Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Registering samples on the uBiome Explorer #5

Open
dhimmel opened this issue Dec 13, 2017 · 3 comments
Open

Registering samples on the uBiome Explorer #5

dhimmel opened this issue Dec 13, 2017 · 3 comments

Comments

@dhimmel
Copy link
Owner

dhimmel commented Dec 13, 2017

I messaged Elies regarding the status of our samples. It turns out, according to Elies, uBiome processed the samples on 2017-10-11! She wrote:

Let me know if you can access the data from the Explorer website or not. There was no user associated with the samples in the database, so maybe you never registered them?

It didn't occur to me to register these kits in the Explorer, which is why I didn't notice the processing was complete. So I just registered the samples in the Explorer now on 2017-12-13.

Methods

I clicked on "Activate Kit" and entered the kit info from kits.tsv. For each sample (kit-tube pair), I entered the date as 2017-07-20 (for when @bemert transferred the samples into the uBiome tubes, see #2 (comment)). Then I changed the sample type to custom from gut/spare. Custom is documented as:

Custom samples are a special category for non-standard sampling, e.g. sampling your pet. Is this a custom sample?

In the notes section, I added the sample information. This was a manual process, so I was careful not to make errors. Elies sent me some data, which we can use in the future to confirm that our sample-to-tube assignments are in concordance.

Here is a screenshot of the 10 samples on the uBiome Explorer:

ubiome-explorer-1
ubiome-explorer-2

Note that we have data on both the concentrated and unconcentrated preparations, which hopefully will provide some comforting redundancy.

dhimmel added a commit that referenced this issue Dec 13, 2017
Downloaded from the uBiome Explorer. See
#5

The kit/tube to ssr id mapping is not yet known (although can be
deduced at download time).

Track .zip files with Git LFS, attempt to use GitLab LFS.
dhimmel added a commit that referenced this issue Dec 13, 2017
Downloaded from the uBiome Explorer. See
#5

The kit/tube to ssr id mapping is not yet known (although can be
deduced at download time).

Track .zip files with Git LFS.

Use GitLab LFS, which offers sufficient free tier LFS storage for
this project, rather than GitHub LFS as the LFS server. Created
a mirror of the repository on GitLab for this purpose at:
https://gitlab.com/dhimmel/fratjuice
@dhimmel
Copy link
Owner Author

dhimmel commented Dec 13, 2017

@eliesbik I'm wondering what identifier system uBiome uses for tubes. This will allow us to more easily translate between our kit/tube information and specific samples.

From downloading the data in the explorer (see #6), it looks like our ten tubes have the following IDs:

300909
300921
300945
300948
300960
300975
300987
301026
301029
301032

When I export taxonomy data as JSON, I see a line like:

  "sequencing_revision": "300987",

So perhaps these identifiers are called sequencing_revision IDs? Anyways, is this ID also the ID on each tube (if you were to read the QR code)? It would be helpful if we could add a sequencing_revision column to kits.tsv and add any other IDs to that table that uBiome uses.

Right now kits.tsv doesn't have a column that a unique identifies each sequencing dataset (uniqueness comes from combining sample_id and sample_type). The goal here is to:

  1. get the right identifiers to identify the sequencing data
  2. get any other IDs that will help bridge analyses (e.g. what the tube QR code contains)

@dhimmel
Copy link
Owner Author

dhimmel commented Dec 13, 2017

Regarding my above questions about identifiers, the spreadsheet provided by @eliesbik in #7 appears to contain the relevant information (and more)! Specifically the Mapping sheet of results.xlsx contains these columns (showing only first four rows):

Sample_ID Sample_type SeqID tubeId barcode order_id
FJ5 - gut concentrated 300909 NA0012382439 699179835 177574
FJ1 - spare unconcentrated 300921 NA0013010984 592185324 183059
FJ2 - gut concentrated 300945 NA0008969520 606185447 183182
FJ2 - spare unconcentrated 300948 NA0013010993 606185447 183182

A few questions @eliesbik to make sure I understand all the identifiers:

  1. tubeId is the unique identifier for each tube, which is encoded by the QR code? We probably will never need this, but it's good so we know which tube at uBiome corresponds to which sample.
  2. SeqID, the "unique sequence id", refers to the output from a sequencing run? In our case, each tube was only sequenced once, so the tubeId-to-SeqID mapping is one-to-one.
  3. barcode is the dehyphenated kit identifier.

@eliesbik
Copy link
Collaborator

eliesbik commented Dec 14, 2017

Happy to answer your questions @dhimmel !

  1. tubeID is the QR code, and refers to the physical tube that holds each sample.
  2. SeqID is indeed the output of a sequencing run. One tubeID can have multiple SeqIDs, for example when a sequencing run did not meet our quality controls, or when a particular sample did not yield enough reads. In both of these cases, we will redo the PCR and sequencing of that sample, or all samples in that run, and a sample will then have two SeqIDs. This is the identifier that will be linked to your raw data.
  3. Correct.

Regarding when the samples were processed, 10-11-2017 refers to October 11, not November 10?

Yes

dhimmel added a commit that referenced this issue Dec 18, 2017
Downloaded from the uBiome Explorer. See
#5

The kit/tube to ssr id mapping is not yet known (although can be
deduced at download time).

Track .zip files with Git LFS.

Use GitLab LFS, which offers sufficient free tier LFS storage for
this project, rather than GitHub LFS as the LFS server. Created
a mirror of the repository on GitLab for this purpose at:
https://gitlab.com/dhimmel/fratjuice
dhimmel added a commit that referenced this issue Dec 18, 2017
Downloaded from the uBiome Explorer. See
#5

The kit/tube to ssr id mapping is not yet known (although can be
deduced at download time).

Track .zip files with Git LFS.

Use GitLab LFS, which offers sufficient free tier LFS storage for
this project, rather than GitHub LFS as the LFS server. Created
a mirror of the repository on GitLab for this purpose at:
https://gitlab.com/dhimmel/fratjuice
dhimmel added a commit that referenced this issue Dec 18, 2017
Downloaded from the uBiome Explorer. See
#5

The kit/tube to ssr id mapping is not yet known (although can be
deduced at download time).

Track .zip files with Git LFS.

Use GitLab LFS, which offers sufficient free tier LFS storage for
this project, rather than GitHub LFS as the LFS server. Created
a mirror of the repository on GitLab for this purpose at:
https://gitlab.com/dhimmel/fratjuice
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants