added relevant COI mock community info and rep seqs #92

devonorourke · 2019-05-13T21:24:42Z

Hi mockrobiota folk,
I've added the fasta file for the mock COI dataset I've used in a few bat guano related projects. Though I don't have a publication to link these data to at the moment, @nbokulich is on the forthcoming paper that describes their use. Reads are dumped as BioSamples via NCBI and I've provided a link in the README.md file for users to access.
Please let me know what other information you'd like me to add.
Cheers

nbokulich · 2019-05-14T00:20:18Z

Thanks @devonorourke !

It looks like the tests failed; could you please fix those and then I can review once tests path? The error suggests that the dataset metadata file's header line is space delimited not tab delimited.

devonorourke · 2019-05-14T01:32:51Z

Sorry; I fixed the dataset-metadata.tsv file so that it was tab-delimited.
Should be okay now

nbokulich

Looks good @devonorourke ! Just a couple minor comments and a request.

In addition to the source formats (which can be provided as-is), would it be possible to provide expected taxonomy files?

See here for an example file
the directory structure should be .../mock-29/<database-name>/<database-version-or-download-date-MMDDYYYY>/<OTU-cluster-percent>/
The taxonomy file will contain taxon names (as row names) that match valid taxa in the reference database/version/otu% that you used. ideally these should be formatted for use with QIIME 2 (e.g., semicolon-delimited
The "database identifier" file is a list of reference database identifiers that match the expected taxon names
If you base this off of a custom database, just make sure the database is available on github, zenodo, or elsewhere (I think this is what you are already doing with your databases, correct?), and make sure it is all well documented (e.g., you can link to a github repo with code describing how the database was made)
Note, a long time ago I put together some shoddy untested code for automatically generating the expected taxonomy files. Specifically, you want this.

The expected taxonomy files are not required at submission, so if this is too much to ask right now that is fine.

Thanks!

nbokulich · 2019-05-14T17:51:39Z

data/mock-coi1/README.md

@@ -0,0 +1,12 @@
+#  mock-coi1


let's call this mock-29 (to keep consistent)

nbokulich · 2019-05-14T17:57:09Z

data/mock-coi1/README.md

+Note:
+The mock sample described above was sequenced in conjunction with hundreds of bat guano samples in a single MiSeq run. All data are availble as BioSamples [here at NCBI](https://www.ncbi.nlm.nih.gov/bioproject/518082). Individual sequence data specific to the mock sample are found in the `dataset-metadata.tsv` document.
+
+These reads contain dual-index barcodes modeled after the Schloss lab [workflow described here](https://github.com/SchlossLab/MiSeq_WetLab_SOP/blob/master/MiSeq_WetLab_SOP.md). Reads were processed in QIIME2 as described in [this GitHub repo](https://github.com/devonorourke/tidybug/blob/master/docs/sequence_filtering.md#raw-sequence-data-processing).


it may be useful to provide a snippet of code showing how to import these reads into QIIME 2 (note that dual-index barcode support is now available in QIIME 2!)

added relevant COI mock community info and rep seqs

a2ce5e4

devonorourke added 3 commits May 13, 2019 21:24

Update dataset-metadata.tsv

cefc1ad

Update dataset-metadata.tsv

978e864

Update dataset-metadata.tsv

c1bf19b

devonorourke added 8 commits May 14, 2019 09:51

Update dataset-metadata.tsv

37e9cc4

Update README.md

5cc9c9b

Update README.md

38c8538

Update sample-metadata.tsv

4a6968e

Update sample-metadata.tsv

9d94c33

Update sample-metadata.tsv

ffa8de9

Update sample-metadata.tsv

786a7b8

Create taxonomy.tsv

447fb8a

nbokulich reviewed May 14, 2019

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added relevant COI mock community info and rep seqs #92

added relevant COI mock community info and rep seqs #92

devonorourke commented May 13, 2019

nbokulich commented May 14, 2019

devonorourke commented May 14, 2019

nbokulich left a comment

nbokulich May 14, 2019

nbokulich May 14, 2019

added relevant COI mock community info and rep seqs #92

Are you sure you want to change the base?

added relevant COI mock community info and rep seqs #92

Conversation

devonorourke commented May 13, 2019

nbokulich commented May 14, 2019

devonorourke commented May 14, 2019

nbokulich left a comment

Choose a reason for hiding this comment

nbokulich May 14, 2019

Choose a reason for hiding this comment

nbokulich May 14, 2019

Choose a reason for hiding this comment