-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
blanks are special again #175
Conversation
Fixes the double SampleID Column
@squirrelo @EmbrietteH, is this case still necessary to handle? |
Still investigating. |
The changes just pushed in were necessary to resolve issues encountered with the production data. |
Assuming tests pass, would it be possible for a rapid review on this? |
Tests should be passing, possible for review please? |
No, stop asking ... jk, I'm on it. On (Nov-06-15|12:20), Daniel McDonald wrote:
|
The first step in the process is to merge the the individual tables into larger ones, and then to merge the larger tables into a final one. We're not using QIIME's `parallel_merge_otu_tables.py` here as we also need one of the intermediate tables for subsequent processing. | ||
|
||
The first merge we'll do is between the Global Gut and the American Gut. | ||
We also need to make sure the metadata (the information about the samples) are also merged and consistent. Prior to merge, we're going to add in some additional detail about every sample, such as a column in the mapping file that is the combination of the study title and the body site. We're also going to "generalize" body sites to the type of site they're from (e.g., the back of the hand is just "skin"). This process will also clean the metadata to remove blanks and unknown sample types. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very minor, but the usage of the term mapping file comes undefined in this paragraph, and it seems like you were just introducing to the concept of "metadata".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mortonjt is going to clean this up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good!
On (Nov-06-15|12:37), Daniel McDonald wrote:
@@ -40,70 +40,76 @@ We're also going to generate some new files, so let's get them setup.
ag_pgp_hmp_gg_cleaned_md = agu.get_new_path(agenv.paths['ag-pgp-hmp-gg-cleaned-md'])
-The first step in the process is to merge the the individual tables into larger ones, and then to merge the larger tables into a final one. We're not using QIIME's
parallel_merge_otu_tables.py
here as we also need one of the intermediate tables for subsequent processing.-The first merge we'll do is between the Global Gut and the American Gut.
+We also need to make sure the metadata (the information about the samples) are also merged and consistent. Prior to merge, we're going to add in some additional detail about every sample, such as a column in the mapping file that is the combination of the study title and the body site. We're also going to "generalize" body sites to the type of site they're from (e.g., the back of the hand is just "skin"). This process will also clean the metadata to remove blanks and unknown sample types.@mortonjt is going to clean this up
Reply to this email directly or view it on GitHub:
https://github.com/biocore/American-Gut/pull/175/files#r44184726
Looks good to me, just a copule of comments, nothing blocking. Should be ready to go granted that tests pass. |
👍 |
Yet another way to describe body sites for blanks