-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Balance summary #172
Balance summary #172
Conversation
I'd recommend making Also, don't place an upper limit, as not all taxonomies will necessarily have 6 or 7 tidy levels. (if Looks great! |
Finally this passed! @qiyunzhu do you mind if you could take a look at this? |
@nbokulich @ebolyen do you know how to force the ordering of options on the command line? |
@mortonjt the underlying choices are stored as a set, so there's not a way to do that at this time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mortonjt Great job! Only trivial comments.
|
||
def setUp(self): | ||
self.results = "results" | ||
os.mkdir(self.results) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will it cause problem without making a temporary directory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope - because I tear it down right after creation. The main reason why I do this is because it makes debugging a heck of a lot easier.
It is making the assumption that the current directory is clean and doesn't have the folder results
. This is a safe assumption, particularly given that these file directories are created on the fly.
gneiss/plot/_plot.py
Outdated
index_f.write('<h1>Balance Taxonomy</h1>\n') | ||
index_f.write('<img src="barplots.svg" alt="barplots">\n\n') | ||
index_f.write(('<h3>Numerator taxa</h3>\n' | ||
'<a href="numerator.csv">\n' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trivial. There two HTML lines can be merged in one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bam done!
num_clade = st.children[NUMERATOR] | ||
denom_clade = st.children[DENOMINATOR] | ||
if num_clade.is_tip(): | ||
num_ = pd.DataFrame( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This num_
variable is not declared before the if
block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That doesn't matter - since it is declared in both the if and the else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good, a couple of questions:
- Should NUMERATOR and DENOMINATOR be parameters?
- Agree with @nbokulich comment but I will add that depending of the target gene or dataset the taxonomy (should it be feature-description?) might have more than 7 levels.
|
An example of a taxonomy with > 7 levels is the raw SILVA taxonomy. It has the same format as greengenes but contains more levels, e.g.:
But there are other taxonomies that contain fewer than 7 levels, e.g., RDP taxonomy I believe is 6 levels. Another example (to consider non-taxonomic feature data) would be gene pathway/ontology data. E.g., picrust data reports KEGG pathways that are 3 levels deep. |
Ok! I have made the taxonomies integer valued, so that multiple levels can be accessed. |
Addresses #181
Still a work in progress - need to figure out how to fix the scaling in the rendering.
This basically provides some better summaries for the individual balances. This ultimately allows for users to visualize how a single balance relates to the metadata and what microbes the balance is composed of.
Below is an example how to run it and the results of the visualization.
Help menu
@nbokulich do you have any thoughts on the user interface here?