-
Notifications
You must be signed in to change notification settings - Fork 235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
imp: adds helpers to get metadata/artifacts from URLs in usage examples #658
Conversation
This can be used as follows: def feature_table_filter_samples(use):
feature_table = use.init_artifact_from_url(
'feature_table',
'https://docs.qiime2.org/{epoch}/data/tutorials/moving-pictures/table.qza'
)
sample_metadata = use.init_metadata_from_url(
'sample_metadata',
'https://data.qiime2.org/{epoch}/tutorials/moving-pictures/sample_metadata.tsv'
)
filtered_table, = use.action(
use.UsageAction(plugin_id='feature_table', action_id='filter_samples'),
use.UsageInputs(table=feature_table, metadata=sample_metadata,
where="[body-site] IN ('left palm', 'right palm')"),
use.UsageOutputNames(filtered_table='filtered_table')
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This great, thanks for adding this @gregcaporaso! Everything looks reasonable to me, and it seems like you have good test coverage for the common use-cases.
Let's let @ebolyen take a final look over this one before it's merged (in case there's any missing context on something that could cause problems).
qiime2/sdk/usage.py
Outdated
# Obtaining epoch modeled on qiime2.metadata.io.MetadataFileError | ||
import qiime2 | ||
|
||
epoch = qiime2.__release__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this approach!
qiime2/sdk/usage.py
Outdated
return data | ||
|
||
def init_artifact_from_url(self, name: str, url: str, | ||
replace_url_epoch: bool = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this too - on the off-chance that we ever don't need the epoch in a given artifact URL, this will be helpful.
qiime2/sdk/usage.py
Outdated
|
||
return self.init_artifact(name, factory) | ||
|
||
def init_metadata_from_url(self, name: str, url: str, column: str = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love the addition of the column parameter, this is so helpful!
qiime2/sdk/usage.py
Outdated
@@ -922,6 +922,146 @@ def init_format(self, name: str, | |||
""" | |||
return self._usage_variable(name, factory, 'format') | |||
|
|||
def _replace_url_epoch(self, url): | |||
# Obtaining epoch modeled on qiime2.metadata.io.MetadataFileError | |||
import qiime2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
import qiime2 |
qiime2
is being imported at the top of this file, so it shouldn't be needed as an additional import here.
qiime2/sdk/usage.py
Outdated
return url.replace('{epoch}', epoch) | ||
|
||
def _request_url(self, url): | ||
import requests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's interesting - the requests
module isn't being found when CI is running your tests. I'm wondering if it actually does need to be added at the top of the file, or maybe in the recipe file as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I wonder if it's not actually a framework dependency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might recommend using stdlib for this (even if it's slightly unpleasant), since there's not too much going on. If we were doing lots of URL requests, then a 3rd party lib might be a good choice.
Since almost everyone uses requests, it's easy for us to end up in a dependency bind someday, so not using it would avoid that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a couple of additional comments inline re: CI failures.
Big picture questions:
|
Thanks for the feedback @lizgehret and @ebolyen! @ebolyen: Did you mean |
Oh yeah, sorry. That's what I get for doing quick drive-by comments. |
@lizgehret, @ebolyen - I partially addressed the comments here.
Updated to use
I agree - generally don't expect them to be overridden, but I don't think there is harm if they are.
That makes sense to me, and I implemented this change. For folks using the previous version from this PR, you would change something like this: metadata_url = \
'https://data.qiime2.org/{epoch}/tutorials/moving-pictures/sample_metadata.tsv'
use.init_metadata_from_url('md', metadata_url) to something like this: import qiime2
metadata_url = \
f'https://data.qiime2.org/{qiime2.__release__}/tutorials/moving-pictures/sample_metadata.tsv'
use.init_metadata_from_url('md', metadata_url)
I think you're right that we shouldn't take this approach. Something I didn't notice until today is that the Going with |
qiime2/sdk/usage.py
Outdated
Examples | ||
-------- | ||
>>> import qiime2 | ||
>>> url = f'https://data.qiime2.org/{qiime2.__release__}/tutorials/moving-pictures/sample_metadata.tsv' # noqa: E501 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't figure out how to get a multi-line string working as input in a doc string. As it stands, I'm disabling the max line length check for flake8. Open to other ideas though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will do what you want. Assuming I remembered how to show continued expressions in doctest.
>>> url = f'https://data.qiime2.org/{qiime2.__release__}/tutorials/moving-pictures/sample_metadata.tsv' # noqa: E501 | |
>>> url = (f'https://data.qiime2.org/{qiime2.__release__}/tutorials/' | |
... 'moving-pictures/sample_metadata.tsv') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested @ebolyen's suggestion locally, and that worked for me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could have sworn I tried that but I guess not! Or maybe something unrelated was failing when I tried and mistook it for this failing. In any case, thank you both!
I decided to drop the column selection support from this PR for now as I was continuing to run into trouble with it as I developed unit tests and real-world examples and I don't want to hold this PR any more due to this. So, if you were previously doing something like:
You should now do: column = 'body-site'
md = use.init_metadata_from_url('md', metadata_url)
mdc = use.get_metadata_column('mdc', column, md) |
These changes look great @gregcaporaso! Excited to have this available for usage examples moving forward. Unless @ebolyen has any change requests, this looks g2g 👍 |
Oh hmm that's interesting @Oddant1 - I wonder if this needs to be |
So after a
@Oddant1 what QIIME 2 release is your conda env created from? I'm wondering if that is (at least part of) the inconsistency here. --> Yeah it looks like this is related to the conda env being used - I'm not sure if it's pulled from anything else, so I could totally be missing something. |
@lizgehret I figured it out and deleted my comment (probably shoulda just edited it). I just needed to do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the reviews everyone!
Addresses #657
Still needs:
init_metadata_column_from_url
this is hooked up by passinggoing to skip this in this PR altogethercolumn
toinit_metadata_from_url
{epoch}
in URLs to replace with the most recent epoch{epoch}
a reasonable way to handle this? we decided to use f-strings instead, based on feedback from @ebolyenSubmitting a draft PR so we can use it in our usage sprint example.