-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Irwin lastl #5
Irwin lastl #5
Conversation
… /usr/bin/python.
''' | ||
raise ("not yet implemented") | ||
|
||
import tools.last, tools.prinseq |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious about this -- if you "import tools.last, tools.prinseq, scripts" at the top of the python file instead of in the filter_lastal method itself, does anything undesirable happen? If there isn't a good reason, it seems better to put it at the top of the file instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a general matter, if I'm pretty sure a module will be used inside only one function I will sometimes import it there instead of at the top of the file.
Reasons:
- Decreases dependencies. For example, sometimes the import will depend on something that isn't always present so the import will fail. No need to disable the other functions in the module by importing at the top when only one function depends on it. (I've been burned by this.)
- Can improve performance to access the other functions.
- Decrease clutter in the file and in the namespace if there are a zillion things imported by different functions.
In this case, "scripts" should certainly be at the top, tools.prinseq appears in at least one other recipe, and although "last" doesn't appear in any of the other recipes it might get used. So, all of those should be at the top. I've moved them there.
If, as a stylistic convention you'd prefer to always import at the top of the file unless there's a specific need not to then I'm fine with that; just let me know.
On Oct 15, 2014, at 11:29 PM, Daniel Park notifications@github.com wrote:
In taxon_filter.py:
'''
- raise ("not yet implemented")
- import tools.last, tools.prinseq
Curious about this -- if you "import tools.last, tools.prinseq, scripts" at the top of the python file instead of in the filter_lastal method itself, does anything undesirable happen? If there isn't a good reason, it seems better to put it at the top of the file instead?—
Reply to this email directly or view it on GitHub.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, let's play it by ear and we can switch things up later if need be. I see you pushed it to the top anyway. My own thinking on this was largely to explicitly declare all dependencies up front. In particular, if anything in this file required us to import some special package (and perhaps we forgot to put it in requirements.txt for auto-install), then we'd know right off the bat simply by importing the file (we wouldn't have to actually execute the methods). In theory, importing a module really shouldn't have much time cost, but in practice, I have seen some big ones (e.g. scipy) take a few seconds. And especially in a context where we're putting multiple different commands in a single python file (instead of one command per python script), your approach makes sense.
We can revisit this later after we implement a few more things...
This looks great, and the new test looks good. I think most pull requests should not take this long, but I think we worked out a few kinks in the process of this first real try. Thanks for being the guinea pig for this! |
Merge in irwin-lastl, the implementation of wrappers around lastal and prinseq.
Incidentally, this closes issue #3 |
No description provided.