-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] initial refactor of compute command (and associated test module) #734
Conversation
Random questions: Should we put things in Is |
Codecov Report
@@ Coverage Diff @@
## master #734 +/- ##
==========================================
+ Coverage 88.68% 88.71% +0.03%
==========================================
Files 29 30 +1
Lines 4604 4617 +13
Branches 45 45
==========================================
+ Hits 4083 4096 +13
Misses 519 519
Partials 2 2
Continue to review full report at Codecov.
|
Firstly, Secondly, I was thinking we would move all the functions out of sourmash compute into a module called compute.py and unit test them. it might be complicated as we pass around the command line args inside the class as a dictionary but we wouldn't have them, we would need separate keywords for them. It shouldn't be that bad. We should keep the compute command inside commands.py so we have all commands in one place. Lastly, if we have this pattern of making compute a separate file, should we have commands_compute_10x.py? and their corresponding tests |
OK, in 4d1f0fa I made pathos and pysam optional, using Assuming tests pass and there aren't any more basic structural things to tackle, I'd like to suggest that this PR get merged soon, and then we can tackle better unit testing and other things with the code that's isolated into these modules. This will minimize conflicts with and from other ongoing work. Thoughts @luizirber? In response to @pranathivemuri last comment - I'd like to avoid proliferating files as much as possible, so I think we should keep everything associated with making new sketches in one place rather than breaking things out further. |
(actually, the better (& real) reason for merging now and then working on things in a new PR is that future diffs will be much smaller and reviewers will be able to tell which code was actually modified :) |
I agree, please feel free to merge it then! |
(still want to have @luizirber sign off on it :) |
@olgabot if you have time to do a review I'd appreciate it :). There is a failing test that I need to fix, but other than that all is ready. |
607664c
to
92cde33
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great start! It would be great if the helper functions of compute were also refactored out separately as it makes the code a little confusing to read right now (at least it was for me when I was first exposed to the codebase)
sourmash compute file1.fa file2.fa --merge merged -o file.sig | ||
=> creates one output file file.sig, with all sequences from | ||
file1.fa and file2.fa combined into one signature. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
""" | |
sourmash compute --input-is-10x --barcodes barcodes.tsv possorted_genome_bam.bam | |
""" |
error("must specify -o with --merge") | ||
sys.exit(-1) | ||
|
||
def make_minhashes(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can these functions be refactored to separate functions?
@olgabot those are excellent refactoring suggestions! but per #734 (comment), the plan is to do them in a separate PR. The one thing git doesn't handle that well is moving large blocks of code around (which is understandable :) and it's quite challenging to review diffs against big reorgs. |
Fixes #733.
Additional refactoring and tests will be the subject of another PR - this one aims to get the code in roughly the right place first!
make test
Did it pass the tests?make coverage
Is the new code covered?without a major version increment. Changing file formats also requires a
major version number increment.
changes were made?
cc @pranathivemuri @luizirber