-
Notifications
You must be signed in to change notification settings - Fork 0
* SCP-2611 Correct gene models for dense matrices and account for single column feature files #145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## development #145 +/- ##
===============================================
+ Coverage 68.83% 69.00% +0.17%
===============================================
Files 22 22
Lines 2663 2678 +15
===============================================
+ Hits 1833 1848 +15
Misses 830 830
Continue to review full report at Codecov.
|
devonbush
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good. Please add a test case that confirms gene models get created for genes with no non-zero expression scores (i.e. a test case that would fail without this fix)
|
devonbush
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs a test to cover the single-column gene file case. And one minor code format suggestion
Co-authored-by: Devon <dbush@broadinstitute.org>
| # models maybe less than the batch size. | ||
| if len(gene_models) > 0: | ||
| self.create_models( | ||
| [], [], None, None, gene_models, data_arrays, num_processed, True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking the time to explain this to me, Eno!
There was a regression for parsing dense matrices where gene models were only created if the gene had expression data. The PR addresses this issue by creating gene models regardless of the presence of expression data.
There was also a second regression. The feature file in an MTX bundle can have 1 column. As stated by 10X genomics, we now set the gene name to the gene ID in the event 1 column is present.