-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filter to expressed variants #67
Conversation
@@ -748,13 +800,52 @@ def load_neoantigens(self, patients=None, variant_type="snv", merge_type="union" | |||
dfs[patient.id] = df_epitopes | |||
return dfs | |||
|
|||
def _load_single_patient_isovar(self, patient, variant_type, merge_type, epitope_lengths): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the content of this is all the same; just moved.
ref=row["ref"], | ||
alt=row["alt"], | ||
ensembl=genome) | ||
if variant in variants: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is going to go through all of variants
each time, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As opposed to caching the output? Yeah, I wanted to try to simplify by not caching this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, maybe you mean that I'm iterating through the VariantCollection
on each check; good point.
My thoughts on this are I'm not sure it's worth building in the isovar filtering as special case to the |
@arahuja curious why not? |
We spoke offline; @arahuja clarified that the |
@arahuja I'm not entirely clear on the best way to do this, since my Perhaps I should replace |
Perhaps? I think needing cohort would be odd, but it somewhat shows that some of the |
@arahuja I agree re pushing more functions to |
f9bd2b0
to
ecc475c
Compare
12 days later, some updates to this PR:
Back to you @arahuja |
@@ -215,10 +223,16 @@ def __init__(self, | |||
join_how="inner", | |||
check_provenance=False, | |||
polyphen_dump_path=None, | |||
pageant_coverage_path=None): | |||
pageant_coverage_path=None, | |||
variant_type="snv", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason this is a cohort level property now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it became cumbersome and wordy to pass it around everywhere. Agree it's a little weird--
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea I think I prefer the alternative, but also see the issue that you can pass one to load_variants
and a different set to load_effects
which could be confusing, so I'm fine with this
777bf4d
to
946eab4
Compare
In this PR:
df_isovar
logic out ofload_neoantigens
into a separate_load_single_patient_isovar
, to use it outside the neoantigen context._filter_variants_to_expressed
to filter out any variant that isn't present in theisovar
DataFrame
.variant_qc_filter
.@arahuja