-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add cellhash demux results #140
Conversation
This reverts commit 9fd15e2.
…ellhash_demux_results
Updating with some comments on the way this has changed since the initial draft:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me. I just had one minor suggestion about staying consistent with the use of the cellhash_techs
, but I'm not 100% sure if that will work as I think it will. I'm also not crazy about the organization of the addition of processes where the demux results are added to the SCE's being incorporated to the same script where there is generation of SCEs and filtering, since now things are definitely not as linear as they used to be, but I can't think of a better option since they are all related to sce processing.
main.nf
Outdated
merged_sce_ch = generate_merged_sce(feature_rna_quant_ch) | ||
feature_sce_ch = generate_merged_sce(feature_rna_quant_ch) | ||
.branch{ // branch cellhash libs | ||
cellhash: it[0]["feature_type"] == "cellhash" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cellhash: it[0]["feature_type"] == "cellhash" | |
cellhash: it[0]["technology"] in cellhash_techs |
Would you maybe be able to do this instead? I just found it generally confusing to map back where this was coming from while being able to use the cellhash_techs
that you defined earlier seems like it would be cleaner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, this won't work (this was what I tried to use first!). meta.technology
is set for the "primary" technology after we add feature data, so it is just 10Xv3
or whatever. generate_merged_sce
adds meta.feature_type
, which retains the kind of feature that us in the altexp of the SCE, so we have to use that.
I could maybe modify generate_merged_sce
to add feature_technology
without changing the string... that might be a better solution, come to think of it... I will try that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, I could use it[0]["feature_meta"]["technology"]
as that is already set by the generate_merged_sce
workflow (all of meta
from feature mappings ends up in meta.feature_meta
when the objects are combined). Do you have any thoughts on which is less confusing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that option makes sense (or at least to me). That way you would be able to do it[0]["feature_meta"]["technology"] in cellhash_techs
correct? What was originally confusing to me was where the value of cellhash
was coming from.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is what it would be. I will test to make sure it works as expected.
This PR adds the basic demux results from cellhash tables to the script integrating multiplexed data.
It does this by adding flags to the
add_demux_sce.R
script to indicate whether the demultiplexing should be performed, along with the option to include a file containing a table of the cellhash sample ids. That file is found in the newcellhash_pool_file
param, which is set in theccdl
profile.This file is not currently documented, so we should probably add that either before merging or as a next step, which might include more documentation about running multiplexed samples.
One other future thought is that while we don't currently have any samples that don't have cellhash data, that is possible. It is also possible (maybe more likely?) that we could get cellhash samples without matched bulks samples for genetic demultiplexing. Right now the workflow doesn't really support that at all. Changing that is probably also related to #100.
Leaving this as a draft for now.