Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using CellNet downstream of salmon #9

Closed
ghost opened this issue Feb 28, 2019 · 4 comments
Closed

Using CellNet downstream of salmon #9

ghost opened this issue Feb 28, 2019 · 4 comments

Comments

@ghost
Copy link

ghost commented Feb 28, 2019

Thank you for developing CellNet. I am trying to apply CellNet to some of my RNAseq data following your Nature Protocol paper. I have a stable pipeline on my cluster to run salmon, and I was wondering if there's any function within cellnet to take the quant.sf files and go from there. In other words, I was wondering if we can just compute the last part of cn_salmon locally?

@pcahan1
Copy link
Collaborator

pcahan1 commented Feb 28, 2019

Hi,

This should be possible. You will want to create an object that is similar to what is produced from running cn_salmon. This will entail calling ...

salmon_load_tranEst, which will load the transcript estimates from the quant.sf file.

gene_expr_sum, which will provide gene level extimates

trans_rnaseq, which will normalize by size

Take a look at the cn_salmon function definition so that you make a list with the expected element names

@rebekabato
Copy link

rebekabato commented Sep 30, 2020

Hi Patrick!

I am also using salmon quant.sf files to run CellNet on. I created a big merged file containing all my samples and gene-level estimates (raw read counts). I would like to use trans_rnaseq() to normalize my data, however, I can't get around the "total" argument that this function takes. I have read the manual of course, but still don't understand what number I am supposed to use here and how to calculate it.

When I was googling around at some point I found that total=1e5 was applied in this function, but I believe I am supposed to customize this value to my experiment.

Would you please explain how to obtain the right value for the total argument in the trans_rnaseq function? I appreciate your time to answer!

@pcahan1
Copy link
Collaborator

pcahan1 commented Sep 30, 2020

Hi,

If you are using any of our pre-trainined cnProc files, then yes total=1e5 so that your query data is compatible with the training data. If you are generating your own cnProc, then you you should set it to a value that is <= min expected total read count of any of your training or query samples. Thank you for your interest in CellNet. We hope to have an updated and more robust version available soon.

@pcahan1 pcahan1 closed this as completed Sep 30, 2020
@rebekabato
Copy link

Thanks for your prompt reply! Would you please elaborate on what kind of normalization happens at this step? I can see that you previously wrote it normalizes by size, but I would love to understand the process a bit more deeply. Or if you could direct me to some online source where I can read about this, that is fine too. Thanks so much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants