-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How does a ULM on LFCs from bulk RNA-seq work for TF activity inference? #21
Comments
Hi @adamklie Glad you find the package useful! As you know decoupler takes an input matrix of gene expression (GEX) and a prior knowledge network (Net) which we transform into matrix format internally. In your case, your GEX is made out of the contrasts' statistics between conditions (if you only have one then it is only one row). When you run To briefly answer your question, yes, we fit a separate model for each TF in Hope this is helpful! Feel free to ask more questions if needed. |
Thank you so much for the detailed explanation and the figure! Makes it very clear how this is working. Now I'm trying to think about when you might expect one to work better than the other. I would expect that many of the explanatory TFs would have correlated weights that might make it harder to fit a |
Very good points. In the end there is no free lunch, there is a tradeoff. The advantage of The advantage of dc.check_corr(net, mat=mat) If you see that some TFs pairs have high correlations (> 0.9), you should definitely double check how the obtained activities look for these when using Therefore, if you are not sure of which one to pick, you can always use the Hope this is helpful! |
So helpful. Thanks again for taking the time to explain, I really appreciate it! |
If we only care regulatory activity in a list of about 30-50 genes (and their logFCs), can we still run consensus? Or we should stick to ORA in that case? |
Hi @gjones3339, Without a proper background of genes I would recommend just running ORA (there we assume a background of ~20k genes, all the protein coding genes). For list of genes you can easily run ora with |
Describe your question
Thanks once again for putting together this package. I've found it super usable and useful so far. I just have hopefully a quick question about the underlying method for analysis of bulk RNA-seq data.
I'm getting some awesome results from fitting a univariate linear model to log fold changes (LFC) from DESeq2 using the DoRothEA network, but I don't really understand how I'm getting these TF activities. I follow the general idea that "the observed molecular readouts in
mat
are the response variable and the regulator weights innet
are the explanatory ones," but I guess I don't really understand what these are in my specific context and how I end up with an activity score for each TF. I'm not sure I even follow what each data point is in this context. Am I fitting a separate model for each TF?If someone could help me understand this better or point me to the details (that I apologize in advance if I missed them) that would be amazing!
The text was updated successfully, but these errors were encountered: