You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are generally interested in detecting perturbed regulator proteins and their networks using causal networks in a de-novo fashion, which would in theory make sense to apply inverse CARNIVAL to solve. However, I am a unclear on how to preprocess the input genes and network to get a reliably solvable problem. My questions are generally:
Is there a current general guideline for how to preprocess inputs for CARNIVAL based off of single-contrast DE runs (either limma, DESeq2 or edgeR) as well as PKNs?
Is there also a way to estimate whether the input PKN+measurement combination may not be solvable?
Below I discuss what I have tried so far.
I noticed that in the available examples, the PKNs and measurement inputs are filtered by some subset of genes, usually TFs and pathway genes. (Below)
These often have pre-loaded networks and TFs, but I am trying to develop a pipeline. I tried two different methods based on those and got very different outcomes in terms of converging on a solution. I am starting with 17.5k gene statistics (based on running limma) as measurements. I then used the following:
This is adapted more or less from the p53 repository
regulon_df<-OmnipathR::import_tfregulons_interactions(organism=9606)
network<-regulon_df|>dplyr::filter(is_directed==1, is_inhibition+is_stimulation==1,
dorothea_level%in% c("A", "B", "C")) |>dplyr::transmute(source=source_genesymbol,
interaction=-1*is_inhibition+is_stimulation,
target=target_genesymbol) |>dplyr::distinct()
TF_activities<-decoupleR::run_viper(gene_t_mat, dorothea::dorothea_hs_pancancer,
.source="tf", .target="target", .mor="mor")
measurements<-TF_activities[,c("source", "score")] |>tibble::deframe()
pathway_activities<-progeny::progeny(as.matrix(measurements), scale=TRUE, top=100, perm=1000, organism="Human")
# snip: convert these activities to progeny_weights via the progenyMembers.RdataCARNIVAL::generateLpFileCarnival(measurements=measurements,
priorKnowledgeNetwork=network,
weights=progeny_weights)
The VIPER/Omnipath method is solvable quickly with CBC, but as far as I am aware from what I am reading, is somewhat out of date in comparison to the previous method.
Thank you for your help!
The text was updated successfully, but these errors were encountered:
We are not using all DE genes as "measurements" in CARNIVAL. CARNIVAL models how sinaling leads to the regulators of differential gene expression, ie. to transcription factors. This is also why our prior knowledge network is a protein-protein interaction network and not a gene regulatory network.
About the solvability of problems:
I cannot think of a case when the CARNIVAL optimization would lead to infeasibility, i.e. when some constraints are contradictory. I guess what you observe is the numerical issues with CBC. We also observed strong limitation with the non-commercial optimizers and thus we recommend Gurobi or cplex that are free in academia.
If you would work with this in a non-academic setting, I would suggest to try to eliminate nodes and edges from the prior knowledge network. We used a couple of strategies in cosmos (https://github.com/saezlab/cosmosR), e.g.
(1) remove nodes from the network that are not expressed in any condition based on the RNAseq data
(2) remove nodes that are not connected with a directed path to the measurement (they are not observable anyways)
(3) remove nodes that are further than N number of steps from the inputs/measurements. E.g. N = 8. The rationale here is that the more steps we do on the network the less likely that our prediction makes any sense.
From these, mostly (3) has the strongest impact on the computational time. Maybe you need to go down to N=5 for CBC.
We are generally interested in detecting perturbed regulator proteins and their networks using causal networks in a de-novo fashion, which would in theory make sense to apply inverse CARNIVAL to solve. However, I am a unclear on how to preprocess the input genes and network to get a reliably solvable problem. My questions are generally:
Below I discuss what I have tried so far.
I noticed that in the available examples, the PKNs and measurement inputs are filtered by some subset of genes, usually TFs and pathway genes. (Below)
https://github.com/saezlab/transcriptutorial
https://github.com/saezlab/CausalToxNet
https://github.com/harrytr/p53
These often have pre-loaded networks and TFs, but I am trying to develop a pipeline. I tried two different methods based on those and got very different outcomes in terms of converging on a solution. I am starting with 17.5k gene statistics (based on running limma) as measurements. I then used the following:
ULM/collectri
VIPER/Omnipath
This is adapted more or less from the p53 repository
The VIPER/Omnipath method is solvable quickly with CBC, but as far as I am aware from what I am reading, is somewhat out of date in comparison to the previous method.
Thank you for your help!
The text was updated successfully, but these errors were encountered: