-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pretty low GRN scores obtained #8
Comments
add:every time running the grnscores(),warning messages were generated as follows.Hope this helps.
|
Hi, I am sorry for the delay!
Thanks for using CellRouter. Regarding your first question:
1) There is nothing wrong with your pipeline. The values are low though. I
think this could be due to dropouts in your data because the dynamics of
your regulators/target genes seem heterogeneous. What you could do is the
following: in the processtrajectory function, increase "neighs" from 1 to 2
or 3. This will use the second or third neighborhood of each cell along the
trajectory to smoothen the kinetic trend. Maybe, by chooing neighs=1 in
your dataset does not provide enough information for a good smoothing. Let
me know if that works.
*cellrouter <- processtrajectories(cellrouter, genes2use,
path.rank=ranks[3], num.cells = 3, neighs = 2) *
What could be also happing is that you may not have genes that
monotonically increase or decrease in expression along your trajectory but
are rather more transiently expressed.
Regarding your second question:
2) I do not a larger k will affect the grnscore that much. It should not. I
am trying to optimize the clustering step that CellRouter does for larger
datasets. For now, choosing a larget k is totally fine. No, the selected
path will not represent the entire process of differentiation. I am working
on now to also include other paths dynamics. The trajectory identified
represents the one where each cell along a particular path is more similar
to each other, according to our optimization algorithm. There might be many
other paths as well. I am extending the platform in this direction as well.
Please, keep me posted on how it goes!
Thanks!
2018-04-20 9:10 GMT-04:00 JianmeiZhong <notifications@github.com>:
… Hi,
I work through most part of the analysis pipeline but the results of GRN
scores obtained via grnscores() in specified paths SP_10.SP_8 as tSNE and
barplot shows.the grn scores have different levels compared to yours
presented in several exemplary analysis,that's level of 0.0* versus 0.*
.Sometimes,the GRNscores plot has one or two genes with low scores in both
directions.I don't know if there's some wrong configuration in my
pipeline.If my source and endpoints are correct and reliable,low scores
level are possible for some datasets and it's available?
[image: barplot]
<https://user-images.githubusercontent.com/37535747/39050789-438495c2-44d9-11e8-97ba-bd6a4ea498d0.png>
another questions asking for your guide :)
known knowledge this project based on are less subpopulations and
increasing number of cells along differentiation process.So,k=120 was set
according to my understanding in function buildGRN() and creatKNN() to
obtain relatively less subpopulations as tSNE plot shows.k value is
different a lot from yours.It might be a cause of low GRN scores?And if a
path didn't go through every subpopulation in non-branched cells
differentiation,gene expression dynamics with this path can represent
entire process of changes?
[image: tsne]
<https://user-images.githubusercontent.com/37535747/39050778-385bd228-44d9-11e8-9e05-571bf8013e36.jpg>
Thank a lot !
Jm Zhong
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#8>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AJqUR9y39O0haayxIve1In_OXKudlT-Yks5tqd4-gaJpZM4Tdb6B>
.
--
Edroaldo
|
OK. I will check that out as well. I have to seen this msgs erlier.
Thanks a lot for your comments!
2018-04-23 11:11 GMT-04:00 JianmeiZhong <notifications@github.com>:
… add:every time running the grnscores(),warning messages were generated as
follows.Hope this helps.
[1] 0
There were 20 warnings (use warnings() to see them)
warnings()
Warning messages:
1: In if (length(bla) == 1 & names(bla) == r) { ... :
the condition has length > 1 and only the first element will be used
2: In if (length(bla) == 1 & names(bla) == r) { ... :
the condition has length > 1 and only the first element will be used
3: In if (length(bla) == 1 & names(bla) == r) { ... :
the condition has length > 1 and only the first element will be used
4: In if (length(bla) == 1 & names(bla) == r) { ... :
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#8 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AJqURyeBCApYEN8C2Lrmh08QHZfj68SGks5tre8OgaJpZM4Tdb6B>
.
--
Edroaldo
|
Thanks! I got some very useful answers from your nice explications. :) |
or if there's any way that subpopulation id of each cell can be customized coercively?That will be a absolutely sole path which all cells can be included. grin: |
How many genes are you using for GRN reconstruction? I noticed that, for
some datasets, a larger GRN produces more interesting results because each
regulator has more target genes. If you have enough memory, you can do it
using all genes. If not, you can select more genes from the pca analysis at
the top of your script to have more genes in the genes2use variable.
Let me know if you decide to try it.
Thanks!
…On Mon, Apr 23, 2018, 1:03 PM JianmeiZhong ***@***.***> wrote:
I tried neigh=2 in another sample,nothing seems changed.
[image: sample2]
<https://user-images.githubusercontent.com/37535747/39141763-466ab774-475b-11e8-920f-97ce4143a0b1.png>
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#8 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AJqUR5HAwFz1cgHtNcQiQUSfuEhRf-D3ks5trgligaJpZM4Tdb6B>
.
|
Also, would you mind to show me the commands you are using to generate
these plots. Just want to double check how your defining the cutoffs...
Thanks!
On Tue, Apr 24, 2018, 7:42 AM Edroaldo Lummertz da Rocha <edroaldo@gmail.com>
wrote:
… How many genes are you using for GRN reconstruction? I noticed that, for
some datasets, a larger GRN produces more interesting results because each
regulator has more target genes. If you have enough memory, you can do it
using all genes. If not, you can select more genes from the pca analysis at
the top of your script to have more genes in the genes2use variable.
Let me know if you decide to try it.
Thanks!
On Mon, Apr 23, 2018, 1:03 PM JianmeiZhong ***@***.***>
wrote:
> I tried neigh=2 in another sample,nothing seems changed.
> [image: sample2]
> <https://user-images.githubusercontent.com/37535747/39141763-466ab774-475b-11e8-920f-97ce4143a0b1.png>
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> <#8 (comment)>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AJqUR5HAwFz1cgHtNcQiQUSfuEhRf-D3ks5trgligaJpZM4Tdb6B>
> .
>
|
mmmm....my apologies for missing your reply and my delayed respond.:) Here are commands generating the low-scores plot and several previous steps. |
I think this should be fine. I have not seen GRN score that low but your
analysis is correct. So, this something related to your dataset. I am
testing in other datasetsas well. Will let you know what I get.
Thanks!
2018-05-02 6:06 GMT-04:00 JianmeiZhong <notifications@github.com>:
… mmmm....my apologies for missing your reply and my delayed respond.:)
I tried to put all genes with non-zero variance into calculation of GRN as
you suggested and more genes were presented in GRNscores plot.increasing
number of input genes generated more nodes and more GRN genes shown but
with pretty low GRN scores mentioned before.more closer endpoint to source
subpopulation was tested to have higher GRN scores,for example 0.1 for SNCA
from SP_10 to SP_3.
Here are commands generating the low-scores plot and several previous
steps.
### selecting genes to use as regulated along developmental trajectories
pca <- prcomp(t(ndata),scale=TRUE,center=TRUE)
loadings <- pca$rotation
num_pc <- 5
quantile <- 0.975
genes2use <- unique(as.vector(unlist(apply(loadings[,1:num_pc],2,
function(x){names(x[which(abs(x) >= quantile(x,quantile))])}))))
ggrn <- buildGRN('Hs',ndata,genes2use,2,'results/GRN.R')
cellrouter <- findsubpopulations(cellrouter,90,'jaccard','results/kNN_
network.gml')
cellrouter <- createKNN(cellrouter,90,'jaccard','results/paths/kNN_
network_trajectory.gml')
tfs <- find_tfs(species = 'Hs')
t <- c('SP_10.SP_8')
x <- grnscores(cellrouter,tfs,t,direction='both',dir.targets='
up',columns=1,width=8,height=5,flip=T,filename=paste('results/',t,sep=''))
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#8 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AJqUR1e5ZfBAkM5Lhgi6O2ljZFMG53E5ks5tuYUWgaJpZM4Tdb6B>
.
--
Edroaldo
|
Hi,
I noticed that increasing k in the createKNN function provided better
results in the dataset generated by Paul et al (Cell 2015). Could you
please try to increase k in the createKNN function?
Please, let me know...
Thanks!
2018-05-08 12:29 GMT-04:00 Edroaldo Lummertz da Rocha <edroaldo@gmail.com>:
… I think this should be fine. I have not seen GRN score that low but your
analysis is correct. So, this something related to your dataset. I am
testing in other datasetsas well. Will let you know what I get.
Thanks!
2018-05-02 6:06 GMT-04:00 JianmeiZhong ***@***.***>:
> mmmm....my apologies for missing your reply and my delayed respond.:)
> I tried to put all genes with non-zero variance into calculation of GRN
> as you suggested and more genes were presented in GRNscores plot.increasing
> number of input genes generated more nodes and more GRN genes shown but
> with pretty low GRN scores mentioned before.more closer endpoint to source
> subpopulation was tested to have higher GRN scores,for example 0.1 for SNCA
> from SP_10 to SP_3.
>
> Here are commands generating the low-scores plot and several previous
> steps.
> ### selecting genes to use as regulated along developmental trajectories
> pca <- prcomp(t(ndata),scale=TRUE,center=TRUE)
> loadings <- pca$rotation
> num_pc <- 5
> quantile <- 0.975
> genes2use <- unique(as.vector(unlist(apply(loadings[,1:num_pc],2,function(x){names(x[which(abs(x)
> >= quantile(x,quantile))])}))))
> ggrn <- buildGRN('Hs',ndata,genes2use,2,'results/GRN.R')
> cellrouter <- findsubpopulations(cellrouter,
> 90,'jaccard','results/kNN_network.gml')
> cellrouter <- createKNN(cellrouter,90,'jacca
> rd','results/paths/kNN_network_trajectory.gml')
> tfs <- find_tfs(species = 'Hs')
> t <- c('SP_10.SP_8')
> x <- grnscores(cellrouter,tfs,t,direction='both',dir.targets='up'
> ,columns=1,width=8,height=5,flip=T,filename=paste('results/',t,sep=''))
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> <#8 (comment)>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AJqUR1e5ZfBAkM5Lhgi6O2ljZFMG53E5ks5tuYUWgaJpZM4Tdb6B>
> .
>
--
Edroaldo
--
Edroaldo
|
Ok,I'll try it and let you know how it goes. |
I tried increasing k in 3 samples but there are no better results than before.they are still pretty low when k=200 . |
So, I think this could be more specific to your dataset then, which should
be fine... I tested in other datasets I usually get higher values...
Thanks!
2018-05-17 4:25 GMT-04:00 JianmeiZhong <notifications@github.com>:
… I tried increasing k in 3 samples but there are no better results than
before.they are still pretty low when k=200 .
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#8 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AJqUR1fw-JPjqu9DCq3oWt07_0pU9ksiks5tzTQRgaJpZM4Tdb6B>
.
--
Edroaldo
|
Hi,
I work through most part of the analysis pipeline but the results of GRN scores obtained via grnscores() in specified paths SP_10.SP_8 as tSNE and barplot shows.the grn scores have different levels compared to yours presented in several exemplary analysis,that's level of 0.0* versus 0.* .Sometimes,the GRNscores plot has one or two genes with low scores in both directions.I don't know if there's some wrong configuration in my pipeline.If my source and endpoints are correct and reliable,low scores level are possible for some datasets and it's available?
another questions asking for your guide :)
known knowledge this project based on are less subpopulations and increasing number of cells along differentiation process.So,k=120 was set according to my understanding in function buildGRN() and creatKNN() to obtain relatively less subpopulations as tSNE plot shows.k value is different a lot from yours.It might be a cause of low GRN scores?And if a path didn't go through every subpopulation in non-branched cells differentiation,gene expression dynamics with this path can represent entire process of changes?
Thank a lot !
Jm Zhong
The text was updated successfully, but these errors were encountered: