Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: BiocParallel errors #14

Closed
MichaelPeibo opened this issue Jan 29, 2018 · 11 comments
Closed

Error: BiocParallel errors #14

MichaelPeibo opened this issue Jan 29, 2018 · 11 comments
Assignees
Labels

Comments

@MichaelPeibo
Copy link

Hi, Ascend team
after normalization by scranNormalise, I want to regress out the cell cycle factor by RegressConfoundingFactors, however, when I run this function , I got this error,

Error: BiocParallel errors
  element index: 1, 2, 3, 4, 5, 6, ...
  first error: NA/NaN/Inf in 'x'

> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS release 6.7 (Final)

Matrix products: default
BLAS: /share/app/cluster/R-3.4.3/lib64/R/lib/libRblas.so
LAPACK: /share/app/cluster/R-3.4.3/lib64/R/lib/libRlapack.so

 version
               _
platform       x86_64-pc-linux-gnu
arch           x86_64
os             linux-gnu
system         x86_64, linux-gnu
status
major          3
minor          4.3
year           2017
month          11
day            30
svn rev        73796
language       R
version.string R version 3.4.3 (2017-11-30)
nickname       Kite-Eating Tree


I did installed and configured the BiocParallel as you told, any suggestion on this? Thanks!

@asenabouth asenabouth self-assigned this Jan 29, 2018
@asenabouth asenabouth added the bug label Jan 29, 2018
@asenabouth
Copy link
Collaborator

Hi @MichaelPeibo,
Could you please email me your script and EMSet (saved as an RDS) to me at a.senabouth @ imb.uq.edu.au so I can investigate this issue for you? And also, are you working on a high-performance computing environment (ie. PBSPro, SLURM, LSF etc...)

@MichaelPeibo
Copy link
Author

MichaelPeibo commented Jan 30, 2018

Hi @asenabouth ,
I sent to you my em.set and script(which could be a little messy, but I believe you can find the key code);
here is my running info, I am working on a Linux platform, I don't know if these info could be helpful for you
cat /proc/cpuinfo | grep name | cut -f2 -d: | uniq -c 8 Intel(R) Xeon(R) CPU E5-2609 v2 @ 2.50GHz

uname -a Linux mgt 2.6.32-696.13.2.el6.x86_64 #1 SMP Thu Oct 5 21:22:16 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
Thanks for your attention and help.

@asenabouth
Copy link
Collaborator

Hi @MichaelPeibo - I've found the cause of your error, and it may have downstream effects. Turns out the scranNormalise function has converted some of the values into the expression matrix into infinite values - this is definitely not ideal. I will look into this; in the meantime, I recommend you use the other normalisation method NormaliseByRLE. Thank you for your patience.

@asenabouth
Copy link
Collaborator

scranNormalise has been fixed, and RegressConfoundingFactors function now works on your dataset @MichaelPeibo . Thank you for raising this issue. Please let me know if you have any other issues.

@MichaelPeibo
Copy link
Author

@asenabouth
Really appreciate your help.
I re-installed the ascend package bu the install_github, and passed the regressConfoundingFactor step; however, when I did PCA, I get this strange plot
image
And when I run the RunCORE, I got this error
`

clustered.set <- RunCORE(pca.set, conservative = TRUE)
[1] "Performing unsupervised clustering..."
Error in RunCORE(pca.set, conservative = TRUE) :
Your dataset may contain cells that are too distinct from the main
population of cells. We recommend you run this function with
'remove_outlier = TRUE' or check the cell-cell normalisation of your
dataset.
`
I did filter by default, any suggestion?

@asenabouth
Copy link
Collaborator

That's a strange result. Your dataset is large enough to have enough variance (unless the regression removed this). We don't usually regress confounding factors on our dataset (the option is there for those that do wish to do this step). Do you get the same result on the dataset if you don't use the confounding factor regression?

You can also use the 'remove_outlier' option with RunCORE to see what you get. This step will remove these outliers however, and is more time consuming as it repeats the dynamic tree cut until all remaining cells can be assigned a cluster.

@asenabouth asenabouth reopened this Jan 31, 2018
@MichaelPeibo
Copy link
Author

I skipped the regression confounding factor step, and set remove_outlier = TRUE; PC variance looks 'not that strange'
image

but when I tried to use the most stable and the least stable RunCORE methods, I got the unchanged results.

besides, does Ascend has any options which can be used to cluster 'once for all' or tune the 'cluster resolution' like Seurat did, rather than, repeated clustering?

@asenabouth
Copy link
Collaborator

I had a look at your data to see if I can shed any more light on the issue - if you generate a PCA plot with PlotPCA you will see some the majority of the points in one location and some distinct data points separated away from this location. These would be the outliers in your dataset.

I also ran RunCORE with remove_outlier set to TRUE, which discarded (but kept a record of) these cells which generated a result of three clusters. The number of outlier cells was less than 20, which is the minimum cluster size set by dynamicTreeCut.

The way RunCORE works is it performs clustering at different resolutions and then selects the most stable resolution for you. Once you run the RunCORE function, you can view the results of all the resolutions by using the GetRandMatrix function and PlotStabilityDendro function, so you can decide if that was the best resolution for you.

We also introduced an option in the latest update to set the size of these sliding windows by using the "windows" argument (just input a sequence of numbers ranging from 0 to 1). It will still try 40 different resolutions however.

Hope that helps. Our group is working on a more detailed clustering package for single cell data, but we don't have an ETA for that yet.

@MichaelPeibo
Copy link
Author

Hi @asenabouth
Do not know if there is any update like I mentioned above 'clustering once for all' (with doc.)?

Another point confused me is what you mentioned in your tutorial and your paper(congrats!), you think there are some apoptosis pathway related genes enrich in cluster2, how do you define it ? Is there any way to determine it automatically?

Thanks!

@MichaelPeibo
Copy link
Author

MichaelPeibo commented Mar 5, 2018

P.S.
Such as pathway analysis just following your processing with Asend;

Also, I really like your devolcalno plot, shown here:
image
in these cases, you only show the label of some genes rather than all. How do you plot it?(I did not find tunable parameters in the plot function)

And what is the parameter setting for certain gene expression plot in tsne?(sorry for thousands of Qs...)

@asenabouth
Copy link
Collaborator

Hi @MichaelPeibo - thanks for your questions! It gives us a good idea of how our users are using our package. I'm moving your comments to different threads, just so it will be easier to track and if any other users have similar questions, they can refer to your threads.

This was referenced Mar 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants