Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Start from matrix #4

Closed
r3fang opened this issue Jul 27, 2018 · 17 comments
Closed

Start from matrix #4

r3fang opened this issue Jul 27, 2018 · 17 comments

Comments

@r3fang
Copy link

r3fang commented Jul 27, 2018

In the example, it all starts from bam nad peak file, is there any easy way to use scATAC directly from count matrix if we already pre-processed the data?

@timydaley
Copy link
Collaborator

Hi Ronxin, thank you for your interest in scABC. Yes, the only issue is how to weight the cells in clustering. We use the background to weight the cells, which requires the original bam files. Without the bam files, then you have to determine another way to weight the cells. One would be to use median of the count matrix (that we call the Foreground matrix in the vignette). This is not something we have fully tested, e.g. how much difference is there between background and foreground and how much information is lost by not using the background to weight cells.

Actually, now that I look at the vignette I'm gonna have to shift some of the code around to accomplish this. I'll work on this a bit today while I have time.

@timydaley
Copy link
Collaborator

I've added an option in the landmarks to input user defined weights. I suggest you use the mean of the count matrix. I've tested it on the 6 cell line in-silico mixture, and the results are very good. The vignette is available at https://github.com/timydaley/scABC/blob/master/vignettes/ClusteringWithCountsMatrix.html. When computing the gap statistic you can use the cell level means as BackGroundMedian. Otherwise, everything proceeds as in the other vignettes.

I hope this helps. Feel free to ask us any more questions that you may have, and please let us know if this is successful.

@timydaley
Copy link
Collaborator

@r3fang We noticed an issue in computing cluster specific p-values when starting from a counts matrix (Thank you @MahdiZ11). We believe that we have solved the issue. We have updated the vignette https://github.com/SUwonglab/scABC/blob/master/vignettes/ClusteringWithCountsMatrix.Rmd to reflect this. Let us know how this works for you. We're curious if you have success. Thank you.

@r3fang
Copy link
Author

r3fang commented Aug 17, 2018

sorry for the late response! let me try it out and get you back today! thank you for your help. Appreciate it!

@r3fang
Copy link
Author

r3fang commented Aug 24, 2018

last question, this is to find cluster, which matrix should i use to perform PCA or tsne against just for visualization

@timydaley
Copy link
Collaborator

timydaley commented Aug 24, 2018 via email

@timydaley
Copy link
Collaborator

Also, I hope this isn't the last question. We are happy to help and answer any questions you may have. I will close this issue, but if you have any more questions, comments, or even critiques, you can make a new issue or email us directly. Thank you Ronxin.

@MahdiZ11
Copy link
Collaborator

To add to Tim's answer, please also take a look at https://github.com/SUwonglab/scABC/blob/master/vignettes/BatchEffect_NumberOfCluster.Rmd. At the end of this vignette, we explain how to plot the t-SNE of clustering results using differential peaks. You can apply this procedure to the foreground matrix as well. Thank you Ronxin for using scABC.

@r3fang
Copy link
Author

r3fang commented Jan 8, 2019

Sorry for coming back again. I have tried the ClusteringWithCountMatrix. However, the performance was not good on my own data. My understanding is InSilicoSCABCForeGroundMatrix is the peak-by-cell count matrix, am I right? I wonder if it is possible to share the count matrix you have used in the script so I can replicate your result on your data, just make sure I did not make any mistakes

@timydaley
Copy link
Collaborator

Hi Rongxin,
can you give us your email so that we can send you the files directly?

@timydaley timydaley reopened this Jan 9, 2019
@r3fang
Copy link
Author

r3fang commented Jan 9, 2019

It's r4fang@gmail.com. Thank you for your help!

@timydaley
Copy link
Collaborator

We should note that we are more uncertain about clustering without background. The background contains a lot of information in the expected counts and allows us to much better quantify the uncertainty of the observed counts, therefore allowing for better clustering.

@MahdiZ11
Copy link
Collaborator

MahdiZ11 commented Apr 2, 2019

Hi Rongxin,

Just wanted to make sure we answered your questions before closing this issue?

Thanks,
Mahdi

@r3fang
Copy link
Author

r3fang commented Apr 2, 2019 via email

@MahdiZ11
Copy link
Collaborator

MahdiZ11 commented Apr 2, 2019

No problem. Happy to answer any question/issue in the future.

@MahdiZ11 MahdiZ11 closed this as completed Apr 2, 2019
@mudappathir
Copy link

Hi,

I am trying to replicate the vignette for ClusteringWithCountsMatrix.Rmd. I could not figure out the InSilicoSCABCForeGroundMatrix used. I saw this closed issue requesting the same. I kindly request you to send the matrix used. My email is rekha.m.mec@gmail.com. Thank you for your help.

@MahdiZ11
Copy link
Collaborator

Hi rekha,

I'll email the matrix to you soon.

Thanks,
Mahdi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants