Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Randomly downsample seurat object #3108

Closed
williamsdrake opened this issue Jun 4, 2020 · 5 comments
Closed

Randomly downsample seurat object #3108

williamsdrake opened this issue Jun 4, 2020 · 5 comments

Comments

@williamsdrake
Copy link

williamsdrake commented Jun 4, 2020

Hi Seurat Team,

Thanks for the wonderful package. I have two seurat objects, one with about 40k cells and another with around 20k cells. I would like to randomly downsample the larger object to have the same number of cells as the smaller object, however I am getting an error when trying to subset. I followed the example in #243, however this issue used a previous version of Seurat and the code didn't work as-is. Here is the slightly modified code I tried with the error:

# Object HV is the Seurat object having the highest number of cells
# Object PD is the second Seurat object with the lowest number of cells
# Compute the length of cells from PD
cells.to.sample <- length(PD@active.ident)

# Sample from HV as many cells as there are cells in PD
# For reproducibility, set a random seed
set.seed(12)
sampled.cells <- sample(x = HV@active.ident, size = cells.to.sample, replace = F)

# Subset Seurat object
HV.sub <- subset(x=HV, cells = sampled.cells)

The error after the last line is:
Error in CellsByIdentities(object = object, cells = cells) :
Cannot find cells provided

Seurat version 3.1.4

Any help or guidance would be appreciated. Thank you

@timoast
Copy link
Collaborator

timoast commented Jun 5, 2020

You should be able to run:

downsampled.obj <- large.obj[, sample(colnames(large.obj), size = ncol(small.obj), replace=F)]

@timoast timoast closed this as completed Jun 5, 2020
@williamsdrake
Copy link
Author

Thank you Tim. If anybody happens upon this in the future, there was a missing ')' in the above code. This is what worked for me:

downsampled.obj <- large.obj[, sample(colnames(large.obj), size = ncol(small.obj), replace=F))]

@kmshort
Copy link

kmshort commented Sep 2, 2020

Thank you Tim. If anybody happens upon this in the future, there was a missing ')' in the above code. This is what worked for me:

downsampled.obj <- large.obj[, sample(colnames(large.obj), size = ncol(small.obj), replace=F))]

They actually both fail due to syntax errors, yours included @williamsdrake .
For the new folks out there used to Satija lab vignettes, I'll just call large.obj pbmc, and downsampled.obj, pbmc.downsampled, and replace size determined by the number of columns in another object with an integer, 2999:

pbmc.subsampled <- pbmc[, sample(colnames(pbmc), size =2999, replace=F)]

I hope this helps someone.

@EdoPredi
Copy link

Hi everyone.

Just a quick question

Thank you Tim. If anybody happens upon this in the future, there was a missing ')' in the above code. This is what worked for me:
downsampled.obj <- large.obj[, sample(colnames(large.obj), size = ncol(small.obj), replace=F))]

They actually both fail due to syntax errors, yours included @williamsdrake .
For the new folks out there used to Satija lab vignettes, I'll just call large.obj pbmc, and downsampled.obj, pbmc.downsampled, and replace size determined by the number of columns in another object with an integer, 2999:

pbmc.subsampled <- pbmc[, sample(colnames(pbmc), size =2999, replace=F)]

I hope this helps someone.

Hy Km Short...

I was trying to do the same and is used your code. The number of column it is reduced ( so the object). My question is... Is this randomized ? which command here is leading to randomization ? I managed to reduce the vignette pbmc from the from 2700 to 600. I ma just worried it is just picking the first 600 and not randomizing

@kmshort
Copy link

kmshort commented Dec 13, 2020

Hy Km Short...

I was trying to do the same and is used your code. The number of column it is reduced ( so the object). My question is... Is this randomized ? which command here is leading to randomization ? I managed to reduce the vignette pbmc from the from 2700 to 600. I ma just worried it is just picking the first 600 and not randomizing

https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/sample

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants