Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subsetting Dataset #40

Closed
sylvia-science opened this issue Oct 20, 2023 · 2 comments
Closed

Subsetting Dataset #40

sylvia-science opened this issue Oct 20, 2023 · 2 comments

Comments

@sylvia-science
Copy link

Hello,

If I'm particularly interested in a subset of cells in my dataset, is it valid to run SAVER on just that subset?

Furthermore, I'm working on a dataset comprised from many different scRNA sources where I suspect batch effects may be relevant. Would it be a good idea to split my SAVER runs to be on each source instead of running on data combined from all sources?

Thank you!

@mohuangx
Copy link
Owner

Hi,

I would recommend running SAVER on the entire dataset so that the prediction is performed on more cells and then look at the subset of the SAVER output but if computation is an issue, it's perfectly fine to run SAVER on the subset of cells too.

I agree that it's probably better to split the SAVER runs and then possibly batch correct. This way, SAVER won't be picking up on differences between sources in performing the prediction.

@sylvia-science
Copy link
Author

Thank you for the fast response!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants