Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about batch effects in refine.bio datasets #455

Open
ghost opened this issue Mar 26, 2021 · 3 comments
Open

Question about batch effects in refine.bio datasets #455

ghost opened this issue Mar 26, 2021 · 3 comments

Comments

@ghost
Copy link

ghost commented Mar 26, 2021

Hi!

I am trying to understand whether batch effects are corrected for in the refine.bio pipeline.

I downloaded the dataset GSE99039 (microarray) from refine.bio then looked at the dataset using PCA. I noticed that the dataset from refine.bio seem to have a clear separation that does not match any of the metadata.

refine.bio PCA
image

Hence would like to ask about
i. where is the part in the pipeline that does the (quantile?) normalization
ii. i understand that for the normalized data pipeline if any batch correction was performed.

Thank you.

@jaclyn-taroni
Copy link
Member

Hi @kengcher,

Thanks for your questions and for using refine.bio. The dataset you mention (GSE99039) is submitter-processed, which means we were unable to process the data from raw files and use whatever values the authors submitted to GEO (in this case, it is reported to be RMA normalized values). We do quantile normalize submitter-processed data for delivery, but have less control over what happens prior to that step. We do not perform any batch correction (e.g., ComBat).

Looking at the description for this particular experiment, I would want to know if that separation corresponds to idiopathic PD vs. controls, but you do mention that the separation does not match any of the metadata in your post.

Hope this helps! Let me know if you have additional questions.

@ghost
Copy link
Author

ghost commented Mar 29, 2021 via email

@jaclyn-taroni
Copy link
Member

We've looked into why this particular experiment was not processed from raw and believe we may have identified a fix, which we will now need to test. If the fix works, we can expect to make the version of this experiment processed from raw within the next few weeks. We're in the middle of some infrastructure changes for the project, so we appreciate your patience!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant