Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

usage of cell type output #59

Open
rkb965 opened this issue Apr 23, 2024 · 0 comments
Open

usage of cell type output #59

rkb965 opened this issue Apr 23, 2024 · 0 comments

Comments

@rkb965
Copy link

rkb965 commented Apr 23, 2024

Hello! Thank you for taking the time to write this great package. I have a few questions that are related to best practices with {meffil} output, but I admit they are not strictly about the package (and I certainly do not expect a response!). Please feel free to point me to a more appropriate spot for these.

  1. Is it standard practice to include all predicted cell types as adjustment variables? This code from README suggests yes:

Add cell count estimates to the set of covariates.
counts <- t(meffil.cell.count.estimates(norm.objects))
covariates <- cbind(covariates, counts)
Run the EWAS.
ewas.ret <- meffil.ewas(norm.beta, variable=variable, covariates=covariates)

The cell count estimates are highly correlated, and I think they may be resulting in inflated/ unstable effect estimates. Is this common? I am entirely new to this, but I don't see it discussed. For full disclosure in case this is somehow deal with in {meffil}, I am using the cell count estimates from {meffil} but the modeling is in a different pipeline.

  1. If you include only a subset of predicted cell types, are there guidelines for amount of information that is acceptable to drop? Most of my samples have near-zero (for some definition of "near") amounts of all but two cell types, but I absolutely have some samples with non-trivial contributions from each of the other five cell types.

  2. Are cell type predictions particularly sensitive to reference datasets? We have a pediatric population with saliva samples and are inclined to use {mefill}'s detailed (7 cell types) output using the gse35069/gse48472 reference datasets, but there is also a reference saliva dataset from a pediatric population BeadSorted.Saliva.EPIC which is from an explicitly pediatric population but only has leukocytes and epithelial cells. With this, I'm not sure whether differential methylation by cell type makes {mefill}'s use of saliva gse48472 more appropriate or if age-specific effects make BeadSorted.Saliva.EPIC more appropriate for this use case.

Many thanks for any wisdom you can share!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant