Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replacement using pData() broken #108

Open
kasperdanielhansen opened this issue Apr 28, 2017 · 6 comments
Open

replacement using pData() broken #108

kasperdanielhansen opened this issue Apr 28, 2017 · 6 comments

Comments

@kasperdanielhansen
Copy link
Collaborator

From Tajuddin, Salman salman.tajuddin@nih.gov

I was going through your 450K data analysis with minif, located here http://kasperdanielhansen.github.io/genbioconductor/html/minfi.html and encountered an error when merging the GEO phenotype data with the methylation data. Not sure what went wrong here.

The error message reads like this:

> pData(rgSet) <- pD
Error in (function (classes, fdef, mtable)  :
  unable to find an inherited method for function ‘pData<-’ for signature ‘"RGChannelSet", "data.frame"’

sessionInfo() is below for your reference

Thanks for your help.

Salman

image

@apastore
Copy link

for posterity. it need a DataFrame not data.frame

@chrislun16
Copy link

chrislun16 commented Jul 7, 2017

What does that mean apastore? What do I need to change from data.frame to a DataFrame? And if you can, what's the difference between the two?

Edit: I figured it out. You have to do pD<-as(pD,"DataFrame")

@kasperdanielhansen
Copy link
Collaborator Author

kasperdanielhansen commented Jul 8, 2017 via email

@chrislun16
Copy link

chrislun16 commented Jul 11, 2017

Do you know if the guide is still correct, because I seem to be getting incorrect beta values.

I was following the guide up until the preprocessing part. Meaning that I was able to obtain the RGset. Then I used my own preprocessing code to obtain the beta values. However, I cross checked them with the beta values that were on GEO and they weren't consistent.

For example, the accession number I used was GSE68777. So I went to that study on GEO and clicked on the first sample: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM1681154. Then I scrolled down and clicked "Download full table" to download the samples and beta values in a text file.

Then I typed head(beta) and chose the first sample. Then I did command F in the text file for that sample and it's value there wasn't the same as the value from the beta data table. Hopefully you can help find the error.

Here is the code I'm using:

library(GEOquery)
library(minfi)
library("IlluminaHumanMethylation450kanno.ilmn12.hg19")
library("IlluminaHumanMethylation450kmanifest")

######## Code copied from 450k Guide ########

getGEOSuppFiles("GSE68777")
untar("GSE68777/GSE68777_RAW.tar", exdir = "GSE68777/idat")
head(list.files("GSE68777/idat", pattern = "idat"))
idatFiles <- list.files("GSE68777/idat", pattern = "idat.gz$", full = TRUE)
sapply(idatFiles, gunzip, overwrite = TRUE)
rgSet <- read.metharray.exp("GSE68777/idat")
geoMat <- getGEO("GSE68777")
pD.all <- pData(geoMat[[1]])
pD <- pD.all[, c("title", "geo_accession", "characteristics_ch1.1", "characteristics_ch1.2")]
names(pD)[c(3,4)] <- c("group", "sex")
pD$group <- sub("^diagnosis: ", "", pD$group)
pD$sex <- sub("^Sex: ", "", pD$sex)
sampleNames(rgSet) <- sub(".*_5", "5", sampleNames(rgSet))
rownames(pD) <- pD$title
pD <- pD[sampleNames(rgSet),]
pData(rgSet) <- pD

@wvictor14
Copy link

It's not clear to me if those GEO sample submissions are processed or unprocessed beta values. Note that you did a normalization step:

MSet <- preprocessIllumina(rgSet, bg.correct = TRUE, normalize = "controls")

So your data is normalized. Why don't you compare the raw beta values to the GEO sample submissions? If those are the same, then good, those submissions are raw beta values. If they are not the same, then maybe they are processed betas and you'll need to find out their normalization step (details are missing in the article) in order to exactly reproduce their data.

@chrislun16
Copy link

chrislun16 commented Jul 12, 2017

You are right. When I compared the raw beta values, they were very similar. Thanks for the help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants