Skip to content

Difference between DML lists #672

@yaaminiv

Description

@yaaminiv

There is a discrepancy between the number of DML @sr320 and I obtained. I think it's because there are differences in our R code.

My list: http://gannet.fish.washington.edu/spartina/2018-10-10-project-virginica-oa-Large-Files/2019-03-07-Yaamini-Virginica-Repository/analyses/2018-10-25-MethylKit/2019-03-13-DML-Destrand-5x-Locations.bed
code: https://github.com/epigeneticstoocean/paper-gonad-meth/blob/master/code/04-methylkit.Rmd

Steven's list: https://github.com/sr320/nb-2019/blob/master/C_virginica/analyses/dml52.bed
code: https://github.com/epigeneticstoocean/paper-gonad-meth/blob/master/code/methylkit_sr.Rmd

The first thing I see is a difference in processBismarkAln. I used mincov = 5, while Steven used mincov = 2. Additionally, I used assembly = v3, and Steven used assembly = v001.

  • Mine: processBismarkAln(location = analysisFiles, sample.id = sample.IDs, assembly = "v3", read.context = "CpG", mincov = 5, treatment = treatmentSpecification)
  • Steven's: processBismarkAln(location = file.list_10, sample.id = list("1","2","3","4","5","6","7","8","9","10"), assembly = "v001", read.context="CpG", mincov=2, treatment = c(0,0,0,0,0,1,1,1,1,1))

Steven also uses this extra filtering step while I do not:

filtered.myobj=filterByCoverage(myobj_10,lo.count=5,lo.perc=NULL, hi.count=100,hi.perc=NULL)

Both of these discrepancies could affect the number of DML and which DML are identified. Which method is "correct"?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions