Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

incorrect reported pileup height reported #18

Closed
ArielPaulson opened this issue Apr 1, 2013 · 4 comments
Closed

incorrect reported pileup height reported #18

ArielPaulson opened this issue Apr 1, 2013 · 4 comments

Comments

@ArielPaulson
Copy link

Perhaps this is related to issue 12, noninteger pileup height...

I have a set of peak calls from the following command (CentOS6 / Python 2.7.2):

macs2 callpeak -t IP.bam -c input.bam -g mm -p 0.05 --nomodel

And in the resulting peaks.xls file, the pileup heights are, almost without exception, higher than the actual pileup heights from the bam file.

Using the reported peak chrom, start, and end, I assay bam pileup heights with the following command:

samtools depth -r chrom:start-end IP.bam | cut -f3 | sort | uniq -c

And the maximum is always lower than the macs2 pileup value. This is especially problematic because I am using pileup heights to filter peaks, and if I want peaks with height >= 10 then macs2 should not tell me it is 10 when in fact it is 4, or 15 when it is 9, or 22 when 12, etc.

Also, I am testing the original bam, and macs2 supposedly removes duplicates, so the disparities are probably even worse.

So how are the pileup values generated, and why are they not the IP height?

Thanks,
Ariel

@benjschiller
Copy link
Collaborator

Is this just a single-end BAM file? I made some modifications to how paired
end files are handled but I didn't think they were invoked unless you
specifically specified -t BAMPE. I believe the default duplicate behavior
is figure out how many duplicates there would be according to the binomial
distribution (so more than one for very high depth of sequencing) and allow
at most that many. You can change that to 1 manually. I'm not clear on why
you are using --nomodel without specifying the shift parameters. Peaks are
filtered by pvalue, requiring min width etc. If you just want to create a
pileup and filter by height, I suggest using bedtools genomeCoverageBed
(the "-d" flag will generate a pileup)

Ben

On Sun, Mar 31, 2013 at 7:22 PM, Ariel Paulson notifications@github.comwrote:

Perhaps this is related to issue 12, noninteger pileup height...

I have a set of peak calls from the following command (CentOS6 / Python
2.7.2):

macs2 callpeak -t IP.bam -c input.bam -g mm -p 0.05 --nomodel

And in the resulting peaks.xls file, the pileup heights are, almost
without exception, higher than the actual pileup heights from the bam file.

Using the reported peak chrom, start, and end, I assay bam pileup heights
with the following command:

samtools depth -r chrom:start-end IP.bam | cut -f3 | sort | uniq -c

And the maximum is always lower than the macs2 pileup value. This is
especially problematic because I am using pileup heights to filter peaks,
and if I want peaks with height >= 10 then macs2 should not tell me it is
10 when in fact it is 4, or 15 when it is 9, or 22 when 12, etc.

Also, I am testing the original bam, and macs2 supposedly removes
duplicates, so the disparities are probably even worse.

So how are the pileup values generated, and why are they not the IP height?

Thanks,
Ariel


Reply to this email directly or view it on GitHubhttps://github.com//issues/18
.

@ArielPaulson
Copy link
Author

Hi Ben,

Just a single-end BAM. I was using --nomodel because of some earlier strangeness with the called peaks (in a different sample) which --nomodel seemed to resolve, so I left it out for the others. But if it is still locally shifting read piles, then this would explain why it is getting pileups which are higher than from the bam directly. And also why the macs:bam pileup ratio seems centered around 2.

Thanks,
Ariel


From: Benjamin Schiller [notifications@github.com]
Sent: Sunday, March 31, 2013 9:50 PM
To: taoliu/MACS
Cc: Paulson, Ariel
Subject: Re: [MACS] incorrect reported pileup height reported (#18)

Is this just a single-end BAM file? I made some modifications to how paired
end files are handled but I didn't think they were invoked unless you
specifically specified -t BAMPE. I believe the default duplicate behavior
is figure out how many duplicates there would be according to the binomial
distribution (so more than one for very high depth of sequencing) and allow
at most that many. You can change that to 1 manually. I'm not clear on why
you are using --nomodel without specifying the shift parameters. Peaks are
filtered by pvalue, requiring min width etc. If you just want to create a
pileup and filter by height, I suggest using bedtools genomeCoverageBed
(the "-d" flag will generate a pileup)

Ben

Ben Schiller
415.273.9236 (Cell)

UC San Francisco, PhD Candidate
Harvey Mudd College, '08

"There's no such thing as an ordinary human." -The Doctor

On Sun, Mar 31, 2013 at 7:22 PM, Ariel Paulson notifications@github.comwrote:

Perhaps this is related to issue 12, noninteger pileup height...

I have a set of peak calls from the following command (CentOS6 / Python
2.7.2):

macs2 callpeak -t IP.bam -c input.bam -g mm -p 0.05 --nomodel

And in the resulting peaks.xls file, the pileup heights are, almost
without exception, higher than the actual pileup heights from the bam file.

Using the reported peak chrom, start, and end, I assay bam pileup heights
with the following command:

samtools depth -r chrom:start-end IP.bam | cut -f3 | sort | uniq -c

And the maximum is always lower than the macs2 pileup value. This is
especially problematic because I am using pileup heights to filter peaks,
and if I want peaks with height >= 10 then macs2 should not tell me it is
10 when in fact it is 4, or 15 when it is 9, or 22 when 12, etc.

Also, I am testing the original bam, and macs2 supposedly removes
duplicates, so the disparities are probably even worse.

So how are the pileup values generated, and why are they not the IP height?

Thanks,
Ariel


Reply to this email directly or view it on GitHubhttps://github.com//issues/18
.


Reply to this email directly or view it on GitHubhttps://github.com//issues/18#issuecomment-15702463.

@taoliu
Copy link
Contributor

taoliu commented Oct 29, 2014

Related to issue #53?

@taoliu
Copy link
Contributor

taoliu commented Mar 8, 2016

Closing due to lack of activity.

@taoliu taoliu closed this as completed Mar 8, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants