You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The value of total_ivl_bp (and anything derived from it) in the output of bedtools summary is incorrect if the intervals in the input BED/GTF/VCF overlap. The reason is that this is calculated simply as the sum of the lengths (end - start) of all intervals, so any overlapping regions are double-counted. You can easily see this if you compare the value for all chromosomes in the total_ivl_bp column from a BED file with overlapping intervals to the output of bedtools jaccard -a x.bed -b x.bed, where intersection and union are identical and correspond to the correct value. (The latter is also the same value that you get when subtracting the sum of all intervals in the complement of the BED from the genome length.)
The text was updated successfully, but these errors were encountered:
tdanhorn
changed the title
Length calculation in summary is incorrect if intervals in the inout file overlap
Length calculation in summary is incorrect if intervals in the input file overlap
Jan 3, 2024
The value of
total_ivl_bp
(and anything derived from it) in the output ofbedtools summary
is incorrect if the intervals in the input BED/GTF/VCF overlap. The reason is that this is calculated simply as the sum of the lengths (end - start) of all intervals, so any overlapping regions are double-counted. You can easily see this if you compare the value forall
chromosomes in thetotal_ivl_bp
column from a BED file with overlapping intervals to the output ofbedtools jaccard -a x.bed -b x.bed
, whereintersection
andunion
are identical and correspond to the correct value. (The latter is also the same value that you get when subtracting the sum of all intervals in the complement of the BED from the genome length.)The text was updated successfully, but these errors were encountered: