**Take a look at the intersect manual**

In [1]:
module load bedtools2

In [15]:
bedtools intersect -h  >& intersect.log

: 1

In [16]:
cat intersect.log


Tool:    bedtools intersect (aka intersectBed)
Version: v2.25.0
Summary: Report overlaps between two feature files.

Usage:   bedtools intersect [OPTIONS] -a <bed/gff/vcf/bam> -b <bed/gff/vcf/bam>

	Note: -b may be followed with multiple databases and/or 
	wildcard (*) character(s). 
Options: 
	-wa	Write the original entry in A for each overlap.

	-wb	Write the original entry in B for each overlap.
		- Useful for knowing _what_ A overlaps. Restricted by -f and -r.

	-loj	Perform a "left outer join". That is, for each feature in A
		report each overlap with B.  If no overlaps are found, 
		report a NULL feature for B.

	-wo	Write the original A and B entries plus the number of base
		pairs of overlap between the two features.
		- Overlaps restricted by -f and -r.
		  Only A features with overlap are reported.

	-wao	Write the original A and B entries plus the number of base
		pairs of overlap between the two features.
		- Overlapping features restricted by -f and -r.
		  However, A fea

In [4]:
TAB="$(printf '\t')"

cat > A.bed << EOF
chr1${TAB}1${TAB}2
chr1${TAB}3${TAB}5
chr1${TAB}2${TAB}10
chr2${TAB}1${TAB}2
EOF

In [5]:
TAB="$(printf '\t')"

cat > B.bed << EOF
chr1${TAB}1${TAB}5
chr1${TAB}3${TAB}10
chr1${TAB}5${TAB}15
EOF

## Default Behavior: reports the intervals of each pairwise overlap

for regions in chromosome 1
```
1-2   + 1-5  ===> 1-2
1-2   + 3-10 ===> NULL
1-2   + 5-15 ===> NULL
3-5   + 1-5  ===> 3-5
3-5   + 3-10 ===> 3-5
3-5   + 5-15 ===> NULL
2-10  + 1-5  ===> 2-5
2-10  + 3-10 ===> 3-10
2-10  + 5-15 ===> 5-10
```

In [6]:
bedtools intersect -a A.bed -b B.bed

chr1	1	2
chr1	3	5
chr1	3	5
chr1	2	5
chr1	3	10
chr1	5	10


## Reports the original fragments instead of the overlap intervals

based on above we already know the overlapped regions include:
```
1-2   + 1-5  ===> 1-2
3-5   + 1-5  ===> 3-5
3-5   + 3-10 ===> 3-5
2-10  + 1-5  ===> 2-5
2-10  + 3-10 ===> 3-10
2-10  + 5-15 ===> 5-10
```

**1. focusing on a**

In [7]:
bedtools intersect -a A.bed -b B.bed -wa

chr1	1	2
chr1	3	5
chr1	3	5
chr1	2	10
chr1	2	10
chr1	2	10


**2. focusing on b???**

In [8]:
bedtools intersect -a A.bed -b B.bed -wb

chr1	1	2	chr1	1	5
chr1	3	5	chr1	1	5
chr1	3	5	chr1	3	10
chr1	2	5	chr1	1	5
chr1	3	10	chr1	3	10
chr1	5	10	chr1	5	15


**3. Both**

In [9]:
bedtools intersect -a A.bed -b B.bed -wa -wb

chr1	1	2	chr1	1	5
chr1	3	5	chr1	1	5
chr1	3	5	chr1	3	10
chr1	2	10	chr1	1	5
chr1	2	10	chr1	3	10
chr1	2	10	chr1	5	15


## Reports the length of overlap intervals

In [10]:
bedtools intersect -a A.bed -b B.bed -wo

chr1	1	2	chr1	1	5	1
chr1	3	5	chr1	1	5	2
chr1	3	5	chr1	3	10	2
chr1	2	10	chr1	1	5	3
chr1	2	10	chr1	3	10	7
chr1	2	10	chr1	5	15	5


In [11]:
bedtools intersect -a A.bed -b B.bed -wo | wc -l

6


## Count the number of overlaps for regions in A

In [59]:
bedtools intersect -a A.bed -b B.bed -c

chr1	1	2	1
chr1	3	5	2
chr1	2	10	3
chr2	1	2	0


this is the same as

In [61]:
bedtools intersect -a A.bed -b B.bed -wa | uniq -c

      1 chr1	1	2
      2 chr1	3	5
      3 chr1	2	10


## Report regions without overlapping

In [62]:
bedtools intersect -a A.bed -b B.bed -v

chr2	1	2


## Other metric related to overlap: Jaccard index

`bedtools jaccard -a A.bed -b B.bed`
```
Error: Sorted input specified, but the file A.bed has the following out of order record
chr1	2	10
```

In [67]:
sort A.bed > A_sort.bed
sort B.bed > B_sort.bed

In [68]:
bedtools jaccard -a A_sort.bed -b B_sort.bed

intersection	union-intersection	jaccard	n_intersections
9	15	0.6	1
