# AIList benchmarks

Here we will show you how to benchmark the code. We assume you have already finished the introduction and have compiled and put the `ailist` executable in your path.

We also included implementations of 2 other data structures, the NCList (obtained from [ncls](https://github.com/hunt-genes/ncls)), and the AITree (obtained from [kerneltree](https://github.com/biocore-ntnu/kerneltree/)). Here is how to compile these tools:

In [7]:
cd
cd AIList/src_AITree
make

gcc -Wall -ggdb -D_FILE_OFFSET_BITS=64 -c AITree.c -o AITree.o 
[01m[KAITree.c:[m[K In function ‘[01m[Kmain[m[K’:
     clock_t start1, end1, end2;
[01;32m[K                           ^[m[K
     clock_t start1, end1, end2;
[01;32m[K                     ^[m[K
     clock_t start1, end1, end2;
[01;32m[K             ^[m[K
[01m[KAITree.c:[m[K At top level:
 static void print_nodes(unsigned long start, unsigned long end)
[01;32m[K             ^[m[K
gcc -Wall -ggdb -D_FILE_OFFSET_BITS=64 interval_tree.o rbtree.o AITree.o -o AITree 


In [9]:
cd ../src_NCList
gcc -o NCList intervaldb.c

Now that all our tools are compiled, we will make sure we can them:

In [11]:
./AIList/bin/AIList AIListTestData/chainOrnAna1.bed AIListTestData/exons.bed | head

chr1	11871	25924	13
chr1	14786	15089	2
chr1	16586	17305	3
chr1	17962	18067	1
chr1	18118	18426	1
chr1	19159	24916	1
chr1	24680	24904	1
chr1	29183	29815	1
chr1	49736	63898	0
chr1	52067	70851	1


In [12]:
time ./AIList/src_AITree/AITree AIListTestData/chainOrnAna1.bed AIListTestData/exons.bed | head

chr1	11871	25924
	13
chr1	14786	15089
	2
chr1	16586	17305
	3
chr1	17962	18067
	1
chr1	18118	18426
	1

real	0m1.314s
user	0m1.292s
sys	0m0.020s


In [9]:
cd 
time ./AIList/src_NCList/NCList AIListTestData/chainOrnAna1.bed AIListTestData/exons.bed | head

0:	43366	41323	1005
1:	34022	32959	554
2:	25183	24276	501
3:	16265	15757	269
4:	18229	17534	371
5:	21608	20636	492
6:	20752	19925	447
7:	14516	14031	261
8:	17835	17188	332
9:	19420	18656	363

real	0m1.045s
user	0m1.028s
sys	0m0.016s


In [16]:
time bedtools intersect -c -a AIListTestData/chainOrnAna1.bed -b AIListTestData/exons.bed | head

chr1	11871	25924	13
chr1	14786	15089	2
chr1	16586	17305	3
chr1	17962	18067	1
chr1	18118	18426	1
chr1	19159	24916	1
chr1	24680	24904	1
chr1	29183	29815	1
chr1	49736	63898	0
chr1	52067	70851	1

real	0m1.920s
user	0m1.848s
sys	0m0.060s


Now, here is how to reproduce the benchmark figures from the paper

## Benchmark code

Now, download some test data for our benchmarks:

In [7]:
cd
wget http://big.databio.org/example_data/sailer/AIListTestData.tgz
tar -xf AIListTestData.tgz

--2019-01-08 17:25:00--  http://big.databio.org/example_data/sailer/AIListTestData.tgz
Resolving big.databio.org (big.databio.org)... 128.143.8.170
Connecting to big.databio.org (big.databio.org)|128.143.8.170|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 413053197 (394M) [application/octet-stream]
Saving to: ‘AIListTestData.tgz’


2019-01-08 17:25:04 (102 MB/s) - ‘AIListTestData.tgz’ saved [413053197/413053197]



Here is additional code that will run meaningful benchmarks on these larger datasets: