# Alignment API Usage

## Import package

In [2]:
from pyBioTools import Alignment
from pyBioTools.common import jhelp

## index_reads

In [8]:
jhelp(Alignment.index_reads)

**index_reads** (bam_fn, min_mapq, keep_unmapped, keep_secondary, keep_supplementary, kwargs)

Index reads found in a coordinated sorted bam file by read_id. The created index file can be used to randon access the alignment file per read_id

---

* **bam_fn** (required) [str]

Path to the bam file to index

* **min_mapq** (default: 0) [int]

Minimal mapq quality of a read to be included in the index

* **keep_unmapped** (default: False) [bool]

Unmapped reads are included in the index

* **keep_secondary** (default: False) [bool]

Secondary alignment are included in the index

* **keep_supplementary** (default: False) [bool]

Supplementary alignment are included in the index

* **kwargs**

Allow to pass extra options such as verbose and quiet



### Basic usage

In [3]:
Alignment.index_reads("./data/sample_1.bam", verbose=True)

Checking Bam file
Parsing reads
13684 Reads [00:00, 20616.59 Reads/s]

Read counts summary
	primary: 10,584
	secondary: 1,496
	unmapped: 1,416
	supplementary: 188



### Excluding reads from index

In [3]:
Alignment.index_reads("./data/sample_1.bam", verbose=True, min_mapq=30, keep_supplementary=True)

Checking Bam file
Parsing reads
13684 Reads [00:00, 20352.85 Reads/s]

Read counts summary
	retained
		primary: 9,824
		supplementary: 188
	skipped
		secondary: 1,496
		unmapped: 1,416
		low_mapq: 760



## sample_reads

In [4]:
jhelp(Alignment.sample_reads)

**sample_reads** (bam_fn, out_folder, out_prefix, n_reads, n_samples, rand_seed, kwargs)

Randomly sample `n_reads` reads from a bam file and write downsampled files in `n_samples` bam files. If the input bam file is not indexed by read_id `index_reads` is automatically called.

---

* **bam_fn** (required) [str]

Path to the indexed bam file

* **out_folder** (default: ./) [str]

Path to a folder where to write sample files

* **out_prefix** (default: out) [str]

Path to a folder where to write sample files

* **n_reads** (default: 1000) [int]

Number of randomly selected reads in each sample

* **n_samples** (default: 1) [int]

Number of samples to generate files for

* **rand_seed** (default: 42) [int]

Seed to use for the pseudo randon generator. For non deterministic behaviour set to 0

* **kwargs**

Allow to pass extra options such as verbose and quiet



### Basic usage

In [7]:
Alignment.sample_reads("./data/sample_1.bam", out_folder="./output/sample_reads/", out_prefix="1K", n_reads=500, n_samples=3)

Load index
	Index: 11188it [00:00, 274508.60it/s]
Write sample reads
	Sample 1: 100%|██████████| 500/500 [00:00<00:00, 1185.16 Reads/s]
	Index output file
	Sample 2: 100%|██████████| 500/500 [00:00<00:00, 1262.96 Reads/s]
	Index output file
	Sample 3: 100%|██████████| 500/500 [00:00<00:00, 1272.82 Reads/s]
	Index output file


In [8]:
ll ./output/sample_reads/

total 4580
-rw-rw-r-- 1 aleg 1584473 May 30 16:47 1K_1.bam
-rw-rw-r-- 1 aleg   10624 May 30 16:47 1K_1.bam.bai
-rw-rw-r-- 1 aleg 1504506 May 30 16:47 1K_2.bam
-rw-rw-r-- 1 aleg   11256 May 30 16:47 1K_2.bam.bai
-rw-rw-r-- 1 aleg 1558478 May 30 16:47 1K_3.bam
-rw-rw-r-- 1 aleg   11272 May 30 16:47 1K_3.bam.bai
