Read_extractor 🧬📊

Read_extractor it's a simple Linux script that uses Samtools to obtain reads from several genomic loci in a sorted and indexed BAM file, using a list of loci from a .csv file.

This script gets every read from the specified genomic location (formatted according to the reference genome used), and counts how many times it's repeated, to finally add that number in a new column.

Usage

-i: Path to input file. Must be a sorted and indexed BAM file.

-f: Path to csv file containing the list of loci to extract. You can easily format your loci to csv using Sed, e.g: if you have your loci in a calc or excel sheet, you can change the new line (\n) to comma (,) with sed ':a;N;$!ba;s/\n/,/g' inputfile > output.csv .

-s: Select sense or antisense strand. Works with SAM flags, select 0 for sense or 16 for antisense.

-r: Minimun number of reads to keep.

Example

./Read_extractor.sh -i input_sorted.bam -f Loci.csv -s 0 > ./output/sense.tab

Using Read_extractor_individual.sh will generate one file for each locus in the loci file, it will put the output files in the working directory, don't add ">".

Your output should look like this one:

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
README.md		README.md
Read_extractor.sh		Read_extractor.sh
Read_extractor_individual.sh		Read_extractor_individual.sh
Reads.png		Reads.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Read_extractor.sh

Read_extractor.sh

Read_extractor_individual.sh

Read_extractor_individual.sh

Reads.png

Reads.png

Repository files navigation

Read_extractor 🧬📊

Usage

Example

About

Releases

Packages

Languages

Agusabb/Read_extractor

Folders and files

Latest commit

History

Repository files navigation

Read_extractor 🧬📊

Usage

Example

About

Resources

Stars

Watchers

Forks

Languages