{{ message }}

## TCLamnidis / Sex.DetERRmine Public

A python script to calculate the relative coverage of X and Y chromosomes, and their associated error bars, from the depth of coverage at specified SNPs.

Switch branches/tags
Nothing to show

## Files

Failed to load latest commit information.
Type
Name
Commit time

# Sex.DetERRmine

A python script to calculate the relative coverage of X and Y chromosomes, and their associated error bars, from the depth of coverage at specified SNPs.

# Instructions

The python script takes a modified output from samtools depth as input, via stdin. The samtools depth file should be manually modified to include a header that begins with a # and is including the sample names (generic or specific) as column headers, like below:

#Chr	Pos	Sample1	Sample2	Sample3	Sample4	Sample5
1	752566	1	0	1	0	1
1	776546	0	0	0	0	0
1	832918	0	1	0	0	0
1	842013	0	1	0	3	1
...


Alternatively, a Sample/bam list can be provided using the -f option. This list should include 1 name per line, and can be the same list used for the samtools depth command.

For instructions on the options available you can try running the script with the -h flag:

\$Sex.DetERRmine.py -h

usage: Sex.DetERRmine.py [-h] [-I <INPUT FILE>] [-f SAMPLELIST]

Calculate the relative X- and Y-chromosome coverage of data, as well as the
associated error bars for each.

optional arguments:
-h, --help            show this help message and exit
-I <INPUT FILE>, --Input <INPUT FILE>
The input samtools depth file. Omit to read from
stdin.
-f SAMPLELIST, --SampleList SAMPLELIST
A list of samples/bams that were in the depth file.
One per line. Should be in the order of the samtools
depth output.



The script will print out the number of SNPs and the number of reads found on each of Autosomes/X/Y, as well as the relative X/Y coverage and their associated errors.

It is possible to pipe the samtools depth output directly to this script:

samtools depth -a -q30 -Q30 -b <BED File> -f <BAM file list> | Sex.DetERRmine.py -f <BAM file list>


# Mathematical explanation

We assume that sequenced reads are distributed along the genome randomly and independently from each other. The "genome" here is made up only of positions in the input depth file.

Ni is the number of sequenced reads in a a chunk of the genome i, the sum of which is the total number of reads on target, N.

We can then calculate:

Where pi is the proportion of all sequenced reads that map to SNPs in i, estimated from the input depths. The error around Ni is the error of the binomial distribution. Then:

Where di is the average depth on SNPs within i, and Si is the number of SNPs in i.

The relative coverage on the X and Y chromosomes can then be calculated as:

We can then use error propagation to calculate the errors around the relative X and Y coverages:

A python script to calculate the relative coverage of X and Y chromosomes, and their associated error bars, from the depth of coverage at specified SNPs.

1.1.2 Latest
Jun 11, 2020

## Packages 0

No packages published

•
•