Thetas

ANSGD Thetas (diversity stats) calculation. See ANGSD for full details on this method.

Before running this script

Note: This is a piggybacked method

Make sure you have run SFS, the output of which will be used as input for this one i.e. you should run:

bash ./scripts/SFS.sh ./scripts/SFS_TAXON.conf
bash ./scripts/THETAS.sh ./scripts/THETAS_TAXON.conf

with the proper taxon name filled in. See the method page for details on running SFS.

DO_SAF create SFS (default=2)
UNIQUE_ONLY uniquely mapped reads (default=1)
MIN_BASEQUAL minimum base quality (default=20)
BAQ adjust qscores around indels (as SAMtools) (default=1)
MIN_IND minimum number of individuals needed to use site (default=1)
GT_LIKELIHOOD estimate genotype likelihoods (default=2)
MIN_MAPQ minimum base mapping quality to use (default=30)
N_CORES number of cores to use (default=32)
DO_MAJORMINOR estimate major/minor alleles (default=1)
DO_MAF calculate per site frequencies (default=1)
DO_THETAS calculate diversity stats (default=1)
OVERRIDE this variable will redo analyses. Set to false if you want to skip (default=false)
SLIDING_WINDOW this variable, when set to true will enable sliding window analysis (default=false)
WIN window size for sliding window analysis (default=50000)
STEP step size for sliding window analysis (default=10000)

UNIX_USER this variable fills in absolute paths for the rest of the config file and script. It should match the name of the user's home directory.
PROJECT_DIR absolute path to the location of the analysis folder
ANGSD_DIR=${PROJECT_DIR}/angsd
ANC_SEQ the path to the ancestral sequence file
REF_SEQ the path to the reference sequence file
TAXON the name of the data being analyzed. The script will look for files in the data directory with this name. These files include: ${TAXON}_samples.txt and ${TAXON}_F.txt. If these files are not present, the script will not work correctly. ${TAXON}_samples.txt contains a list of paths to BAM files. Check the data folder for an example. ${TAXON}_F.txt contains inbreeding coefficients for each of these samples. Check the data folder for an example.