# Simlate Genomic Data for Exercise 1 in Applied Genetic Analysis
This notebook is used to simulate a pedigree to be analysed in the course Applied Genetic Analysis in SS2019. The material shown here is based on the JuliaCourse held in december 2018 in Munich.

In a first step, packages `Distributions` and `Random` must be loaded. After that, we are setting a seed.

In [1]:
using Distributions, Random
Random.seed!(2345);

## Initialize The Sampler
In a first step we take the code from day2/dataSimulation and modify it to a much smaller example.

In [2]:
using XSim
chrLength = 1.0
numChr    = 1
numLoci   = 100
mutRate   = 0.0
locusInt  = chrLength/numLoci
mapPos   = collect(0:locusInt:(chrLength-0.0001))
geneFreq = fill(0.5,numLoci)
XSim.build_genome(numChr,chrLength,numLoci,geneFreq,mapPos,mutRate) 

In [3]:
### #checking the result
XSim.G

XSim.GenomeInfo(XSim.ChromosomeInfo[], 0, 0.0, 0.0, Int64[], Float64[])

## Random Mating In Finite Populations
We start by generating a founder population. 

In [4]:
### # specify the number of sires and dams
popSizeFounderSire=10
popSizeFounderDam=40
### # sample the founder population
sires = sampleFounders(popSizeFounderSire);
dams = sampleFounders(popSizeFounderDam);
animalFounders = concatCohorts(sires,dams);

Sampling 10 animals into base population.
Sampling 40 animals into base population.


## Random Mating
The founder cohorts in `basePop` are used to generate offspring from the first generation via randomly mating the sires and dams from the founder cohort. We use `basePopMales` and `basePopFemales` to produce a second generation

In [5]:
ngen,popSize = 1,500
sires1,dams1,gen1 = sampleRan(popSize, ngen, sires, dams);

Generation     2: sampling   250 males and   250 females


## Selection
From the generated animals select some sires.

In [6]:
sires1sel= XSim.Cohort(Array{XSim.Animal,1}(undef,0),Array{Int64,2}(undef,0,0))
sires1sel.animalCohort = sires1.animalCohort[1:25];

## Generation 2

In [7]:
sires = concatCohorts(sires,sires1sel);
dams = concatCohorts(dams, dams1);

In [8]:
ngen,popSize = 1,5000
sires2,dams2,gen2 = sampleRan(popSize, ngen, sires, dams);

Generation     2: sampling  2500 males and  2500 females


## Combining all data

Combining all animals into a singel cohort and writing the data to files


In [13]:
animals=concatCohorts(sires, dams, sires2, dams2);
M = getOurGenotypes(animals);
resVar=9.3
P = getOurPhenVals(animals, resVar);

## Writing The Data
Now that we have generated the data, we must write them to files. The data consist of 

* marker and QTL genotypes
* phenotypic observations
* pedigree information

Before the data is written, we first delete any old files from previous runs. Otherwise new data gets appended to the old files.

In [15]:
outFile = "data_w09"
# delete old files first
run(`\rm -f $outFile.ped`)
run(`\rm -f $outFile.phe`)
run(`\rm -f $outFile.brc`)
run(`\rm -f $outFile.gen`)
# write new output    
outputPedigree(animals, outFile)

## Convert This Notebook


In [11]:
;ipython nbconvert --to slides SimulateDataEx04.ipynb



This application is used to convert notebook files (*.ipynb) to various other
formats.


Options
-------

Arguments that take values are actually convenience aliases to full
Configurables, whose aliases are listed on the help line. For more information
on full configurables, see '--help-all'.

--debug
    set log level to logging.DEBUG (maximize logging output)
--generate-config
    generate default config file
-y
    Answer yes to any questions instead of prompting.
--execute
    Execute the notebook prior to export.
--allow-errors
    Continue notebook execution even if one of the cells throws an error and include the error message in the cell output (the default behaviour is to abort conversion). This flag is only relevant if '--execute' was specified, too.
--stdin
    read a single notebook file from stdin. Write the resulting notebook with default basename 'notebook.*'
--stdout
    Write notebook output to stdout instead of files.
--inplace
    Run nbconvert in place, overwriting 

