### qpWave and qpAdm

qpWave and qpAdm allow to model a target population as a mixture of others, given a set of reference groups.

Following (https://comppopgenworkshop2019.readthedocs.io/en/latest/contents/05_qpwave_qpadm/qpwave_qpadm.html):

QpWave and qpAdm are tools for summarizing information from multiple F-statistics, to make demographic inferences. With qpWave and qpAdm we can:

- Detect the minimum number of independent gene pools to explain a set of target populations (qpWave)
- Testing sufficienty of an admixture model within the resolution of data (qpAdm)
- Estimating admixture proportions (qpAdm)

Both qpWave and qpAdm require input file of EIGENSTRAT format, for that we can use the F4_dataset.* we used for the F4 statistics.

In [None]:
mkdir qpAdm
cd qpAdm

#### Prepare the Left and Right populations

To run both qpWave and qpAdm we will need two simple text files: **right** and **left** files. Both files contain a list of populations, with one population per line. It is possible to also use single-sample groups. The population group we list must be available in the third colum of the .ind file.

- The left file, should list the proxy sources of the admixture event we want to test with qpAdm.
- The right file, should list the reference groups: populations differentialy related to the left population and the admixed target.

In [None]:
echo -e "\FranceLIA.Anc_imputed\nNedEMA.Anc_imputed" > left.txt
echo -e "YRI\nBulgarian.HO\nCzech.HO\nEnglish.HO\nFrench.HO\nItalian.North.HO\nNorwegian.HO\nScottish.HO" > right.txt 

### Running qpWave

In order to detect the minimum number of independent gene pools to explain a set of target populations, we are going to run qpWave. qpWave allows to check whether there was any gene flow between the left and the right populations, with the aim to select groups that are as indipendent as possible.

If the right and the left population are independent, we can then move to run qpAdm.

#### preparing the par file
As for all other EIGENSOFT/AdmixTools we are going to prepare a par file for the software.

In [None]:
qpWave -p qpWave_qpAdm.par >> qpWave.log

In the log file, qpWave list the file used, as well as the left and right populations considered for the run.

We are interested at the last lines, where the ranking is. Specifically, we are going to focus at the last ranking row, that correspond to the highest ranking degree. Currently, we are testing N=2 left populations, so the maximum ranking will be N-1.

We are looking for an indication that the populations considered are independent, a p-value < 0.05, at taildiff, will indicate that the selected groups are indeed independent.

### Running qpAdm

So far, we have tested the left and the right groups, thus the proxy sources of the admixture event and the reference groups. We have not yet studied their relationship with the target admixed samples. Now that we know that the right and the left groups are independent, we can proceed running qpAdm with those samples to model the target admixed group.

To run qpAdm, we need to add the target admixed group in the left.txt file. Be sure it is added as first in the list of left.txt

In [None]:
sed  -i '1s/^/KoksijdeEMA.Anc_imputed\n/' left.txt

We can now run qpAdm with the same par file we create before, for qpWave.

In [None]:
qpAdm -p qpWave_qpAdm.par >> qpAdm.log

We modelled **KoksijdeEMA.Anc_imputed** as a mixture of FranceLIA.Anc_imputed and NedEMA.Anc_imputed (following the left.txt file).

- best coefficients is listing the ancestry proporions assigned
- std. errors are the standard errors computed via block jackknife
- summ will given you the summary of the run