## Example notebook: Structure with pop assignments

This notebook shows how to use the `ipyrad.analysis` tools to generate structure input files that use populations information. 

#### Required software

In [1]:
# conda install ipyrad -c ipyrad
# conda install structure clumpp -c ipyrad

In [2]:
## import modules
import ipyrad.analysis as ipa

### Create a structure analysis object
If you include a 'mapfile' then we will use locus information to subsample just a single SNP from each locus so that the resulting data file will meet the expectations of structure that SNPs are "unlinked". If you create multiple replicates files using different random seeds then different SNPs will be selected in each rep. 

In [3]:
s = ipa.structure(
        name="test", 
        workdir="analysis-structure",
        data="analysis-ipyrad/ped_min10_outfiles/ped_min10.str",
        mapfile="analysis-ipyrad/ped_min10_outfiles/ped_min10.snps.map",
    )

#### Set params for the structure analysis
These values are used to generate the "mainparams" and "extraparams" files for structure. 

In [4]:
## set run parameters
s.mainparams.burnun = 20000
s.mainparams.numreps = 100000

## tell structure to expect popdata & popflag
s.mainparams.popdata = 1
s.mainparams.popflag = 1

## print all mainparams
s.mainparams

burnin             250000              
burnun             20000               
extracols          0                   
label              1                   
locdata            0                   
mapdistances       0                   
markernames        0                   
markovphase        0                   
missing            -9                  
notambiguous       -999                
numreps            100000              
onerowperind       0                   
phased             0                   
phaseinfo          0                   
phenotype          0                   
ploidy             2                   
popdata            1                   
popflag            1                   
recessivealleles   0                   

In [5]:
## tell structure to use popinfo
s.extraparams.usepopinfo = 1

## print all other extraparams
s.extraparams

admburnin           500                 
alpha               1.0                 
alphamax            10.0                
alphapriora         1.0                 
alphapriorb         2.0                 
alphapropsd         0.025               
ancestdist          0                   
ancestpint          0.9                 
computeprob         1                   
echodata            0                   
fpriormean          0.01                
fpriorsd            0.05                
freqscorr           1                   
gensback            2                   
inferalpha          1                   
inferlambda         0                   
intermedsave        0                   
lambda_             1.0                 
linkage             0                   
locispop            0                   
locprior            0                   
locpriorinit        1.0                 
log10rmax           1.0                 
log10rmin           -4.0                
log10rpropsd    

#### By default the 'header' of the str file is empty

In [6]:
s.header

Unnamed: 0,labels,popdata,popflag,locdata,phenotype
0,29154_superba,,,,
1,30556_thamno,,,,
2,30686_cyathophylla,,,,
3,32082_przewalskii,,,,
4,33413_thamno,,,,
5,33588_przewalskii,,,,
6,35236_rex,,,,
7,35855_rex,,,,
8,38362_rex,,,,
9,39618_rex,,,,


#### You can fill it in by appending to the popdata attribute
`popdata` is the *a priori* population assignment of an individual to a population. `popflag` is whether or not to assign that population assignment in the analysis (1) or to leave it to be inferred (0). 

In [7]:
s.popdata = [0, 0, 0, 1, 2, 1, 2, 2, 2, 2, 2, 0, 0]
s.popflag = [1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1]
s.header

Unnamed: 0,labels,popdata,popflag,locdata,phenotype
0,29154_superba,0,1,,
1,30556_thamno,0,1,,
2,30686_cyathophylla,0,0,,
3,32082_przewalskii,1,0,,
4,33413_thamno,2,0,,
5,33588_przewalskii,1,0,,
6,35236_rex,2,0,,
7,35855_rex,2,0,,
8,38362_rex,2,1,,
9,39618_rex,2,1,,


### Write the new structure files
This will write the `.str` file (and subsample SNPs if you included a mapfile) with the header information included, and it will write a mainparams and extraparams file with the parameter settings that we entered above. 

In [8]:
s.write_structure_files(kpop=3)

('/home/deren/Documents/ipyrad/tests/analysis-structure/tmp-test-3-1.mainparams.txt',
 '/home/deren/Documents/ipyrad/tests/analysis-structure/tmp-test-3-1.extraparams.txt',
 '/home/deren/Documents/ipyrad/tests/analysis-structure/tmp-test-3-1.strfile.txt')

### Run structure
This runs the structure command line with the following arguments. 

In [None]:
%%bash
structure -i analysis-structure/tmp-test-3-1.strfile.txt \
          -m analysis-structure/tmp-test-3-1.mainparams.txt \
          -e analysis-structure/tmp-test-3-1.extraparams.txt \
          -K 3 \
          -N 13 \
          -D 12345 \
          -L 17337 \
          -o analysis-structure/tmp-test-3-1.results