# Downstream analysis
We create a vcf file containing small and structral variants annotated with their AF. Here we are going to go through how to explore those variants. The following table describes the metadata tagged for each variant.


| Metadata      | Description |
| -- |:-----------:|
| AC | Allele count in genotypes|
| AC_Het | Allele counts in homozygous genotypes|
| AC_Hom | Allele counts in heterozygous genotypes|
| AC_Hemi | Allele counts in hemizygous genotypes|
| AF | Allele frequency |
| MAF | Minor Allele frequency |
| NS | Number of samples with data   |
| AN | Total number of alleles in called genotypes |
| HWE | Hardy-Weinberg equilibrium |
| ExcHet | Test excess heterozygosity; 1=good, 0=bad |


Let's first check a file called "samples.csv" containing ethnici information of the 273 samples. The following command print the first 10 samples

In [1]:
%%bash
cat index/SGDP/samples.csv |head|tr -s ',' $'\t'| tr -s ' ' '_'| scripts/prettytable 3

┌────────────┬────────────────────────────────────────┬──────────────────┐
│[0msample      [0m│[0mpopulation                              [0m│[0msuper_population  [0m│
├────────────┼────────────────────────────────────────┼──────────────────┤
│abh100      │Abkhasian_in_Abkhazia_or_Russia_(SGDP)  │EUR               │
│abh107      │Abkhasian_in_Abkhazia_or_Russia_(SGDP)  │EUR               │
│ALB212      │Albanian_in_Albania_(SGDP)              │EUR               │
│Ale14       │Tlingit_in_Russia_(SGDP)                │EUR               │
│Ale20       │Aleut_in_Russia_(SGDP)                  │EUR               │
│Ale22       │Aleut_in_Russia_(SGDP)                  │EUR               │
│Ale32       │Tlingit_in_Russia_(SGDP)                │EUR               │
│altai363p   │Altaian_in_Russia_(SGDP)                │EUR               │
│armenia293  │Armenian_in_Armenia_(SGDP)              │EUR               │
└────────────┴────────────────────────────────────────┴──────────────────┘


The commands below count the number of samples per breed

In [2]:
%%bash
cut -f3 -d, index/SGDP/samples.csv|tail -n+2 |tr -s ' ' '_' |sort |uniq -c| awk '{print $2"\t"$1}' |sort -k2,2nr > tmp
cat <(echo -e "super_population\tcount") tmp | scripts/prettytable 2

┌──────────────────┬───────┐
│[0msuper_population  [0m│[0mcount  [0m│
├──────────────────┼───────┤
│ASIA              │114    │
│EUR               │75     │
│AFR               │44     │
│AUS               │20     │
│N_AMR             │12     │
│S_AMR             │11     │
└──────────────────┴───────┘


# Hail
Although bcftools is very helpful and fast but it is hard to do complex tasks with it. Here we are suggesting using Hail to be able explore the population genotyping results and get meaningful results. Hail is a python library for genomic data expoloration. It creates a matrix table for vcf files which is very similar to R dataframes.

So let's do some coding by intializing Hail engine

In [3]:
import hail as hl
hl.init()
from hail.plot import show
from pprint import pprint
hl.plot.output_notebook()



2023-03-27 11:29:32.190 WARN  NativeCodeLoader:60 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable


Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Running on Apache Spark version 3.1.3
SparkUI available at http://c3-52.farm.cse.ucdavis.edu:4040
Welcome to
     __  __     <>__
    / /_/ /__  __/ /
   / __  / _ `/ / /
  /_/ /_/\_,_/_/_/   version 0.2.107-2387bb00ceee
LOGGING: writing to /home/mshokrof/TheGreatGenotyper/hail-20230327-1129-0.2.107-2387bb00ceee.log


Now we are going to load the vcf and samples information to create Hail Matrix table

In [20]:
vcf="test_output/merged.vcf.bgz"
samplesInfo="index/SGDP/samples.updated.csv"

mt = hl.import_vcf(vcf,reference_genome="GRCh38")
table = (hl.import_table(samplesInfo, impute=True,delimiter=",")
         .key_by('sample'))
mt = mt.annotate_cols(population = table[mt.s])

2023-03-27 11:45:20.685 Hail: INFO: wrote table with 277 rows in 1 partition to /tmp/persist_table4sDFsWhUO1
2023-03-27 11:45:20.867 Hail: INFO: Reading table to impute column types
2023-03-27 11:45:21.086 Hail: INFO: Finished type imputation
  Loading field 'sample' as type str (imputed)
  Loading field 'population' as type str (imputed)
  Loading field 'super_population' as type str (imputed)


Lets see how the hail matrix table is organized

In [18]:
mt.rows().show(5)

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Unnamed: 3_level_0,Unnamed: 4_level_0,info,info,info,info,info,info,info,info,info,info,info,info,info,info
locus,alleles,rsid,qual,filters,AF,UK,AK,MA,ID,AN,AC,NS,AC_Hom,AC_Het,AC_Hemi,MAF,HWE,ExcHet
locus<GRCh38>,array<str>,str,float64,set<str>,array<float64>,int32,array<int32>,int32,array<str>,int32,array<int32>,int32,array<int32>,array<int32>,array<int32>,float64,array<float64>,array<float64>
chr21:13224360,"[""G"",""A""]","""chr21-13224360-SNV-0-1""",-10.0,{},[4.42e-01],31,"[31,0]",0,"[""chr21-13224360-SNV-0-1""]",552,[244],276,[32],[212],[0],0.442,[1.45e-21],[1.31e-21]
chr21:13224428,"[""C"",""T""]","""chr21-13224428-SNV-0-1""",-10.0,{},[3.99e-01],8,"[13,4]",0,"[""chr21-13224428-SNV-0-1""]",552,[220],276,[108],[112],[0],0.399,[1.19e-02],[9.97e-01]
chr21:13224943,"[""C"",""CA""]","""chr21-13224943-INS-0-1""",-10.0,{},[9.91e-01],44,"[24,11]",0,"[""chr21-13224943-INS-0-1""]",552,[547],276,[542],[5],[0],0.00906,[1.00e+00],[9.82e-01]
chr21:13229905,"[""A"",""G""]","""chr21-13229905-SNV-0-1""",-10.0,{},[9.98e-01],18,"[8,29]",0,"[""chr21-13229905-SNV-0-1""]",552,[551],276,[550],[1],[0],0.00181,[1.00e+00],[1.00e+00]
chr21:13231128,"[""A"",""G""]","""chr21-13231128-SNV-0-1""",-10.0,{},[7.37e-01],60,"[31,0]",0,"[""chr21-13231128-SNV-0-1""]",552,[407],276,[270],[137],[0],0.263,[7.01e-07],[3.92e-07]


In [6]:
mt.GT.show(5)

Unnamed: 0_level_0,Unnamed: 1_level_0,'abh100','abh107','ALB212','Ale14'
locus,alleles,GT,GT,GT,GT
locus<GRCh38>,array<str>,call,call,call,call
chr21:13224360,"[""G"",""A""]",0/0,0/0,0/1,0/1
chr21:13224428,"[""C"",""T""]",0/0,0/0,0/1,0/0
chr21:13224943,"[""C"",""CA""]",1/1,1/1,1/1,1/1
chr21:13229905,"[""A"",""G""]",1/1,1/1,1/1,1/1
chr21:13231128,"[""A"",""G""]",0/1,1/1,1/1,0/1


In [21]:
samplesPercohort=mt.aggregate_cols(hl.agg.counter(mt.population.super_population))
print(samplesPercohort)

{'Africa': 44, 'America': 22, 'Central Asia and Siberia': 25, 'East Asia': 47, 'Oceania': 25, 'South Asia': 35, 'West Eurasia': 75, None: 3}


### Stratify population allele frequency
Here we are trying to answer questions like which variants are frequent in the african populations. We are going to calculate allele frequencies per super population.


In [33]:
mt=mt.annotate_rows(AF_AFR=hl.agg.filter(mt.population.super_population =="Africa",
                                     hl.agg.sum(mt.GT.n_alt_alleles())
                                     / samplesPercohort["Africa"]*2 ))
mt=mt.annotate_rows(AF_EUR=hl.agg.filter(mt.population.super_population =="West Eurasia",
                                     hl.agg.sum(mt.GT.n_alt_alleles())
                                     / samplesPercohort["West Eurasia"]*2 ))
mt.rows().show(5)

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Unnamed: 3_level_0,Unnamed: 4_level_0,info,info,info,info,info,info,info,info,info,info,info,info,info,info,Unnamed: 19_level_0,Unnamed: 20_level_0
locus,alleles,rsid,qual,filters,AF,UK,AK,MA,ID,AN,AC,NS,AC_Hom,AC_Het,AC_Hemi,MAF,HWE,ExcHet,AF_AFR,AF_EUR
locus<GRCh38>,array<str>,str,float64,set<str>,array<float64>,int32,array<int32>,int32,array<str>,int32,array<int32>,int32,array<int32>,array<int32>,array<int32>,float64,array<float64>,array<float64>,float64,float64
chr21:13224360,"[""G"",""A""]","""chr21-13224360-SNV-0-1""",-10.0,{},[4.42e-01],31,"[31,0]",0,"[""chr21-13224360-SNV-0-1""]",552,[244],276,[32],[212],[0],0.442,[1.45e-21],[1.31e-21],2.09,1.41
chr21:13224428,"[""C"",""T""]","""chr21-13224428-SNV-0-1""",-10.0,{},[3.99e-01],8,"[13,4]",0,"[""chr21-13224428-SNV-0-1""]",552,[220],276,[108],[112],[0],0.399,[1.19e-02],[9.97e-01],2.68,0.96
chr21:13224943,"[""C"",""CA""]","""chr21-13224943-INS-0-1""",-10.0,{},[9.91e-01],44,"[24,11]",0,"[""chr21-13224943-INS-0-1""]",552,[547],276,[542],[5],[0],0.00906,[1.00e+00],[9.82e-01],4.0,3.89
chr21:13229905,"[""A"",""G""]","""chr21-13229905-SNV-0-1""",-10.0,{},[9.98e-01],18,"[8,29]",0,"[""chr21-13229905-SNV-0-1""]",552,[551],276,[550],[1],[0],0.00181,[1.00e+00],[1.00e+00],4.0,4.0
chr21:13231128,"[""A"",""G""]","""chr21-13231128-SNV-0-1""",-10.0,{},[7.37e-01],60,"[31,0]",0,"[""chr21-13231128-SNV-0-1""]",552,[407],276,[270],[137],[0],0.263,[7.01e-07],[3.92e-07],3.27,2.77


Now we calculated startified AF per cohort lets find the frequent variants in african samples

In [34]:
AFR_Frequent=mt.filter_rows(mt.AF_AFR > 0.7)
AFR_Frequent.rows().show()

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Unnamed: 3_level_0,Unnamed: 4_level_0,info,info,info,info,info,info,info,info,info,info,info,info,info,info,Unnamed: 19_level_0,Unnamed: 20_level_0
locus,alleles,rsid,qual,filters,AF,UK,AK,MA,ID,AN,AC,NS,AC_Hom,AC_Het,AC_Hemi,MAF,HWE,ExcHet,AF_AFR,AF_EUR
locus<GRCh38>,array<str>,str,float64,set<str>,array<float64>,int32,array<int32>,int32,array<str>,int32,array<int32>,int32,array<int32>,array<int32>,array<int32>,float64,array<float64>,array<float64>,float64,float64
chr21:13224360,"[""G"",""A""]","""chr21-13224360-SNV-0-1""",-10.0,{},[4.42e-01],31,"[31,0]",0,"[""chr21-13224360-SNV-0-1""]",552,[244],276,[32],[212],[0],0.442,[1.45e-21],[1.31e-21],2.09,1.41
chr21:13224428,"[""C"",""T""]","""chr21-13224428-SNV-0-1""",-10.0,{},[3.99e-01],8,"[13,4]",0,"[""chr21-13224428-SNV-0-1""]",552,[220],276,[108],[112],[0],0.399,[1.19e-02],[9.97e-01],2.68,0.96
chr21:13224943,"[""C"",""CA""]","""chr21-13224943-INS-0-1""",-10.0,{},[9.91e-01],44,"[24,11]",0,"[""chr21-13224943-INS-0-1""]",552,[547],276,[542],[5],[0],0.00906,[1.00e+00],[9.82e-01],4.0,3.89
chr21:13229905,"[""A"",""G""]","""chr21-13229905-SNV-0-1""",-10.0,{},[9.98e-01],18,"[8,29]",0,"[""chr21-13229905-SNV-0-1""]",552,[551],276,[550],[1],[0],0.00181,[1.00e+00],[1.00e+00],4.0,4.0
chr21:13231128,"[""A"",""G""]","""chr21-13231128-SNV-0-1""",-10.0,{},[7.37e-01],60,"[31,0]",0,"[""chr21-13231128-SNV-0-1""]",552,[407],276,[270],[137],[0],0.263,[7.01e-07],[3.92e-07],3.27,2.77
chr21:13231256,"[""C"",""T""]","""chr21-13231256-SNV-0-1""",-10.0,{},[5.80e-01],55,"[24,31]",0,"[""chr21-13231256-SNV-0-1""]",552,[320],276,[206],[114],[0],0.42,[1.33e-02],[9.96e-01],2.14,2.29
chr21:13231853,"[""C"",""T""]","""chr21-13231853-SNV-0-1""",-10.0,{},[5.89e-01],10,"[0,7]",0,"[""chr21-13231853-SNV-0-1""]",552,[325],276,[228],[97],[0],0.411,[6.47e-06],[1.00e+00],2.27,2.29
chr21:13231869,"[""C"",""T""]","""chr21-13231869-SNV-0-1""",-10.0,{},[5.89e-01],10,"[0,7]",0,"[""chr21-13231869-SNV-0-1""]",552,[325],276,[228],[97],[0],0.411,[6.47e-06],[1.00e+00],2.27,2.29
chr21:13231870,"[""A"",""G""]","""chr21-13231870-SNV-0-1""",-10.0,{},[5.89e-01],10,"[0,7]",0,"[""chr21-13231870-SNV-0-1""]",552,[325],276,[228],[97],[0],0.411,[6.47e-06],[1.00e+00],2.27,2.29
chr21:13232002,"[""A"",""G""]","""chr21-13232002-SNV-0-1""",-10.0,{},[7.19e-01],62,"[31,31]",0,"[""chr21-13232002-SNV-0-1""]",552,[397],276,[282],[115],[0],0.281,[6.57e-01],[3.68e-01],3.32,2.69


We can easily get the ids of the common variants

In [35]:
AFR_Frequent.rows().rsid.collect()[:10]

[Stage 225:>                                                        (0 + 1) / 1]

['chr21-13224360-SNV-0-1',
 'chr21-13224428-SNV-0-1',
 'chr21-13224943-INS-0-1',
 'chr21-13229905-SNV-0-1',
 'chr21-13231128-SNV-0-1',
 'chr21-13231256-SNV-0-1',
 'chr21-13231853-SNV-0-1',
 'chr21-13231869-SNV-0-1',
 'chr21-13231870-SNV-0-1',
 'chr21-13232002-SNV-0-1']

## Explore population genotypes of a specfic variant

Let's explore the population data of a specific variant


In [36]:
SV=mt.filter_rows(mt.rsid=="chr21-13880084-INS-1-38892")
SV.rows().show()

[Stage 228:>                                                        (0 + 1) / 1]

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Unnamed: 3_level_0,Unnamed: 4_level_0,info,info,info,info,info,info,info,info,info,info,info,info,info,info,Unnamed: 19_level_0,Unnamed: 20_level_0
locus,alleles,rsid,qual,filters,AF,UK,AK,MA,ID,AN,AC,NS,AC_Hom,AC_Het,AC_Hemi,MAF,HWE,ExcHet,AF_AFR,AF_EUR
locus<GRCh38>,array<str>,str,float64,set<str>,array<float64>,int32,array<int32>,int32,array<str>,int32,array<int32>,int32,array<int32>,array<int32>,array<int32>,float64,array<float64>,array<float64>,float64,float64
chr21:13880084,"[""A"",""ACAGCAAAACAAAAGGCACCTTGGCTCCTGTTTGCAGTAAGTTCCTCATTTTCATCTCACAGCCTCTCAATCTGATCCTTACTGTCTATTTTCCTGTGAGCCTTCTGATCACAAGTATTCAACAATTCTTTACAAAGATCCAAACTTACCCTCATCTCCCTGTCTTTGAAGTCCTCCAAACTCTCCAGAACTCCATCTGCTACCCCCTTCTGAACCTGCTTCTACATTATCAGCTATCTTTGTCATACCCTGGCAATGTGGTAAAGGAAGACAAGCCCATTTTCCGGGGAAAAATTCAAGGAGGTTTCAGATACTTGAATAAAAAGAAGCTGAGTGCTGATTGCCAAGACATTAGAGAGAAGGCCTTGAAGACATTTAATAGATCCACTTTGCAGTAATAATTTTCTCCATGATCATAAAGAAAAGAGGTTTAATAGGTTAATGATTCTGCAGGCTGTAAGGAAGCATAGTGGCTTCTGCATCTGACAGGACTCAGGAAGCCTCCCAATCATACCAGAACGTCAAGGGGCAAGGAGATGTCTCATTTGGCAGAAGTAGGAGCAAGACAGAGAAAGGAAACAGGTGTCATGCCCCATTACACAAGCAGATCTCATGAGAACTCACTATCACAAGGTCAGCATCTAGAAAATGGTGCTTAAACATTGGTGAAGGATCCGCCCCACACACCCAACTCCCACTGTTTCCAGGCAGAAGCCTCCTTCAGATGCAGAGCCTCTTGGAAAACCTCTATTATGGAAGTGCAGAAGGAAAATATGGGCTTGAAGTCCCCACACAGGTGGCCACCAACCTCCAGACCCCAGATTCATAGGCCCACCAACAACTCACGCCCTTAGTGTAGAAAAGCTACAGGCCCTCAATACCAGCCCACCCCACGAGAACAGCTGAGGGGCTAAAACCTGCAAAGCCACAGGTGCACTGCCCTAGTAGAGGTTTTCCACGAGGCTTTGCCTCTGCAGCAGGCTACTCCCCCTTCCTACTACCCCCCACCCTCCCACCACCCTACTGCCAACCCACTCCTCCCAATCCTACCCATCCCTTTTACCTTCCACCGCCACCAACCCCCTGTCCATAATTAAGTCACCTACTTCAACATTAGGGATTACAATTCCACATGAGTTTCATAGGGACACACAGGAAAACCATATAATTCTGACCCTGATATTCCAGAATCTCATGTCCTTATCACAGAGCAAAATACAGTCATGACCTCTCAAAAGTTTGCAAAAGTCTTAACTCATTCCAAATGTAAAAAATTCAGTCTCATCTGAGACAAGGCCACAGTCCCTTCTGTCTATGAGTCCCTGAATTTAAAACGGAGTTCTTTTCTTTCAAGGTACAATGATGGTAGAGACATTGTGTAAGCTTTCTCAGTCCAAAGGGAAGAAATTTCCCAGAAAAATAACACAAATGGGCCTACAGGCCCAATGCAAGTCCAAAACCCAGCGGGACAGTATTCACTCAGTCTCTCAGCTCCAAAATCATCAAGAGAATTCACTATCATGTGAACAGCATTAAGGAGAGCGTGTTTATCCATTTGTGAAACATCTGACCCCACCCTCATCTTTCACTCCCACCCACAAAAAAATCTCTTCTATTCTCCCCACTCCCTTACCTCCAACCCCCATGCTTCTCCATGATTAAATCGCCTCCCACCAGGCCCCATTTTTAACATTCCCCATTATAATTCCACATGATTTTGGTAGGGATGCAGAGCCAAATCATATTATTCTGACCCTGGCCCCCGTATCTCATGTTCTTCTCACACTGCAAAATACAATGATGCCTTCTCTACAGTTTCCCAATCTCTTAACTCATTCCAGCATTTACTGAAATGTCCAAAGCCCAAAGTCTCTTCTGAGACAAGGCTGCAGTCCCTTCTGCCCCTGAGCCTCTGAAATACAAAGCAAGTTAACTACTTCCAAGGTACAATGATTTTACAGGCATTGGGTAAGTATTCCCAGTCAAAAGGAAAAAATTTGTCAGAAAGAAGCACAAAACACAGTTGAGACTTACAAACACCATGCAAGTCAAAAACCCAGCAGGATAGTCATTGAATCTTACAGCTCCAAATCATTTTTTTTAAATCCAGATCCCACATCCAGAGCACTAGGGTATGAGGCCTGGGCTCCAAAGGCCTTGGGCAGCTCTGCACCTGTGGCTTTGCAAGGTCTAACCTCCACAGTGGCTCTCATGGGCTGGGCTAGTGTTGAGTACCTGCAGCTTTTCCACAATGAGGGTGCAAGCTGGTAGAGAGTCTATGAATCTGGCATTTGCAGAATGGTGTTACCCTGTATGGGAGTTCCAACCCTATATGTTCCTTCTGTACTGCCCTACTAAAGGAGGCTCTGCCTCTTGGAAAATTTTGACCTGGACACCCAGGTTTTTCCATATGTACTCTGGAGTCTAGAAAAAGGATCCCCAGCCTCCAGTTTTGTGCTCTGCGCACCTGCTGGCTTAACACTATGTGGAAGCCACCAAAGCCTGCAGCTTGTGCCCCTTGAAGCAGTGACCCAAGCTGTACATGTGCATCTTTCAGCCATGGCTGGAGCTGGAGCTTCAGGAATGCAGCCAGCAGTGTCCTGAGGATGGACACAAGAGTGGGGCCATGGTGCTGGAGAAGGAAACCATTCTTTTCTCCCAGGCCTCAGGGCCTGTGATAGCAAGGGCTGCTACAAAAGTCTCTGAAATGCCTTCATGGCCTTTTTAACACTGTATTGGCTATTAGCACTCAACTCCATTTTATGCAAATTTCTGAAGACTTCTTGAACTTTCTCACTGACAATCAGCTTTTCTTTTTGACCACTTGGCCAGGCTGCAAATTTTCCAAACTTTTAAATATGTTTCACCATGAGGTCATTTCTTTGGTCACATATAGGACCACAGGCTGTTCGACACAGACAGGACACCTCTTAAGCTTTGCTGCATAAAATTTCCTTCCAACAGATACAATCTAAATTATCACCCTGATGTTCAAAATTTCACAGGTCTCCAGGTTAAGGACATCGTGCCGCAACGTTCTTTGCTAAGGAAAAAACAAAAGTGACCTTGCCTCCTGTTCCCAGCAAGCTCCTCATTTTCATGTGAGACCTTCTAAGCCTGGTGATCACTGTCCATCCTTCTGTCACCTTTTTAATTACAACTATTTGACAAGTCTCTACACTGATCCAAACTTTTCCTCATCTTCCTGTCTTCTTCCAAGACCTCTAAACTCTCCAACCTCTGGCCATTACACACTTCTTAACCTGCTTCTACATTTTCAGCTACATGTGTCACAGCCTGGCAATGTGGTAAAAGAAGAAAAGTCCATTTCAGGAGAAAAATTAATGCAGGCTTCAGACATTTCCCTGAAAAGAAGCTGAGTGCTGATCACCAATACAACAAGGAAAAGGTCTTGAAGGCATTTCATAGCTCCACTTCACAGCACTAATTTTCTGTATAATCAGAAAGAAAAGAGGTTGAACTGGCTCATGGTTCTGCAAGCTTTAAATAAATCATAGAGGTTTCTGCTTCTGGGAGGACTCAGGAAGCCTCCCAATCATACCAGAAGACCAAGCAGCAATGGGATGTTTTATATGGCAGAAGTAGAAACAAGACAGAGAGAGGAAAAAGCTGCCACATGTTGTATAAACCTGTTATACAAGCAGATTTCCTGAGAACTCACTATCACAAGGTCAGCATCAAGAAGATGTTGCTTAACCACTGGTGAAAGATCTGCCCCCTACCACCCCTCCCCCCACTGTTTCCAGGCAGAAGCCTGAGGCAGAGGCAGAGCCACTAGGAAAACCTCTACTAGGGCAGTGCAGAAAAAATATATGGGCTTGTAGGCCCCACACAGGAGGTTACCATCCTCCAGACCCAAGATTCATAGACCCACCAGCAGCTTGCACTCTCAGTATGGAAAAGCCACAGGCACTCAACACCAGCCCAGCCCATGGGAGCAAACACGGGGGCTAAAGCCTGCAAAGCCACAGGTGCACTGCCCTGGTATAGAGGTTTTCCATGAGGCTCTGCCTCTGCAGCAGGCTACTCCCCCTTTCTACTGCCCACCACCCTCTCACCACCCTACTGCCAGCCTATTTCTCCCCACCCTACACACTTGTTTCTCTTCCACCCCTACCACTCTGCCGTTTGTGGTTAAATCACATCCCACTAGGCCCCACCTACAACAGTCAGGAATACAATTCCCCATGAGTTTTTGTAGGGAAACACAGCCAAACCATATTATTCTGATCCTGACACCCCCACATCTCATATCCTTCTCACACAGAAAAATACAAACAGGCCTTTTCAAGAGTTTCCAAAAGTCTTAACTCACTCCAGCAGTAACTCAAATGTAGTAAGTTCAAATCTCAACCAACAGAAGGCTGCAATCTCTTCTGCCTATGAGTCCCTGAATGTAAAAGACAATTCTTTTCTTTCAAGTTATAATGATGGCACAGGCACTGGGTAAGCTTTCTCAAACCAAAGGGAAGATTTTCCCAGAAAAAATAACACAATTGGTACACAGGCCCAATCCGAGTCCAAAACCCAGCAGGACAGCATTCATTTATCATGAGAACTCACTATCACACAGACTGCATTAAGGAGATAGTATTTAACCATTTGTGAAGGATCTGCCACCCATCCCCATGTTTTGCCCTCACCCATACCATGAACCCCCATTCTCCCACATCCCACATCCCCCTTCCAACCCCCATTCCCTACCATGATTAAATCACCTTCTACCAAGCCCCACATTTAACATTCCCCATTACAATTCCACATAAGTTTTGGTAGGGACACAGAGCCAAATCATATTATTCTTCCTTTGGCCCCCCCAATCTCATGTCCTTCTCATACTGCAAAATACAATGATGCATTCTCTACAGTCCCCCAATGTCTGAACTCATTCCAGCATTTACTCAAATGTCCATTTGTGAAGGATCCGCCCCCACCCCTGCCTTTCACCCCCAACCCCACCACAATCCCCCCAACCCTCACCACCCCCCAATCCCCCACCTCCCCACCACCACAATCCCCCCAACCCTCCCCGTCCCACAATCCCCTCAACCCTCCCCACCCTCCCAACCATCCAACCTCCACTCTCCACCAGGATTAAATCACCTTCCACCAGCCCCCACCTTTAACATTTCCCATTAAAATTCCACATGAGTTTTGGTAGAGACACAGAGCCAAAACATATTATTTTGTCCCTGGTCCCCCAAAGTTCATGTCTTTCTCACATTGCAAAATGCAATGATGCCTTCCCTAGAGTCCCCCAAGTCTTAACTCATCCCAACATTTACTCAAATATCCAAAGCCCAAAGTCTCTTCTGAGACAAGGCTGCCGGATCTTTTGCCCCGAGCCTCTGAAATACAAAGCAAGTTAACCACTTCCAAGGTGCAATGATTGTACAGGCTTTGGGTAAGCATTCCCAGCCAAAAAGAAGAAATTTGCCAGAAAGAAGCACAAAACACAGATGGGACTTAAAACTCCCTGCAAGTCAAAAACCCAGCAGGCCAGTCATTCCATCATACAGCTCCAAATCATCTTTTTGGAATCTATGTCCACATCCAGAGCACAGGGTGGTGTGACAGCTGGGATCCCAAGGCCTTGGGCAGCTCTGCACCTGTGGCATTGCAGAATCTTTCCCCTACAGCTGCCCTCATTGGCTAGGCTGGTGTTGAGTGCCTGTAGCTTTTCAACACTAAGGGTGCAAGCAGTTGGTGGGTCTATGAAACTGGGGCCTGGAAAATTGTGCCCCCCTGTATGGGGAATCCAACCCTATACTCTCCTTGTGTACTGCCCCAGTAGAGGTTTACCATGATGCTCTGCCTCTTGGAAAAGCTTCTGGTTGGACAATCAGGCTTTCTGATACATGCTCTGGAGTCTAGATGAAGGCTCAGAAGCTTCTAGACCTTGCTTTGTGCACCTGCTGGCTTAACACCATGTGGAGGCCACCAAGTCTATGAGCTTGCATCCTCTGAAGCAGTGATGCAAGCTGTACCTGTGCATCTTTCATCCATGGCTGGAGATGGAGATGGAGCTGCAGGGATGCAGGCAGCAGTGTCCTGAGGCTGCACACAGCAGTGGAGCCATGGGGCTGGCCCAGGAAACCATCCTTTTCTCCTAGGCCCCAGGGCCAGTGACAGCAAGGGCTGCTACAAAAGTCTCTGAAATGCCTTCAAGGCCTTTTTCCCATTATCTTGAATTATTAGCACTCGGCTCCTTTTTATGCAAATATCTGAAGCCTTCTTGACTTTCCCCCTGAAAATCAGCTTTTCTTTCTGACCACTTGTGGAGATTACAAATTTTCCATATGTTTAAGCTCTGCTTCTCATTTAAATATAAGTTCCAACTTATAGTCATTTCTTTGACCACACATAGGTGCACAGGGTGTTCAATGTAGGTAGGAGAACTCTTGAGCATTGTTGCTTAGAAGTTCATTCCACCAGATACACTCTAAATCATCACCCTCAAGTTCAGTTTCACAGATCTCCTGGGAAGGATCAATGTGTAGTCAATTACTTTGCTAAAGCAAAACAAAAAAACCTTGGCTCCTTTTCCCAGTAAGTCCTTCATTTTCACCTGACATCTTGTAAGCCTGGCCTTCACTGTCCATCCTTATGTTAGCCTTTTAATCACAACTATTTAACAAGTCTCTGCAATGGTCCAAACTTTCCCTCATCTTCCTGTCTTCTTCCAAGCTCTCCGAACTCTCTAACCTCTGGCCATTACCTAATTCAGAACCTGCTTCTACACTGTCAGCTATCTTTTTCGCAGCCTGGCAATGTGGTAAAAGAAGAAAAGACCATTATCAGGGGGAAACATCAAGAAGGTCTCCAATATCTGCATTGAAAAAAGCTCAGTGCTAATAGCCAAGAGATTGGGGGAAAGGCCTAGAAGTCATTTCATAGCTTCACTTCACAGCATTAATTTTCTGTATCTACATAAAGAAAAGAGACTTAGTTGACTCACAGTTCTTCAGGCTTTAAAGAAAGCACAGTGGTTTCTGCTTGTAGGAGGACTCAGGAAGCCTCCCAATCATACCAGAAGGCCAAGCGGCAATGAAATGTTTCATATGGCAGGAGAAGAAGCAAGACAGAGAGAGGAAAGAGGTGCAACATCCTCTTATACAACTAGATCTCATGAGAGCTCACTATCAGGAGATCAGCATCAAGAAGATGGTGCTTCACTGTTGGTGAAGGATTCGCCCACCACCCCATATCCACCACCCACTGTTTCCAAGCACAAGCCTGAGACAGAGGCAGATCCTCTTGGAAAACCTCTACTATGGCTGTTTAGAAGAAAACTATGGGCTTGGAGCCCCCATGGAGGATACCACCATCCTCCAGACCCCAGACCCATAGATCCATCAACAGCTCACACCCCATGTACGGAAAAGCTACAGGCACTCAACACCAGCCCAGCCCATGAGAGCAGCTACAGGTGCTAAACCCTGCAAAGCCCCAGGTGCACTGCCTTAGTAGAGTTTTTCCATGAGCCTCTGCCTCTGCGGCAGGCTACTCCCCTCCTGCTACACACCACCCTACAGCCAGCCTACTCCTCCCCACTTTACCCACCTGTTTTTACGTCCAATCCAACCCCTCTCCCATCCATGAATAAATCACCTCTCACAAGGCCCCACATGCAACATTCGGGATTACAATTACATGTGAGTTTAGGTAGGGACACACAGCTAAACCATATTATTCTGACCCTGATCCCCCGAATATCATATCCTTCTCACAGAGTAAAATACAATCAGGCCTTTTCAAAAGTTGTCGAAAGTCTTAAGTCATTGCAGCATTAAGTCAAATGTAAAAAGTTCAACGTCTCACCTGAGAAAAGGCTACAGTCCCTTTTGCCTATAAGTTCCTGAATTTAAAAGGGATTTATTTTCTTTCAAGGTACAAAGATGGTACAGGCATTGGGTAAGTTTTCTCAATCCAAAGGGGAGAGGTTTGCCAGGAAAATAACACAAATTGGATCACAGGGCCAATGCAAGTCCAAAACCCAGGAGACCACTATCCATTCAATCTCACAGCTCCAAAACCATCATGAGAACTCACCATCATGAAGAAAGAATTAAGGAGATGGTGTTTAACCGTTTGTGAAGGATCCTCCCCCACCCCCACTTTTCACCCCTCACCCCCACCATAATCCACCCATTCTCCCCAATCCCCACCTTCCAATACCCAGTGCCCTCCACGATTAAATCACCTTCTACCTGGCCCCACTTTTAACATTTCTGATTACAATTCCATATGAGTTTCCATAGGGACACACAGCCAAATCTTATTATTCTGTCCCTGCCCCACAAATCTCACGTCCTTCTCACTTTGCAAAATACAATGATGCCTTACTTACCATTCCCCAAGCTACTGTGCTTTTTTTTTACAGCCTGTAGAACCATGAGTCAATTAAACCTTTTTGTTATGATCATACAGAAAATTAGTATTGTGAAGTGAAGCTATGAAATGCCTTCAATGACTATTCCCCATCATCTTGGCTAAGACCCCCAAGGTCTTAACTCATTCTAGCATTTAGTCAAATGTCTGAAGCCCAAAGTCTCATCTGAGACAAAGATGCAGTCCCTTCTCCTCCTGAGCCTCTGAAATACAAAGCAAGTTAACTGCTTCCAAGGTATGATTGTCCAGACATTGAGTAAGAATTCCCAACCAAAAGGAAGATTTTTGCCAGAGAGAAGAACAAAACACAAACGAGACTTACAGGTCCCATGAAAATCCAAAACCCAGCAGGCCAGTTATTCAAACCTACAGCTCCAAAGTCATCCTTTTTCAATCCTTGTCCCATCTCCAGGGCACAAGGGTATGAGGGCTGGGTTCCCAAGGCCTTGGGCAGCTCTCTACCTGTGGCTTTGCAGTGTTCAGTCCCCACAGCTGCCCTCATGGGCTGTGCTGGTGTTGACTGACTGTAGTTTTTACCCACAGAGAGTACGAAGTTCTTGGTGGGTCTATGAATCTGGGGTCTGCATGATGCTGGCCTCCATTGTGGGGGCTACAAGCCCATATTTTCCTTCTGCACTGCCCTAATAGAGGTTTCCCAGGAGGCTCTGCTTTTTGGCTCCTTCTGTCTGGACTCCCAGGCATTTTCATATGTCTTCCGAAATCTATATGAAGGCTCCGAAGCCTCTGGGCTAGTGCTCTGTGCACTCGCGGGCTTAAAACTATATGGAAGCCATGAAGCCTTATAGCCTGTACCCTCTGAAGCAGTGATGCAATCTGTACCGGTGCATCTTTCAGCCAAGTTTGGAACAGGAGCTGGGGCTGCCAGGATGCAGGCAGCAGTGTCCTGAGGCTGCAAACAGCCGCAGGGTCATGGGGCTGGCCCAGGAAACCATTCTTCTCTCCTAGGTCCCAGGGCCTGTGACAGCAAGGGCTGCTGCAAACATCTCTGAAATGCCTCCAAGACTTTTTCCCCCATTGTCTTGGCTATTAGCATTGGCCTCCATTTTATGCAAGTTTCTGGAGCCTTCATGAATTTTCCCCCTGAAAATCAGCTTTTCTTTTTAACCACTTGGCCAGGCTGCGGATATTCCAAACTTTTGAGCTCTGCTTGTCATTTAAATATAAGTTCCAACTTGAGGTCATTTCCTCGGTCACACATAACCTCGGTCACACATGAAAGCACAGGCTGTTTGATGGAGACATGCCCCCACCCTTGTGCTATGCTGCCTAGAAATTTATTCCAACAGATATGCACTAAATCATCACCCTAGAGTTCAAAATTTCACAGATCTTGAGGGCAAGGTCGCCCTGCAGCCATGTTCTTTGCTACAGCAAAACAAAAGCTAACCTTGGCTCCCGTTCCCAGTAAGATCATCATTTTCATCTGAGACCTTGTAAGCCTGGCCTTCACTGTCCATCCTTCTGCCAGGCTTTTAATCACAACTATTTAACAAGTGCCTACAATGGTCCAAACTTTCCTTCATCTCTCTGTCTTCTTTCAAGATCTCCAAACTCTCCAACCTCTGGCTGTTACCCACTTCCGAACCTGCTTTACATTTTCAGCTATCTTTGTTGCAGCCTGGCAATGCAGAAGAAAAAGAAGTCCATTTTCAGGGGGAAACTTCAAGAAGCCTTCAGATATTTGCATTAAAAAGAAGTCCAGTGCTAATAGCCAAGACGATGGGGAAATGTCATTGAAGATATTTCATAGCTCCACTTCGAAGTACTTTATTTTCTGTATGATCATAACGAAAAGGGGTTTAATTGGCTCATGGTTCTGCAGGCTGTAAAGAAAGCATAGTGACTTCTGCTTCTGGGAGGACTCAGGAAGCCTCCCAATCATACCGGTAGGAAAACAGCAATGAAATGTTTCATACAGCAGGAGTAGGAGCAAGGCTGAGAGAGGAAAGTGGTGCCACACCATCCTGTAACCAGATCTCATGAGAACTCACTATCACTAGGTCAGCATCAAGAAGATGGTGCTTAACCATTGGTGAAGGATGCGCCCCCCAACACAGTTCCACCCCCTACCGTTCCAGACAGAAACCTGCTGCAGAGGCAGAGGCTCTTGGAAATCCTGTACTATGGCAGTGCAGAAGGAAAATAAGGGCTTTGAGTGACTATGCAGGAGGCCTCCAGCCTCTAGACCCCAGATTCATAGACATACCAACAGTTCACAACCTCAGTATGGAAAAGTGATAGGCACTCAACACCAGCCCAGCCCATGAGAGCAGCCATGGGGGCTGAAGCCTCCAAAGCCACAGGCGCACTGCCCTGGTAGAGGTTTTCCATGAGCCGCTGACTCTGCGGCAGGCTACTCCCCCTTCCTACTACCCACCACCCTCCCACCACCCTACAGCCAGCCTACTCTTCCCCACCCTACCCACCCCTTTTTTCTTCCACCCCTACCCCTCCCATGCATGATAAAATAATCTCACAGCAGGCCCCAACTCCAACATTTTGGATTACAATTCCACATGAGTTTTTCCAGGGGCACACAGCCAAATCATATTATGCTGACCTTGACCCCCCAAATCTCATATCCTTCTCACAGAATAAAATACAATCGTGTCTTTTCAAAGTTTCCAAAAGCCTTAACTCATTCCCGCATTAACTCAAATGTAAAAAGTTCAAAGTCTCATCTGAGACAAGGCTACAATCTCTTCTGCCTATGAGTCCCTGAAGTTAAAAGGGTGTTCGTTTCTTTCAAGGTACAATGATGGTACAGGTATTGGGTAAGCTTTCTCAATCCAAAGGGAAGAAATTTCCAAGAAAAATAACACAAATGGGACCACAGGCCCAATGGACATCCAAAATCCCGCAGGTCAGTGTTCATTCAGTCTCACAGCTCCAAAATCATGAAGAGAACTCACTATCAGAAGGACGGCATTAAGGAGATGGTGTTTAACAATTTGTGAAGGATGCACCCCCGCCCCTGCCTTACACCCCCAACCCCACCATAATCCCCTCCAACCCTCCTCACCACCCAATCCCCCCCAACCCTCCCCAACCCCCAGCCATCCAACCTGCACTCTCCGTCATGAATAAATCATCTTCCACCAGCCCCCACCTTTAACATTTCCCATTAACATTCCACAAGAGCTTTGGTAGAGACAGAGAGCCAAATCATATTACTCTGTCCCGGGTTGCCCAAAGTTCATGTCTTTCTCACATTGCAAAATGCAATGATGCCTTCCCTAGAGTCTCTCAAATCTTAACTCATTCCAGCATTTATTCAAGTGCCCAAAGCCCAGAGTCTTATCTGAGACAAGTCTACACTCCCTTCTGCCCATGAGCCACTGAATTATATATAAAGGAAGTTTACTACTTCCAAGGAGCAATCATTACACAGGTGTTGGGTAAGCATTCCCAGCCAAAAGGAAAAAAATTGCCAGAAAGAAGCTCAAAACACAGATGGGACTTACAGACCCCATGCAAGTCAAAAACCCAGCAGGCCAGTCATTGAATCCTACAGCTCCCAAATCATCTTTTGTGAATCTATACCCCACATCTGGAGCACAGGGGTGGATGGCTGTGCTCCCAAGACCTTCAGCAGCTCAGCATCTCTGGCTGTGCAGGGTATATCCCCCACAGGTGCCCTCATGGGCTGGGTTGGTGTTGAGTGCCTTTGACTTTTCCACACTGAGGGTGCCAGCAGTTGGTGCATCTATGAATCTGGGGTCTGGAAAATGGTGCATTCCTGTGTGGGGGCTGCAACCCTATATGTTCCTTCTGTACTGCCCTAGTAAATGTTTCCCATGAGGCTCTGCCTCTTGGAAAAGCTTCTGCCTCAACACCCAGGTTTTTCTGTATATACTCTGGAGTCTAGACAAAGGCTCCCAAGCCTCTAGTTCTGTGCTCTGTGCACCTGCTGGCTTAGTACTATGTGGAAGCCACCAAGGTTTGAAGCTTGCACCCCTGAAGCAGTGATGCAAGCTGTACCTGTGCATCTTTCAGCCGTGACTGGAGCTGGAGCTGGAGCTGGAGCTGCAGGGATGCAGACAGCAGTGTCCTGAGGACGGACACAGTAGCGCGACCATGGGACTCAATCAGGAAAGCATTCTTCGGTCCTAGACCTCAGGGCCTGTGACAGCAAGCTCTGCTGCAAAGGTCTCTGAAATGGCTTCAAGGCCTTTTAACCTTGTCTTAGCTATTAGCACTGGGCTCCATTTTATGCAAATTTCTGAAGGCTTCTTGAATTTTCCCACTGAAAATCAGCTTTGCTTTTTGACCACTTGGCCAGGCTGCAAATTTTCCAAACTTTTAAGCCCTGCTTCTCATTTAAATATAAGTTTCAATTTGAGGTCATTTATTCAGTCACAGAGAAGACCACAGGCTGTTCAAAACAGACAAGACACCTCTTGAGCTTTGCTGCCTACTTCATTTCACCAGATATACCCTAAATCATCACACTCAAGTTCAAAGTTTCAGAGGTCTCCAGGGCAGGGACACCATCCTGCCAAGTTCTTTGCTAAGGGAAAACAAAAGTAACCTTGACTCCTGTTCCCAGTAAGTTGCTCATTTTCATCTGAGACCTTCTAAGCCTGGTCTTCACTGTCCATCCTTCAGTCACTATTTTAATTATAGCTATGTAACAAGTCTCCATGGTCACCCTTTTAATTTAACACATCTCTACAATAGTCCAAATTTTCCCTCATCTTTCTTTCTTCTTCCAAGCCCCCCAAACTGTGCAGCGTCCGGTGGTTACCCACTTCTGAACCTGCTTCTACGTTTTCAGCCACCGTTGTGGCAGCCTAGCAATGTGGTGAAAGGAGAAAAGTCCATTTTCAGGGGGAAAATTCAAGAAGGCTTCAGATATTACCATATAAAAGTGGGCAAATGTTAATAGTAAAAAAAAAAAAAAATGGGGAAAAAACCTTGAGGGCATTTCATGGCTCCACTCTACAGTACAAATTTTCTGTAATATTATTTTTAAAAGAGTTTTAATTGGCTCATGGTTCTGCAGGTTGTAAAGGCAGCAAGTGGTTTCTGCTTCTGGGAGGACTCAGGGAGCCTCCCAATCATACCAAAAGGCCAAGTGACAATGAGATATTCCATATGGCAGGAGTAGGAAGAAGACACAGAGAAGGAAGAGGTGCCACACCAGTTTATACAACCAGATCTCATGAGAACTCACTATCAGGAGATCAGCATCAGGAAGATTAACCAATGGTGAAGGATCCACCCACACCACCGCCTACTGTTTCCAGGCAGAAGCCTCCTGCAGAGCAGAACCTCTTGGAGAACCTCTACGAGTGCAGTGCAGAAGGAAAATATGGGCTTGGAGCCCCCACACAGGAGGCCACCATCCTCCAGACTGCAGATACATAGACCCAACAGCTTGCACTCTCTGCGTGGAACAGCTACAGGCACTCCACACCAGCCCAGCCCATGAGAGCAGCCATGGGGGCTACACCCTGCAAAACCACAGGTGCACTGCCCTAGTAGAGACTTTCCCTGAGCCTCTGCCTCTGCAGCAGGCTACTCCCACTTCCTGCTACCCACCACCCTCTCACCACCCTACTAACAACCTACTCCTCACCCTACCCACCCCTTTTCCTTCCACCCCCGGCCCCCGACCATCCACGATTAAATCACCTCCAGCCAGGCCCCACCTCCAACATTAAAGATTACAATTCACTTGAGTTTTGTTAAAGAAACACAGCCAAATCATATTATTCTGACCCTGATCCCCACAGTCTCATGTGCTTCTCACAGAGAAAAATATATTCATGCCTTTTCAAAAGTTTCCAAAAGTCTTAAATCATTCCAACATTAACTCAAATATAAAAAAATCAACTTCTCATCTGAGACAGTTCTACAGTACGTTTTGCCTATGAGTCTCTGAATTTAAAAGGATGTTCTTTTCTTTCAAGGTACAATAATGGTACAGGCTTTGGGTAAGATTTTTCAATCCAGAGGGAAGAAATTTCCGAGGAAGGAAACACAAATGGGACCACAGGCCCAATACAAGTCCAAAACCCAGAAGGCCAGTATCCATTCAATCTTACAGCTCCAAAATCATGAAGAGAACTCACTATCACAAGGACAGCAATAAGGAGAATGTTTAATCATTTGTGAAGCATCCGCCCCACAACTCCCAATTTTCACTCCTCACCCGCACCATAAATCCCCCATTCTCCCTACGCCCCATCTTCCAACCCCAACTCTCCACCGTGATTAAATCACCTTCCACCAGGCCCCACCTTTAACATTCCGATGACAATTCCACATGAGTTTTGGTAGAGACACAGAGCTGAATTTTATTATTCTGTCCCTGGCTCCCCAAATCTCATGTCCTTCTCACATTGAAAAATATAATGATGCCTTCCCTACAATCCCCCAAAATCTTATATCATTACAGCATTTATTCAAATGTTGAAAGCTTAAAGTCTCATCTGGCACAAGGCTACAGTTGCTTAGGCCCATGAGCCTCTGAACTATAAAGCAAGTTAACTACTTCCAAGGTACAATGTTTGTAGAGCCATTGGGTAAGCATTCCCAGCCAAAAGAAAGAATTTTGCCAGAAAAAAAACAAAACATAGACAGGACTTACCGGTCCCATGAAACTCCAAAACCAAAAGGCCAGTCATTCAATCCTACAGCTCCAAAATCACCCTTTTTGAAACCCTGTCCCACATCCAGGGCACAGGGGTGTGAGGGCTGGGCTCCCAAGGCCTTGGGCAGCTTGGCACCTGTGGCTTTGCAGGGTTTATGCCCACGGCTGCCCTCAGGACCTCGGCTGGTGTTGAGTGCCTGTGACTTTTCCCCACTAAGGATACAAGTTGTTGGGGGTCTATGAATCTGGGGTCTGCATGATGGTGGCCTCCAGTGTGGAGGCTCCAACCCCATGTTTTCCTTCTGCACTGCCCTAGTAGAAGTTTCATATAAGGCTCTGCCTTTTTGGGATGTTTTTGCCTGGACACCCAGGCATTTCCATACATCTTCCAAAATCTATAGAGAGGTTTCCAAGCCTCTGGTCTCATGCTCCGTCCACCAATGGCTTAACACAATGAGGAAATTACCAAGGCTTCTAGCTGGCACCCTCTGTAGCAGTGACCCAAGCTGTAGCTGTGCATCTTTCAGTCATGGCTGGAGCTGGAGCTGGAGCTGGAGCTGGAGCTGCAGGGATGCAGGCAGCAGTGTCCTGAGGCTGCACACAGAGGGGGGGCATGGAACTGGCCCAGGAAACCATTCTTCTCTCCTAGGCCCCAGGGCCTGTAACAGCCAGGGCTGCTGCAAAGGTCTCTGAAATGCCTTCAAGGCCTTTTCCCTATTGTCTTGTCTATTAGCACTGGGCTCCTTTTCATGCGAGTTTCTGAAGCCTTCCTCAACTTTCCCCCTGAAAATCAGCTTTTCTTTTTGACCACATGGCCAGGCTGCAAATTTTCCAAACTTTTGAGTTCTGTTTCTCATGTAATGTGAGAGTTGGGACTCATTTAATGTAAGTCTCATCCAGAAGTCATTTCCTCCATCACACATAAGAACACAGGCTGTGTGATGCAGACAGGACACCTCTTGAGTTTGCTGCTCAGTTCATTCCACCAGATACTCAGTAAATCCTCACCCTCAAGTTCAAAATTTCACAGATCTCCAGGGCGAGGTCACCGTGCAGCCATGTTCTTTGCTAAGGAAACAAAAGTAACTTTGACTTCTGTTCCCAGTAAGAGCTTCATTTTCATCTGAGACCTTCTAAGTCGGGCTTTCACTAACCATTTTCCTGTGAGCCTTCTGATCACAAGTGTTTAGCAATTCTTTACAAAGATCCAAACTTTCTTTCATCTTCTTGTCTTTGAAGCCCTCCAAACTCTCCCGACCTCTGTCCGCTACTCCCTTCTGAATCTGCTTCTACATTATCACTATCTTTGCCACAGCCTGGCAATGTGGTAAAGGAAAACAAGTCCATTTTCAGGGGAAAATTCATGAAGCCTTCACATACTTGAATGAAAAGAAGCTGAGTGCTGATTGCCACGACAATGACATTTAATAGTTCCACTTTGCACTACTAATTTTCTCTATGATCATAAAGAAAAGAAGTTTAATTGGCTCATGATTCTGCAGACCATAAGGAAACATAATGGCTTCTGAATCTGGGAGGACTCAGGAAGCCTTCCAATCATACCAGAATGTCCAAGGGCAATGACATGATTCATGTGGCAGGAGTAGGCACAAGACACACACAGGAGAGAGGACCACACCCTATTATACAAACAGATCTCATGAGAACTCACTATCACAAGGTCAGCATCATGAAGATGGTGCTTAAACATTGAGGAAGGAACAACCACCCACCCCCAACTCCCACTGTTTCCAAACAGAAGCCTGCTGCAGAGGCAGAGCCTCTTGGAAAACCTCTACCAGGGAAGTGTGGAAGGAAAATATGGGCTTGAAGCCCCCAGGCAGATGGCCACCAACCTCCAGACACCAGATTCATAGACCCACCAACAGCTCACACCCTCTGTGGAAAAGCTACAGGCACTCAACAACAGCCCAGACCGTGAGAGCAGCTGCAGGGGCTAAACCCTGCAAAGCCACAGGTGCTCTGTCCTAGTAGAACTTTTCCATGAGGCTTTGCCTCTGCAGCAGGCTGCTCCCCCTTGCTACTACCCCCCACGCTCCCACCATTCTACTGCCAGCCTACTCCTCCCCACCCTAACCAACCCTTTTGTGATACCCTACCTTGTTTTAACCTGGTCGACTCTCCCTTAGCTGAGAGGGCCAGACAGACTCCATCTTGGCTCCTTCACTTGCAGCCCCTTACCCACCCCCCTTCCTCAAGGACTTAACTTGTGCAAGCTGACTCCCAGCACATCAAAGAATGCAATTACTGATAAGATACTCTGGCAAGCTATATCCACAGTTCCCAGGAATTCGCCCGGTTGATAGTACACAAAACCCCAGCATTTGTGTCCAGTTGATAGCACCCAAAGCCCCCACATCTATCACCTTTGGATGGATTTAAAGCCCCTGCACATGGAAATGTTTGTTTTCCTGTAGCCATTTATCTTTTTAACTTTTTTGCCTGTTTTGCTGCTGTGAGAGTCCTTCAGCGAGGCTCCCCCTCCCCTTTCTAAACCAAAGTATAAAAGAAAATCTAGCCCCTGCTTCCGGGCCAAGAGAATTTTGAGCACTAGCGGGCTCTCAGTTGCCGGCAATAAAGGTCTCCTGAAGTCGTCTCATGGTTTGGCGTTTCTCTACAACTCACTCGGTTACAACCCTTTTCCTTCCACCCCAACCTCCTCCAATCAATGACTAAATGATCTCCCCCAGGCCCCACCTTCAACATTTGGAATTACAATTGCACCTGAATTTTTATAGGGACACACAGCCAAACCATATTATTCTGATCCTGATATTCCAGAATCTCATGTCCTTAACACAGAGCAAAATACAATCATGCCTTTTCAAAACTTCCAACAGCCTTAACTCATTCCAAGTGTAAAAAGTTCAAAGTCTCATCTGAGACAAGGCTACTGTCCCTTCTGCCCATGGGTTCCTGAATTTAAAAGAGATTTCTTTTCTTTCAAGGTACAATGATGGTACAGGTGTTGTGTAAGCTTTCTCAATCCAAAGGGAAGAAATTTCCCAGAATACACAAACGGGACCGCAGGGCAAATGCAAGTCCAACACCCAGCAGGACACTATTCACTCAATCTCACAGCTCAAAAATCATCAAGAGAACTCACTGTCATGCGGACAGCATTAAGGAGATAGCGTTTACCCATTTGTGAAGAATCTTCCATCCCACCCTCATCTTTCACTCCCACCCACAAAATAATCTCTCCCATTCTCCCCACACCCCTACCTCCAACACCCACTCTTCTCCATGATTAAATCACCTCCCACCAGGTCCCACCTTTAACATTTCCCACTACAATTCCACATGAATATTGGTAGAGAGACAGAATCAAATCGTATTATTCTGACTCTTGCTCCCCAAATCTTGTATCCTTGTCACGCTGCAAAATACAATGATGACTTCTCTACTGTCCCCCAATGACTTAACTCATTCCAGCATTTACTGAAATGTCCAAGGACTTACAGACCCCATACAAGTCAAAAACCCAGCAGGCCAGTCATTGAATCCTACAGCTCCAAATCATTTTTTCTGAATGGATATCTCACATCCAGAGCACAGGTGTGTCATGGCTGGGCTCCCAAGGCCTTGGGCAGCTCTGCACCTGTGGCTGTGCAGGGTCTATCCCCCATAGCTGCCCTCATGGGCTGGGCTGGTGTTGAGTGCCTGTAGCTTTTCCACACCAAGGTTGGAAAAGCCAGCTGTTGGTGGGTCTATGAATCTGGGGTCTGGAGAATGGTGCCTCCATGTTTAGGGACTCCAGTCTTATATTCTCCTTCTGTACTGCCCTAGTAAAGGTTTCCCATGAGGCTCTGCCTCTTGGAAAAGCTTCTACCTGAAAACTCAGGTTTTTCTGTAGATACTCTGGAGTCTAGACAAAGGCTCCCAAGCCTCTAGTTTTGTTCTCTGTGCACCTGCTGGCTTAACACTATGTGGAAGCCACCAAAGCTTGAAGCTTACACCCCTGAAGCATGATGCAAGCTGTACCTGTGCATCTTTCAGCCATGGCTGGAACTAGAGCTGCAGGGATGCAGGCAGCAGTGTCTTGAGACTGCACATAGAGGGGGGTCATGGAACTGGCCTAGGAAACCATTCTTCTCTCCGAGGCCCCAGGGCCTATGATAGCAAGGGCTGCTGCAAAGGTTTCTGAAATGGCTTCAAGGCCTTTTCCCTATTGTCTTGGCTATTAGCACTGGGCTCCTTTTCATGAAAATTTCCGAAGCCTTCCTCAGTTTTCCCCTGAAAATCAGCTTTTCTTTTTGACCATTTGGCCAGGCTGCAAATTTTTGAGTTCTGTTTCTCATTTAATATAAGAGTTGGGACTCATTTAATGTAAGACCCATCCAGATGTCATTTCCTCGGTCACACATAAGGGCACGGGCTGTTCGATACAGACAGGACACCCCTTGAGCTTTGCTGTCCAGAAGTTTATTCCATCAGATACACAGTAAGTCATCACCCACAAGTTCAAAGTTTCACAGATCTCCAGGGCAGTGTCACTGTGCAGCCACGTTCTTTGCTACAGCAAAACAAAAGGCACCTTGGCTCCTGTTTGCAGTAAGTTCCTCATTTTCATCTGAGAGCTTCTCAATCTGATCCTTACTGTCTATTTTCCTGTGAGCCTTCTGATCACAAGTATTCAACAATTCTTTACAAAGATCCAAACTTACCCTCATCTCCCTGTCTTTGAAGTCCTCCAAACTCTCCAGAACTCCATCTGCTACCCCCTTCTGAACCTGCTTCTACATTATCAGCTATCTTTGTCATACCCTGGCAATGTGGTAAAGGAAGACAAGCCCATTTTCCGGGGAAAAATTCAAGGAGGTTTCAGATACTTGAATAAAAAGAAGCTGAGTGCTGATTGCCAAGACATTAGAGAGAAGGCCTTGAAGACATTTAATAGATCCACTTTGCAGTAATAATTTTCTCCATGATCATAAAGAAAAGAGGTTTAATAGGTTAATGATTCTGCAGGCTGTAAGGAAGCATAGTGGCTTCTGCATCTGACAGGACTCAGGAAGCCTCCCAATCATACCAGAACGTCAAGGGGCAAGGAGATGTCTCATTTGGCAGAAGTAGGAGCAAGACAGAGAAAGGAAACAGGTGTCATGCCCCATTACACAAGCAGATCTCATGAGAACTCACTATCACAAGGTCAGCATCTAGAAAATGGTGCTTAAACGTTGGTGAAGGATCCGCCCCACACACCCAACTCCCACTGTTTCCAGGCAGAAGCCTCCTTCAGATGCAGAGCCTCTTGGAAAACCTCTATTATGGAAGTGCAGAAGGAAAATATGGGCTTGAAGTCCCCACACAGGTGGCCACCAACCTCCAGACCCCAGATTCATAGGCCCACCAACAACTCACGCCCTTAGTGTAGAAAAGCTACAGGCCCTCAATACCAGCCCACCCCACGAGAACAGCTGAGGGGCTAAAACCTGCAAAGCCACAGGTGCACTGCCCTAGTAGAGGTTTTCCACGAGGCTTTGCCTCTGCAGCAGGCTACTCCCCCTTCCTACTACCCCCCACCCTCCCACTACCCTACTGCCAACCCACTCCTCCCAATCCTACCCATCCCTTTTACCTTCCACCGCCACCAACCCCCTGTCCATAATTAAGTCACCTACTTCAACATTAGGGATTACAATTCCACATGAGTTTCATAGGGACACACAGGAAAACCATATAATTCTGACCCTGATATTCCAGAATCTCATGTCCTTATCACAGAGCAAAATACCATCATGACCTCTCAAAAGTTTGCAAAAGTCTTAACTCATTCCAAATGTAAAAAATTCAGTCTCATCTGAGACAAGGCCACAGTCCCTTCTGTCTATGAGTCCCTGAATTTAAAACGGAGTTCTTTTCTTTCAAGGTACAATGATGGTAGAGACATTGTGTAAGCTTTCTCAGTCCAAAGGGAAGAAATTTCCCAGAAAAATAATACAAATGGGCCTACAGGCCCAATGCAAGTCCAAAACCCAGCGGGACAGTATTCACTCAGTCTCTCAGCTCCAAAATCATCAAGAGAACTCACTATCATGTGAACAGCATTAAGGAGAGCGTGTTTATCCATTTGTGAAACATCCGCCCCCCACCCTCATCTTTCACTCCCACCCACAAAAAAATCTCTTCTATTCTCCCCACTCCCTTACTTCCAACCCCCATGCTTCTCCATGATTAAATCGCCTCCCACCAGGCCCCATTTTTAACATTCCCCATTATAATTCCACATGATTTTGGTAGGGATGCAGAGCCAAATCATATTATTCTGACCCTGGCCCCCGTATCTCATGTTCTTCTCACACTGCAAAATACCATGATGCCTTCTCTACAGTTTCCCAATCTCTTAACTCATTCCAGCATTTACTGAAATGTCCAAAGCCCAAAGTCTCTTCTGAGACAAGGCTGGGGTCTCTTCTGCCCCTGAGCCTCTGAAATACAAAGTAAGTTAACTACTACCAAAGTACAGTGATTGTACACACATTGGGTAAGTATTCCCAGCCAAAAGGAAGAAATTAGCCAGAAAGAAGAACAAAACACAGATGGGACTTACAAACCCCATAGGAGTCAAAAATCCAACAGCCAGTCATTGAATCCTACAGTTCCAAACCATTTTTTTTGAATCCAGATCCCACATTCAGAGCACAAGGGTATGAGGCCTGGGCTCCCAAGGCCTTGGGCAGCTCTGCACCTGTGAAGTTGCAGGGTCTAACCTCCACAGTTGTCCTCATGGGCTGGGCTGATGTTGAATACCTATAGGTTTTCCACAATGAGGGTGCAAGCTGCTAGTAAGTCTATGAATCTGGTGTTTGCAGAATGGTGCCTCCCTGTATTGGGGTTCCAACCCTATATTTTCCTTCTGTACTGCCCTAGTAAAGGAGGCTCTGCCTCTTGGAAAAATTTCGACCTGGACACCCAACTTTTTCCATACATACTCTGGAGTCCAGACAAAGGATCCCCAGCCTCCAGTTTTGTGCTCTGAGCACCTGCTGGCTTAACACTATGTGGAAGCCACCAAGGCTTGCAGTTTGCACCCTCGGAAGCAGTGACCCAGGCTGTACCTGTGCATCTTTCAACCATGGCTGGAGCTGGAGCTTCAGGAATCCAGCCAGCAGTGTCCTGAGGATGGACATAGCAGCGGGGTCATGGGGCTGAAGAAGGAAACCATTCTTTTCTCCCAGGCCTCAGGGCCTGTGATAGCAAGGGCTGCTGAAAAAGTCTCTGAAATGCCTTCAAGGCCTTTTTAACATTGTATTGGCTATTAGCACTGAACTCCATTTTATAAACATTTCTGAAGCCTTCTTGAACTTTCCCACTGATAATAAGCTTTTCTTTTTGACCACTTGGCCAGGCTGCAAATTTTCACAACTTTTAAGCTCTCCTTGTCATTTAAATATGTTTCACCTTGAGGTCATTTCTTTGGTCACATATTTTCATGTGAGACCTTCTAAGCCTGGTGGTCACTGTCCATCCTTATGTCACCTTCTTAATTATAACTATTTAACAAGTCTCTACACTGATCCAAACTTTTCCTCATCTTCCTGTCTTCTTCCAAGACCTCCAAACTCTCCAACCTCTGGCCATTACACACTTCTTAACCTGCTTCTACATTTTCAGCTACATGTGTCACAGCCTGGCAATGTGGTAAAAGAAGAAAAGTCCATTTCAGGAGAAAAATTAATGCAGGCTTTAGACTTTTGCCTGAAAAGAAGCTGAGTGCTGATTGCCATGACAATAGGGAAAAGGTCTTGAAGGCATTTCATAGCTCCACTTTACAGCATTAATTTTCTGTATAATCAGAAAGAAAAGAGGTTGAACTGGCTCATGGTTCTGCAAGCTTTAAATAAATCATAGAGGCTTCTGCTTCTGGGAGGACTCAGGAAGCCTCCCAATCATACCAGAAGACCAAGCAACAATGGGAAGTTTTATATGGCAGAAGTAGAAACAAGACAGAGAGGAAAAAGGTGGCACACGTTGTATAAACCTGTTATACAACCAGATTTCCTGAGAACTCACTATCACAAGGTCAGCATCAAGAAGATGTTGCTTAACCATTGGTGAAAGATCTGCCCCCTACCACCCCCACCCCCCACTGTTTCCAGGCAGAAGCCTGAGGCACAGGCAGAGCCTCTTGGAAAACCTCTACTAGGGCAGTGCAGAAAAAATATATGGGCTTGGAGGCCCCACGCAACCATCCACCAGACCACAGATTCATAGACCCAACAATAGCTTGCATTCTCAGTATGGAAAAGCCACAGGCACTCAACACCAGCCCAGCCGACGAAGGCAGCCATGGGGGCTAATGCCTGCAAAGCCACAGGTGCACTGCCCTGGTAGAGCTTTTCCATGAGGCTCTGCCTCTGCAGCAGGCTACTCCCTCTTCCTACTGCCCACCACACTCTCACCACCCTACTGCCAGCCTACTCCTCCCCACCCTACCCACTTGTTTTCCCTTCCACCCCTACCAACCTCCCATTTGTGATTAAATCACTTCCCACCAGGCCCCACCTGCAACAGTCAGGAATACAATTCACCAAGAGTTTTTGTAGGGAAACACAGTCAAACCATATTATTCTGACCCTGAAACCCCCACATCTCATGTCCTTCTCACACAGAAAAATACAAACATGCCTTTTCAAAAGTTTCCAAAAGTCTTAACTCATTCCAGCAGTAACTCAAATGTAGTAAGTTCAAGTCTCATCCAAGACAAGGCTGCAATCTCTTCTGCCTATGAGTCCCTGAATGTAAAAGACAATTCTTTTCTTTCAAGTTACAATGATGGCACAGGCACTGGGTAAGCTTTCTCAATCCAAAGGGAAGATTTTCCCTGAAAAATAACACAACTGGGACACAGGCCCAATCCGAGTCCAAAACCCAGCAGGACAGCATTCATTTATCATGAGAACTCACTATCGCACAGACTGCATTAAGGTGATAGTATTTAACCATTTGTGAAGGATCTGCCACCCATCCCCATGTTTCACCCTCACCCACACCATGAACCCCCATTCTCCCACATCCACCTTCCAACCCCCATTCTCTACCATGATTAAATCACCTTCTACCAAGCCCCACATTTGACATTCCCCATTACAATTCCACATGAGTTTTGGAAGGGACACAGAGCCAAATCATATTATTCTCCCCTTGGTCCCCAACCTCATGACCTTCTTATACTGCAAAATATAATGATGTGTTCTCTAAGGTCCCCCAATGTCTGAACTCATTCCAGCATTTACTCAAATGTCCATTTGTGAAGGATCCACCCCCCACCCCTCCCTTTCAGCCACAAATGCACCACAATCCGCCAACACTCCCCACCCACCTATAGCCCCAACCCTCCACACCACCCCCAGCATCCACCCTCCATGCTCCACCATGATTAAATCACCTTCCACCAGCCCCCACCTTTAACATTTCCCATTAAAATTCCACATGAGTTTTGGTAGAGACACAGAGCCAAAACATATTATTCTGCCCCTGGTCCCCCAAATCTCATGTCTTTTTCATATTGCAAAATGCAATGATGCCTTCCCTAGAGTCCCCTAAATCTCAACTCATTCCAGCCTTTACTCAAATATCCAAAGCCCAAAGTCTCTTCTGAGACAAGGCTGCAGTCTCTTCTGCCCTGAGACTCTGAAATACACAGCAAATTAACTACTTCCAAGGTACAATGACTGTACTGGCATTGGTTAAGCATTCCCAGGCAAAAAGAAGAAATTTGCCAGAAAGAAGCATAAAACACAGATGGGACTTACAAACTCCCTGCAAGTCAAAAGCCCAGCAGACCAGTCATCCATCATACACCACCAAATCACCTTTTTGGAATCTATGTCCACATCCAGAGCACAGGGTGATGTGACAGCTGGGATCCCAAGGCCTTGGGCAGCTCTGCACCTGTGGCATTGCAGAATCTTTCCCCTACAGTTGCCCTCATTGACTAGGCTGGTGTTGAGTGCCTGTAGCTTTTCAACACTAAGGGTGCAAGCAGCTGGTGCGTCTATGAAACTGGGGCCTGGAAAATGGTGCCTCCCTGTATGGGGACTCCAACCCTATACTCTCCTTGTGTACTGCCTGAGTAGAAGTTTACGATGAGGCTCTGCCTCTTGGAAAAGCTTCTGGCTGGACAATCAGGCTTTCTGATACATCCTCTGAAGTCTAGATGAAGGCTCAGAAGCTTCTAGTCCTTGCTTTATGCAACTGCTGGCTTAACACCTTGTGGAAGCCACCAAGTCTTCGAGCTTGCATTCTCTGAAGCAGTGACACAAGCTGTACCTGTGCATCTTTCATCCATGGCTGGAGCTGGAGATGGAGCTGCAGGGATGCAGGCAGCAGTGTCCTGAGGCTGCACACAGCAGTGGAGCCATGGGGCTGGCCCAGGAAACCATCCTTTTCTCCTAGGCCCCAGGGCCAGTGACAGCAAGGGCTGCTACAAAAGTCTCTGAAATGCCTTCAAGGCCTTTTTCCCATTATCTTGAATTATTAGCACTCGGCTCCTTTTTATGCAAATATCTGAAGCCTTCTTGACTTTCCCCCTGAAAATCAGCTTTTCTTTCTGACCACTTGTGGAGATTACAAATTTTCCATATGTTTAAGCTCTGCTTCTCATTTAAATATAAGTTCCAACTTATAGTGATTTCTTTGACCACACATAGGTGCGCAGGCTGTTCAATGTAGGCAGGACAACTCTTGAGCATTGTTGCTTAGAAGTTCATTCTACCAGATACACTCTAAATCATCACCCTCAAGTTCAGTTTCACAGATCTCCTGGAAAGGATCAATGTGTAGTCAATTACTTTGCTAAGGCAAAACAAAAAAACCTTGGCTCCTCTTCCCAGTGAGTACTTCATTTTCACCTGAGATCTTGTAAGCCTGACATTCACTGTCCATCCTTATGTTAGCCTTTTAATCACAACTATTGAACAAGTCTCTGCAATGGTCCAAACTTTCCCTCATCTTCCTGTTTTCTTCCAAGATCTCCGAACTCTCTAACCTCTGGCCATTACCTAATTCAGAACCTGCTTCTACACTATCAGGTATCATTTTCGCAGCCTGGCAATGTGGTAACAGAAGAAAAGTCTATTATCAGGGGGAAACATCAAGATGGTATCCAATATTTGCATTGAAAAAAGCTCAGTGCTAATAACCAAGAGATTGGGGGAAGGCCTAGAAGTCATTTCATAGCTTCACTTCGCAGCATTAATTGTCTGTATGTACATAAAGAAAAGAGGTTTAGTTGACTCACAGTTCTTCAGGCTATAAAGAAAGCATAGTGGATTCTGCTTGTAGGAGGACTCAGGAAGCCTCCCAATCATACCAGAGGCCAAGCAGCAATGAAATGTTTCATATGCCAGAAGTAGAAGCAAGACAGAAAGAGGAAAGAGGTGCGACATCCTCTTATACAACTAGATCTCATGAGAGCTCACTGTCAGGAGATCAGCATCAAGAAGATGGTGCTTCACTGTTGGTGAAGGATCCGCCCACCACCCCATATCCACCACCCACTGTTTCCAAGTAGAAGCCTGAGACAGAGGCAGATCCTCTTGGAAAACCTCTACTATGGCTGTTTAGAAGAAAACTATGGGCTTGGAGCCCCCATGTAGGATACCAGCATCCTCCAATCCCCAGATTCATAGAACCATCAACAGCTCACACCCCCAGTATCGAAAAGCTACAGGCACTCAACACCAGCCCAGCCCATGAGAGCAGCTACAGGTGCTAAACCCTGCAAAGCCCCAGGTGCACTGCCTTAGTAGAGTTTTTCCATGAGCCTCTGCCTCTGCGGCAGGCTACTCCCATCCTGCTACACACCACCCTACAGCCAGCCTACTCCTCCCCACTTTACCCACCTGTTTTTACGTCCAATCCAACCCCTCTCCCATCCATGAATAAATCACCTCCCACCAGGCCTCACCTGCAACATTGGGGATTACAATTACATGTGAGTTTAGGTAGGGACACACAGCTAAACAATATTATTCTGACCCTGATCCCCCAAATATCATATCCTTCTCACAGAGTAAAATACAATCATGCCTTTTCAAAAGCTGCCAAAAGTCTTAAGTCATTTCAGCATTAACTCAAATTTAAAAAGTTCAAAGTCTCACCTGAGAAAAGGCTACAGACCCTTTGGCCTATAAGTTCCTGAATTTAAAAGGGATTTATTTTCTTTCAAGATACAAAGATGGTACAGGCATTGGGTAAGTTTTGTCAATCCAAAGGGGAGAGGTGTGCCAGGAAAATAACACAAATGGGATCACAGGGCCAATGCAAGTCCAAAACCCAGGAGGCCAGTATCCATTCAATCTCACTGCTCCAAAACAATCACAAGAACTCATCATCATGAGGAAAGGATTAAGGAGATGGTGTTTAACCATTTGTGAAGGATCCTACCCCCACCCCCACTTTTCACCCCTCACCCCCACCATAATCCACCCATTCTCCCCAATCCCCACCTTCCAATACCCAGTGCCCTCCACGATTAAATCACCTTCCACCTGGCCCCACTTTCAACATTTCTGATTACAATTCCATATGAGTTTCCATAGGGACACACAGCCAAATCTTATTATTCTGTCCCTGCCCCACAAATCTCATGTCCTTCTCACTTTGCAAAATACAATGATGCCTTACTTACCATTCCCCAAGCCACTGTGCTTTTTTTTTACAGCCTGCAGAACCATGAGCCCATTAAACCCCTTTTTGTTATGATCATACAGAAAATTAGTATTGTGAAGTGAAGCTATGAAATGCCTTCAATGACTTTTCCCCATCATCTCGGCTAAGACCCCCAAGGTCTTAACTCATTCCATCATTTACTCAACTGTCTGAAGCCCAAAGTCTCATCTAAGACAAAGATGCAGTCCCATCTCCTCCTGAGCCTCTGAAATACAAAGCAAGTTAACTACTTCCAAGGTATGATTGTCCAGACATTGAGTAAGAATTCCCAACCAAAAGGAAGATTTATGCCAGAGACAAGAACAAAACATAAACGGGACTTACAGGTCCCTTGAAAATCCAAAACCCAGCAGGCCAGTTATTCAAACCTACAGCTCCAAAGTCATCCTTTTTCAATCCTTGTCCCACATCCAGGGCACAAGTGTATGAGGGCTGGGTTCCCAAGGCCTTGGGCAGCACTCTACCTGTGGCTTTGCAGTGTTCAGTCCCCACAGCTGCCCTCATGGGCTGTGCTGGTGTTGAGTGCCTGTAGTTTTTACCCACAGAGGGTACAAAGTTCTTGGTGGGTCTATGAATCTGGGGTCTGCATGATGCTGGCATCCAGTGTGGGGGCCCCAACCCCATATTTTCCTTCTGCACTGCCCTAGTAGAGGTTTCCCAGGAGTCTCTGCTTTTTTGGCAGCCTTCTGTCTGGACACCAAGACACTTTCATACATCTTCCGAAATCTGTATGAAGGCTCCAAAGCCTCTGGGCTAGTGCTCTGTGCACTCGCTAGCTTAAAACTATATGGAAGCCATGAAGCCTTACAGCCTGTACCCTCTGAAGCAGTGATGCAATCTGTACCTGTGCATCTTTCAGCCAAGGTCGGTGCAGGAGCTGGGGCAGCCGGGTTTCAGGCAGCAGTGTCCTGAGGCTGCACACAGCAGCAGGGTCATGGGGCTGGCCCAGGAAACCATTCTTCTCTAATAGGCCCCAGGGACTGTGACAGCAAGGGCTGCTGCAAACATCTCTGAAATGCCTCCAAGGCTTTTTCCCCCAGTGTCTTGGCTATTAGCACTGGCCTCCATTTTATGCAAGTTTCTGGAGCCTTCATGAATTTTCCCCCGAAAATCAGCTTTTCTTTTTGACCACTTGGCCAGGCTGCGGATATTCCAAACTTTTGAGCTCTGCTTGTCATTTAAATATAAGTTCCAACTTGAGGTCATTTCCTCGGTCACACATAACCTCGGTCACACATGAAAGCACAGGCTGTTTGATGCAGACATGCCCCCACCCTTGTGCTATGCTGCCTAGAAGTTCTTTCCACCAGATATGCACTAAATCTTCACCCTAGAGTTCAAAATTTCACAGATCTTGAGGGCAAGGTCGCCCTGCAGCCATGTTCTTTGCTACAGCAAAACAAAAGCTAACCTTGGCTCCCGTTCCCAGTAAGATCCTCATTTTCATCTGAGACCTTGTAAGCCTGGCCTTCACTGTCCATCCTTCTGCCAGCCTTTTAATCACAACTATTTAACAAGTGCCTACAATGGTCCAAATTTCCCTTCATCTCCCTGCCTTCTTTCAAGATCTCCAAACTCTCCAACCTCTGGCTGTTACCCACTTCTGAACCTGCTTTACATTTTCAGCTATCTTTGTTGCAGCCTGGCAATGCAGAAGAAAAAGAAGTCCATTTTCAGGGGGAAAACTTCAGGAAGCCTTCAGATATTTGCATTAAAAAGAAGTCCAGTGCTAATAGCCAAGAAGATGGGGAAATGTCATTGAAGATATTTCATAGCTCCACTTCGCAGTACTTTATTTTCTGCATGATCATAACGAAAAGGGGTTTAATTGGCTCATGGTTCTGCAGGCTGTAAAGAAAGCATAGTGACTTCTGCTTCTGGGAGGACTCAGGAAGCCTCCCAATCATACCAGTAGGAAAACAGCAATGAAATGTTTCATACAGCAGGAGTAGGAGCAAGGCTGAGAGAGGAAAGTGGTGCCACACCGTCCTATAACCAGATCTCATGAGAACTCACTATCACTAGGTCAGCATCAAGAAGATGGTGCTTAAACATTGGTGAAGGATCCGCCCCCCAACACAGCTCCACCCCCTACCGTTCCAGACAGAAGCCTGCTGCAGAGGCAGAGGCTCTTGGAAATCCTGTACTGTGGCAGTGCAGAAGGAAAATAAGGGCTTTGAGTGACTATGCAGGAGGCCACCAGCCTCTAGACCCCAGATTCATAGACCTACCAACAGTTCACACCCTCAGTATGGAAAAGTGATAGGCACTCAACACCAGCCAAGCCTATGAGAGCAGGCTTGTGGGCTAAAGCCTGCAAAGCCACAGGCGCACTGCCCAGGTAGAGGTTTTCCAAGAGCCGCTGTCTCTGCAGCAGGCTACTCCCCCTTCCTACTACCCACCACCCTCCCACCACCCTACAGCCAGCCTACTCTTCCCCACCCTACCCACCCCTTTTTTCTTCCACCCCTACCCCTCCCATTCATGATTAAATAATCTCACACCAGACCCCAACTCCAACATTTTGGATTACAATTCCACATGAGTTTTTCCAGGGGCACACAGCCAAATCATATTATGCTGACCTTGACCCCACCAAATCTCATATCCTTCTCACAGAATAAAATAAAATCGTGTCTTTTCAAAGTTTCCAAAAGCCTTAACTCATTCCCACATTAACTCAAATGTAAAAAGCTCAAAGTCTCATCTGAGACAAGGTTACAATCTCTTCTGCCTATGAGTCCCTGAAGTTAAAAGGGTGTTCGTTTCTTTCAAGGTACAATGATGGTACAGATATTGGGTAACTTTTCTCAATCCAAAGGGAAGAAATTTCCAAGAACAATAACACAAATGGGACCACAGGCCCAATGGAAATCCAAAATCCCACAGGTCAGTGTTCAGTCAATCTCACAGCTCCAAAATCATGAAGAGAACTCAATATCAGAAGGACAGCATTAAGGAGATGGTGTTTAACCATTTGTGAAGGATGCACCCCCACCCCTGCCTTACACCCCCAACCCCACCACAATCCCCTCCAACCCTCCTCACCACCCAATCCCCCCCATCTCTCCCCAACCCTGAAACATCCAACCTCCTCTCTCCATCATGATTAAATCACCTTCCACCAGCCCCCACCTTTAACATTTCCCATTAACATTCCACATTAGCTTTGGTAGAGACAGAGAGCCAAAACATATTATTCTGTCCCTGGTCCCCCAGAGTTCATGTTTTTCTCACATTGCAAAATGTAATGATGCCTTCCTTAGAGTCTCTCAAATCTTAACCCATTCCAGCATTTACTCAAATGCCCAAAGCCGAGAGTCTTATTTGAGACAAGTCTACAGACCCTTCTGCCCATGAGCCACTGAATTATATATAAAGCAAGTTTACTACTTCCAAGGTGCAATGATTGTACAGGCGTTGGGTAAGCATTCCCAGCCAAAAGGAAAAAAATTGCCAGAAAGAAGCACAAAACACAGATGGGACTTACAGACCCCATGCAAGTCAAAAACCCAGCAAGCCAGTCATTGAATCCTACAGCTCCCAAATCATCTTTTCTGAATCTATACCTCACATCTGGGGCACAGGGGTGGATGGCTGGGCTCCCAAGGCCTTGGGCAGCTCAGCATCTGTGGCTGTGCAGTGTCTATCCCCGACAGGTGCCCTTATGGGCTGGGCTGGCGTTGAGTACCTGTGGCTTTTCCACACTGAGGGTGCGAGCAGTTGACGGGTCTATGAATCTGGGGTCTGGAAAGTGGTGCATTCCTGTGTGGGGGCTGCAACCCTATATGTTCCTTCTGTACTGCCCTAGTAAATGTTTCCCATGAGGCTCTGCCTCTTGGAAAAGCTTCTGCCTCAACACCCAGGTTTTTCTGTATATACTCTGGAGTCTAGACAAAGGCTCCCAAGCCTCTAGTTCTGTGCTCTGTGCACCTGCTGGCTTAGTACTATGTGGAAGCCACCAAGGTTTGAAGCTTGCACCCCTGAAGCAGTGATGCAAGCTGTACCTGTGCATCTTTCAGCCGTGACTGGAGCTGTAGCTGTAGCTGTAGCTGCAGGGATGCAGACAGCAGTGTCCTGAGGACGGACACAGTAGCGCGACCATGGGACTCAATCAGGAAAGCATTCTTCGGTCCTAGACCTCAGGGCCTGTGACAGCAAGCTCTGCTGCAAAGGTCTCTGAAATGGCTTCAAGACCTTTTAACCTTGTCTTAGCTATTAGCACTGGGCTCCATTTTATGCAAATTTCTGAAGCCTTCTTGAATTTTCCCACTGAAAATCAGCTTTGCTTTTTGACCACTTGGCCAGGCTGCAAATTTTCCAAACTTTTAAGCCCTGCTTCTCATTTAAATATAAGTTTCAATTTGAGGTCATTTATTCAGTCACAGAGAAGACCACAGGCTGTTCAAAACAGACAAGACACCTCTTGAGCTTTGCTGCCTACTTCATTTCACCAGATATACCCTAAATCATCACACTCAAGTTCAAAGTTTCAGAGGTCTCCAGGGCAGGGACACCATCCAGCCAAGTTCTTTGCTAAGGGAAAACAAAAGTAACCTTGACTCCTGTTCCCAGTAAGTTGCTCATTTTCATCTGAGACCTTCTAAGCCTGGTCTTCACTGTCCATCCTTCAGTCACTATTTTAATTATAGCTATGTAACAAGTCTCCATGGTCACCCTTTTAATTTAACACATCTCTACAATAGTCCAAATTTTCCCTCATCTTTCTTTCTTCTTCCAAGCCCCCCAAACTGTGCAACGTCCGGTGGTTACCCACTTCTGAACCTGCTTCTACGTTTTCAGCCACCGTTGTGGCAGCCTGGCAATGTGGTGAAAGGAGAAAAGTCCATTTTCAGGGGGAAAATTCAAGAAGGCTTCAGATATTACCATATAAAAGTGGGCAAATGTTAATAGTAAAAAAAAAAAAAAAATGGGGAAAAAACCTTGAGGGCATTTCATGGCTCCACTCTACAGTACAAATTTTCTGTAATATTATTTTTAAAAGAGTTTTAATTGGCTCATGGTTCTGCAGGTTGTAAAGGCAGCAAGTGGTTTCTGCTTCTGGGAGGACTCAGGGAGCCTCCCAATCATACCAAAAGGCCAAGTGACAATGAGATATTCCATATGGCAGGAGTAGGAAGAAGACACAGAGAAGGAAGAGGTGCCACACCAGTTTATACAACCAGATCTCATGAGAACTCACTATCAGGAGATCAGCATCAGGAAGATTAACCAATGGTGAAGGATCCACCCACACCACCGCCTACTGTTTCCAGGCAGAAGCCTCCTGCAGAGCAGAACCTCTTGGAGAACCTCTACGAGTGCAGTGCAGAAGGAAAATATGGGCTTGGAGCCCCCACACAGGAGGCCACCATCCTCCAGACTGCAGATACATAGACCCAACAGCTTGCACTCTCTGCGTGGAACAGCTACAGGCACTCCACACCAGCCCAGCCCATGAGAGCAGCCATGGGGGCTACACCCTGCAAAACCACAGGTGCACTGCCCTAGTAGAGACTTTCCATGAGCCTCTGCCTCTGCAGCAGGCTACTCCCCCTTCCTGCTACCCACCACCCTCTCACCACCCTACTAACAACATACTCCTCACCCTACCCACCCCTTTTCCTTCCACCCCCGGCCCCCGACCATCCACGATTAAATCACCTCCAGCCAGGCCCCACCTCCAACATTAAAGATTACAATTCACTTGAGTTTTGTTAAAGAAACACAGCCAAATCATATTATTCTGACCCTGATCCCCACAGTCTCATGTGCTTCTCACAGAGAAAAATATATTCATGCCTTTTCAAAAGTTTCCAAAAGTCTTAAATCATTCCAACATTAACTCAAATATAAAAAAATCAACTTCTCATCTGAGACAGTTCTACAGTACGTTTTGCCTATGAGTCTCTGAATTTAAAAGGATGTTCTTTTCTTTCAAGGTACAATAATGGTACAGGCTTTGGGTAAGATTTTTCAATCCAGAGGGAAGAAATTTCCGAGGAAGGAAACACAAATGGGACCACAGGCCCAATACAAGTCCAAAACCCAGAAGGCCAGTATCCATTCAATCTTACAGCTCCAAAATCATGAAGAGAACTCACTATCACAAGGACAGCAATAAGGAGAATGTTTAATCATTTGTGAAGCATCCGCCCCACAACTCCCAATTTTCACTCCTCACCCGCACCACAAATCCCCCATTCTCCCTACGCCCCATCTTCCAACCCCAACTCTCCACCGTGATTAAATCACCTTCCACCAGGCCCCACCTTTAACATTCTGATGACAATTCCACATGAGTTTTGGTAGAGACACAGAGCTGAATTTTATTATTCTGTCCCTGGCTCCCCAAATCTCATGTCCTTCTCACATTGAAAAATATAATGATGCCTTCCCTACAATCCCCCAAAATCTTATATCATTACAGCATTTATTCAAATGTTGAAAGCTTAAAGTCTCATCTGGCACAAGGCTACAGTTGCTTAGGCCCATGAGCCTCTGAACTATAAAGCAAGTTAACTACTTCCAAGGTACAATGTTTGTAGAGCCATTGGGTAAGCATTCCCAGCCAAAAGAAAGAATTTTGCCAGAAAAAACAAAACATAGACAGGACTTACCGGTCCCATGAAACTCCAAAACCAAAAGGCCAGTCATTCAATCCTACAGCTCCAAAATCACCCTTTTTGAAACCCTGTCCCACATCCAGGGCACAGGGGTGTGAGGGCTGGGCTCCCAAGGCCTTGGGCAGCGTGGCATCTGTGGCTTTGCAGGGTTTATGCCCACGGCTGCCCTCAGGACCTCGGCTGGTGTTGAGTGCCTGTGACTTTTCCCCACTAAGGATACAAGTTGTTGGGGGTCTATGAATCTGGGGTCTGCATGATGGTGGCCTCCAGTGTGGAGGCTCCAACCCCATGTTTTCCTTCTGCACTGCCCTAGTAGAAGTTTCATATAAGGCTCTGCCTTTTTGGGATGTTTTTGCCTGGACACCCAGGCATTTCCATACATCTTCCAAAATCTATAGAGAGGTTTCCAAGCCTCTGGTCTCATGCTCCGTCCACCAATGGCTTAACACAATGAGGAAATTACCAAGGCTTCTAGCTGGCACCCTCTGTAGCAGTGACCCAAGCTGTAGCTGTGCATCTTTCAGTCATGGCTGGAGCTGGAGCTGGAGCTGGAGCTGGAGCTGCAGGGATGCAGGCAGCAGTGTCCTGAGGCTGCACACAGAGGGGGGGCATGGAACTGGCCCAGGAAACCATTCTTCTCTCCTAGGCCCCAGGGCCTGTAACAGCCAGGGCTGCTGCAAAGGTCTCTGAAATGCCTTCAAGGCCTTTTCCCTATTGTCTTGTCTATTAGCACTGGGCTCCTTTTCATGCAAGTTTCTGAAGCCTTCCTCAACTTTCCCCCTGAAAATCAGCTTTTCTTTTTGACCACATGGCCAGGCTGCAAATTTTCCAAACTTTTGAGTTCTGTTTCTCATGTAATGTGAGAGTTGGGACTCATTTAATGTAAGTCTCATCCAGAAGTCATTTCCTCCATCACACATAAGAACACAGGCTGTGTGATGCAGACAGGACACCTCTTGAGTTTGCTGCTCAGTTCATTCCACCAGATACTCAGTAAATCATCACCCTCAAGTTCAAAATTTCACAGATCTCCAGGGCGAGGTCACCGTGCAGCCACGTTCTTTGCTAAGGAAACAAAAGTAACTTTGACTTCTGTTCCCAGTAAGAGCTTCATTTTCATCTGAGACCTTCTAAGTGGGGCTTTCACTAACCATTTTCCTGTGAGCCTTCTGATCACAAGTGTTTAGCAATTCTTTACAAAGATCCAAACTTTCTTTCATCTTCTTGTCTTTGAAGCCCTCCAAACTCTCCCGACCTCTGTCCGCTACTCCCTTCTGAATCTGCTTCTACATTATCACTATCTTTGCCACATCCTGGCAATGTGGTAAAGGAAAACAAGTCCATTTTCAGGGGAAAATTCATGAAGCCTTCACATACTTGAATGAAAAGAAGCTGAGTGCTGATTGCCACGACAATGACATTTAATAGTTCCACTTTGCACTACTAATTTTCTCTATGATCATAAAGAAAAGAGGTTTAATTGGCTCATGATTCTGCAGACCATAAGGAAACATAATGGCTTCTGAATCTGGGAGGACTCAGGAAGCCTTCCAATCATACCAGAATGTCCAGGGGCAATGACATGATTCATGTGGCAGGAGTAGACACAAGACACACACAGGAGAGAGGGCCACACCCTATTATACAAACAGATCTCATGAGAACTCACTATCACAAGGTCAGCATCATGAAGATGGTGCTTAAACATTGAGGAAGGAACAACCACCCACCCCCAACTCCCACTGTTTCCAAACAGAAGCCTGCTGCAGAGGCAGAGCCTCTTGGAAAACCTCTACCAGGGAAGTGTGGAAGGAAAATATGGGCTTGAAGCCCCCATGCAGATGGCCACCAACCTCCAGACCCCAGATTCATAGACCCACCAAGAGCTCACACCCTCTGTGGAAAAGCTACAGGCACTCAACAACAGCCCAGACCGTGAGAGCAGCTGCAGGGGCTAAACCCTGCAAAGCCACAGGTGCTCTGTCCTAGTAGAACTTTTCCATGAGGCTTTGCCTCTGCAGCAGGCTGCTCCCCCTTGCTACTACCCCCCACGCTCCCACCATTCTACTGCCAGCCTACTCCTCCCCACCCTAACCAACCCTTTTGTGATACCCTACCTTGTTTTAACCTGGTCGACTCTCCCTTAGCTGAGAGGGCCAGACAGACTCCATCTTGGCTCCTTCACTTGCAGCCCCTTACCCACCCCCCTTCCTCAAGGACTTAACTTGTGCAAGCTGACTCCCAGCACATCAAAGAATGCAATTACTGATAAGATACTCTGGCAAGCTATATCCACAGTTCCCAGGAATTCGCCCGGTTGATAGTACACAAAACCCCAGCATTTGTGTCCAGTTGATAGCACCCAAAGCCCCCACATCTATCACCTTTGGATGGATTTAAAGCCCCTGCACATGGAAATGTTTGTTTTCCTGTAGCCATTTATCTTTTTAACTTTTTTGCCTGTTTTGCTGCTGTGAGAGTCCTTCAGCGAGGCTCCCCCTCCCCTTTCTAAACCAAAGTATAAAAGAAAATCTAGCCCCTTCTTCCAGGCCAAGAGAATTTTGAGCACTAGCGGGCTCTCAGTTGCCGGCAATAAAGGTCTCCTGAAGTCGTCTCATGGTTTGGCGTTTCTCTACAACTCACTCGGTTACAACCCTTTTCCTTCCACCCCAACCTCCTCCAATCAATGACTAAATGATCTCCCCCAGGCCCCACCTTCAACATTTGGAATTACAATTGCACCTGAATTTTTATAGGGACACACAGCCAAACCATATTATTCTGACCCTGATATTCCAGAATCTCATGTCCTTAACACAGAGCAAAATACAATCATGCCTTTTCAAAACTTCCAACAGCCTTAACTCATTCCAAGTGTAAAAAGTTCAAAGTCTCATCTGAGACAAGGCTACTGTCCCTTCTGCCCATGGGTTCCTGAATTTAAAAGAGATTTCTTTTCTTTCAAGGTACAATGATGGTACAGGTGTTGTGTAAGCTTTCTCAATCCAAAGGGAAGAAATTTCCCAGAATACACAAACGGGACCGCAGGGCAAATGCAAGTCCAACACCCAGCAGGACACTATTCACTCAATCTCACAGCTCAAAAATCATCAAGAGAACTCACTATCATGCGGACAGCATTAAGGAGATAGCGTTTACCCATTTGTGAAGAATCTTCCATCCCACCCTCATCTTTCACTCCCACCCACAAAATAATCTCTCCCATTCTCCCCACACCCCTACCTCCAACATCCACTCTTCTCCATGATTAAATCACCTCCCACCAGGTCCCACCTTTAACATTCCCCACGACAATTCCACATGAATATTGGTAGAGAGACAGAATCAAATCGTATTATTCTGACTCTTGCTCCCCAAATCTTGTATCCTTGTCACGCTGCAAAATACAATGATGACTTCTCTACTGTCCCCCAATGACTTAACTCATTCCAGCATTTACTGAAATGTCCAAGGACTTACAGACCCCATACAAGTCAAAAACCCAGCAGGCCAGTCATTGAATCCTACAGCTCCAAATCATTTTTTCTGAATGGATATCTCACATCCAGAGCACAGGTGTGTCATGGCTGGGCTCCCAAGGCCTTGGGCAGCTCTGCACCTGTGGCTGTGCATGGTCTATCCCCCACAGCTGCCCTCATGGGCTGGGCTGGTGTTGAGTGCCTGTAGCTTTTCCAAACTAAGGTTGGAAAAGCCAGCTGTTGGTGGGTCTATGAATCTGGGGTCTGGAGAATGGTGCCTCCATGTTTAGGGACTCCAGTCTTATACTCTCCTTCTGTACTGCCCTAGTAAAGGTTTCCCATGAGGCTTTGCCTCTTGGAAAAGCTTCTACCTGAAAACCCAGGTTTTTCCGTAGATACTCTGGAGTCTAGACAAAGGCTCCCAAGCCTCTAGTTTTGTTCTCTGTGCACCTGCTGGCTTAACACTATGTGGAAGCCACCAAGGCTTGAAGCTTACACCCCTGAAGCATGATGCAAGCTGTACCTGTGCATCTTTCAGCCATGGCTGGAACTAGAGCTGCAGGGATGCAGGCAGCAGTGTCTTGAGACTGCACATAGAGGGGGGTCATGGAACTGGCCTAGGAAACCATTCTTCTCTCCGAGGCCCCAGGGCCTATGATAGCAAGGGCTGCTGCAAAGGTTTCTGAAATGGCTTCAAGGCCTTTTCCCTATTGTCTTGGCTATTAGCACTGGGTTCCTTTTCATGCAAATTTCCGAAGCCTTCCTCAGTTTTCCCCTGAAAATCAGATTTTCTTTTTGACCATTTGGCCAGGCTGCACATTTTTGAGTTCTGTTTCTCATTTAATATAAGAGTTGGGACTCATTTAATGTAAGACCCATACAGATGTCATTTCCTCGGTCACACATAAGGGCACGGGCTGTTCGATACAGACAGGACACCCCTTGAGCTTTGCTGCCCAGAAGTTTATTCCATCAGATACACAGTAAGTCATCACCCACAAGTTCAAAGTTTCACAGATCTCCAGGGCAGTGTCACTGTGCAGCCACGTTCTTTGCTC""]","""chr21-13880084-INS-1-38892""",-10.0,{},[3.99e-02],301,"[0,301]",0,"[""chr21-13880084-INS-1-38892""]",552,[22],276,[2],[20],[0],0.0399,[3.53e-01],[9.40e-01],0.0,0.56


In [38]:
print("AFR Freq =%.2f"% SV.rows().AF_AFR.collect()[0])
print("West Eurasia Freq =%.2f"% SV.rows().AF_EUR.collect()[0])

[Stage 234:>                                                        (0 + 1) / 1]

AFR Freq =0.00


[Stage 236:>                                                        (0 + 1) / 1]

West Eurasia Freq =0.56


### Here we are showing the sum of alleles found per each population.


In [39]:
entries = SV.entries()
results = (entries.group_by(breed = entries.population.population)
      .aggregate(alleleCount = hl.agg.sum(entries.GT.n_alt_alleles())))
results=results.order_by(-results.alleleCount)
results.show()

2023-03-27 12:01:46.715 Hail: INFO: Ordering unsorted dataset with network shuffle
2023-03-27 12:01:49.021 Hail: INFO: Ordering unsorted dataset with network shuffle


breed,alleleCount
str,int64
"""Armenian in Armenia (SGDP)""",2
"""Bedouin B in Israel(Negev) (SGDP)""",2
"""Jordanian in Jordan (SGDP)""",2
"""Turkish in Turkey (SGDP)""",2
"""Yemenite Jew in Yemen (SGDP)""",2
"""Basque in France (SGDP)""",1
"""Bergamo in Italy(Bergamo) (SGDP)""",1
"""Estonian in Estonia (SGDP)""",1
"""Georgian in Georgia (SGDP)""",1
"""Greek in Greece (SGDP)""",1


Finally get the ids of the samples that have this variant

In [40]:
results = entries.filter(entries.GT.is_non_ref())
print(results.s.collect())

[Stage 247:>                                                        (0 + 1) / 1]

['Armenian222', 'Est375', 'HG01504', 'HG02494', 'HGDP00616', 'HGDP00650', 'HGDP00722', 'HGDP01172', 'HGDP01364', 'Jordan603', 'Kayseri23827', 'Kayseri24424', 'NA17374', 'NorthOssetia5', 'Sam02', 'YemeniteJew4695', 'YemeniteJew5433', 'armenia293', 'iran11', 'mg27', 'tdj409_shugnan']


# Run principal component analysis (PCA) on the Hardy-Weinberg-normalized genotype call matrix.
Finally lets run pca on the genotypes and visualize how the samples are related to each others

In [22]:
eigenvalues, pcs, _ = hl.hwe_normalized_pca(mt.GT)
mt = mt.annotate_cols(scores = pcs[mt.s].scores)


2023-03-27 11:49:16.724 Hail: INFO: hwe_normalize: found 62021 variants after filtering out monomorphic sites.
2023-03-27 11:49:17.248 Hail: INFO: Coerced sorted dataset
2023-03-27 11:49:19.919 Hail: INFO: pca: running PCA with 10 components...) / 1]
2023-03-27 11:49:30.292 Hail: INFO: wrote table with 0 rows in 0 partitions to /tmp/persist_tableUATPGblIIQ
    Total size: 23.71 KiB
    * Rows: 0.00 B
    * Globals: 23.71 KiB
    * Smallest partition: N/A
    * Largest partition:  N/A


In [27]:
from bokeh.models import  CategoricalColorMapper
from bokeh.palettes import Category10

pallete=Category10[7]
colors={
    'Africa': "#12eeff",
    'America': '#310079',
    'Central Asia and Siberia': '#01daa0',
    'East Asia':'#ff48de',
    'Oceania':"#34afff",
    "South Asia":"#008c1e",
    "West Eurasia":"#001f54",
    None: pallete[6]
}


colorTable={}
for s in table.collect():
    colorTable[s.population]=colors[s.super_population]

factors=[]
pallete=[]
for k,v in colorTable.items():
    factors.append(k)
    pallete.append(v)
    
color_mapper = CategoricalColorMapper(factors=factors, palette=pallete)    

p = hl.plot.scatter(mt.scores[0],
                    mt.scores[1],
                    label=mt.population.population,
                    colors=color_mapper,
                    title='PCA', xlabel='PC1', ylabel='PC2')
show(p)