## Testing Lower Stack Depth

In my combined data set for batch 1, I encountered a lot of missing genotypes in the Alaskan samples. The hypothesis is that this is because lower stack depth is preventing pstacks from genotyping individuals at a locus (not that the cat locus isn't being aligned to in the original .sam files). So I want to see if using a lower stack depth for pstacks is going to help genotype Alaskan individuals. But first I need to make sure that we can *trust* the genotypes generated with a lower stack depth. 


In this notebook, I ...
1. subset n = 50 Korean individuals
2. subsample the reads in the `.sam` alignment files in those individuals
3. run my new, subsampled data files through stacks: pstacks --> sstacks --> populations
4. extract from the populations output file the SNPs that were genotyped in `batch 1`
5. compare genotypes between batches for the subset of individuals


<br>
#### 9/20/2017
<br>
### Step One: Subset Korean individuals



In [1]:
cd ../

/mnt/hgfs/Pacific cod/DataAnalysis/PCod-Compare-repo


In [4]:
import random

# create a list of 50 random numbers
index = random.sample(range(0,325,1), 50)

popmap = open("scripts/PopMap_KOR.txt", "r")
new_popmap = open("scripts/PopMap_KOR_subset.txt", "w")

# if the line number in the popmap is in the list of 50 random numbers, write out that sample to the new popmap
i = 0
for line in popmap:
    if i in index:
        new_popmap.write(line)
    i += 1
popmap.close()
new_popmap.close()

### Step Two: Subsample reads from each individual's .sam file

in order to do this, I must come up with a proportion of the .sam files that I want to keep, to make the Korean files approximately the same size as Alaskan files. 

In the excel file `readsVtags_KOR_batch1` I took the top 10 `.sam` files with the largest number of reads from each population. I then divided each pair: AK # reads / KOR # reads, and took the average. 

The proportion I ended up with was `0.87`.

In [19]:
cd ../

/mnt/hgfs/Pacific cod/DataAnalysis/PCod-Compare-repo


In [24]:
## First, I need to unzip the sam files for these individuals ##
infile = open("scripts/PopMap_KOR_subset.txt", "r")
bash_array = []
for line in infile:
    bash_array.append(line.strip().split()[0] + ".sam.gz")
infile.close()

outfile = open("scripts/gunzip_Korean_subset.sh", "w")
outfile.write("#!/bin/bash\n")
for sample in bash_array:
    outfile.write("gzip -d ../../PCod-Korea-repo/stacks_b7_wgenome/" + sample + "\n")
outfile.close()

In [25]:
!head scripts/gunzip_Korean_subset.sh

#!/bin/bash
gzip -d ../../PCod-Korea-repo/stacks_b7_wgenome/PO010715_26.sam.gz
gzip -d ../../PCod-Korea-repo/stacks_b7_wgenome/PO010715_27.1.sam.gz
gzip -d ../../PCod-Korea-repo/stacks_b7_wgenome/PO020515_10.1.sam.gz
gzip -d ../../PCod-Korea-repo/stacks_b7_wgenome/PO031715_04.sam.gz
gzip -d ../../PCod-Korea-repo/stacks_b7_wgenome/PO031715_23.sam.gz
gzip -d ../../PCod-Korea-repo/stacks_b7_wgenome/PO010715_07_rep.sam.gz
gzip -d ../../PCod-Korea-repo/stacks_b7_wgenome/GE011215_08.1.sam.gz
gzip -d ../../PCod-Korea-repo/stacks_b7_wgenome/GE011215_30.1.sam.gz
gzip -d ../../PCod-Korea-repo/stacks_b7_wgenome/GE012315_05.1.sam.gz


In [29]:
pwd

u'/mnt/hgfs/Pacific cod/DataAnalysis/PCod-Compare-repo'

In [30]:
!mkdir stacks_b2_wgenome

In [35]:
## Subsampling ##

popmap = open("scripts/PopMap_KOR_subset.txt", "r")

#create sample list
sample_List = []
for line in popmap:
    sample_List.append(line.strip().split()[0])
popmap.close()
print "read in sample list."

proportion = 0.50


for sample in sample_List:
    # find length of samfile
    samfile = open("../PCod-Korea-repo/stacks_b7_wgenome/" + sample + ".sam", "r")
    nlines = 0
    for line in samfile:
        nlines += 1
    keeplines = float(proportion) * float(nlines)
    samfile.close()
    
    # write new samfile, abbreviated
    samfile = open("../PCod-Korea-repo/stacks_b7_wgenome/" + sample + ".sam", "r")
    newfile = open("stacks_b2_wgenome/" + sample + "_subset.sam", "w")
    count = 0
    line = samfile.readline()
    while line.startswith("@"):
        newfile.write(line)
        line = samfile.readline()
    while count < keeplines:
        newfile.write(samfile.readline())
        count += 1
    samfile.close()
    newfile.close()
    print "created new file for sample ", sample

read in sample list.
created new file for sample  PO010715_26
created new file for sample  PO010715_27.1
created new file for sample  PO020515_10.1
created new file for sample  PO031715_04
created new file for sample  PO031715_23
created new file for sample  PO010715_07_rep
created new file for sample  GE011215_08.1
created new file for sample  GE011215_30.1
created new file for sample  GE012315_05.1
created new file for sample  GE012315_10.1
created new file for sample  GEO012315_02
created new file for sample  GEO012315_12
created new file for sample  GEO012315_18
created new file for sample  GEO012315_21
created new file for sample  GE012315_11_2
created new file for sample  GE011215_11
created new file for sample  NA021015_02.1
created new file for sample  NA021015_10.1
created new file for sample  NA021015_25
created new file for sample  YS_121316_18
created new file for sample  YS_121316_28
created new file for sample  YS_121316_29
created new file for sample  YS_121316_21_2
crea

### Run pstacks through populations

In [48]:
pwd

u'/mnt/hgfs/Pacific cod/DataAnalysis/PCod-Compare-repo/scripts'

In [49]:
!python pstacks_populations_genShell_KORdepthSubset_9-21.py PopMap_KOR_subset.txt

In [50]:
!head pstacks_populations_KORsubsetDepth_9-20.sh

#!/bin/bash

#pstacks
pstacks -t sam -f ../stacks_b2_wgenome/PO010715_26_subset.sam -o ../stacks_b2_wgenome -i 1000 -m 3 -p 6 --model_type bounded 2>> ../stacks_b2_wgenome/pstacks_out_b2_wgenome
pstacks -t sam -f ../stacks_b2_wgenome/PO010715_27.1_subset.sam -o ../stacks_b2_wgenome -i 1001 -m 3 -p 6 --model_type bounded 2>> ../stacks_b2_wgenome/pstacks_out_b2_wgenome
pstacks -t sam -f ../stacks_b2_wgenome/PO020515_10.1_subset.sam -o ../stacks_b2_wgenome -i 1002 -m 3 -p 6 --model_type bounded 2>> ../stacks_b2_wgenome/pstacks_out_b2_wgenome
pstacks -t sam -f ../stacks_b2_wgenome/PO031715_04_subset.sam -o ../stacks_b2_wgenome -i 1003 -m 3 -p 6 --model_type bounded 2>> ../stacks_b2_wgenome/pstacks_out_b2_wgenome
pstacks -t sam -f ../stacks_b2_wgenome/PO031715_23_subset.sam -o ../stacks_b2_wgenome -i 1004 -m 3 -p 6 --model_type bounded 2>> ../stacks_b2_wgenome/pstacks_out_b2_wgenome
pstacks -t sam -f ../stacks_b2_wgenome/PO010715_07_rep_subset.sam -o ../stacks_b2_wgenome -i 1005 -m 

In [None]:
# in terminal:
./pstacks_populations_KORdepthSubset_9-20.sh

<br>
### Get a list of the locus_snp pairs selected in batch 1

#### 9/21/2017

In [19]:
pwd

u'/mnt/hgfs/Pacific cod/DataAnalysis/PCod-Compare-repo'

In [30]:
cd scripts

/mnt/hgfs/Pacific cod/DataAnalysis/PCod-Compare-repo/scripts


In [31]:
infile = open("../stacks_b1_wgenome/batch_1.genepop", "r")
infile.readline()
loci_list = infile.readline().split(",")

In [32]:
print len(loci_list)

22054


### Extract only those locus_snp pairs from batch 2 genepop

I used the `write_random_snp` model for populations in batch 1, so I need to extract the loci / snp pairs from batch 2 (where populations genotyped all snps at a locus).


In [33]:
# open files
infile = open("../stacks_b2_wgenome/batch_2.genepop", "r")
outfile = open("../stacks_b2_wgenome/batch_2_matched.genepop", "w")

# write genepop header and locus names to new file
header = infile.readline()
outfile.write(header)
outfile.write(",".join(loci_list) + "\n")

# find indices of locus_snp pairs from batch 2 genepop that match batch 1 locus_snp pairs
snps = infile.readline()
snp_list_b2 = snps.strip().split(",")


In [34]:
print len(snp_list_b2)

22531


In [35]:
# open files
infile = open("../stacks_b2_wgenome/batch_2.genepop", "r")
outfile = open("../stacks_b2_wgenome/batch_2_matched.genepop", "w")

# write genepop header and locus names to new file
header = infile.readline()
outfile.write(header)
outfile.write(",".join(loci_list) + "\n")

# find indices of locus_snp pairs from batch 2 genepop that match batch 1 locus_snp pairs
snps = infile.readline()
snp_list_b2 = snps.strip().split(",")

indices = []
index = 0

snps_kept = []
snps_removed = []

for snp in snp_list_b2:
    if snp in loci_list:
        indices.append(index)
        snps_kept.append(snp)
    elif snp not in loci_list:
        snps_removed.append(snp)    
    index += 1

print indices[len(indices)-1]

22527


In [36]:
print len(indices)

10481


In [37]:
print len(snps_kept)

10481


In [38]:
print len(snps_removed)

12050


In [39]:
print "snps in the batch 1 genepop that were not identified in the batch 2 genepop:"
print len([i for i in loci_list if i not in snp_list_b2])

snps in the batch 1 genepop that were not identified in the batch 2 genepop:
11573


<br>

*could this have something to do with the settings in populations??*

### RERUN POPULATIONS WITH LESS STRINGENT FLAGS TO RETAIN MAXIMUM NUMBER OF LOCI

In [53]:
pwd

u'/mnt/hgfs/Pacific cod/DataAnalysis/PCod-Compare-repo/scripts'

In [54]:
cd ../

/mnt/hgfs/Pacific cod/DataAnalysis/PCod-Compare-repo


In [55]:
!populations -b 2 -P stacks_b2_wgenome -M scripts/PopMap_KOR_subset_populations.txt -t 36 -r 0 -p 0 -m 10 --genepop --fasta 2>> poopulations_out_b2wgenome_p2

### Get a list of the locus_snp pairs selected in batch 1
#### attempt # 2

In [None]:
cd ../scripts

In [4]:
pwd

u'/mnt/hgfs/Pacific cod/DataAnalysis/PCod-Compare-repo/scripts'

In [5]:
infile = open("../stacks_b1_wgenome/batch_1.genepop", "r")
infile.readline()
loci_list = infile.readline().split(",")

In [6]:
print len(loci_list)

22054


### Extract only those locus_snp pairs from batch 2 genepop

I used the write_random_snp model for populations in batch 1, so I need to extract the loci / snp pairs from batch 2 (where populations genotyped all snps at a locus).

**Attempt # 2**

In [7]:
# open files
infile = open("../stacks_b2_wgenome/batch_2.genepop", "r")

# write genepop header to new file
header = infile.readline()

# find indices of locus_snp pairs from batch 2 genepop that match batch 1 locus_snp pairs
snps = infile.readline()
snp_list_b2 = snps.strip().split(",")

indices = []
index = 0

snps_kept = []
snps_removed = []

for snp in snp_list_b2:
    if snp in loci_list:
        indices.append(index)
        snps_kept.append(snp)
    elif snp not in loci_list:
        snps_removed.append(snp)    
    index += 1

print indices[len(indices)-1]

26179


In [8]:
infile.close()

In [9]:
print len(indices)

11496


In [10]:
print len(snps_kept)

11496


In [11]:
print len(snps_removed)

14705


In [12]:
print "snps in the batch 1 genepop that were not identified in the batch 2 genepop:"
print len([i for i in loci_list if i not in snp_list_b2])

snps in the batch 1 genepop that were not identified in the batch 2 genepop:
10558


### create new batch 2 genepop with only matching snps

In [13]:
# write genotypes for each individual into the new file IF the index of the genotypes is in the indices list
infile = open("../stacks_b2_wgenome/batch_2.genepop", "r")
outfile = open("../stacks_b2_wgenome/batch_2_matched.genepop", "w")

header = infile.readline()
outfile.write(header)
infile.readline() #loci list line
outfile.write(",".join(snps_kept) + "\n")



outfile2 = open("index_check.txt", "w")

for line in infile:
    index_check = ""
    if line.startswith("pop"):
        outfile.write(line)
    else:
        linelist = line.strip().split("\t")
        outfile.write(linelist[0] + "\t") #writes individual name to outfile
        sample_index = 0
        for i in range(1,len(linelist)):
            if sample_index in indices and sample_index != 26179:
                outfile.write(linelist[i] + "\t")
                index_check += "," + str(sample_index)
            elif sample_index in indices and sample_index == 26179:
                outfile.write(linelist[i] + "\n")
                index_check += "," + str(sample_index)
            sample_index += 1
    outfile2.write(index_check + "\n")
infile.close()
outfile.close()    
outfile2.close()   

### create new batch 1 genepop with only matching snps

In [14]:
# open files
infile = open("../stacks_b1_wgenome/batch_1.genepop", "r")

#skip over header
header = infile.readline()

# find indices of loci in the list "snps_kept"
index_1 = 0
indices_1 = []
loci_list = infile.readline().strip().split(",") #infile: read up to loci
for locus in loci_list:
    if locus in snps_kept:
        indices_1.append(index_1)
    index_1 += 1
infile.close()
    
print len(snps_kept)
print len(indices_1)
print indices_1[len(indices_1) -1]

11496
11496
22047


In [15]:
pwd

u'/mnt/hgfs/Pacific cod/DataAnalysis/PCod-Compare-repo/scripts'

In [23]:
# write genotypes for each individual into the new file IF the index of the genotypes is in the indices list
infile = open("../stacks_b1_wgenome/batch_1.genepop", "r")
header = infile.readline()
infile.readline() # skip over loci list

## open output file a write in header
outfile = open("../stacks_b1_wgenome/batch_1_matched.genepop", "w")
outfile.write(header)

## write in loci_snp pairs 
outfile.write(",".join(snps_kept) + "\n")


#create sample list
popmap = open("PopMap_KOR_subset.txt", "r")
sample_List = []
for line in popmap:
    sample_List.append(line.strip().split()[0] + ",")
popmap.close()


outfile2 = open("index_check_batch1.txt", "w")

for line in infile:
    index_check = ""
    if line.startswith("pop"):
        outfile.write(line)
    else:
        linelist = line.strip().split("\t")
        if linelist[0] in sample_List:
            outfile.write(linelist[0] + "\t") #writes individual name to outfile
            sample_index = 0
            for i in range(1,len(linelist)):
                if sample_index in indices_1 and sample_index != 22047:
                    outfile.write(linelist[i] + "\t")
                    index_check += "," + str(sample_index)
                elif sample_index in indices_1 and sample_index == 22047:
                    outfile.write(linelist[i] + "\n")
                    index_check += "," + str(sample_index)
                sample_index += 1
        outfile2.write(index_check + "\n")
infile.close()
outfile.close()    
outfile2.close()


### Compare genotypes at each locus

In [1]:
pwd

u'/mnt/hgfs/Pacific cod/DataAnalysis/PCod-Compare-repo/notebooks'

In [2]:
cd ../

/mnt/hgfs/Pacific cod/DataAnalysis/PCod-Compare-repo


In [3]:
gen1 = open("stacks_b1_wgenome/batch_1_matched.genepop", "r")
gen2 = open("stacks_b2_wgenome/batch_2_matched.genepop", "r")

#skip over header of genepop
gen1.readline()
gen2.readline()

#check to make sure that all needed locus_snp pairs are present in batch 2 genepop
loci1 = gen1.readline().strip().split(",")
loci2 = gen2.readline().strip().split(",")
for i in loci1:
    if i not in loci2: 
        print "oh no! You're missing locus ", i, " in your batch 2 file."

# make a dictionary where the key is the sample and the value is a list of genotypes
import collections

b2_genotype_dict = collections.OrderedDict()
for line in gen2:
    linelist = line.strip().split()
    if len(linelist) > 1:
        sample_name = linelist[0].strip("_subset,")
        b2_genotype_dict[sample_name] = linelist[1:]
gen2.close()      
        
b1_genotype_dict = collections.OrderedDict()
for line in gen1:
    linelist = line.strip().split()
    if len(linelist) > 1:
        sample_name = linelist[0].strip(",")
        if sample_name in b2_genotype_dict.keys():
            b1_genotype_dict[sample_name] = linelist[1:]
gen1.close()

In [4]:
print b1_genotype_dict.keys()

['PO010715_07_rep', 'PO010715_26', 'PO010715_27.1', 'PO020515_10.1', 'PO031715_04', 'PO031715_23', 'GE011215_08.1', 'GE011215_11', 'GE011215_30.1', 'GE012315_05.1', 'GE012315_10.1', 'GE012315_11_2', 'GEO012315_02', 'GEO012315_12', 'GEO012315_18', 'GEO012315_21', 'NA021015_02.1', 'NA021015_10.1', 'NA021015_25', 'YS_121316_18', 'YS_121316_21_2', 'YS_121316_28', 'YS_121316_29', 'JUK07_12', 'JUK07_13', 'JUK07_14', 'JUK07_15', 'JUK07_29.1', 'JUK07_32', 'JB121807_01', 'JB121807_05', 'JB121807_05_2', 'JB121807_12.1', 'JB121807_19.1', 'JB121807_23_2', 'JB121807_25', 'JB121807_33.1', 'JB121807_41.1', 'JB021108_25.1', 'JB021108_45', 'JB021108_46_rep.1', 'JB021108_48.1', 'BOR07_07.1', 'BOR07_10.1', 'BOR07_12.1', 'GEO020414_11', 'GEO020414_16', 'GEO020414_2', 'GEO020414_27', 'GEO020414_4']


In [5]:
print b2_genotype_dict.keys()

['PO010715_07_rep', 'PO010715_26', 'PO010715_27.1', 'PO020515_10.1', 'PO031715_04', 'PO031715_23', 'GE011215_08.1', 'GE011215_11', 'GE011215_30.1', 'GE012315_05.1', 'GE012315_10.1', 'GE012315_11_2', 'GEO012315_02', 'GEO012315_12', 'GEO012315_18', 'GEO012315_21', 'NA021015_02.1', 'NA021015_10.1', 'NA021015_25', 'YS_121316_18', 'YS_121316_21_2', 'YS_121316_28', 'YS_121316_29', 'JUK07_12', 'JUK07_13', 'JUK07_14', 'JUK07_15', 'JUK07_29.1', 'JUK07_32', 'JB121807_01', 'JB121807_05_2', 'JB121807_05', 'JB121807_12.1', 'JB121807_19.1', 'JB121807_23_2', 'JB121807_25', 'JB121807_33.1', 'JB121807_41.1', 'JB021108_25.1', 'JB021108_45', 'JB021108_46_rep.1', 'JB021108_48.1', 'BOR07_07.1', 'BOR07_10.1', 'BOR07_12.1', 'GEO020414_11', 'GEO020414_16', 'GEO020414_27', 'GEO020414_2', 'GEO020414_4']


In [6]:
print len(b2_genotype_dict['PO010715_07_rep'])

11496


In [7]:
print len(b1_genotype_dict['PO010715_07_rep'])

11496


In [8]:
outfile = open("stacks_pipeline_analyses/analyses_AK_LowGenotypeRate/compare_KOR_genos_Depths3v10.txt", "w")

# write to an output file:
## --- (0) sample name
## --- (1) number of genotyped loci in batch 1
## --- (2) number of genotyped loci in batch 2

# calculate and write to output file:
## --- (3) number of same genotypes (includes missing genotypes)
## --- (4) number of same genotypes (does not include missing genotypes)

# calculate and write to output file:
## --- (5) number of different genotypes
## --- (5) number of different genotypes het in batch 1 --> hom in batch 2
## --- (6) number of different genotypes hom in batch 1 --> het in batch 2
## --- (7) number of different genotypes hom in batch 1 --> diff hom in batch 2
## --- (8) number of different genotypes het in batch 1 --> diff het in batch 2
## --- (9) number of different genotypes b/c batch 1 is missing, batch 2 is het
## --- (10) number of different genotypes b/c batch 1 is missing, batch 2 is hom
## --- (11) number of different genotypes b/c batch 2 is missing, batch 1 is het
## --- (12) number of different genotypes b/c batch 2 is missing, batch 2 is hom

outfile.write("sample\tgenotyped.b1\tgenotyped.b2\tsame.genos.wmissing\tsame.genos\tdiff.genos")
outfile.write("\tdiff.het_hom\tdiff.hom_het\tdiff.het_het\tdiff.hom_hom\tdiff.miss_het\tdiff.miss_hom\tdiff.het_miss\tdiff.hom_miss\n")


for sample in b2_genotype_dict.keys():
    outfile.write(sample + "\t")
    b2_genotypes = b2_genotype_dict[sample]
    b1_genotypes = b1_genotype_dict[sample]
    if len(b2_genotypes) != len(b1_genotypes):
        print "crap! you don't have the same number of genotypes in your batches."
        break
    # write to an output file: sample name, number genotyped per batch
    b1_genotyped = len([i for i in b1_genotypes if i != "0000"])
    b2_genotyped = len([i for i in b2_genotypes if i != "0000"])
    outfile.write(str(b1_genotyped) + "\t" + str(b2_genotyped) + "\t")
    # calculate and write to output file: matches and mismatches
    matched_wmissing = 0
    matched = 0
    different = 0
    diff_het_hom = 0
    diff_hom_het = 0
    diff_het_het = 0
    diff_hom_hom = 0
    diff_miss_hom = 0
    diff_miss_het = 0
    diff_het_miss = 0
    diff_hom_miss = 0
    for i in range(0, len(b1_genotypes)):
        # number of matching genotypes, with and without missing genotypes
        if b2_genotypes[i] == b1_genotypes[i]:
            if b2_genotypes[i] != "0000":
                matched += 1
                matched_wmissing += 1
            elif b2_genotypes[i] == "0000":
                matched_wmissing += 1
        # differing pairs, and the type of difference between them
        elif b2_genotypes[i] != b1_genotypes[i]:
            different += 1
            # if b1 is missing genotype
            if b1_genotypes[i] == "0000":
                if b2_genotypes[i][0:2] != b2_genotypes[i][2:]:
                    diff_miss_het += 1
                elif b2_genotypes[i][0:2] == b2_genotypes[i][2:]:
                    diff_miss_hom += 1
                else: 
                    print "your code is effed up! b1 missing"
            # if b2 is missing genotype
            elif b2_genotypes[i] == "0000":
                if b1_genotypes[i][0:2] != b1_genotypes[i][2:]:
                    diff_het_miss += 1
                elif b1_genotypes[i][0:2] == b1_genotypes[i][2:]:
                    diff_hom_miss += 1
                else: 
                    print "your code is effed up! b2 missing"
            # het --> het / hom
            elif b1_genotypes[i][0:2] != b1_genotypes[i][2:]:
                if b2_genotypes[i][0:2] == b2_genotypes[i][2:]:
                    diff_het_hom += 1
                elif b2_genotypes[i][0:2] != b2_genotypes[i][2:]:
                    diff_het_het += 1
            # hom --> het / hom
            elif b1_genotypes[i][0:2] == b1_genotypes[i][2:]:
                if b2_genotypes[i][0:2] != b2_genotypes[i][2:]:
                    diff_hom_het += 1
                elif b2_genotypes[i][0:2] == b2_genotypes[i][2:]:
                    diff_hom_hom += 1
            
    outfile.write(str(matched_wmissing) + "\t" + str(matched) + "\t" + str(different) + "\t")
    outfile.write(str(diff_het_hom) + "\t" + str(diff_hom_het) + "\t" + str(diff_het_het) + "\t" + str(diff_hom_hom) + "\t")
    outfile.write(str(diff_miss_het) + "\t" + str(diff_miss_hom) + "\t" + str(diff_het_miss) + "\t" + str(diff_hom_miss) + "\n")

outfile.close()

I checked the output manually in excel, and it matches up!

![img-overall](https://github.com/mfisher5/PCod-Compare-repo/blob/master/stacks_pipeline_analyses/analyses_AK_LowGenotypeRate/GenoComparison_plot1.png?raw=true)
![img-differences](https://github.com/mfisher5/PCod-Compare-repo/blob/master/stacks_pipeline_analyses/analyses_AK_LowGenotypeRate/GenoComparison_Mismatches.png?raw=true)