## Resistant lines

This notebook extracts the IDs of lines (which are our "individuals") from the DGRP2 that have a resistant allele for any of the loci of interest.

In [1]:
from pathlib import Path

import pandas as pd

### Ace and Cyp6g1

In [2]:
for pref in Path('output/resistant-lines/').glob('*.012'):
    with open(str(pref) + '.indv', 'r') as f:
        lines = [line.strip() for line in f]
    with open(str(pref) + '.pos', 'r') as f:
        pos = [line.strip().replace('\t', '-') for line in f]
    data = pd.read_table(pref, index_col=0, header=None)
    data.index = lines
    data.columns = pos
    resistant = data.loc[data.eq(2).any(axis='columns')]
    identifier = pref.name.split('.')[0]
    with open(pref.parent / (identifier.split('-')[0] + '-resistant-lines.txt'), 'w') as f:
        f.write('\n'.join(line for line in resistant.index.tolist()) + '\n')
    print(f"{identifier}: {len(resistant)}/{len(data)} ({len(resistant)/len(data):%}) resistant")

cyp-genotype: 155/205 (75.609756%) resistant
ace-genotype: 78/205 (38.048780%) resistant


For CHKoV, we use the data from Frank Jiggins. Since that was done for the DGRP1 and we're working with DGRP2, we need to...

### Compare DGRP1 and DGRP2 line names

In [3]:
data = pd.read_table(snakemake.input['jiggins'], sep=' ', skiprows=3)

In [4]:
lines1 = sorted(list(data.id.str.replace('X', 'line_')))

In [5]:
len(lines1)

189

In [6]:
with open(Path('output/resistant-lines/ace-genotype.012.indv'), 'r') as f:
    lines2 = sorted(list(line.strip() for line in f))

In [7]:
len(lines2)

205

In [8]:
with open(Path('output/resistant-lines/dgrp1-dgrp2-lines-comparison.txt'), 'w') as f:
    i = 0
    j = 0
    f.write('DGRP1\t\tDGRP2\n')
    while (i < len(lines1)) and (j < len(lines2)):
        if lines1[i] == lines2[j]:
            f.write(f'{lines1[i]:8s}\t{lines2[j]:8s}\n')
            i += 1
            j += 1
        elif lines1[i] < lines2[j]:
            f.write(lines1[i] + '\n')
            i += 1
        elif lines1[i] > lines2[j]:
            f.write('\t\t' + lines2[j] + '\n')
            j += 1

### CHKoV1

In [9]:
resistant = data.loc[data.doc.eq(1)].id

In [10]:
resistant = sorted(resistant.str.replace('X', 'line_').tolist())
chkov_in_dgrp2 = set(resistant) & set(lines2)

In [11]:
with open('output/resistant-lines/chkov-resistant-lines.txt', 'w') as f:
    f.write('\n'.join(line for line in list(chkov_in_dgrp2)) + '\n')
print(f"CHKoV1: {len(chkov_in_dgrp2)}/205 ({len(chkov_in_dgrp2)/205:%}) resistant")

CHKoV1: 139/205 (67.804878%) resistant
