## Table of Contents
- [Import libraries](#1)
- [processing tables](#2)
- [Download tables](#3)


<a name='1'></a>
## Import libraries

The script is focused on setting up an environment for data analysis and visualization. It imports a suite of libraries and modules that are essential for statistical computing, data manipulation, progress tracking, file system operations, and generating visualizations such as plots and Venn diagrams. The specific libraries imported include pandas for data structures, numpy for numerical operations, tqdm for progress bars, glob for file path retrieval, os for operating system interaction, matplotlib and seaborn for plotting and graphical representations, and matplotlib_venn for creating Venn diagrams.

Additionally, the script modifies the system path to include a custom directory, which suggests that the script will use additional custom modules and configuration settings located in this directory. These custom modules, imported with wildcard imports (from config import * and from functions import *)

In [1]:
# %load /cluster/home/myurchikova/github/projects2020_ohsu/eth/learning_Master_thesis/TASKS/func/base_imports.py
import pandas as pd
import numpy as np
import tqdm 
import glob
import os
import matplotlib
import matplotlib.pyplot as plt
import seaborn as sns
import tarfile
import re
from matplotlib_venn import venn2, venn2_circles, venn2_unweighted
from matplotlib_venn import venn3, venn3_circles
import sys
sys.path.append(r"/cluster/home/myurchikova/github/projects2020_ohsu/eth/learning_Master_thesis/TASKS/func")
from config import *
from functions import *

<a name='2'></a>
## Processing table

Establishing directories for data storage and retrieval.
Initializing an empty DataFrame to store filtered results.
Employing conditional logic to handle different versions or types of data.
Reading and loading data from various files, with conditions to filter and select specific data points.
Mapping related data files to one another for comparative analysis.
Manipulating a DataFrame by appending new data, which includes sample information, filtering criteria, and measurement sizes.
Assigning priority levels to data entries, indicating a method for ranking their importance or relevance.
Printing information for verification or logging purposes.
Aggregating and consolidating data in preparation for further analysis or visualization.

In [4]:
# Loading tables according to given paths

filter_dir = os.path.join(FILTERING_PATH_BRCA, 'filtering_samples', FILTERING_ID)

out_df_filtered=pd.DataFrame()

out_var=''
title_venn = '{sample}'

# ETH Names
# OLD code
if not OHSU_BRCA_NEW: 
    eth_all = glob.glob(os.path.join(filter_dir, 'G*'))
    salt=''
    file=TAR_OHSU_BRCA
    
#New code
if OHSU_BRCA_NEW:
    eth_all = glob.glob(os.path.join(filter_dir, 'G*'))
    eth_all = [i for i in eth_all if len(os.path.basename(i)) == 45] #Shorten path has len 45
    salt='OHSU_BRSA_NEW'
    file=TAR_OSHU_BRCA_NEW
    
# OHSU Names
with tarfile.open(file, "r:*") as tar:
    ohsu_all = tar.getnames()

print(ohsu_all[0],TT)
# Get file pairs
file_pair = {}
for idx_eth, eth in enumerate(eth_all):
    pattern = os.path.basename(eth).replace('G_', '').replace('.gz', '') 
    for idx_ohsu, ohsu in enumerate(ohsu_all):
        if pattern in ohsu:
            file_pair[eth] = ohsu
        
restricts = RESTRICTS_BRCA
for restrict in restricts:
    df = {'sample' : [], 
      'filter_foreground' : [], 
      'filter_background' : [], 
      'filter': [],
      'size_ohsu' : [], 
      'size_eth' : [], 
      'size_intersection' : [], 
      'size_ohsu\eth' : [], 
      'size_eth\ohsu' : [],
      'eth_kmers\inter':[],
      'ohsu_kmers\inter':[],
      'inter_kmers':[],
          
      'coord_OHSU':[],
      'coord_ETH':[],
      'size_ohsu_coor' : [], 
      'size_eth_coor' : [], 
      'size_intersection_coor' : [], 
      'size_ohsu\eth_coor' : [], 
      'size_eth\ohsu_coor' : [],
      'eth_coor\inter_coor':[],
      'ohsu_coor\inter_coor':[],
      'inter_coor':[],
      'eth_coor\ohsu_coor':[],
      'ohsu_coor\eth_coor':[],
      'priority':[],
         }
    with tarfile.open(file, "r:*") as tar: #OHSU
        for eth, ohsu in file_pair.items(): # ETH
            print('eth',eth,TT,'ohsu',ohsu)
            if (not restrict) or restrict == re.findall('G_([\s\S]+?)_',eth)[0].replace('-',''): #Restrict to category of interest
                    print(restrict)
                    df_ohsu = pd.read_csv(ohsu_path:=tar.extractfile(ohsu), sep="\t")
                    df_ohsu.reset_index(inplace=True)
                    if not df_ohsu.empty: df_ohsu = table_processing.ohsu_to_eth_coord(df_ohsu)
                    if not df_ohsu.empty: df_ohsu['junction_coordinate'] = df_ohsu['jx_shifted'].apply(lambda x: ':'.join(x.split(';')[1:3]))
                    df_eth = pd.read_csv(eth, sep="\t")
                    if not df_eth.empty: df_eth=table_processing.get_junction_coordinates(df_eth,'coord')
                    df1=df_eth
                    df2=df_ohsu
                    print(df_eth.columns,df_eth,sep='\n')
                    print(df_ohsu.columns,df_ohsu,sep='\n')
                    df_eth = set(df_eth['kmer'])
                    df_ohsu = set(df_ohsu['index'])
                    df_eth_coor = set(df1['junction_coordinate']) if not df1.empty else set([])
                    df_ohsu_coor = set(df2['junction_coordinate']) if not df1.empty else set([])
                    name = os.path.basename(ohsu).replace('.tsv', '').split('_')
                    print(name)
                    df['coord_ETH'].append(df1['junction_coordinate'] if not df1.empty else 'None' )
                    df['coord_OHSU'].append(df2['junction_coordinate'] if not df2.empty else 'None')
                    df['sample'].append(ss:=name[1].replace('-',''))
                
                    if not OHSU_BRCA_NEW:
                        df['filter_foreground'].append(name[2])
                        df['filter_background'].append(name[3])
                        if name[3][1] == 'Any':
                            priority=0
                        elif name[3][1] == 10:
                            priority=1
                        elif name[3][1] == 2:
                            priority=2
                        else:
                            priotity = None
                        df['priority'].append(priority)
                        df['filter'].append(name[2]+' '+name[3])
                    else:
                        df['filter'].append(name[2])
                        a = []
                        for i in range(5):
                            if name[2][i] == 'A':
                                a.append('Any')
                            elif name[2][i] == 'X':
                                a.append('10')
                            elif name[2][i] == 'N':
                                a.append('None')
                            else:
                                a.append(name[2][i])
                        df['filter_foreground'].append(f'({a[0]}, {a[1]}, {a[2]})')
                        df['filter_background'].append(f'({a[3]}, {a[4]})')
                        if a[4] == 'Any':
                            priority=0
                        elif a[4] == '10':
                            priority=1
                        elif a[4] == '2':
                            priority=2
                        else:
                            priority = None
                        df['priority'].append(priority)
                    print(a)
                    print(priority)
                    df['size_ohsu'].append(len(df_ohsu))
                    df['size_eth'].append(len(df_eth))
                    df['size_ohsu_coor'].append(len(df_ohsu_coor))
                    df['size_eth_coor'].append(len(df_eth_coor))
                    df['size_ohsu\eth'].append(len(df_ohsu_filter:=df_ohsu.difference(df_eth)))
                    df['size_eth\ohsu'].append(len(df_eth_filter:=df_eth.difference(df_ohsu)))
                
                    if name[2] in ['0AN3XGA','0AN3AGA'] and name[1].replace('-','') == 'TCGAC8A12P01A11RA11507':
                       out_var+= f'sample: {ss}\n filter_for: ({a[0]}, {a[1]}, {a[2]})\n  filter_back: ({a[3]}, {a[4]})\n ohsu/eth: {len(df_ohsu_filter)}\n path_OSHU: {ohsu_path}\n path_ETH: {eth}\n'
                    
                    df['size_ohsu\eth_coor'].append(len(df_ohsu_filter_coor:=df_ohsu_coor.difference(df_eth_coor)))
                    df['size_eth\ohsu_coor'].append(len(df_eth_filter_coor:=df_eth_coor.difference(df_ohsu_coor)))
                    df['ohsu_coor\eth_coor'].append(np.array(df_ohsu_filter_coor))
                    print(len(df_ohsu_filter_coor))
                    print(len(df_eth_filter_coor))
                    df['eth_coor\ohsu_coor'].append(np.array(df_eth_filter_coor))
                    df['size_intersection'].append(len(df_inter_filter:=set(df_ohsu) & set(df_eth)))
                    df['size_intersection_coor'].append(len(df_inter_filter_coor:=df_ohsu_coor & df_eth_coor))
                    print(len(df_inter_filter_coor))
                    print('NEXT')
                    df['eth_kmers\inter'].append(df_eth_witout_inter:=list(df_eth_filter.difference(df_inter_filter)))
                    df['ohsu_kmers\inter'].append(df_ohsu_witout_inter:=list(df_ohsu_filter.difference(df_inter_filter)))
                    df['eth_coor\inter_coor'].append(df_eth_witout_inter_coor:=list(df_eth_filter_coor.difference(df_inter_filter_coor)))
                    df['ohsu_coor\inter_coor'].append(df_ohsu_witout_inter_coor:=list(df_ohsu_filter_coor.difference(df_inter_filter_coor)))
                    df['inter_kmers'].append(list(df_inter_filter))
                    df['inter_coor'].append(list(df_inter_filter_coor))
    df = pd.DataFrame(df)
         #print(df)
    if not out_df_filtered.empty:
        out_df_filtered = pd.concat([out_df_filtered, df])
    else:
        out_df_filtered = df

J_TCGA-24-1431-01A-01R-1566-13_02101GA.tsv 
---------------------------------------------

eth /cluster/work/grlab/projects/projects2020_OHSU/peptides_generation/CANCER_eth/commit_c4dd02c_conf2_Frame_cap0_runs/TCGA_Breast_1102/filtering_samples/filters_19May_order_5ge_wAnnot_GPstar/G_TCGA-A2-A0D2-01A-21R-A034-07_02501GA.tsv.gz 
---------------------------------------------
 ohsu J_TCGA-A2-A0D2-01A-21R-A034-07_02501GA.tsv
eth /cluster/work/grlab/projects/projects2020_OHSU/peptides_generation/CANCER_eth/commit_c4dd02c_conf2_Frame_cap0_runs/TCGA_Breast_1102/filtering_samples/filters_19May_order_5ge_wAnnot_GPstar/G_TCGA-C8-A12P-01A-11R-A115-07_0A512GA.tsv.gz 
---------------------------------------------
 ohsu J_TCGA-C8-A12P-01A-11R-A115-07_0A512GA.tsv
TCGAC8A12P01A11RA11507


46it [00:00, 6384.02it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                              coord  \
0   GALVYAAKP      54634783:54634802:54632373:54632381:None:None   
1   QFIENRVLF      54631737:54631750:54629607:54629621:None:None   
2   AGALVYAAK      54634783:54634805:54632376:54632381:None:None   
3   YAAKPNEEI      54634783:54634790:54632361:54632381:None:None   
4   KTIHRKQSP      54631737:54631754:54629611:54629621:None:None   
5   NSSKTESCS      54631737:54631749:54629606:54629621:None:None   
6   GGKQFIENR      54631737:54631759:54629616:54629621:None:None   
7   LVYAAKPNE      54634783:54634796:54632367:54632381:None:None   
8   HRKQSPVLS      54631737:54631745:54629602:54629621:None:None   
9   VAGALVYAA      54634783:54634808:54632379:54632381:None:None   
10  GKQFIENRV      54631737:54631756:54629613:54629621:None:None   
11  FIENRVLFC      54631737:54631747:54629604:54629621

207it [00:00, 3497.51it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                          coord  \
0    GALVYAAKP  54634783:54634802:54632373:54632381:None:None   
1    QFIENRVLF  54631737:54631750:54629607:54629621:None:None   
2    AGALVYAAK  54634783:54634805:54632376:54632381:None:None   
3    YAAKPNEEI  54634783:54634790:54632361:54632381:None:None   
4    GDLDSVLGA  54633062:54633087:54632379:54632381:None:None   
..         ...                                            ...   
202  RGYGKINKQ  73291597:73291621:73291177:73291180:None:None   
203  GYGKINKQI  73291597:73291618:73291174:73291180:None:None   
204  NKQITSCGP  73291597:73291603:73291159:73291180:None:None   
205  KQITSCGPS  73291597:73291600:73291156:73291180:None:None   
206  YGKINKQIT  73291597:73291615:73291171:73291180:None:None   

     junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0                

272it [00:00, 3839.93it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    TLYPKYELH      37724981:37725002:37795269:37795275:None:None   
1    PKYELHKLK      37724990:37725002:37795269:37795284:None:None   
2    LYPKYELHK      37724984:37725002:37795269:37795278:None:None   
3    ETLYPKYEL      37724978:37725002:37795269:37795272:None:None   
4    ELHKLKKAV      37724999:37725002:37795269:37795293:None:None   
..         ...                                                ...   
267  VKHLKTVQR      89268510:89268525:89281270:89281282:None:None   
268  ELVKHLKTV      89268504:89268525:89281270:89281276:None:None   
269  HEAVHQHTS  186628500:186628523:186628495:186628499:None:None   
270  AVHQHTSRS  186628500:186628517:186628489:186628499:None:None   
271  EAVHQHTSR  186628500:186628520:186628492:186628499:None:None   

     junctionAnnotated  readFrameAnnotate

118it [00:00, 6844.24it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    GALVYAAKP      54634783:54634802:54632373:54632381:None:None   
1    QFIENRVLF      54631737:54631750:54629607:54629621:None:None   
2    AGALVYAAK      54634783:54634805:54632376:54632381:None:None   
3    YAAKPNEEI      54634783:54634790:54632361:54632381:None:None   
4    KTIHRKQSP      54631737:54631754:54629611:54629621:None:None   
..         ...                                                ...   
113  RDNMWVLPH  215423413:215423425:215423354:215423369:None:None   
114  RRDNMWVLP  215423413:215423428:215423357:215423369:None:None   
115  EGRRDNMWV  215423413:215423434:215423363:215423369:None:None   
116  SEGRRDNMW  215423413:215423437:215423366:215423369:None:None   
117  WNSSNPRHH  107670488:107670508:107516798:107516805:None:None   

     junctionAnnotated  readFrameAnnotate

617it [00:00, 3307.90it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                          coord  \
0    TLYPKYELH  37724981:37725002:37795269:37795275:None:None   
1    PKYELHKLK  37724990:37725002:37795269:37795284:None:None   
2    LYPKYELHK  37724984:37725002:37795269:37795278:None:None   
3    ETLYPKYEL  37724978:37725002:37795269:37795272:None:None   
4    ELHKLKKAV  37724999:37725002:37795269:37795293:None:None   
..         ...                                            ...   
612  LIQLRSSRD  63761229:63761244:63760966:63760978:None:None   
613  IQLRSSRDK  63761229:63761241:63760963:63760978:None:None   
614  EGMLIQLRS  63761229:63761253:63760975:63760978:None:None   
615  GMLIQLRSS  63761229:63761250:63760972:63760978:None:None   
616  LRSSRDKTY  63761229:63761235:63760957:63760978:None:None   

     junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0                

207it [00:00, 3592.17it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                          coord  \
0    GALVYAAKP  54634783:54634802:54632373:54632381:None:None   
1    QFIENRVLF  54631737:54631750:54629607:54629621:None:None   
2    AGALVYAAK  54634783:54634805:54632376:54632381:None:None   
3    YAAKPNEEI  54634783:54634790:54632361:54632381:None:None   
4    GDLDSVLGA  54633062:54633087:54632379:54632381:None:None   
..         ...                                            ...   
202  RGYGKINKQ  73291597:73291621:73291177:73291180:None:None   
203  GYGKINKQI  73291597:73291618:73291174:73291180:None:None   
204  NKQITSCGP  73291597:73291603:73291159:73291180:None:None   
205  KQITSCGPS  73291597:73291600:73291156:73291180:None:None   
206  YGKINKQIT  73291597:73291615:73291171:73291180:None:None   

     junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0                

117it [00:00, 2866.60it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    GALVYAAKP      54634783:54634802:54632373:54632381:None:None   
1    QFIENRVLF      54631737:54631750:54629607:54629621:None:None   
2    AGALVYAAK      54634783:54634805:54632376:54632381:None:None   
3    YAAKPNEEI      54634783:54634790:54632361:54632381:None:None   
4    KTIHRKQSP      54631737:54631754:54629611:54629621:None:None   
..         ...                                                ...   
112  GRRDNMWVL  215423413:215423431:215423360:215423369:None:None   
113  RDNMWVLPH  215423413:215423425:215423354:215423369:None:None   
114  RRDNMWVLP  215423413:215423428:215423357:215423369:None:None   
115  EGRRDNMWV  215423413:215423434:215423363:215423369:None:None   
116  SEGRRDNMW  215423413:215423437:215423366:215423369:None:None   

     junctionAnnotated  readFrameAnnotate

339it [00:00, 2948.41it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                          coord  \
0    RYSLPVPTY  12718630:12718633:12720509:12720533:None:None   
1    RNDLQLRYS  12718612:12718633:12720509:12720515:None:None   
2    LQLRYSLPV  12718621:12718633:12720509:12720524:None:None   
3    DLQLRYSLP  12718618:12718633:12720509:12720521:None:None   
4    QRNDLQLRY  12718609:12718633:12720509:12720512:None:None   
..         ...                                            ...   
334  RGYGKINKQ  73291597:73291621:73291177:73291180:None:None   
335  GYGKINKQI  73291597:73291618:73291174:73291180:None:None   
336  NKQITSCGP  73291597:73291603:73291159:73291180:None:None   
337  KQITSCGPS  73291597:73291600:73291156:73291180:None:None   
338  YGKINKQIT  73291597:73291615:73291171:73291180:None:None   

     junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0                

556it [00:00, 3232.13it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    TLYPKYELH      37724981:37725002:37795269:37795275:None:None   
1    PKYELHKLK      37724990:37725002:37795269:37795284:None:None   
2    LYPKYELHK      37724984:37725002:37795269:37795278:None:None   
3    ETLYPKYEL      37724978:37725002:37795269:37795272:None:None   
4    ELHKLKKAV      37724999:37725002:37795269:37795293:None:None   
..         ...                                                ...   
551  GMLIQLRSS      63761229:63761250:63760972:63760978:None:None   
552  LRSSRDKTY      63761229:63761235:63760957:63760978:None:None   
553  HEAVHQHTS  186628500:186628523:186628495:186628499:None:None   
554  AVHQHTSRS  186628500:186628517:186628489:186628499:None:None   
555  EAVHQHTSR  186628500:186628520:186628492:186628499:None:None   

     junctionAnnotated  readFrameAnnotate

28it [00:00, 6168.09it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                          coord  \
0   GALVYAAKP  54634783:54634802:54632373:54632381:None:None   
1   QFIENRVLF  54631737:54631750:54629607:54629621:None:None   
2   AGALVYAAK  54634783:54634805:54632376:54632381:None:None   
3   YAAKPNEEI  54634783:54634790:54632361:54632381:None:None   
4   KTIHRKQSP  54631737:54631754:54629611:54629621:None:None   
5   NSSKTESCS  54631737:54631749:54629606:54629621:None:None   
6   GGKQFIENR  54631737:54631759:54629616:54629621:None:None   
7   LVYAAKPNE  54634783:54634796:54632367:54632381:None:None   
8   HRKQSPVLS  54631737:54631745:54629602:54629621:None:None   
9   VAGALVYAA  54634783:54634808:54632379:54632381:None:None   
10  GKQFIENRV  54631737:54631756:54629613:54629621:None:None   
11  FIENRVLFC  54631737:54631747:54629604:54629621:None:None   
12  IHRKQSPVL  54631737:54631748:54629

178it [00:00, 2690.06it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                          coord  \
0    TLYPKYELH  37724981:37725002:37795269:37795275:None:None   
1    PKYELHKLK  37724990:37725002:37795269:37795284:None:None   
2    LYPKYELHK  37724984:37725002:37795269:37795278:None:None   
3    ETLYPKYEL  37724978:37725002:37795269:37795272:None:None   
4    ELHKLKKAV  37724999:37725002:37795269:37795293:None:None   
..         ...                                            ...   
173  LVKHLKTVQ  89268507:89268525:89281270:89281279:None:None   
174  RELVKHLKT  89268501:89268525:89281270:89281273:None:None   
175  RELVKHLKT  89268501:89268525:89281270:89281273:None:None   
176  VKHLKTVQR  89268510:89268525:89281270:89281282:None:None   
177  ELVKHLKTV  89268504:89268525:89281270:89281276:None:None   

     junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0                

423it [00:00, 3050.62it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                          coord  \
0    TLYPKYELH  37724981:37725002:37795269:37795275:None:None   
1    PKYELHKLK  37724990:37725002:37795269:37795284:None:None   
2    LYPKYELHK  37724984:37725002:37795269:37795278:None:None   
3    ETLYPKYEL  37724978:37725002:37795269:37795272:None:None   
4    ELHKLKKAV  37724999:37725002:37795269:37795293:None:None   
..         ...                                            ...   
418  LIQLRSSRD  63761229:63761244:63760966:63760978:None:None   
419  IQLRSSRDK  63761229:63761241:63760963:63760978:None:None   
420  EGMLIQLRS  63761229:63761253:63760975:63760978:None:None   
421  GMLIQLRSS  63761229:63761250:63760972:63760978:None:None   
422  LRSSRDKTY  63761229:63761235:63760957:63760978:None:None   

     junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0                

105it [00:00, 6807.36it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    GALVYAAKP      54634783:54634802:54632373:54632381:None:None   
1    IENSNCQLG      54631737:54631744:54631532:54631552:None:None   
2    QFIENRVLF      54631737:54631750:54629607:54629621:None:None   
3    AGALVYAAK      54634783:54634805:54632376:54632381:None:None   
4    YAAKPNEEI      54634783:54634790:54632361:54632381:None:None   
..         ...                                                ...   
100  IPYTIKSKA  116125338:116125363:116121953:116121955:None:None   
101  PYTIKSKAV  116125338:116125360:116121950:116121955:None:None   
102  YTIKSKAVR  116125338:116125357:116121947:116121955:None:None   
103  TIKSKAVRG  116125338:116125354:116121944:116121955:None:None   
104  IKSKAVRGE  116125338:116125351:116121941:116121955:None:None   

     junctionAnnotated  readFrameAnnotate

461it [00:00, 3095.15it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                          coord  \
0    TLYPKYELH  37724981:37725002:37795269:37795275:None:None   
1    PKYELHKLK  37724990:37725002:37795269:37795284:None:None   
2    LYPKYELHK  37724984:37725002:37795269:37795278:None:None   
3    ETLYPKYEL  37724978:37725002:37795269:37795272:None:None   
4    ELHKLKKAV  37724999:37725002:37795269:37795293:None:None   
..         ...                                            ...   
456  LIQLRSSRD  63761229:63761244:63760966:63760978:None:None   
457  IQLRSSRDK  63761229:63761241:63760963:63760978:None:None   
458  EGMLIQLRS  63761229:63761253:63760975:63760978:None:None   
459  GMLIQLRSS  63761229:63761250:63760972:63760978:None:None   
460  LRSSRDKTY  63761229:63761235:63760957:63760978:None:None   

     junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0                

118it [00:00, 4284.42it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    GALVYAAKP      54634783:54634802:54632373:54632381:None:None   
1    QFIENRVLF      54631737:54631750:54629607:54629621:None:None   
2    AGALVYAAK      54634783:54634805:54632376:54632381:None:None   
3    YAAKPNEEI      54634783:54634790:54632361:54632381:None:None   
4    KTIHRKQSP      54631737:54631754:54629611:54629621:None:None   
..         ...                                                ...   
113  RDNMWVLPH  215423413:215423425:215423354:215423369:None:None   
114  RRDNMWVLP  215423413:215423428:215423357:215423369:None:None   
115  EGRRDNMWV  215423413:215423434:215423363:215423369:None:None   
116  SEGRRDNMW  215423413:215423437:215423366:215423369:None:None   
117  WNSSNPRHH  107670488:107670508:107516798:107516805:None:None   

     junctionAnnotated  readFrameAnnotate

217it [00:00, 3036.59it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    QELNPLNGS      54632240:54632259:54630945:54630953:None:None   
1    GALVYAAKP      54634783:54634802:54632373:54632381:None:None   
2    IENSNCQLG      54631737:54631744:54631532:54631552:None:None   
3    QFIENRVLF      54631737:54631750:54629607:54629621:None:None   
4    AGALVYAAK      54634783:54634805:54632376:54632381:None:None   
..         ...                                                ...   
212  RDNMWVLPH  215423413:215423425:215423354:215423369:None:None   
213  RRDNMWVLP  215423413:215423428:215423357:215423369:None:None   
214  EGRRDNMWV  215423413:215423434:215423363:215423369:None:None   
215  SEGRRDNMW  215423413:215423437:215423366:215423369:None:None   
216  WNSSNPRHH  107670488:107670508:107516798:107516805:None:None   

     junctionAnnotated  readFrameAnnotate

118it [00:00, 2571.55it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    GALVYAAKP      54634783:54634802:54632373:54632381:None:None   
1    QFIENRVLF      54631737:54631750:54629607:54629621:None:None   
2    AGALVYAAK      54634783:54634805:54632376:54632381:None:None   
3    YAAKPNEEI      54634783:54634790:54632361:54632381:None:None   
4    KTIHRKQSP      54631737:54631754:54629611:54629621:None:None   
..         ...                                                ...   
113  RDNMWVLPH  215423413:215423425:215423354:215423369:None:None   
114  RRDNMWVLP  215423413:215423428:215423357:215423369:None:None   
115  EGRRDNMWV  215423413:215423434:215423363:215423369:None:None   
116  SEGRRDNMW  215423413:215423437:215423366:215423369:None:None   
117  WNSSNPRHH  107670488:107670508:107516798:107516805:None:None   

     junctionAnnotated  readFrameAnnotate

46it [00:00, 6589.64it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                              coord  \
0   GALVYAAKP      54634783:54634802:54632373:54632381:None:None   
1   QFIENRVLF      54631737:54631750:54629607:54629621:None:None   
2   AGALVYAAK      54634783:54634805:54632376:54632381:None:None   
3   YAAKPNEEI      54634783:54634790:54632361:54632381:None:None   
4   KTIHRKQSP      54631737:54631754:54629611:54629621:None:None   
5   NSSKTESCS      54631737:54631749:54629606:54629621:None:None   
6   GGKQFIENR      54631737:54631759:54629616:54629621:None:None   
7   LVYAAKPNE      54634783:54634796:54632367:54632381:None:None   
8   HRKQSPVLS      54631737:54631745:54629602:54629621:None:None   
9   VAGALVYAA      54634783:54634808:54632379:54632381:None:None   
10  GKQFIENRV      54631737:54631756:54629613:54629621:None:None   
11  FIENRVLFC      54631737:54631747:54629604:54629621

119it [00:00, 3532.01it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    KIYRWTLAQ  190454654:190454665:190317152:190317168:None:None   
1    YRWTLAQPP  190454654:190454659:190317146:190317168:None:None   
2    TRYKIYRWT  190454654:190454674:190317161:190317168:None:None   
3    RWTLAQPPT  190454654:190454656:190317143:190317168:None:None   
4    YKIYRWTLA  190454654:190454668:190317155:190317168:None:None   
..         ...                                                ...   
114  IPYTIKSKA  116125338:116125363:116121953:116121955:None:None   
115  PYTIKSKAV  116125338:116125360:116121950:116121955:None:None   
116  YTIKSKAVR  116125338:116125357:116121947:116121955:None:None   
117  TIKSKAVRG  116125338:116125354:116121944:116121955:None:None   
118  IKSKAVRGE  116125338:116125351:116121941:116121955:None:None   

     junctionAnnotated  readFrameAnnotate

201it [00:00, 3045.39it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                          coord  \
0    GALVYAAKP  54634783:54634802:54632373:54632381:None:None   
1    QFIENRVLF  54631737:54631750:54629607:54629621:None:None   
2    AGALVYAAK  54634783:54634805:54632376:54632381:None:None   
3    YAAKPNEEI  54634783:54634790:54632361:54632381:None:None   
4    GDLDSVLGA  54633062:54633087:54632379:54632381:None:None   
..         ...                                            ...   
196  RGYGKINKQ  73291597:73291621:73291177:73291180:None:None   
197  GYGKINKQI  73291597:73291618:73291174:73291180:None:None   
198  NKQITSCGP  73291597:73291603:73291159:73291180:None:None   
199  KQITSCGPS  73291597:73291600:73291156:73291180:None:None   
200  YGKINKQIT  73291597:73291615:73291171:73291180:None:None   

     junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0                

28it [00:00, 6217.73it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                          coord  \
0   GALVYAAKP  54634783:54634802:54632373:54632381:None:None   
1   QFIENRVLF  54631737:54631750:54629607:54629621:None:None   
2   AGALVYAAK  54634783:54634805:54632376:54632381:None:None   
3   YAAKPNEEI  54634783:54634790:54632361:54632381:None:None   
4   KTIHRKQSP  54631737:54631754:54629611:54629621:None:None   
5   NSSKTESCS  54631737:54631749:54629606:54629621:None:None   
6   GGKQFIENR  54631737:54631759:54629616:54629621:None:None   
7   LVYAAKPNE  54634783:54634796:54632367:54632381:None:None   
8   HRKQSPVLS  54631737:54631745:54629602:54629621:None:None   
9   VAGALVYAA  54634783:54634808:54632379:54632381:None:None   
10  GKQFIENRV  54631737:54631756:54629613:54629621:None:None   
11  FIENRVLFC  54631737:54631747:54629604:54629621:None:None   
12  IHRKQSPVL  54631737:54631748:54629

46it [00:00, 2074.58it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                              coord  \
0   GALVYAAKP      54634783:54634802:54632373:54632381:None:None   
1   QFIENRVLF      54631737:54631750:54629607:54629621:None:None   
2   AGALVYAAK      54634783:54634805:54632376:54632381:None:None   
3   YAAKPNEEI      54634783:54634790:54632361:54632381:None:None   
4   KTIHRKQSP      54631737:54631754:54629611:54629621:None:None   
5   NSSKTESCS      54631737:54631749:54629606:54629621:None:None   
6   GGKQFIENR      54631737:54631759:54629616:54629621:None:None   
7   LVYAAKPNE      54634783:54634796:54632367:54632381:None:None   
8   HRKQSPVLS      54631737:54631745:54629602:54629621:None:None   
9   VAGALVYAA      54634783:54634808:54632379:54632381:None:None   
10  GKQFIENRV      54631737:54631756:54629613:54629621:None:None   
11  FIENRVLFC      54631737:54631747:54629604:54629621

504it [00:00, 3899.95it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    TLYPKYELH      37724981:37725002:37795269:37795275:None:None   
1    PKYELHKLK      37724990:37725002:37795269:37795284:None:None   
2    LYPKYELHK      37724984:37725002:37795269:37795278:None:None   
3    ETLYPKYEL      37724978:37725002:37795269:37795272:None:None   
4    ELHKLKKAV      37724999:37725002:37795269:37795293:None:None   
..         ...                                                ...   
499  GMLIQLRSS      63761229:63761250:63760972:63760978:None:None   
500  LRSSRDKTY      63761229:63761235:63760957:63760978:None:None   
501  HEAVHQHTS  186628500:186628523:186628495:186628499:None:None   
502  AVHQHTSRS  186628500:186628517:186628489:186628499:None:None   
503  EAVHQHTSR  186628500:186628520:186628492:186628499:None:None   

     junctionAnnotated  readFrameAnnotate

720it [00:00, 3707.27it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    TLYPKYELH      37724981:37725002:37795269:37795275:None:None   
1    PKYELHKLK      37724990:37725002:37795269:37795284:None:None   
2    LYPKYELHK      37724984:37725002:37795269:37795278:None:None   
3    ETLYPKYEL      37724978:37725002:37795269:37795272:None:None   
4    ELHKLKKAV      37724999:37725002:37795269:37795293:None:None   
..         ...                                                ...   
715  GMLIQLRSS      63761229:63761250:63760972:63760978:None:None   
716  LRSSRDKTY      63761229:63761235:63760957:63760978:None:None   
717  HEAVHQHTS  186628500:186628523:186628495:186628499:None:None   
718  AVHQHTSRS  186628500:186628517:186628489:186628499:None:None   
719  EAVHQHTSR  186628500:186628520:186628492:186628499:None:None   

     junctionAnnotated  readFrameAnnotate

46it [00:00, 6305.99it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                              coord  \
0   GALVYAAKP      54634783:54634802:54632373:54632381:None:None   
1   QFIENRVLF      54631737:54631750:54629607:54629621:None:None   
2   AGALVYAAK      54634783:54634805:54632376:54632381:None:None   
3   YAAKPNEEI      54634783:54634790:54632361:54632381:None:None   
4   KTIHRKQSP      54631737:54631754:54629611:54629621:None:None   
5   NSSKTESCS      54631737:54631749:54629606:54629621:None:None   
6   GGKQFIENR      54631737:54631759:54629616:54629621:None:None   
7   LVYAAKPNE      54634783:54634796:54632367:54632381:None:None   
8   HRKQSPVLS      54631737:54631745:54629602:54629621:None:None   
9   VAGALVYAA      54634783:54634808:54632379:54632381:None:None   
10  GKQFIENRV      54631737:54631756:54629613:54629621:None:None   
11  FIENRVLFC      54631737:54631747:54629604:54629621

28it [00:00, 6194.45it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                          coord  \
0   GALVYAAKP  54634783:54634802:54632373:54632381:None:None   
1   QFIENRVLF  54631737:54631750:54629607:54629621:None:None   
2   AGALVYAAK  54634783:54634805:54632376:54632381:None:None   
3   YAAKPNEEI  54634783:54634790:54632361:54632381:None:None   
4   KTIHRKQSP  54631737:54631754:54629611:54629621:None:None   
5   NSSKTESCS  54631737:54631749:54629606:54629621:None:None   
6   GGKQFIENR  54631737:54631759:54629616:54629621:None:None   
7   LVYAAKPNE  54634783:54634796:54632367:54632381:None:None   
8   HRKQSPVLS  54631737:54631745:54629602:54629621:None:None   
9   VAGALVYAA  54634783:54634808:54632379:54632381:None:None   
10  GKQFIENRV  54631737:54631756:54629613:54629621:None:None   
11  FIENRVLFC  54631737:54631747:54629604:54629621:None:None   
12  IHRKQSPVL  54631737:54631748:54629

118it [00:00, 2734.23it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    GALVYAAKP      54634783:54634802:54632373:54632381:None:None   
1    QFIENRVLF      54631737:54631750:54629607:54629621:None:None   
2    AGALVYAAK      54634783:54634805:54632376:54632381:None:None   
3    YAAKPNEEI      54634783:54634790:54632361:54632381:None:None   
4    KTIHRKQSP      54631737:54631754:54629611:54629621:None:None   
..         ...                                                ...   
113  RDNMWVLPH  215423413:215423425:215423354:215423369:None:None   
114  RRDNMWVLP  215423413:215423428:215423357:215423369:None:None   
115  EGRRDNMWV  215423413:215423434:215423363:215423369:None:None   
116  SEGRRDNMW  215423413:215423437:215423366:215423369:None:None   
117  WNSSNPRHH  107670488:107670508:107516798:107516805:None:None   

     junctionAnnotated  readFrameAnnotate

202it [00:00, 3452.65it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                          coord  \
0    GALVYAAKP  54634783:54634802:54632373:54632381:None:None   
1    QFIENRVLF  54631737:54631750:54629607:54629621:None:None   
2    AGALVYAAK  54634783:54634805:54632376:54632381:None:None   
3    YAAKPNEEI  54634783:54634790:54632361:54632381:None:None   
4    GDLDSVLGA  54633062:54633087:54632379:54632381:None:None   
..         ...                                            ...   
197  RGYGKINKQ  73291597:73291621:73291177:73291180:None:None   
198  GYGKINKQI  73291597:73291618:73291174:73291180:None:None   
199  NKQITSCGP  73291597:73291603:73291159:73291180:None:None   
200  KQITSCGPS  73291597:73291600:73291156:73291180:None:None   
201  YGKINKQIT  73291597:73291615:73291171:73291180:None:None   

     junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0                

117it [00:00, 3556.09it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    GALVYAAKP      54634783:54634802:54632373:54632381:None:None   
1    QFIENRVLF      54631737:54631750:54629607:54629621:None:None   
2    AGALVYAAK      54634783:54634805:54632376:54632381:None:None   
3    YAAKPNEEI      54634783:54634790:54632361:54632381:None:None   
4    KTIHRKQSP      54631737:54631754:54629611:54629621:None:None   
..         ...                                                ...   
112  GRRDNMWVL  215423413:215423431:215423360:215423369:None:None   
113  RDNMWVLPH  215423413:215423425:215423354:215423369:None:None   
114  RRDNMWVLP  215423413:215423428:215423357:215423369:None:None   
115  EGRRDNMWV  215423413:215423434:215423363:215423369:None:None   
116  SEGRRDNMW  215423413:215423437:215423366:215423369:None:None   

     junctionAnnotated  readFrameAnnotate

102it [00:00, 6764.47it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                          coord  \
0    QELNPLNGS  54632240:54632259:54630945:54630953:None:None   
1    GALVYAAKP  54634783:54634802:54632373:54632381:None:None   
2    IENSNCQLG  54631737:54631744:54631532:54631552:None:None   
3    QFIENRVLF  54631737:54631750:54629607:54629621:None:None   
4    AGALVYAAK  54634783:54634805:54632376:54632381:None:None   
..         ...                                            ...   
97   QREQGAFPT  49515751:49515760:49526487:49526505:None:None   
98   TRRQREQGA  49515742:49515760:49526487:49526496:None:None   
99   VITRRQREQ  49515736:49515760:49526487:49526490:None:None   
100  RQREQGAFP  49515748:49515760:49526487:49526502:None:None   
101  EQGAFPTTN  49515757:49515760:49526487:49526511:None:None   

     junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0                

217it [00:00, 4661.60it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    QELNPLNGS      54632240:54632259:54630945:54630953:None:None   
1    GALVYAAKP      54634783:54634802:54632373:54632381:None:None   
2    IENSNCQLG      54631737:54631744:54631532:54631552:None:None   
3    QFIENRVLF      54631737:54631750:54629607:54629621:None:None   
4    AGALVYAAK      54634783:54634805:54632376:54632381:None:None   
..         ...                                                ...   
212  RDNMWVLPH  215423413:215423425:215423354:215423369:None:None   
213  RRDNMWVLP  215423413:215423428:215423357:215423369:None:None   
214  EGRRDNMWV  215423413:215423434:215423363:215423369:None:None   
215  SEGRRDNMW  215423413:215423437:215423366:215423369:None:None   
216  WNSSNPRHH  107670488:107670508:107516798:107516805:None:None   

     junctionAnnotated  readFrameAnnotate

118it [00:00, 6866.46it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    GALVYAAKP      54634783:54634802:54632373:54632381:None:None   
1    QFIENRVLF      54631737:54631750:54629607:54629621:None:None   
2    AGALVYAAK      54634783:54634805:54632376:54632381:None:None   
3    YAAKPNEEI      54634783:54634790:54632361:54632381:None:None   
4    KTIHRKQSP      54631737:54631754:54629611:54629621:None:None   
..         ...                                                ...   
113  RDNMWVLPH  215423413:215423425:215423354:215423369:None:None   
114  RRDNMWVLP  215423413:215423428:215423357:215423369:None:None   
115  EGRRDNMWV  215423413:215423434:215423363:215423369:None:None   
116  SEGRRDNMW  215423413:215423437:215423366:215423369:None:None   
117  WNSSNPRHH  107670488:107670508:107516798:107516805:None:None   

     junctionAnnotated  readFrameAnnotate

118it [00:00, 4004.95it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    GALVYAAKP      54634783:54634802:54632373:54632381:None:None   
1    QFIENRVLF      54631737:54631750:54629607:54629621:None:None   
2    AGALVYAAK      54634783:54634805:54632376:54632381:None:None   
3    YAAKPNEEI      54634783:54634790:54632361:54632381:None:None   
4    KTIHRKQSP      54631737:54631754:54629611:54629621:None:None   
..         ...                                                ...   
113  RDNMWVLPH  215423413:215423425:215423354:215423369:None:None   
114  RRDNMWVLP  215423413:215423428:215423357:215423369:None:None   
115  EGRRDNMWV  215423413:215423434:215423363:215423369:None:None   
116  SEGRRDNMW  215423413:215423437:215423366:215423369:None:None   
117  WNSSNPRHH  107670488:107670508:107516798:107516805:None:None   

     junctionAnnotated  readFrameAnnotate

28it [00:00, 5889.99it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                          coord  \
0   GALVYAAKP  54634783:54634802:54632373:54632381:None:None   
1   QFIENRVLF  54631737:54631750:54629607:54629621:None:None   
2   AGALVYAAK  54634783:54634805:54632376:54632381:None:None   
3   YAAKPNEEI  54634783:54634790:54632361:54632381:None:None   
4   KTIHRKQSP  54631737:54631754:54629611:54629621:None:None   
5   NSSKTESCS  54631737:54631749:54629606:54629621:None:None   
6   GGKQFIENR  54631737:54631759:54629616:54629621:None:None   
7   LVYAAKPNE  54634783:54634796:54632367:54632381:None:None   
8   HRKQSPVLS  54631737:54631745:54629602:54629621:None:None   
9   VAGALVYAA  54634783:54634808:54632379:54632381:None:None   
10  GKQFIENRV  54631737:54631756:54629613:54629621:None:None   
11  FIENRVLFC  54631737:54631747:54629604:54629621:None:None   
12  IHRKQSPVL  54631737:54631748:54629

77it [00:00, 2437.24it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                          coord  \
0   GALVYAAKP  54634783:54634802:54632373:54632381:None:None   
1   IENSNCQLG  54631737:54631744:54631532:54631552:None:None   
2   QFIENRVLF  54631737:54631750:54629607:54629621:None:None   
3   AGALVYAAK  54634783:54634805:54632376:54632381:None:None   
4   YAAKPNEEI  54634783:54634790:54632361:54632381:None:None   
..        ...                                            ...   
72  QREQGAFPT  49515751:49515760:49526487:49526505:None:None   
73  TRRQREQGA  49515742:49515760:49526487:49526496:None:None   
74  VITRRQREQ  49515736:49515760:49526487:49526490:None:None   
75  RQREQGAFP  49515748:49515760:49526487:49526502:None:None   
76  EQGAFPTTN  49515757:49515760:49526487:49526511:None:None   

    junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0               False         

42it [00:00, 6163.99it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                              coord  \
0   MKAKRTKAR      37796926:37796947:37968444:37968450:None:None   
1   AKRTKARKL      37796932:37796947:37968444:37968456:None:None   
2   KRTKARKLG      37796935:37796947:37968444:37968459:None:None   
3   RTKARKLGA      37796938:37796947:37968444:37968462:None:None   
4   KAKRTKARK      37796929:37796947:37968444:37968453:None:None   
5   TKARKLGAD      37796941:37796947:37968444:37968465:None:None   
6   KLLQEVFLT      37569743:37569759:37564578:37564589:None:None   
7   LKKLLQEVF      37569743:37569765:37564584:37564589:None:None   
8   KKLLQEVFL      37569743:37569762:37564581:37564589:None:None   
9   QEVFLTTTI      37569743:37569750:37564569:37564589:None:None   
10  LQEVFLTTT      37569743:37569753:37564572:37564589:None:None   
11  LLQEVFLTT      37569743:37569756:37564575:37564589

452it [00:00, 3660.85it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    MKAKRTKAR      37796926:37796947:37968444:37968450:None:None   
1    AKRTKARKL      37796932:37796947:37968444:37968456:None:None   
2    KRTKARKLG      37796935:37796947:37968444:37968459:None:None   
3    RTKARKLGA      37796938:37796947:37968444:37968462:None:None   
4    KAKRTKARK      37796929:37796947:37968444:37968453:None:None   
..         ...                                                ...   
447  PATFGKVQM  101466776:101466794:101466222:101466231:None:None   
448  KVQMSLLPL  101466776:101466779:101466207:101466231:None:None   
449  KVQMSLLPL  101466776:101466779:101466207:101466231:None:None   
450  KVQMSLLPL  101466776:101466779:101466207:101466231:None:None   
451  KVQMSLLPL  101466776:101466779:101466207:101466231:None:None   

     junctionAnnotated  readFrameAnnotate

452it [00:00, 3899.39it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    MKAKRTKAR      37796926:37796947:37968444:37968450:None:None   
1    AKRTKARKL      37796932:37796947:37968444:37968456:None:None   
2    KRTKARKLG      37796935:37796947:37968444:37968459:None:None   
3    RTKARKLGA      37796938:37796947:37968444:37968462:None:None   
4    KAKRTKARK      37796929:37796947:37968444:37968453:None:None   
..         ...                                                ...   
447  PATFGKVQM  101466776:101466794:101466222:101466231:None:None   
448  KVQMSLLPL  101466776:101466779:101466207:101466231:None:None   
449  KVQMSLLPL  101466776:101466779:101466207:101466231:None:None   
450  KVQMSLLPL  101466776:101466779:101466207:101466231:None:None   
451  KVQMSLLPL  101466776:101466779:101466207:101466231:None:None   

     junctionAnnotated  readFrameAnnotate

452it [00:00, 3852.23it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    MKAKRTKAR      37796926:37796947:37968444:37968450:None:None   
1    AKRTKARKL      37796932:37796947:37968444:37968456:None:None   
2    KRTKARKLG      37796935:37796947:37968444:37968459:None:None   
3    RTKARKLGA      37796938:37796947:37968444:37968462:None:None   
4    KAKRTKARK      37796929:37796947:37968444:37968453:None:None   
..         ...                                                ...   
447  PATFGKVQM  101466776:101466794:101466222:101466231:None:None   
448  KVQMSLLPL  101466776:101466779:101466207:101466231:None:None   
449  KVQMSLLPL  101466776:101466779:101466207:101466231:None:None   
450  KVQMSLLPL  101466776:101466779:101466207:101466231:None:None   
451  KVQMSLLPL  101466776:101466779:101466207:101466231:None:None   

     junctionAnnotated  readFrameAnnotate

6it [00:00, 4251.70it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
        kmer                                          coord  \
0  MKAKRTKAR  37796926:37796947:37968444:37968450:None:None   
1  AKRTKARKL  37796932:37796947:37968444:37968456:None:None   
2  KRTKARKLG  37796935:37796947:37968444:37968459:None:None   
3  RTKARKLGA  37796938:37796947:37968444:37968462:None:None   
4  KAKRTKARK  37796929:37796947:37968444:37968453:None:None   
5  TKARKLGAD  37796941:37796947:37968444:37968465:None:None   

   junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0              False               False      +   37796947:37968444  
1              False               False      +   37796947:37968444  
2              False               False      +   37796947:37968444  
3              False               False      +   37796947:37968444  
4              False               False      +   37796947:37968444  
5       

317it [00:00, 3341.64it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                          coord  \
0    SSDEKTKKN  58578018:58578025:58574389:58574409:None:None   
1    ISSSDEKTK  58578018:58578031:58574395:58574409:None:None   
2    LISSSDEKT  58578018:58578034:58574398:58574409:None:None   
3    LLISSSDEK  58578018:58578037:58574401:58574409:None:None   
4    TLLISSSDE  58578018:58578040:58574404:58574409:None:None   
..         ...                                            ...   
312  KSTTLGGSW  52902029:52902033:52901942:52901965:None:None   
313  KSTTLGGSW  52902029:52902033:52901942:52901965:None:None   
314  KSTTLGGSW  52902029:52902033:52901942:52901965:None:None   
315  KSTTLGGSW  52902029:52902033:52901942:52901965:None:None   
316  QNKMLETKS  52902029:52902054:52901963:52901965:None:None   

     junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0                

876it [00:00, 3439.89it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                          coord  \
0    SSDEKTKKN  58578018:58578025:58574389:58574409:None:None   
1    ISSSDEKTK  58578018:58578031:58574395:58574409:None:None   
2    LISSSDEKT  58578018:58578034:58574398:58574409:None:None   
3    LLISSSDEK  58578018:58578037:58574401:58574409:None:None   
4    TLLISSSDE  58578018:58578040:58574404:58574409:None:None   
..         ...                                            ...   
871  KSTTLGGSW  52902029:52902033:52901942:52901965:None:None   
872  KSTTLGGSW  52902029:52902033:52901942:52901965:None:None   
873  KSTTLGGSW  52902029:52902033:52901942:52901965:None:None   
874  KSTTLGGSW  52902029:52902033:52901942:52901965:None:None   
875  QNKMLETKS  52902029:52902054:52901963:52901965:None:None   

     junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0                

164it [00:00, 3492.09it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                          coord  \
0    MKAKRTKAR  37796926:37796947:37968444:37968450:None:None   
1    AKRTKARKL  37796932:37796947:37968444:37968456:None:None   
2    KRTKARKLG  37796935:37796947:37968444:37968459:None:None   
3    RTKARKLGA  37796938:37796947:37968444:37968462:None:None   
4    KAKRTKARK  37796929:37796947:37968444:37968453:None:None   
..         ...                                            ...   
159  KSTTLGGSW  52902029:52902033:52901942:52901965:None:None   
160  KSTTLGGSW  52902029:52902033:52901942:52901965:None:None   
161  KSTTLGGSW  52902029:52902033:52901942:52901965:None:None   
162  KSTTLGGSW  52902029:52902033:52901942:52901965:None:None   
163  QNKMLETKS  52902029:52902054:52901963:52901965:None:None   

     junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0                

1000it [00:00, 3503.83it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                          coord  \
0    SSDEKTKKN  58578018:58578025:58574389:58574409:None:None   
1    ISSSDEKTK  58578018:58578031:58574395:58574409:None:None   
2    LISSSDEKT  58578018:58578034:58574398:58574409:None:None   
3    LLISSSDEK  58578018:58578037:58574401:58574409:None:None   
4    TLLISSSDE  58578018:58578040:58574404:58574409:None:None   
..         ...                                            ...   
995  KSTTLGGSW  52902029:52902033:52901942:52901965:None:None   
996  KSTTLGGSW  52902029:52902033:52901942:52901965:None:None   
997  KSTTLGGSW  52902029:52902033:52901942:52901965:None:None   
998  KSTTLGGSW  52902029:52902033:52901942:52901965:None:None   
999  QNKMLETKS  52902029:52902054:52901963:52901965:None:None   

     junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0                

28it [00:00, 6329.66it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                              coord  \
0   MKAKRTKAR      37796926:37796947:37968444:37968450:None:None   
1   AKRTKARKL      37796932:37796947:37968444:37968456:None:None   
2   KRTKARKLG      37796935:37796947:37968444:37968459:None:None   
3   RTKARKLGA      37796938:37796947:37968444:37968462:None:None   
4   KAKRTKARK      37796929:37796947:37968444:37968453:None:None   
5   TKARKLGAD      37796941:37796947:37968444:37968465:None:None   
6   KLLQEVFLT      37569743:37569759:37564578:37564589:None:None   
7   LKKLLQEVF      37569743:37569765:37564584:37564589:None:None   
8   KKLLQEVFL      37569743:37569762:37564581:37564589:None:None   
9   QEVFLTTTI      37569743:37569750:37564569:37564589:None:None   
10  LQEVFLTTT      37569743:37569753:37564572:37564589:None:None   
11  LLQEVFLTT      37569743:37569756:37564575:37564589

65it [00:00, 6841.91it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                          coord  \
0   SSSDSTLKI  52145845:52145846:52138216:52138242:None:None   
1   HYQSSSDST  52145845:52145855:52138225:52138242:None:None   
2   LLQHYQSSS  52145845:52145864:52138234:52138242:None:None   
3   YQSSSDSTL  52145845:52145852:52138222:52138242:None:None   
4   QSSSDSTLK  52145845:52145849:52138219:52138242:None:None   
..        ...                                            ...   
60  FTIHMSSLS      8069590:8069598:8073884:8073903:None:None   
61  TSFTIHMSS      8069584:8069598:8073884:8073897:None:None   
62  ALLQASQYT      8069576:8069598:8073884:8073889:None:None   
63  SFTIHMSSL      8069587:8069598:8073884:8073900:None:None   
64  LLQASQYTC      8069579:8069598:8073884:8073892:None:None   

    junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0               False         

28it [00:00, 6152.26it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                              coord  \
0   MKAKRTKAR      37796926:37796947:37968444:37968450:None:None   
1   AKRTKARKL      37796932:37796947:37968444:37968456:None:None   
2   KRTKARKLG      37796935:37796947:37968444:37968459:None:None   
3   RTKARKLGA      37796938:37796947:37968444:37968462:None:None   
4   KAKRTKARK      37796929:37796947:37968444:37968453:None:None   
5   TKARKLGAD      37796941:37796947:37968444:37968465:None:None   
6   KLLQEVFLT      37569743:37569759:37564578:37564589:None:None   
7   LKKLLQEVF      37569743:37569765:37564584:37564589:None:None   
8   KKLLQEVFL      37569743:37569762:37564581:37564589:None:None   
9   QEVFLTTTI      37569743:37569750:37564569:37564589:None:None   
10  LQEVFLTTT      37569743:37569753:37564572:37564589:None:None   
11  LLQEVFLTT      37569743:37569756:37564575:37564589

32it [00:00, 4573.63it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                          coord  \
0   RLRQFPGQR  38367548:38367572:38370158:38370161:None:None   
1   SGSFQGRGV  38367552:38367572:38370158:38370165:None:None   
2   QGRGVDTST  38367564:38367572:38370158:38370177:None:None   
3   GRGVDTSTY  38367567:38367572:38370158:38370180:None:None   
4   RQFPGQRSG  38367554:38367572:38370158:38370167:None:None   
5   QFPGQRSGH  38367557:38367572:38370158:38370170:None:None   
6   GQRSGHEHL  38367566:38367572:38370158:38370179:None:None   
7   FQGRGVDTS  38367561:38367572:38370158:38370174:None:None   
8   FPGQRSGHE  38367560:38367572:38370158:38370173:None:None   
9   GSFQGRGVD  38367555:38367572:38370158:38370168:None:None   
10  DSGSFQGRG  38367549:38367572:38370158:38370162:None:None   
11  SFQGRGVDT  38367558:38367572:38370158:38370171:None:None   
12  LRQFPGQRS  38367551:38367572:38370

39it [00:00, 6034.75it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                          coord  \
0   MKAKRTKAR  37796926:37796947:37968444:37968450:None:None   
1   AKRTKARKL  37796932:37796947:37968444:37968456:None:None   
2   KRTKARKLG  37796935:37796947:37968444:37968459:None:None   
3   RTKARKLGA  37796938:37796947:37968444:37968462:None:None   
4   KAKRTKARK  37796929:37796947:37968444:37968453:None:None   
5   TKARKLGAD  37796941:37796947:37968444:37968465:None:None   
6   GDAGPAGPR  50196618:50196643:50196526:50196528:None:None   
7   RLRQFPGQR  38367548:38367572:38370158:38370161:None:None   
8   SGSFQGRGV  38367552:38367572:38370158:38370165:None:None   
9   QGRGVDTST  38367564:38367572:38370158:38370177:None:None   
10  GRGVDTSTY  38367567:38367572:38370158:38370180:None:None   
11  RQFPGQRSG  38367554:38367572:38370158:38370167:None:None   
12  QFPGQRSGH  38367557:38367572:38370

206it [00:00, 3496.41it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                          coord  \
0    MKAKRTKAR  37796926:37796947:37968444:37968450:None:None   
1    AKRTKARKL  37796932:37796947:37968444:37968456:None:None   
2    KRTKARKLG  37796935:37796947:37968444:37968459:None:None   
3    RTKARKLGA  37796938:37796947:37968444:37968462:None:None   
4    KAKRTKARK  37796929:37796947:37968444:37968453:None:None   
..         ...                                            ...   
201  KSTTLGGSW  52902029:52902033:52901942:52901965:None:None   
202  KSTTLGGSW  52902029:52902033:52901942:52901965:None:None   
203  KSTTLGGSW  52902029:52902033:52901942:52901965:None:None   
204  KSTTLGGSW  52902029:52902033:52901942:52901965:None:None   
205  QNKMLETKS  52902029:52902054:52901963:52901965:None:None   

     junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0                

28it [00:00, 6235.56it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                              coord  \
0   MKAKRTKAR      37796926:37796947:37968444:37968450:None:None   
1   AKRTKARKL      37796932:37796947:37968444:37968456:None:None   
2   KRTKARKLG      37796935:37796947:37968444:37968459:None:None   
3   RTKARKLGA      37796938:37796947:37968444:37968462:None:None   
4   KAKRTKARK      37796929:37796947:37968444:37968453:None:None   
5   TKARKLGAD      37796941:37796947:37968444:37968465:None:None   
6   KLLQEVFLT      37569743:37569759:37564578:37564589:None:None   
7   LKKLLQEVF      37569743:37569765:37564584:37564589:None:None   
8   KKLLQEVFL      37569743:37569762:37564581:37564589:None:None   
9   QEVFLTTTI      37569743:37569750:37564569:37564589:None:None   
10  LQEVFLTTT      37569743:37569753:37564572:37564589:None:None   
11  LLQEVFLTT      37569743:37569756:37564575:37564589

441it [00:00, 3157.43it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                          coord  \
0    SSDEKTKKN  58578018:58578025:58574389:58574409:None:None   
1    ISSSDEKTK  58578018:58578031:58574395:58574409:None:None   
2    LISSSDEKT  58578018:58578034:58574398:58574409:None:None   
3    LLISSSDEK  58578018:58578037:58574401:58574409:None:None   
4    TLLISSSDE  58578018:58578040:58574404:58574409:None:None   
..         ...                                            ...   
436  KSTTLGGSW  52902029:52902033:52901942:52901965:None:None   
437  KSTTLGGSW  52902029:52902033:52901942:52901965:None:None   
438  KSTTLGGSW  52902029:52902033:52901942:52901965:None:None   
439  KSTTLGGSW  52902029:52902033:52901942:52901965:None:None   
440  QNKMLETKS  52902029:52902054:52901963:52901965:None:None   

     junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0                

107it [00:00, 2690.85it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                          coord  \
0    MKAKRTKAR  37796926:37796947:37968444:37968450:None:None   
1    AKRTKARKL  37796932:37796947:37968444:37968456:None:None   
2    KRTKARKLG  37796935:37796947:37968444:37968459:None:None   
3    RTKARKLGA  37796938:37796947:37968444:37968462:None:None   
4    KAKRTKARK  37796929:37796947:37968444:37968453:None:None   
..         ...                                            ...   
102  LQGLGINTH  39715758:39715776:39715836:39715845:None:None   
103  TLQGLGINT  39715755:39715776:39715836:39715842:None:None   
104  LGINTHLCF  39715767:39715776:39715836:39715854:None:None   
105  GLGINTHLC  39715764:39715776:39715836:39715851:None:None   
106  LTLQGLGIN  39715752:39715776:39715836:39715839:None:None   

     junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0                

6it [00:00, 4645.71it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
        kmer                                          coord  \
0  MKAKRTKAR  37796926:37796947:37968444:37968450:None:None   
1  AKRTKARKL  37796932:37796947:37968444:37968456:None:None   
2  KRTKARKLG  37796935:37796947:37968444:37968459:None:None   
3  RTKARKLGA  37796938:37796947:37968444:37968462:None:None   
4  KAKRTKARK  37796929:37796947:37968444:37968453:None:None   
5  TKARKLGAD  37796941:37796947:37968444:37968465:None:None   

   junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0              False               False      +   37796947:37968444  
1              False               False      +   37796947:37968444  
2              False               False      +   37796947:37968444  
3              False               False      +   37796947:37968444  
4              False               False      +   37796947:37968444  
5       

308it [00:00, 4221.39it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                          coord  \
0    MKAKRTKAR  37796926:37796947:37968444:37968450:None:None   
1    AKRTKARKL  37796932:37796947:37968444:37968456:None:None   
2    KRTKARKLG  37796935:37796947:37968444:37968459:None:None   
3    RTKARKLGA  37796938:37796947:37968444:37968462:None:None   
4    KAKRTKARK  37796929:37796947:37968444:37968453:None:None   
..         ...                                            ...   
303  KSTTLGGSW  52902029:52902033:52901942:52901965:None:None   
304  KSTTLGGSW  52902029:52902033:52901942:52901965:None:None   
305  KSTTLGGSW  52902029:52902033:52901942:52901965:None:None   
306  KSTTLGGSW  52902029:52902033:52901942:52901965:None:None   
307  QNKMLETKS  52902029:52902054:52901963:52901965:None:None   

     junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0                

28it [00:00, 6196.08it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                              coord  \
0   MKAKRTKAR      37796926:37796947:37968444:37968450:None:None   
1   AKRTKARKL      37796932:37796947:37968444:37968456:None:None   
2   KRTKARKLG      37796935:37796947:37968444:37968459:None:None   
3   RTKARKLGA      37796938:37796947:37968444:37968462:None:None   
4   KAKRTKARK      37796929:37796947:37968444:37968453:None:None   
5   TKARKLGAD      37796941:37796947:37968444:37968465:None:None   
6   KLLQEVFLT      37569743:37569759:37564578:37564589:None:None   
7   LKKLLQEVF      37569743:37569765:37564584:37564589:None:None   
8   KKLLQEVFL      37569743:37569762:37564581:37564589:None:None   
9   QEVFLTTTI      37569743:37569750:37564569:37564589:None:None   
10  LQEVFLTTT      37569743:37569753:37564572:37564589:None:None   
11  LLQEVFLTT      37569743:37569756:37564575:37564589

608it [00:00, 3752.63it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    MKAKRTKAR      37796926:37796947:37968444:37968450:None:None   
1    AKRTKARKL      37796932:37796947:37968444:37968456:None:None   
2    KRTKARKLG      37796935:37796947:37968444:37968459:None:None   
3    RTKARKLGA      37796938:37796947:37968444:37968462:None:None   
4    KAKRTKARK      37796929:37796947:37968444:37968453:None:None   
..         ...                                                ...   
603  PATFGKVQM  101466776:101466794:101466222:101466231:None:None   
604  KVQMSLLPL  101466776:101466779:101466207:101466231:None:None   
605  KVQMSLLPL  101466776:101466779:101466207:101466231:None:None   
606  KVQMSLLPL  101466776:101466779:101466207:101466231:None:None   
607  KVQMSLLPL  101466776:101466779:101466207:101466231:None:None   

     junctionAnnotated  readFrameAnnotate

6it [00:00, 4622.67it/s]

Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
        kmer                                          coord  \
0  MKAKRTKAR  37796926:37796947:37968444:37968450:None:None   
1  AKRTKARKL  37796932:37796947:37968444:37968456:None:None   
2  KRTKARKLG  37796935:37796947:37968444:37968459:None:None   
3  RTKARKLGA  37796938:37796947:37968444:37968462:None:None   
4  KAKRTKARK  37796929:37796947:37968444:37968453:None:None   
5  TKARKLGAD  37796941:37796947:37968444:37968465:None:None   

   junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0              False               False      +   37796947:37968444  
1              False               False      +   37796947:37968444  
2              False               False      +   37796947:37968444  
3              False               False      +   37796947:37968444  
4              False               False      +   37796947:37968444  
5       


6it [00:00, 4565.64it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
        kmer                                          coord  \
0  MKAKRTKAR  37796926:37796947:37968444:37968450:None:None   
1  AKRTKARKL  37796932:37796947:37968444:37968456:None:None   
2  KRTKARKLG  37796935:37796947:37968444:37968459:None:None   
3  RTKARKLGA  37796938:37796947:37968444:37968462:None:None   
4  KAKRTKARK  37796929:37796947:37968444:37968453:None:None   
5  TKARKLGAD  37796941:37796947:37968444:37968465:None:None   

   junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0              False               False      +   37796947:37968444  
1              False               False      +   37796947:37968444  
2              False               False      +   37796947:37968444  
3              False               False      +   37796947:37968444  
4              False               False      +   37796947:37968444  
5       

436it [00:00, 4388.46it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    MKAKRTKAR      37796926:37796947:37968444:37968450:None:None   
1    AKRTKARKL      37796932:37796947:37968444:37968456:None:None   
2    KRTKARKLG      37796935:37796947:37968444:37968459:None:None   
3    RTKARKLGA      37796938:37796947:37968444:37968462:None:None   
4    KAKRTKARK      37796929:37796947:37968444:37968453:None:None   
..         ...                                                ...   
431  PATFGKVQM  101466776:101466794:101466222:101466231:None:None   
432  KVQMSLLPL  101466776:101466779:101466207:101466231:None:None   
433  KVQMSLLPL  101466776:101466779:101466207:101466231:None:None   
434  KVQMSLLPL  101466776:101466779:101466207:101466231:None:None   
435  KVQMSLLPL  101466776:101466779:101466207:101466231:None:None   

     junctionAnnotated  readFrameAnnotate

61it [00:00, 6247.77it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                          coord  \
0   MKAKRTKAR  37796926:37796947:37968444:37968450:None:None   
1   AKRTKARKL  37796932:37796947:37968444:37968456:None:None   
2   KRTKARKLG  37796935:37796947:37968444:37968459:None:None   
3   RTKARKLGA  37796938:37796947:37968444:37968462:None:None   
4   KAKRTKARK  37796929:37796947:37968444:37968453:None:None   
..        ...                                            ...   
56  YKKKPSIPL  52319155:52319166:52318565:52318581:None:None   
57  LEGYKKKPS  52319155:52319175:52318574:52318581:None:None   
58  KKKPSIPLP  52319155:52319163:52318562:52318581:None:None   
59  GYKKKPSIP  52319155:52319169:52318568:52318581:None:None   
60  KKPSIPLPP  52319155:52319160:52318559:52318581:None:None   

    junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0               False         

42it [00:00, 4882.91it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                              coord  \
0   MKAKRTKAR      37796926:37796947:37968444:37968450:None:None   
1   AKRTKARKL      37796932:37796947:37968444:37968456:None:None   
2   KRTKARKLG      37796935:37796947:37968444:37968459:None:None   
3   RTKARKLGA      37796938:37796947:37968444:37968462:None:None   
4   KAKRTKARK      37796929:37796947:37968444:37968453:None:None   
5   TKARKLGAD      37796941:37796947:37968444:37968465:None:None   
6   KLLQEVFLT      37569743:37569759:37564578:37564589:None:None   
7   LKKLLQEVF      37569743:37569765:37564584:37564589:None:None   
8   KKLLQEVFL      37569743:37569762:37564581:37564589:None:None   
9   QEVFLTTTI      37569743:37569750:37564569:37564589:None:None   
10  LQEVFLTTT      37569743:37569753:37564572:37564589:None:None   
11  LLQEVFLTT      37569743:37569756:37564575:37564589

42it [00:00, 6313.10it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                              coord  \
0   MKAKRTKAR      37796926:37796947:37968444:37968450:None:None   
1   AKRTKARKL      37796932:37796947:37968444:37968456:None:None   
2   KRTKARKLG      37796935:37796947:37968444:37968459:None:None   
3   RTKARKLGA      37796938:37796947:37968444:37968462:None:None   
4   KAKRTKARK      37796929:37796947:37968444:37968453:None:None   
5   TKARKLGAD      37796941:37796947:37968444:37968465:None:None   
6   KLLQEVFLT      37569743:37569759:37564578:37564589:None:None   
7   LKKLLQEVF      37569743:37569765:37564584:37564589:None:None   
8   KKLLQEVFL      37569743:37569762:37564581:37564589:None:None   
9   QEVFLTTTI      37569743:37569750:37564569:37564589:None:None   
10  LQEVFLTTT      37569743:37569753:37564572:37564589:None:None   
11  LLQEVFLTT      37569743:37569756:37564575:37564589

58it [00:00, 6416.52it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                          coord  \
0   PGDRRLSLS  71109940:71109954:71105575:71105588:None:None   
1   RPGDRRLSL  71109940:71109957:71105578:71105588:None:None   
2   PRPGDRRLS  71109940:71109960:71105581:71105588:None:None   
3   DRRLSLSLS  71109940:71109948:71105569:71105588:None:None   
4   GDRRLSLSL  71109940:71109951:71105572:71105588:None:None   
5   NFHDPETGD  71109940:71109965:71105586:71105588:None:None   
6   YGHPADLPA  50186008:50186014:50185958:50185979:None:None   
7   FEYGHPADL  50186008:50186020:50185964:50185979:None:None   
8   EYGHPADLP  50186008:50186017:50185961:50185979:None:None   
9   RLRQFPGQR  38367548:38367572:38370158:38370161:None:None   
10  SGSFQGRGV  38367552:38367572:38370158:38370165:None:None   
11  QGRGVDTST  38367564:38367572:38370158:38370177:None:None   
12  GRGVDTSTY  38367567:38367572:38370

42it [00:00, 6039.73it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                              coord  \
0   MKAKRTKAR      37796926:37796947:37968444:37968450:None:None   
1   AKRTKARKL      37796932:37796947:37968444:37968456:None:None   
2   KRTKARKLG      37796935:37796947:37968444:37968459:None:None   
3   RTKARKLGA      37796938:37796947:37968444:37968462:None:None   
4   KAKRTKARK      37796929:37796947:37968444:37968453:None:None   
5   TKARKLGAD      37796941:37796947:37968444:37968465:None:None   
6   KLLQEVFLT      37569743:37569759:37564578:37564589:None:None   
7   LKKLLQEVF      37569743:37569765:37564584:37564589:None:None   
8   KKLLQEVFL      37569743:37569762:37564581:37564589:None:None   
9   QEVFLTTTI      37569743:37569750:37564569:37564589:None:None   
10  LQEVFLTTT      37569743:37569753:37564572:37564589:None:None   
11  LLQEVFLTT      37569743:37569756:37564575:37564589

224it [00:00, 3816.27it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                          coord  \
0    MKAKRTKAR  37796926:37796947:37968444:37968450:None:None   
1    AKRTKARKL  37796932:37796947:37968444:37968456:None:None   
2    KRTKARKLG  37796935:37796947:37968444:37968459:None:None   
3    RTKARKLGA  37796938:37796947:37968444:37968462:None:None   
4    KAKRTKARK  37796929:37796947:37968444:37968453:None:None   
..         ...                                            ...   
219  KSTTLGGSW  52902029:52902033:52901942:52901965:None:None   
220  KSTTLGGSW  52902029:52902033:52901942:52901965:None:None   
221  KSTTLGGSW  52902029:52902033:52901942:52901965:None:None   
222  KSTTLGGSW  52902029:52902033:52901942:52901965:None:None   
223  QNKMLETKS  52902029:52902054:52901963:52901965:None:None   

     junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0                

57it [00:00, 6712.77it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                              coord  \
0   AQPHQKMGC      39098421:39098427:39098351:39098372:None:None   
1   SPGAQAQPH      39098421:39098442:39098366:39098372:None:None   
2   DSPGAQAQP      39098421:39098445:39098369:39098372:None:None   
3   GAQAQPHQK      39098421:39098436:39098360:39098372:None:None   
4   AQAQPHQKM      39098421:39098433:39098357:39098372:None:None   
5   PGAQAQPHQ      39098421:39098439:39098363:39098372:None:None   
6   QAQPHQKMG      39098421:39098430:39098354:39098372:None:None   
7   MKYTTATGL  183094558:183094580:183079278:183079283:None:None   
8   MKYTTATGL  183094558:183094580:183079278:183079283:None:None   
9   MKYTTATGL  183094558:183094580:183079278:183079283:None:None   
10  MKYTTATGL  183094558:183094580:183079278:183079283:None:None   
11  NGECPWQEE      10498710:10498731:10497766:10497772

974it [00:00, 3592.88it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    RTEPSPNRV          8877029:8877050:8868506:8868512:None:None   
1    ERTEPSPNR          8877029:8877053:8868509:8868512:None:None   
2    LKARKVLTG      58458488:58458513:58458461:58458463:None:None   
3    ARKVLTGRL      58458488:58458507:58458455:58458463:None:None   
4    KVLTGRLCL      58458488:58458501:58458449:58458463:None:None   
..         ...                                                ...   
969  VRRAPHGCP  154464868:154464886:154465017:154465026:None:None   
970  PHGCPEILP  154464880:154464886:154465017:154465038:None:None   
971  PEVRRAPHG  154464862:154464886:154465017:154465020:None:None   
972  RRAPHGCPE  154464871:154464886:154465017:154465029:None:None   
973  RAPHGCPEI  154464874:154464886:154465017:154465032:None:None   

     junctionAnnotated  readFrameAnnotate

61it [00:00, 6701.75it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                          coord  \
0   AQPHQKMGC  39098421:39098427:39098351:39098372:None:None   
1   SPGAQAQPH  39098421:39098442:39098366:39098372:None:None   
2   DSPGAQAQP  39098421:39098445:39098369:39098372:None:None   
3   GAQAQPHQK  39098421:39098436:39098360:39098372:None:None   
4   AQAQPHQKM  39098421:39098433:39098357:39098372:None:None   
..        ...                                            ...   
56  FYITFAKLP  94623279:94623300:94625523:94625529:None:None   
57  EFYITFAKL  94623276:94623300:94625523:94625526:None:None   
58  FAKLPLLCS  94623291:94623300:94625523:94625541:None:None   
59  YITFAKLPL  94623282:94623300:94625523:94625532:None:None   
60  TFAKLPLLC  94623288:94623300:94625523:94625538:None:None   

    junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0               False         

88it [00:00, 2183.03it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                          coord  \
0   ELANPIRKY  31798546:31798552:31800978:31800999:None:None   
1   LANPIRKYQ  31798549:31798552:31800978:31801002:None:None   
2   ARLRRELAN  31798531:31798552:31800978:31800984:None:None   
3   LRRELANPI  31798537:31798552:31800978:31800990:None:None   
4   RELANPIRK  31798543:31798552:31800978:31800996:None:None   
..        ...                                            ...   
83  MRNIRECAY  13201486:13201504:13204218:13204227:None:None   
84  NIRECAYTH  13201492:13201504:13204218:13204233:None:None   
85  EKMRNIREC  13201480:13201504:13204218:13204221:None:None   
86  KMRNIRECA  13201483:13201504:13204218:13204224:None:None   
87  RECAYTHFK  13201498:13201504:13204218:13204239:None:None   

    junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0               False         

396it [00:00, 3228.38it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    LHGPGLPRT      99603166:99603173:99517163:99517183:None:None   
1    IPNGLHGPG      99603166:99603185:99517175:99517183:None:None   
2    GLHGPGLPR      99603166:99603176:99517166:99517183:None:None   
3    PNGLHGPGL      99603166:99603182:99517172:99517183:None:None   
4    NGLHGPGLP      99603166:99603179:99517169:99517183:None:None   
..         ...                                                ...   
391  VRRAPHGCP  154464868:154464886:154465017:154465026:None:None   
392  PHGCPEILP  154464880:154464886:154465017:154465038:None:None   
393  PEVRRAPHG  154464862:154464886:154465017:154465020:None:None   
394  RRAPHGCPE  154464871:154464886:154465017:154465029:None:None   
395  RAPHGCPEI  154464874:154464886:154465017:154465032:None:None   

     junctionAnnotated  readFrameAnnotate

757it [00:00, 3286.23it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    LKARKVLTG      58458488:58458513:58458461:58458463:None:None   
1    ARKVLTGRL      58458488:58458507:58458455:58458463:None:None   
2    KVLTGRLCL      58458488:58458501:58458449:58458463:None:None   
3    KARKVLTGR      58458488:58458510:58458458:58458463:None:None   
4    RKVLTGRLC      58458488:58458504:58458452:58458463:None:None   
..         ...                                                ...   
752  VRRAPHGCP  154464868:154464886:154465017:154465026:None:None   
753  PHGCPEILP  154464880:154464886:154465017:154465038:None:None   
754  PEVRRAPHG  154464862:154464886:154465017:154465020:None:None   
755  RRAPHGCPE  154464871:154464886:154465017:154465029:None:None   
756  RAPHGCPEI  154464874:154464886:154465017:154465032:None:None   

     junctionAnnotated  readFrameAnnotate

224it [00:00, 4561.90it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    LKARKVLTG      58458488:58458513:58458461:58458463:None:None   
1    ARKVLTGRL      58458488:58458507:58458455:58458463:None:None   
2    KVLTGRLCL      58458488:58458501:58458449:58458463:None:None   
3    KARKVLTGR      58458488:58458510:58458458:58458463:None:None   
4    RKVLTGRLC      58458488:58458504:58458452:58458463:None:None   
..         ...                                                ...   
219  TKKRNERSG  151102942:151102957:151103058:151103070:None:None   
220  GFTKKRNER  151102936:151102957:151103058:151103064:None:None   
221  GFTKKRNER  151102936:151102957:151103058:151103064:None:None   
222  GFTKKRNER  151102936:151102957:151103058:151103064:None:None   
223  GFTKKRNER  151102936:151102957:151103058:151103064:None:None   

     junctionAnnotated  readFrameAnnotate

15it [00:00, 5081.95it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                              coord  \
0   NGECPWQEE      10498710:10498731:10497766:10497772:None:None   
1   ECPWQEEER      10498710:10498725:10497760:10497772:None:None   
2   CPWQEEERT      10498710:10498722:10497757:10497772:None:None   
3   QNGECPWQE      10498710:10498734:10497769:10497772:None:None   
4   GECPWQEEE      10498710:10498728:10497763:10497772:None:None   
5   PWQEEERTW      10498710:10498719:10497754:10497772:None:None   
6   WQEEERTWD      10498710:10498716:10497751:10497772:None:None   
7   MLVRSRDGP  133208468:133208470:133207920:133207945:None:None   
8   EATFTFMLV  133208468:133208488:133207938:133207945:None:None   
9   TFMLVRSRD  133208468:133208476:133207926:133207945:None:None   
10  TFTFMLVRS  133208468:133208482:133207932:133207945:None:None   
11  FTFMLVRSR  133208468:133208479:133207929:133207945

224it [00:00, 3414.63it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    LKARKVLTG      58458488:58458513:58458461:58458463:None:None   
1    ARKVLTGRL      58458488:58458507:58458455:58458463:None:None   
2    KVLTGRLCL      58458488:58458501:58458449:58458463:None:None   
3    KARKVLTGR      58458488:58458510:58458458:58458463:None:None   
4    RKVLTGRLC      58458488:58458504:58458452:58458463:None:None   
..         ...                                                ...   
219  TKKRNERSG  151102942:151102957:151103058:151103070:None:None   
220  GFTKKRNER  151102936:151102957:151103058:151103064:None:None   
221  GFTKKRNER  151102936:151102957:151103058:151103064:None:None   
222  GFTKKRNER  151102936:151102957:151103058:151103064:None:None   
223  GFTKKRNER  151102936:151102957:151103058:151103064:None:None   

     junctionAnnotated  readFrameAnnotate

76it [00:00, 2742.34it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                          coord  \
0   AQPHQKMGC  39098421:39098427:39098351:39098372:None:None   
1   SPGAQAQPH  39098421:39098442:39098366:39098372:None:None   
2   DSPGAQAQP  39098421:39098445:39098369:39098372:None:None   
3   GAQAQPHQK  39098421:39098436:39098360:39098372:None:None   
4   AQAQPHQKM  39098421:39098433:39098357:39098372:None:None   
..        ...                                            ...   
71  FYITFAKLP  94623279:94623300:94625523:94625529:None:None   
72  EFYITFAKL  94623276:94623300:94625523:94625526:None:None   
73  FAKLPLLCS  94623291:94623300:94625523:94625541:None:None   
74  YITFAKLPL  94623282:94623300:94625523:94625532:None:None   
75  TFAKLPLLC  94623288:94623300:94625523:94625538:None:None   

    junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0               False         

220it [00:00, 2648.87it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    LKARKVLTG      58458488:58458513:58458461:58458463:None:None   
1    ARKVLTGRL      58458488:58458507:58458455:58458463:None:None   
2    KVLTGRLCL      58458488:58458501:58458449:58458463:None:None   
3    KARKVLTGR      58458488:58458510:58458458:58458463:None:None   
4    RKVLTGRLC      58458488:58458504:58458452:58458463:None:None   
..         ...                                                ...   
215  TKKRNERSG  151102942:151102957:151103058:151103070:None:None   
216  GFTKKRNER  151102936:151102957:151103058:151103064:None:None   
217  GFTKKRNER  151102936:151102957:151103058:151103064:None:None   
218  GFTKKRNER  151102936:151102957:151103058:151103064:None:None   
219  GFTKKRNER  151102936:151102957:151103058:151103064:None:None   

     junctionAnnotated  readFrameAnnotate

15it [00:00, 5579.02it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                              coord  \
0   NGECPWQEE      10498710:10498731:10497766:10497772:None:None   
1   ECPWQEEER      10498710:10498725:10497760:10497772:None:None   
2   CPWQEEERT      10498710:10498722:10497757:10497772:None:None   
3   QNGECPWQE      10498710:10498734:10497769:10497772:None:None   
4   GECPWQEEE      10498710:10498728:10497763:10497772:None:None   
5   PWQEEERTW      10498710:10498719:10497754:10497772:None:None   
6   WQEEERTWD      10498710:10498716:10497751:10497772:None:None   
7   MLVRSRDGP  133208468:133208470:133207920:133207945:None:None   
8   EATFTFMLV  133208468:133208488:133207938:133207945:None:None   
9   TFMLVRSRD  133208468:133208476:133207926:133207945:None:None   
10  TFTFMLVRS  133208468:133208482:133207932:133207945:None:None   
11  FTFMLVRSR  133208468:133208479:133207929:133207945

15it [00:00, 5469.40it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                              coord  \
0   NGECPWQEE      10498710:10498731:10497766:10497772:None:None   
1   ECPWQEEER      10498710:10498725:10497760:10497772:None:None   
2   CPWQEEERT      10498710:10498722:10497757:10497772:None:None   
3   QNGECPWQE      10498710:10498734:10497769:10497772:None:None   
4   GECPWQEEE      10498710:10498728:10497763:10497772:None:None   
5   PWQEEERTW      10498710:10498719:10497754:10497772:None:None   
6   WQEEERTWD      10498710:10498716:10497751:10497772:None:None   
7   MLVRSRDGP  133208468:133208470:133207920:133207945:None:None   
8   EATFTFMLV  133208468:133208488:133207938:133207945:None:None   
9   TFMLVRSRD  133208468:133208476:133207926:133207945:None:None   
10  TFTFMLVRS  133208468:133208482:133207932:133207945:None:None   
11  FTFMLVRSR  133208468:133208479:133207929:133207945

457it [00:00, 3739.91it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    ISVVNHQDH  102448235:102448257:102410796:102410801:None:None   
1    VNHQDHPPC  102448235:102448248:102410787:102410801:None:None   
2    SVVNHQDHP  102448235:102448254:102410793:102410801:None:None   
3    VVNHQDHPP  102448235:102448251:102410790:102410801:None:None   
4    LHGPGLPRT      99603166:99603173:99517163:99517183:None:None   
..         ...                                                ...   
452  VRRAPHGCP  154464868:154464886:154465017:154465026:None:None   
453  PHGCPEILP  154464880:154464886:154465017:154465038:None:None   
454  PEVRRAPHG  154464862:154464886:154465017:154465020:None:None   
455  RRAPHGCPE  154464871:154464886:154465017:154465029:None:None   
456  RAPHGCPEI  154464874:154464886:154465017:154465032:None:None   

     junctionAnnotated  readFrameAnnotate

61it [00:00, 6427.16it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                          coord  \
0   AQPHQKMGC  39098421:39098427:39098351:39098372:None:None   
1   SPGAQAQPH  39098421:39098442:39098366:39098372:None:None   
2   DSPGAQAQP  39098421:39098445:39098369:39098372:None:None   
3   GAQAQPHQK  39098421:39098436:39098360:39098372:None:None   
4   AQAQPHQKM  39098421:39098433:39098357:39098372:None:None   
..        ...                                            ...   
56  FYITFAKLP  94623279:94623300:94625523:94625529:None:None   
57  EFYITFAKL  94623276:94623300:94625523:94625526:None:None   
58  FAKLPLLCS  94623291:94623300:94625523:94625541:None:None   
59  YITFAKLPL  94623282:94623300:94625523:94625532:None:None   
60  TFAKLPLLC  94623288:94623300:94625523:94625538:None:None   

    junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0               False         

15it [00:00, 6100.51it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                              coord  \
0   NGECPWQEE      10498710:10498731:10497766:10497772:None:None   
1   ECPWQEEER      10498710:10498725:10497760:10497772:None:None   
2   CPWQEEERT      10498710:10498722:10497757:10497772:None:None   
3   QNGECPWQE      10498710:10498734:10497769:10497772:None:None   
4   GECPWQEEE      10498710:10498728:10497763:10497772:None:None   
5   PWQEEERTW      10498710:10498719:10497754:10497772:None:None   
6   WQEEERTWD      10498710:10498716:10497751:10497772:None:None   
7   MLVRSRDGP  133208468:133208470:133207920:133207945:None:None   
8   EATFTFMLV  133208468:133208488:133207938:133207945:None:None   
9   TFMLVRSRD  133208468:133208476:133207926:133207945:None:None   
10  TFTFMLVRS  133208468:133208482:133207932:133207945:None:None   
11  FTFMLVRSR  133208468:133208479:133207929:133207945

76it [00:00, 6590.18it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                          coord  \
0   AQPHQKMGC  39098421:39098427:39098351:39098372:None:None   
1   SPGAQAQPH  39098421:39098442:39098366:39098372:None:None   
2   DSPGAQAQP  39098421:39098445:39098369:39098372:None:None   
3   GAQAQPHQK  39098421:39098436:39098360:39098372:None:None   
4   AQAQPHQKM  39098421:39098433:39098357:39098372:None:None   
..        ...                                            ...   
71  FYITFAKLP  94623279:94623300:94625523:94625529:None:None   
72  EFYITFAKL  94623276:94623300:94625523:94625526:None:None   
73  FAKLPLLCS  94623291:94623300:94625523:94625541:None:None   
74  YITFAKLPL  94623282:94623300:94625523:94625532:None:None   
75  TFAKLPLLC  94623288:94623300:94625523:94625538:None:None   

    junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0               False         

170it [00:00, 3139.95it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                          coord  \
0    ELANPIRKY  31798546:31798552:31800978:31800999:None:None   
1    LANPIRKYQ  31798549:31798552:31800978:31801002:None:None   
2    ARLRRELAN  31798531:31798552:31800978:31800984:None:None   
3    LRRELANPI  31798537:31798552:31800978:31800990:None:None   
4    RELANPIRK  31798543:31798552:31800978:31800996:None:None   
..         ...                                            ...   
165  FYITFAKLP  94623279:94623300:94625523:94625529:None:None   
166  EFYITFAKL  94623276:94623300:94625523:94625526:None:None   
167  FAKLPLLCS  94623291:94623300:94625523:94625541:None:None   
168  YITFAKLPL  94623282:94623300:94625523:94625532:None:None   
169  TFAKLPLLC  94623288:94623300:94625523:94625538:None:None   

     junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0                

86it [00:00, 6690.60it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                          coord  \
0   SATSSYLDK  41528204:41528220:41527956:41527967:None:None   
1   TSSYLDKVR  41528204:41528214:41527950:41527967:None:None   
2   QSSATSSYL  41528204:41528226:41527962:41527967:None:None   
3   SSATSSYLD  41528204:41528223:41527959:41527967:None:None   
4   ATSSYLDKV  41528204:41528217:41527953:41527967:None:None   
..        ...                                            ...   
81  AKVDALLHL  33403949:33403955:33406195:33406216:None:None   
82  GHLAKVDAL  33403940:33403955:33406195:33406207:None:None   
83  LEGHLAKVD  33403934:33403955:33406195:33406201:None:None   
84  KVDALLHLA  33403952:33403955:33406195:33406219:None:None   
85  HLAKVDALL  33403943:33403955:33406195:33406210:None:None   

    junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0               False         

61it [00:00, 6639.14it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                          coord  \
0   AQPHQKMGC  39098421:39098427:39098351:39098372:None:None   
1   SPGAQAQPH  39098421:39098442:39098366:39098372:None:None   
2   DSPGAQAQP  39098421:39098445:39098369:39098372:None:None   
3   GAQAQPHQK  39098421:39098436:39098360:39098372:None:None   
4   AQAQPHQKM  39098421:39098433:39098357:39098372:None:None   
..        ...                                            ...   
56  FYITFAKLP  94623279:94623300:94625523:94625529:None:None   
57  EFYITFAKL  94623276:94623300:94625523:94625526:None:None   
58  FAKLPLLCS  94623291:94623300:94625523:94625541:None:None   
59  YITFAKLPL  94623282:94623300:94625523:94625532:None:None   
60  TFAKLPLLC  94623288:94623300:94625523:94625538:None:None   

    junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0               False         

35it [00:00, 6446.26it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                              coord  \
0   ELANPIRKY      31798546:31798552:31800978:31800999:None:None   
1   LANPIRKYQ      31798549:31798552:31800978:31801002:None:None   
2   ARLRRELAN      31798531:31798552:31800978:31800984:None:None   
3   LRRELANPI      31798537:31798552:31800978:31800990:None:None   
4   RELANPIRK      31798543:31798552:31800978:31800996:None:None   
5   RLRRELANP      31798534:31798552:31800978:31800987:None:None   
6   LARLRRELA      31798528:31798552:31800978:31800981:None:None   
7   RRELANPIR      31798540:31798552:31800978:31800993:None:None   
8   EYDSYCGVG      16008948:16008965:16000609:16000619:None:None   
9   DSYCGVGLS      16008948:16008959:16000603:16000619:None:None   
10  LEEYDSYCG      16008948:16008971:16000615:16000619:None:None   
11  YCGVGLSFL      16008948:16008953:16000597:16000619

383it [00:00, 3142.11it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    LKARKVLTG      58458488:58458513:58458461:58458463:None:None   
1    ARKVLTGRL      58458488:58458507:58458455:58458463:None:None   
2    KVLTGRLCL      58458488:58458501:58458449:58458463:None:None   
3    KARKVLTGR      58458488:58458510:58458458:58458463:None:None   
4    RKVLTGRLC      58458488:58458504:58458452:58458463:None:None   
..         ...                                                ...   
378  TKKRNERSG  151102942:151102957:151103058:151103070:None:None   
379  GFTKKRNER  151102936:151102957:151103058:151103064:None:None   
380  GFTKKRNER  151102936:151102957:151103058:151103064:None:None   
381  GFTKKRNER  151102936:151102957:151103058:151103064:None:None   
382  GFTKKRNER  151102936:151102957:151103058:151103064:None:None   

     junctionAnnotated  readFrameAnnotate

594it [00:00, 3130.73it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    MLNKYFKLG      28929701:28929710:28935521:28935539:None:None   
1    DFEMLNKYF      28929692:28929710:28935521:28935530:None:None   
2    FEMLNKYFK      28929695:28929710:28935521:28935533:None:None   
3    LNKYFKLGM      28929704:28929710:28935521:28935542:None:None   
4    EMLNKYFKL      28929698:28929710:28935521:28935536:None:None   
..         ...                                                ...   
589  VRRAPHGCP  154464868:154464886:154465017:154465026:None:None   
590  PHGCPEILP  154464880:154464886:154465017:154465038:None:None   
591  PEVRRAPHG  154464862:154464886:154465017:154465020:None:None   
592  RRAPHGCPE  154464871:154464886:154465017:154465029:None:None   
593  RAPHGCPEI  154464874:154464886:154465017:154465032:None:None   

     junctionAnnotated  readFrameAnnotate

188it [00:00, 3581.47it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    TWNRTFEGR      32774821:32774843:32786233:32786238:None:None   
1    WNRTFEGRN      32774824:32774843:32786233:32786241:None:None   
2    NRTFEGRNT      32774827:32774843:32786233:32786244:None:None   
3    RTFEGRNTE      32774830:32774843:32786233:32786247:None:None   
4    SLQEPALQK  170808731:170808752:170809982:170809988:None:None   
..         ...                                                ...   
183  AKVDALLHL      33403949:33403955:33406195:33406216:None:None   
184  GHLAKVDAL      33403940:33403955:33406195:33406207:None:None   
185  LEGHLAKVD      33403934:33403955:33406195:33406201:None:None   
186  KVDALLHLA      33403952:33403955:33406195:33406219:None:None   
187  HLAKVDALL      33403943:33403955:33406195:33406210:None:None   

     junctionAnnotated  readFrameAnnotate

72it [00:00, 2103.81it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                          coord  \
0   AQPHQKMGC  39098421:39098427:39098351:39098372:None:None   
1   SPGAQAQPH  39098421:39098442:39098366:39098372:None:None   
2   DSPGAQAQP  39098421:39098445:39098369:39098372:None:None   
3   GAQAQPHQK  39098421:39098436:39098360:39098372:None:None   
4   AQAQPHQKM  39098421:39098433:39098357:39098372:None:None   
..        ...                                            ...   
67  FYITFAKLP  94623279:94623300:94625523:94625529:None:None   
68  EFYITFAKL  94623276:94623300:94625523:94625526:None:None   
69  FAKLPLLCS  94623291:94623300:94625523:94625541:None:None   
70  YITFAKLPL  94623282:94623300:94625523:94625532:None:None   
71  TFAKLPLLC  94623288:94623300:94625523:94625538:None:None   

    junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0               False         

76it [00:00, 6501.60it/s]

Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                          coord  \
0   AQPHQKMGC  39098421:39098427:39098351:39098372:None:None   
1   SPGAQAQPH  39098421:39098442:39098366:39098372:None:None   
2   DSPGAQAQP  39098421:39098445:39098369:39098372:None:None   
3   GAQAQPHQK  39098421:39098436:39098360:39098372:None:None   
4   AQAQPHQKM  39098421:39098433:39098357:39098372:None:None   
..        ...                                            ...   
71  FYITFAKLP  94623279:94623300:94625523:94625529:None:None   
72  EFYITFAKL  94623276:94623300:94625523:94625526:None:None   
73  FAKLPLLCS  94623291:94623300:94625523:94625541:None:None   
74  YITFAKLPL  94623282:94623300:94625523:94625532:None:None   
75  TFAKLPLLC  94623288:94623300:94625523:94625538:None:None   

    junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0               False         


69it [00:00, 2632.74it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                              coord  \
0   SLQEPALQK  170808731:170808752:170809982:170809988:None:None   
1   IAENMLSVC  170808748:170808752:170809982:170810005:None:None   
2   ALQKICYLY  170808746:170808752:170809982:170810003:None:None   
3   QEPALQKIC  170808737:170808752:170809982:170809994:None:None   
4   CIAENMLSV  170808745:170808752:170809982:170810002:None:None   
..        ...                                                ...   
64  AKVDALLHL      33403949:33403955:33406195:33406216:None:None   
65  GHLAKVDAL      33403940:33403955:33406195:33406207:None:None   
66  LEGHLAKVD      33403934:33403955:33406195:33406201:None:None   
67  KVDALLHLA      33403952:33403955:33406195:33406219:None:None   
68  HLAKVDALL      33403943:33403955:33406195:33406210:None:None   

    junctionAnnotated  readFrameAnnotated strand  jun

670it [00:00, 3307.88it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    RTEPSPNRV          8877029:8877050:8868506:8868512:None:None   
1    ERTEPSPNR          8877029:8877053:8868509:8868512:None:None   
2    MLNKYFKLG      28929701:28929710:28935521:28935539:None:None   
3    DFEMLNKYF      28929692:28929710:28935521:28935530:None:None   
4    FEMLNKYFK      28929695:28929710:28935521:28935533:None:None   
..         ...                                                ...   
665  VRRAPHGCP  154464868:154464886:154465017:154465026:None:None   
666  PHGCPEILP  154464880:154464886:154465017:154465038:None:None   
667  PEVRRAPHG  154464862:154464886:154465017:154465020:None:None   
668  RRAPHGCPE  154464871:154464886:154465017:154465029:None:None   
669  RAPHGCPEI  154464874:154464886:154465017:154465032:None:None   

     junctionAnnotated  readFrameAnnotate

150it [00:00, 3919.64it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                          coord  \
0    ELANPIRKY  31798546:31798552:31800978:31800999:None:None   
1    LANPIRKYQ  31798549:31798552:31800978:31801002:None:None   
2    ARLRRELAN  31798531:31798552:31800978:31800984:None:None   
3    LRRELANPI  31798537:31798552:31800978:31800990:None:None   
4    RELANPIRK  31798543:31798552:31800978:31800996:None:None   
..         ...                                            ...   
145  FYITFAKLP  94623279:94623300:94625523:94625529:None:None   
146  EFYITFAKL  94623276:94623300:94625523:94625526:None:None   
147  FAKLPLLCS  94623291:94623300:94625523:94625541:None:None   
148  YITFAKLPL  94623282:94623300:94625523:94625532:None:None   
149  TFAKLPLLC  94623288:94623300:94625523:94625538:None:None   

     junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0                

224it [00:00, 2921.10it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    LKARKVLTG      58458488:58458513:58458461:58458463:None:None   
1    ARKVLTGRL      58458488:58458507:58458455:58458463:None:None   
2    KVLTGRLCL      58458488:58458501:58458449:58458463:None:None   
3    KARKVLTGR      58458488:58458510:58458458:58458463:None:None   
4    RKVLTGRLC      58458488:58458504:58458452:58458463:None:None   
..         ...                                                ...   
219  TKKRNERSG  151102942:151102957:151103058:151103070:None:None   
220  GFTKKRNER  151102936:151102957:151103058:151103064:None:None   
221  GFTKKRNER  151102936:151102957:151103058:151103064:None:None   
222  GFTKKRNER  151102936:151102957:151103058:151103064:None:None   
223  GFTKKRNER  151102936:151102957:151103058:151103064:None:None   

     junctionAnnotated  readFrameAnnotate

338it [00:00, 4893.48it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                          coord  \
0    MLNKYFKLG  28929701:28929710:28935521:28935539:None:None   
1    DFEMLNKYF  28929692:28929710:28935521:28935530:None:None   
2    FEMLNKYFK  28929695:28929710:28935521:28935533:None:None   
3    LNKYFKLGM  28929704:28929710:28935521:28935542:None:None   
4    EMLNKYFKL  28929698:28929710:28935521:28935536:None:None   
..         ...                                            ...   
333  AKVDALLHL  33403949:33403955:33406195:33406216:None:None   
334  GHLAKVDAL  33403940:33403955:33406195:33406207:None:None   
335  LEGHLAKVD  33403934:33403955:33406195:33406201:None:None   
336  KVDALLHLA  33403952:33403955:33406195:33406219:None:None   
337  HLAKVDALL  33403943:33403955:33406195:33406210:None:None   

     junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0                

7it [00:00, 4535.78it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
        kmer                                  coord  junctionAnnotated  \
0  FRIRKQHRR  346147:346152:310026:310048:None:None              False   
1  EGREFRIRK  346147:346164:310038:310048:None:None              False   
2  GREFRIRKQ  346147:346161:310035:310048:None:None              False   
3  YSEGREFRI  346147:346170:310044:310048:None:None              False   
4  SEGREFRIR  346147:346167:310041:310048:None:None              False   
5  EFRIRKQHR  346147:346155:310029:310048:None:None              False   
6  REFRIRKQH  346147:346158:310032:310048:None:None              False   

   readFrameAnnotated strand junction_coordinate  
0                True      -       310048:346147  
1                True      -       310048:346147  
2                True      -       310048:346147  
3                True      -       310048:346147  
4                True 

7it [00:00, 4935.30it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
        kmer                                  coord  junctionAnnotated  \
0  FRIRKQHRR  346147:346152:310026:310048:None:None              False   
1  EGREFRIRK  346147:346164:310038:310048:None:None              False   
2  GREFRIRKQ  346147:346161:310035:310048:None:None              False   
3  YSEGREFRI  346147:346170:310044:310048:None:None              False   
4  SEGREFRIR  346147:346167:310041:310048:None:None              False   
5  EFRIRKQHR  346147:346155:310029:310048:None:None              False   
6  REFRIRKQH  346147:346158:310032:310048:None:None              False   

   readFrameAnnotated strand junction_coordinate  
0                True      -       310048:346147  
1                True      -       310048:346147  
2                True      -       310048:346147  
3                True      -       310048:346147  
4                True 

7it [00:00, 4665.52it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
        kmer                                  coord  junctionAnnotated  \
0  FRIRKQHRR  346147:346152:310026:310048:None:None              False   
1  EGREFRIRK  346147:346164:310038:310048:None:None              False   
2  GREFRIRKQ  346147:346161:310035:310048:None:None              False   
3  YSEGREFRI  346147:346170:310044:310048:None:None              False   
4  SEGREFRIR  346147:346167:310041:310048:None:None              False   
5  EFRIRKQHR  346147:346155:310029:310048:None:None              False   
6  REFRIRKQH  346147:346158:310032:310048:None:None              False   

   readFrameAnnotated strand junction_coordinate  
0                True      -       310048:346147  
1                True      -       310048:346147  
2                True      -       310048:346147  
3                True      -       310048:346147  
4                True 

110it [00:00, 7095.65it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    FRIRKQHRR              346147:346152:310026:310048:None:None   
1    EGREFRIRK              346147:346164:310038:310048:None:None   
2    GREFRIRKQ              346147:346161:310035:310048:None:None   
3    YSEGREFRI              346147:346170:310044:310048:None:None   
4    SEGREFRIR              346147:346167:310041:310048:None:None   
..         ...                                                ...   
105  QQCGADSLG  141628113:141628116:141627936:141627960:None:None   
106  QPTCIVLQQ  141628113:141628137:141627957:141627960:None:None   
107  PTCIVLQQC  141628113:141628134:141627954:141627960:None:None   
108  TCIVLQQCG  141628113:141628131:141627951:141627960:None:None   
109  CIVLQQCGA  141628113:141628128:141627948:141627960:None:None   

     junctionAnnotated  readFrameAnnotate

41it [00:00, 6425.05it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                              coord  \
0   FRIRKQHRR              346147:346152:310026:310048:None:None   
1   EGREFRIRK              346147:346164:310038:310048:None:None   
2   GREFRIRKQ              346147:346161:310035:310048:None:None   
3   YSEGREFRI              346147:346170:310044:310048:None:None   
4   SEGREFRIR              346147:346167:310041:310048:None:None   
5   EFRIRKQHR              346147:346155:310029:310048:None:None   
6   REFRIRKQH              346147:346158:310032:310048:None:None   
7   EGHEADLRG      71372166:71372170:71384782:71384805:None:None   
8   EGHEADLRG      71372166:71372170:71384782:71384805:None:None   
9   EGHEADLRG      71372166:71372170:71384782:71384805:None:None   
10  EGHEADLRG      71372166:71372170:71384782:71384805:None:None   
11  NGYIEGHEA      71372154:71372170:71384782:71384793

458it [00:00, 3533.85it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    TGLCQIFSE      33296217:33296222:33295165:33295187:None:None   
1    GLCQIFSEE      33296217:33296219:33295162:33295187:None:None   
2    LLQTGLCQI      33296217:33296231:33295174:33295187:None:None   
3    LQTGLCQIF      33296217:33296228:33295171:33295187:None:None   
4    ESFLLQTGL      33296217:33296240:33295183:33295187:None:None   
..         ...                                                ...   
453  ELQQIMKGG  180318792:180318814:180312462:180312467:None:None   
454  QIMKGGKRS  180318792:180318805:180312453:180312467:None:None   
455  IMKGGKRSS  180318792:180318802:180312450:180312467:None:None   
456  LKLSAECQK      49379664:49379690:49363232:49363233:None:None   
457  KLSAECQKF      49379664:49379687:49363229:49363233:None:None   

     junctionAnnotated  readFrameAnnotate

37it [00:00, 6404.31it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                          coord  \
0   FRIRKQHRR          346147:346152:310026:310048:None:None   
1   EGREFRIRK          346147:346164:310038:310048:None:None   
2   GREFRIRKQ          346147:346161:310035:310048:None:None   
3   YSEGREFRI          346147:346170:310044:310048:None:None   
4   SEGREFRIR          346147:346167:310041:310048:None:None   
5   EFRIRKQHR          346147:346155:310029:310048:None:None   
6   REFRIRKQH          346147:346158:310032:310048:None:None   
7   VKVYRPATV  72667427:72667434:72635279:72635299:None:None   
8   DLFGVKVYR  72667427:72667446:72635291:72635299:None:None   
9   GVKVYRPAT  72667427:72667437:72635282:72635299:None:None   
10  LFGVKVYRP  72667427:72667443:72635288:72635299:None:None   
11  KVYRPATVL  72667427:72667431:72635276:72635299:None:None   
12  FGVKVYRPA  72667427:72667440:72635

401it [00:00, 3129.28it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    TGLCQIFSE      33296217:33296222:33295165:33295187:None:None   
1    GLCQIFSEE      33296217:33296219:33295162:33295187:None:None   
2    LLQTGLCQI      33296217:33296231:33295174:33295187:None:None   
3    LQTGLCQIF      33296217:33296228:33295171:33295187:None:None   
4    ESFLLQTGL      33296217:33296240:33295183:33295187:None:None   
..         ...                                                ...   
396  ELQQIMKGG  180318792:180318814:180312462:180312467:None:None   
397  QIMKGGKRS  180318792:180318805:180312453:180312467:None:None   
398  IMKGGKRSS  180318792:180318802:180312450:180312467:None:None   
399  LKLSAECQK      49379664:49379690:49363232:49363233:None:None   
400  KLSAECQKF      49379664:49379687:49363229:49363233:None:None   

     junctionAnnotated  readFrameAnnotate

13it [00:00, 5765.06it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                          coord  \
0   FRIRKQHRR          346147:346152:310026:310048:None:None   
1   EGREFRIRK          346147:346164:310038:310048:None:None   
2   GREFRIRKQ          346147:346161:310035:310048:None:None   
3   YSEGREFRI          346147:346170:310044:310048:None:None   
4   SEGREFRIR          346147:346167:310041:310048:None:None   
5   EFRIRKQHR          346147:346155:310029:310048:None:None   
6   REFRIRKQH          346147:346158:310032:310048:None:None   
7   VKVYRPATV  72667427:72667434:72635279:72635299:None:None   
8   DLFGVKVYR  72667427:72667446:72635291:72635299:None:None   
9   GVKVYRPAT  72667427:72667437:72635282:72635299:None:None   
10  LFGVKVYRP  72667427:72667443:72635288:72635299:None:None   
11  KVYRPATVL  72667427:72667431:72635276:72635299:None:None   
12  FGVKVYRPA  72667427:72667440:72635

37it [00:00, 6411.19it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                          coord  \
0   FRIRKQHRR          346147:346152:310026:310048:None:None   
1   EGREFRIRK          346147:346164:310038:310048:None:None   
2   GREFRIRKQ          346147:346161:310035:310048:None:None   
3   YSEGREFRI          346147:346170:310044:310048:None:None   
4   SEGREFRIR          346147:346167:310041:310048:None:None   
5   EFRIRKQHR          346147:346155:310029:310048:None:None   
6   REFRIRKQH          346147:346158:310032:310048:None:None   
7   VKVYRPATV  72667427:72667434:72635279:72635299:None:None   
8   DLFGVKVYR  72667427:72667446:72635291:72635299:None:None   
9   GVKVYRPAT  72667427:72667437:72635282:72635299:None:None   
10  LFGVKVYRP  72667427:72667443:72635288:72635299:None:None   
11  KVYRPATVL  72667427:72667431:72635276:72635299:None:None   
12  FGVKVYRPA  72667427:72667440:72635

171it [00:00, 3202.95it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    TGLCQIFSE      33296217:33296222:33295165:33295187:None:None   
1    GLCQIFSEE      33296217:33296219:33295162:33295187:None:None   
2    LLQTGLCQI      33296217:33296231:33295174:33295187:None:None   
3    LQTGLCQIF      33296217:33296228:33295171:33295187:None:None   
4    ESFLLQTGL      33296217:33296240:33295183:33295187:None:None   
..         ...                                                ...   
166  VEDLMDSGN  240787250:240787256:240786342:240786363:None:None   
167  EDLMDSGNK  240787250:240787253:240786339:240786363:None:None   
168  YSVEDLMDS  240787250:240787262:240786348:240786363:None:None   
169  SYSVEDLMD  240787250:240787265:240786351:240786363:None:None   
170  SVEDLMDSG  240787250:240787259:240786345:240786363:None:None   

     junctionAnnotated  readFrameAnnotate

110it [00:00, 4601.40it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    FRIRKQHRR              346147:346152:310026:310048:None:None   
1    EGREFRIRK              346147:346164:310038:310048:None:None   
2    GREFRIRKQ              346147:346161:310035:310048:None:None   
3    YSEGREFRI              346147:346170:310044:310048:None:None   
4    SEGREFRIR              346147:346167:310041:310048:None:None   
..         ...                                                ...   
105  QQCGADSLG  141628113:141628116:141627936:141627960:None:None   
106  QPTCIVLQQ  141628113:141628137:141627957:141627960:None:None   
107  PTCIVLQQC  141628113:141628134:141627954:141627960:None:None   
108  TCIVLQQCG  141628113:141628131:141627951:141627960:None:None   
109  CIVLQQCGA  141628113:141628128:141627948:141627960:None:None   

     junctionAnnotated  readFrameAnnotate

7it [00:00, 3860.13it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
        kmer                                  coord  junctionAnnotated  \
0  FRIRKQHRR  346147:346152:310026:310048:None:None              False   
1  EGREFRIRK  346147:346164:310038:310048:None:None              False   
2  GREFRIRKQ  346147:346161:310035:310048:None:None              False   
3  YSEGREFRI  346147:346170:310044:310048:None:None              False   
4  SEGREFRIR  346147:346167:310041:310048:None:None              False   
5  EFRIRKQHR  346147:346155:310029:310048:None:None              False   
6  REFRIRKQH  346147:346158:310032:310048:None:None              False   

   readFrameAnnotated strand junction_coordinate  
0                True      -       310048:346147  
1                True      -       310048:346147  
2                True      -       310048:346147  
3                True      -       310048:346147  
4                True 

7it [00:00, 4875.48it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
        kmer                                  coord  junctionAnnotated  \
0  FRIRKQHRR  346147:346152:310026:310048:None:None              False   
1  EGREFRIRK  346147:346164:310038:310048:None:None              False   
2  GREFRIRKQ  346147:346161:310035:310048:None:None              False   
3  YSEGREFRI  346147:346170:310044:310048:None:None              False   
4  SEGREFRIR  346147:346167:310041:310048:None:None              False   
5  EFRIRKQHR  346147:346155:310029:310048:None:None              False   
6  REFRIRKQH  346147:346158:310032:310048:None:None              False   

   readFrameAnnotated strand junction_coordinate  
0                True      -       310048:346147  
1                True      -       310048:346147  
2                True      -       310048:346147  
3                True      -       310048:346147  
4                True 

543it [00:00, 3499.44it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    TGLCQIFSE      33296217:33296222:33295165:33295187:None:None   
1    GLCQIFSEE      33296217:33296219:33295162:33295187:None:None   
2    LLQTGLCQI      33296217:33296231:33295174:33295187:None:None   
3    LQTGLCQIF      33296217:33296228:33295171:33295187:None:None   
4    ESFLLQTGL      33296217:33296240:33295183:33295187:None:None   
..         ...                                                ...   
538  ELQQIMKGG  180318792:180318814:180312462:180312467:None:None   
539  QIMKGGKRS  180318792:180318805:180312453:180312467:None:None   
540  IMKGGKRSS  180318792:180318802:180312450:180312467:None:None   
541  LKLSAECQK      49379664:49379690:49363232:49363233:None:None   
542  KLSAECQKF      49379664:49379687:49363229:49363233:None:None   

     junctionAnnotated  readFrameAnnotate

110it [00:00, 2479.64it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    FRIRKQHRR              346147:346152:310026:310048:None:None   
1    EGREFRIRK              346147:346164:310038:310048:None:None   
2    GREFRIRKQ              346147:346161:310035:310048:None:None   
3    YSEGREFRI              346147:346170:310044:310048:None:None   
4    SEGREFRIR              346147:346167:310041:310048:None:None   
..         ...                                                ...   
105  QQCGADSLG  141628113:141628116:141627936:141627960:None:None   
106  QPTCIVLQQ  141628113:141628137:141627957:141627960:None:None   
107  PTCIVLQQC  141628113:141628134:141627954:141627960:None:None   
108  TCIVLQQCG  141628113:141628131:141627951:141627960:None:None   
109  CIVLQQCGA  141628113:141628128:141627948:141627960:None:None   

     junctionAnnotated  readFrameAnnotate

96it [00:00, 3680.86it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                              coord  \
0   FRIRKQHRR              346147:346152:310026:310048:None:None   
1   EGREFRIRK              346147:346164:310038:310048:None:None   
2   GREFRIRKQ              346147:346161:310035:310048:None:None   
3   YSEGREFRI              346147:346170:310044:310048:None:None   
4   SEGREFRIR              346147:346167:310041:310048:None:None   
..        ...                                                ...   
91  AREQLQGKI  17636654:17636666:17636576:17636588:17635661:1...   
92  REQLQGKIR  17636654:17636663:17636576:17636588:17635658:1...   
93  LVAREQLQG      17636654:17636672:17636579:17636588:None:None   
94  QLVAREQLQ      17636654:17636675:17636582:17636588:None:None   
95  VAREQLQGK      17636654:17636669:17636576:17636588:None:None   

    junctionAnnotated  readFrameAnnotated strand  \
0

14it [00:00, 5672.91it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                              coord  \
0   FRIRKQHRR              346147:346152:310026:310048:None:None   
1   EGREFRIRK              346147:346164:310038:310048:None:None   
2   GREFRIRKQ              346147:346161:310035:310048:None:None   
3   YSEGREFRI              346147:346170:310044:310048:None:None   
4   SEGREFRIR              346147:346167:310041:310048:None:None   
5   EFRIRKQHR              346147:346155:310029:310048:None:None   
6   REFRIRKQH              346147:346158:310032:310048:None:None   
7   NMSYSVEDL  240787250:240787271:240786357:240786363:None:None   
8   MSYSVEDLM  240787250:240787268:240786354:240786363:None:None   
9   VEDLMDSGN  240787250:240787256:240786342:240786363:None:None   
10  EDLMDSGNK  240787250:240787253:240786339:240786363:None:None   
11  YSVEDLMDS  240787250:240787262:240786348:240786363

37it [00:00, 6364.65it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                          coord  \
0   FRIRKQHRR          346147:346152:310026:310048:None:None   
1   EGREFRIRK          346147:346164:310038:310048:None:None   
2   GREFRIRKQ          346147:346161:310035:310048:None:None   
3   YSEGREFRI          346147:346170:310044:310048:None:None   
4   SEGREFRIR          346147:346167:310041:310048:None:None   
5   EFRIRKQHR          346147:346155:310029:310048:None:None   
6   REFRIRKQH          346147:346158:310032:310048:None:None   
7   VKVYRPATV  72667427:72667434:72635279:72635299:None:None   
8   DLFGVKVYR  72667427:72667446:72635291:72635299:None:None   
9   GVKVYRPAT  72667427:72667437:72635282:72635299:None:None   
10  LFGVKVYRP  72667427:72667443:72635288:72635299:None:None   
11  KVYRPATVL  72667427:72667431:72635276:72635299:None:None   
12  FGVKVYRPA  72667427:72667440:72635

88it [00:00, 3486.11it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                              coord  \
0   TGLCQIFSE      33296217:33296222:33295165:33295187:None:None   
1   GLCQIFSEE      33296217:33296219:33295162:33295187:None:None   
2   LLQTGLCQI      33296217:33296231:33295174:33295187:None:None   
3   LQTGLCQIF      33296217:33296228:33295171:33295187:None:None   
4   ESFLLQTGL      33296217:33296240:33295183:33295187:None:None   
..        ...                                                ...   
83  VEDLMDSGN  240787250:240787256:240786342:240786363:None:None   
84  EDLMDSGNK  240787250:240787253:240786339:240786363:None:None   
85  YSVEDLMDS  240787250:240787262:240786348:240786363:None:None   
86  SYSVEDLMD  240787250:240787265:240786351:240786363:None:None   
87  SVEDLMDSG  240787250:240787259:240786345:240786363:None:None   

    junctionAnnotated  readFrameAnnotated strand  jun

67it [00:00, 6425.77it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                  coord  junctionAnnotated  \
0   FRIRKQHRR  346147:346152:310026:310048:None:None              False   
1   EGREFRIRK  346147:346164:310038:310048:None:None              False   
2   GREFRIRKQ  346147:346161:310035:310048:None:None              False   
3   YSEGREFRI  346147:346170:310044:310048:None:None              False   
4   SEGREFRIR  346147:346167:310041:310048:None:None              False   
..        ...                                    ...                ...   
62  HGSGTRERP  397303:397313:399378:399395:None:None              False   
63  RHGSGTRER  397300:397313:399378:399392:None:None              False   
64  GSGTRERPC  397306:397313:399378:399398:None:None              False   
65  VRHGSGTRE  397297:397313:399378:399389:None:None              False   
66  SGTRERPCR  397309:397313:399378:399401:No

249it [00:00, 3060.02it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                          coord  \
0    TGLCQIFSE  33296217:33296222:33295165:33295187:None:None   
1    GLCQIFSEE  33296217:33296219:33295162:33295187:None:None   
2    LLQTGLCQI  33296217:33296231:33295174:33295187:None:None   
3    LQTGLCQIF  33296217:33296228:33295171:33295187:None:None   
4    ESFLLQTGL  33296217:33296240:33295183:33295187:None:None   
..         ...                                            ...   
244  HGSGTRERP          397303:397313:399378:399395:None:None   
245  RHGSGTRER          397300:397313:399378:399392:None:None   
246  GSGTRERPC          397306:397313:399378:399398:None:None   
247  VRHGSGTRE          397297:397313:399378:399389:None:None   
248  SGTRERPCR          397309:397313:399378:399401:None:None   

     junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0                

37it [00:00, 5444.28it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                          coord  \
0   FRIRKQHRR          346147:346152:310026:310048:None:None   
1   EGREFRIRK          346147:346164:310038:310048:None:None   
2   GREFRIRKQ          346147:346161:310035:310048:None:None   
3   YSEGREFRI          346147:346170:310044:310048:None:None   
4   SEGREFRIR          346147:346167:310041:310048:None:None   
5   EFRIRKQHR          346147:346155:310029:310048:None:None   
6   REFRIRKQH          346147:346158:310032:310048:None:None   
7   VKVYRPATV  72667427:72667434:72635279:72635299:None:None   
8   DLFGVKVYR  72667427:72667446:72635291:72635299:None:None   
9   GVKVYRPAT  72667427:72667437:72635282:72635299:None:None   
10  LFGVKVYRP  72667427:72667443:72635288:72635299:None:None   
11  KVYRPATVL  72667427:72667431:72635276:72635299:None:None   
12  FGVKVYRPA  72667427:72667440:72635

37it [00:00, 6335.81it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                          coord  \
0   FRIRKQHRR          346147:346152:310026:310048:None:None   
1   EGREFRIRK          346147:346164:310038:310048:None:None   
2   GREFRIRKQ          346147:346161:310035:310048:None:None   
3   YSEGREFRI          346147:346170:310044:310048:None:None   
4   SEGREFRIR          346147:346167:310041:310048:None:None   
5   EFRIRKQHR          346147:346155:310029:310048:None:None   
6   REFRIRKQH          346147:346158:310032:310048:None:None   
7   VKVYRPATV  72667427:72667434:72635279:72635299:None:None   
8   DLFGVKVYR  72667427:72667446:72635291:72635299:None:None   
9   GVKVYRPAT  72667427:72667437:72635282:72635299:None:None   
10  LFGVKVYRP  72667427:72667443:72635288:72635299:None:None   
11  KVYRPATVL  72667427:72667431:72635276:72635299:None:None   
12  FGVKVYRPA  72667427:72667440:72635

35it [00:00, 1913.63it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                              coord  \
0   TGLCQIFSE      33296217:33296222:33295165:33295187:None:None   
1   GLCQIFSEE      33296217:33296219:33295162:33295187:None:None   
2   LLQTGLCQI      33296217:33296231:33295174:33295187:None:None   
3   LQTGLCQIF      33296217:33296228:33295171:33295187:None:None   
4   ESFLLQTGL      33296217:33296240:33295183:33295187:None:None   
5   SFLLQTGLC      33296217:33296237:33295180:33295187:None:None   
6   FLLQTGLCQ      33296217:33296234:33295177:33295187:None:None   
7   QTGLCQIFS      33296217:33296225:33295168:33295187:None:None   
8   FRIRKQHRR              346147:346152:310026:310048:None:None   
9   EGREFRIRK              346147:346164:310038:310048:None:None   
10  GREFRIRKQ              346147:346161:310035:310048:None:None   
11  YSEGREFRI              346147:346170:310044:310048

31it [00:00, 6100.09it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                          coord  \
0   FRIRKQHRR          346147:346152:310026:310048:None:None   
1   EGREFRIRK          346147:346164:310038:310048:None:None   
2   GREFRIRKQ          346147:346161:310035:310048:None:None   
3   YSEGREFRI          346147:346170:310044:310048:None:None   
4   SEGREFRIR          346147:346167:310041:310048:None:None   
5   EFRIRKQHR          346147:346155:310029:310048:None:None   
6   REFRIRKQH          346147:346158:310032:310048:None:None   
7   KTDMHGDSE  12939232:12939235:12939591:12939615:None:None   
8   KDKDRHARR  12939225:12939235:12939591:12939608:None:None   
9   TRRKIKTDM  12939217:12939235:12939591:12939600:None:None   
10  DEEKDKDRH  12939216:12939235:12939591:12939599:None:None   
11  KIKTDMHGD  12939226:12939235:12939591:12939609:None:None   
12  DKDRHARRL  12939228:12939235:12939

13it [00:00, 5849.17it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                          coord  \
0   FRIRKQHRR          346147:346152:310026:310048:None:None   
1   EGREFRIRK          346147:346164:310038:310048:None:None   
2   GREFRIRKQ          346147:346161:310035:310048:None:None   
3   YSEGREFRI          346147:346170:310044:310048:None:None   
4   SEGREFRIR          346147:346167:310041:310048:None:None   
5   EFRIRKQHR          346147:346155:310029:310048:None:None   
6   REFRIRKQH          346147:346158:310032:310048:None:None   
7   VKVYRPATV  72667427:72667434:72635279:72635299:None:None   
8   DLFGVKVYR  72667427:72667446:72635291:72635299:None:None   
9   GVKVYRPAT  72667427:72667437:72635282:72635299:None:None   
10  LFGVKVYRP  72667427:72667443:72635288:72635299:None:None   
11  KVYRPATVL  72667427:72667431:72635276:72635299:None:None   
12  FGVKVYRPA  72667427:72667440:72635

95it [00:00, 2262.94it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                              coord  \
0   TGLCQIFSE      33296217:33296222:33295165:33295187:None:None   
1   GLCQIFSEE      33296217:33296219:33295162:33295187:None:None   
2   LLQTGLCQI      33296217:33296231:33295174:33295187:None:None   
3   LQTGLCQIF      33296217:33296228:33295171:33295187:None:None   
4   ESFLLQTGL      33296217:33296240:33295183:33295187:None:None   
..        ...                                                ...   
90  VEDLMDSGN  240787250:240787256:240786342:240786363:None:None   
91  EDLMDSGNK  240787250:240787253:240786339:240786363:None:None   
92  YSVEDLMDS  240787250:240787262:240786348:240786363:None:None   
93  SYSVEDLMD  240787250:240787265:240786351:240786363:None:None   
94  SVEDLMDSG  240787250:240787259:240786345:240786363:None:None   

    junctionAnnotated  readFrameAnnotated strand  jun

13it [00:00, 5922.87it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                          coord  \
0   FRIRKQHRR          346147:346152:310026:310048:None:None   
1   EGREFRIRK          346147:346164:310038:310048:None:None   
2   GREFRIRKQ          346147:346161:310035:310048:None:None   
3   YSEGREFRI          346147:346170:310044:310048:None:None   
4   SEGREFRIR          346147:346167:310041:310048:None:None   
5   EFRIRKQHR          346147:346155:310029:310048:None:None   
6   REFRIRKQH          346147:346158:310032:310048:None:None   
7   VKVYRPATV  72667427:72667434:72635279:72635299:None:None   
8   DLFGVKVYR  72667427:72667446:72635291:72635299:None:None   
9   GVKVYRPAT  72667427:72667437:72635282:72635299:None:None   
10  LFGVKVYRP  72667427:72667443:72635288:72635299:None:None   
11  KVYRPATVL  72667427:72667431:72635276:72635299:None:None   
12  FGVKVYRPA  72667427:72667440:72635

424it [00:00, 3489.61it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    TGLCQIFSE      33296217:33296222:33295165:33295187:None:None   
1    GLCQIFSEE      33296217:33296219:33295162:33295187:None:None   
2    LLQTGLCQI      33296217:33296231:33295174:33295187:None:None   
3    LQTGLCQIF      33296217:33296228:33295171:33295187:None:None   
4    ESFLLQTGL      33296217:33296240:33295183:33295187:None:None   
..         ...                                                ...   
419  ELQQIMKGG  180318792:180318814:180312462:180312467:None:None   
420  QIMKGGKRS  180318792:180318805:180312453:180312467:None:None   
421  IMKGGKRSS  180318792:180318802:180312450:180312467:None:None   
422  LKLSAECQK      49379664:49379690:49363232:49363233:None:None   
423  KLSAECQKF      49379664:49379687:49363229:49363233:None:None   

     junctionAnnotated  readFrameAnnotate

31it [00:00, 6079.84it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                          coord  \
0   FRIRKQHRR          346147:346152:310026:310048:None:None   
1   EGREFRIRK          346147:346164:310038:310048:None:None   
2   GREFRIRKQ          346147:346161:310035:310048:None:None   
3   YSEGREFRI          346147:346170:310044:310048:None:None   
4   SEGREFRIR          346147:346167:310041:310048:None:None   
5   EFRIRKQHR          346147:346155:310029:310048:None:None   
6   REFRIRKQH          346147:346158:310032:310048:None:None   
7   KTDMHGDSE  12939232:12939235:12939591:12939615:None:None   
8   KDKDRHARR  12939225:12939235:12939591:12939608:None:None   
9   TRRKIKTDM  12939217:12939235:12939591:12939600:None:None   
10  DEEKDKDRH  12939216:12939235:12939591:12939599:None:None   
11  KIKTDMHGD  12939226:12939235:12939591:12939609:None:None   
12  DKDRHARRL  12939228:12939235:12939

168it [00:00, 4398.85it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    TGLCQIFSE      33296217:33296222:33295165:33295187:None:None   
1    GLCQIFSEE      33296217:33296219:33295162:33295187:None:None   
2    LLQTGLCQI      33296217:33296231:33295174:33295187:None:None   
3    LQTGLCQIF      33296217:33296228:33295171:33295187:None:None   
4    ESFLLQTGL      33296217:33296240:33295183:33295187:None:None   
..         ...                                                ...   
163  VEDLMDSGN  240787250:240787256:240786342:240786363:None:None   
164  EDLMDSGNK  240787250:240787253:240786339:240786363:None:None   
165  YSVEDLMDS  240787250:240787262:240786348:240786363:None:None   
166  SYSVEDLMD  240787250:240787265:240786351:240786363:None:None   
167  SVEDLMDSG  240787250:240787259:240786345:240786363:None:None   

     junctionAnnotated  readFrameAnnotate

339it [00:00, 4396.15it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    TGLCQIFSE      33296217:33296222:33295165:33295187:None:None   
1    GLCQIFSEE      33296217:33296219:33295162:33295187:None:None   
2    LLQTGLCQI      33296217:33296231:33295174:33295187:None:None   
3    LQTGLCQIF      33296217:33296228:33295171:33295187:None:None   
4    ESFLLQTGL      33296217:33296240:33295183:33295187:None:None   
..         ...                                                ...   
334  ELQQIMKGG  180318792:180318814:180312462:180312467:None:None   
335  QIMKGGKRS  180318792:180318805:180312453:180312467:None:None   
336  IMKGGKRSS  180318792:180318802:180312450:180312467:None:None   
337  LKLSAECQK      49379664:49379690:49363232:49363233:None:None   
338  KLSAECQKF      49379664:49379687:49363229:49363233:None:None   

     junctionAnnotated  readFrameAnnotate

294it [00:00, 2985.95it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    TGLCQIFSE      33296217:33296222:33295165:33295187:None:None   
1    GLCQIFSEE      33296217:33296219:33295162:33295187:None:None   
2    LLQTGLCQI      33296217:33296231:33295174:33295187:None:None   
3    LQTGLCQIF      33296217:33296228:33295171:33295187:None:None   
4    ESFLLQTGL      33296217:33296240:33295183:33295187:None:None   
..         ...                                                ...   
289  ELQQIMKGG  180318792:180318814:180312462:180312467:None:None   
290  QIMKGGKRS  180318792:180318805:180312453:180312467:None:None   
291  IMKGGKRSS  180318792:180318802:180312450:180312467:None:None   
292  LKLSAECQK      49379664:49379690:49363232:49363233:None:None   
293  KLSAECQKF      49379664:49379687:49363229:49363233:None:None   

     junctionAnnotated  readFrameAnnotate

37it [00:00, 6665.35it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                          coord  \
0   FRIRKQHRR          346147:346152:310026:310048:None:None   
1   EGREFRIRK          346147:346164:310038:310048:None:None   
2   GREFRIRKQ          346147:346161:310035:310048:None:None   
3   YSEGREFRI          346147:346170:310044:310048:None:None   
4   SEGREFRIR          346147:346167:310041:310048:None:None   
5   EFRIRKQHR          346147:346155:310029:310048:None:None   
6   REFRIRKQH          346147:346158:310032:310048:None:None   
7   VKVYRPATV  72667427:72667434:72635279:72635299:None:None   
8   DLFGVKVYR  72667427:72667446:72635291:72635299:None:None   
9   GVKVYRPAT  72667427:72667437:72635282:72635299:None:None   
10  LFGVKVYRP  72667427:72667443:72635288:72635299:None:None   
11  KVYRPATVL  72667427:72667431:72635276:72635299:None:None   
12  FGVKVYRPA  72667427:72667440:72635

149it [00:00, 4532.31it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
1    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
2    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
3    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
4    LEAKFETLQ      52238712:52238724:52243011:52243026:None:None   
..         ...                                                ...   
144  EGSPSADFA  136407428:136407432:136407204:136407227:None:None   
145  EEGSPSADF  136407428:136407435:136407207:136407227:None:None   
146  DLLDEEEGS  136407428:136407450:136407222:136407227:None:None   
147  LLDEEEGSP  136407428:136407447:136407219:136407227:None:None   
148  EEEGSPSAD  136407428:136407438:136407210:136407227:None:None   

     junctionAnnotated  readFrameAnnotate

38it [00:00, 6562.77it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                              coord  \
0   WAGPFPPAS      95287734:95287752:95288229:95288238:None:None   
1   PFPPASPSP      95287743:95287752:95288229:95288247:None:None   
2   FPPASPSPE      95287746:95287752:95288229:95288250:None:None   
3   AGPFPPASP      95287737:95287752:95288229:95288241:None:None   
4   PWAGPFPPA      95287731:95287752:95288229:95288235:None:None   
5   LPWAGPFPP      95287728:95287752:95288229:95288232:None:None   
6   GPFPPASPS      95287740:95287752:95288229:95288244:None:None   
7   FVKTLRWDA  124913290:124913306:124913082:124913093:None:None   
8   FVKTLRWDA  124913290:124913306:124913082:124913093:None:None   
9   FVKTLRWDA  124913290:124913306:124913082:124913093:None:None   
10  FVKTLRWDA  124913290:124913306:124913082:124913093:None:None   
11  KTLRWDANL  124913290:124913300:124913076:124913093

77it [00:00, 3178.22it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                              coord  \
0   VELEAKFET      52238706:52238724:52243011:52243020:None:None   
1   VELEAKFET      52238706:52238724:52243011:52243020:None:None   
2   VELEAKFET      52238706:52238724:52243011:52243020:None:None   
3   VELEAKFET      52238706:52238724:52243011:52243020:None:None   
4   LEAKFETLQ      52238712:52238724:52243011:52243026:None:None   
..        ...                                                ...   
72  TLRWDANLR  124913290:124913297:124913073:124913093:None:None   
73  IFVKTLRWD  124913290:124913309:124913085:124913093:None:None   
74  IFVKTLRWD  124913290:124913309:124913085:124913093:None:None   
75  IFVKTLRWD  124913290:124913309:124913085:124913093:None:None   
76  IFVKTLRWD  124913290:124913309:124913085:124913093:None:None   

    junctionAnnotated  readFrameAnnotated strand  jun

807it [00:00, 3629.66it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    DKLFQSWSS      89497170:89497175:89421235:89421257:None:None   
1    DKLFQSWSS      89497170:89497175:89421235:89421257:None:None   
2    DKLFQSWSS      89497170:89497175:89421235:89421257:None:None   
3    DKLFQSWSS      89497170:89497175:89421235:89421257:None:None   
4    IPDKLFQSW      89497170:89497181:89421241:89421257:None:None   
..         ...                                                ...   
802  EGSPSADFA  136407428:136407432:136407204:136407227:None:None   
803  EEGSPSADF  136407428:136407435:136407207:136407227:None:None   
804  DLLDEEEGS  136407428:136407450:136407222:136407227:None:None   
805  LLDEEEGSP  136407428:136407447:136407219:136407227:None:None   
806  EEEGSPSAD  136407428:136407438:136407210:136407227:None:None   

     junctionAnnotated  readFrameAnnotate

306it [00:00, 4261.03it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    FLLQRSDTS  101148149:101148161:101160072:101160087:None:None   
1    VSFLLQRSD  101148143:101148161:101160072:101160081:None:None   
2    AVSFLLQRS  101148140:101148161:101160072:101160078:None:None   
3    SFLLQRSDT  101148146:101148161:101160072:101160084:None:None   
4    LLQRSDTSL  101148152:101148161:101160072:101160090:None:None   
..         ...                                                ...   
301  EAGGRAGGS              944778:944783:944715:944737:None:None   
302  GDPDAEAGG              944778:944798:944730:944737:None:None   
303  PDAEAGGRA              944778:944792:944724:944737:None:None   
304  DPDAEAGGR              944778:944795:944727:944737:None:None   
305  DAEAGGRAG              944778:944789:944721:944737:None:None   

     junctionAnnotated  readFrameAnnotate

447it [00:00, 3569.87it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    DKLFQSWSS      89497170:89497175:89421235:89421257:None:None   
1    DKLFQSWSS      89497170:89497175:89421235:89421257:None:None   
2    DKLFQSWSS      89497170:89497175:89421235:89421257:None:None   
3    DKLFQSWSS      89497170:89497175:89421235:89421257:None:None   
4    IPDKLFQSW      89497170:89497181:89421241:89421257:None:None   
..         ...                                                ...   
442  EGSPSADFA  136407428:136407432:136407204:136407227:None:None   
443  EEGSPSADF  136407428:136407435:136407207:136407227:None:None   
444  DLLDEEEGS  136407428:136407450:136407222:136407227:None:None   
445  LLDEEEGSP  136407428:136407447:136407219:136407227:None:None   
446  EEEGSPSAD  136407428:136407438:136407210:136407227:None:None   

     junctionAnnotated  readFrameAnnotate

63it [00:00, 6708.16it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                          coord  \
0   WAGPFPPAS  95287734:95287752:95288229:95288238:None:None   
1   PFPPASPSP  95287743:95287752:95288229:95288247:None:None   
2   FPPASPSPE  95287746:95287752:95288229:95288250:None:None   
3   AGPFPPASP  95287737:95287752:95288229:95288241:None:None   
4   PWAGPFPPA  95287731:95287752:95288229:95288235:None:None   
..        ...                                            ...   
58  EAGGRAGGS          944778:944783:944715:944737:None:None   
59  GDPDAEAGG          944778:944798:944730:944737:None:None   
60  PDAEAGGRA          944778:944792:944724:944737:None:None   
61  DPDAEAGGR          944778:944795:944727:944737:None:None   
62  DAEAGGRAG          944778:944789:944721:944737:None:None   

    junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0               False         

190it [00:00, 3397.08it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                          coord  \
0    VELEAKFET  52238706:52238724:52243011:52243020:None:None   
1    VELEAKFET  52238706:52238724:52243011:52243020:None:None   
2    VELEAKFET  52238706:52238724:52243011:52243020:None:None   
3    VELEAKFET  52238706:52238724:52243011:52243020:None:None   
4    LEAKFETLQ  52238712:52238724:52243011:52243026:None:None   
..         ...                                            ...   
185  EAGGRAGGS          944778:944783:944715:944737:None:None   
186  GDPDAEAGG          944778:944798:944730:944737:None:None   
187  PDAEAGGRA          944778:944792:944724:944737:None:None   
188  DPDAEAGGR          944778:944795:944727:944737:None:None   
189  DAEAGGRAG          944778:944789:944721:944737:None:None   

     junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0                

134it [00:00, 2636.83it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
1    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
2    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
3    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
4    LEAKFETLQ      52238712:52238724:52243011:52243026:None:None   
..         ...                                                ...   
129  EGSPSADFA  136407428:136407432:136407204:136407227:None:None   
130  EEGSPSADF  136407428:136407435:136407207:136407227:None:None   
131  DLLDEEEGS  136407428:136407450:136407222:136407227:None:None   
132  LLDEEEGSP  136407428:136407447:136407219:136407227:None:None   
133  EEEGSPSAD  136407428:136407438:136407210:136407227:None:None   

     junctionAnnotated  readFrameAnnotate

473it [00:00, 3353.53it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    FLLQRSDTS  101148149:101148161:101160072:101160087:None:None   
1    VSFLLQRSD  101148143:101148161:101160072:101160081:None:None   
2    AVSFLLQRS  101148140:101148161:101160072:101160078:None:None   
3    SFLLQRSDT  101148146:101148161:101160072:101160084:None:None   
4    LLQRSDTSL  101148152:101148161:101160072:101160090:None:None   
..         ...                                                ...   
468  EGSPSADFA  136407428:136407432:136407204:136407227:None:None   
469  EEGSPSADF  136407428:136407435:136407207:136407227:None:None   
470  DLLDEEEGS  136407428:136407450:136407222:136407227:None:None   
471  LLDEEEGSP  136407428:136407447:136407219:136407227:None:None   
472  EEEGSPSAD  136407428:136407438:136407210:136407227:None:None   

     junctionAnnotated  readFrameAnnotate

337it [00:00, 4350.47it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    SAAGKEQRV      44940171:44940186:44958774:44958786:None:None   
1    GKEQRVWFL      44940180:44940186:44958774:44958795:None:None   
2    AAGKEQRVW      44940174:44940186:44958774:44958789:None:None   
3    AGKEQRVWF      44940177:44940186:44958774:44958792:None:None   
4    KEQRVWFLL      44940183:44940186:44958774:44958798:None:None   
..         ...                                                ...   
332  EGSPSADFA  136407428:136407432:136407204:136407227:None:None   
333  EEGSPSADF  136407428:136407435:136407207:136407227:None:None   
334  DLLDEEEGS  136407428:136407450:136407222:136407227:None:None   
335  LLDEEEGSP  136407428:136407447:136407219:136407227:None:None   
336  EEEGSPSAD  136407428:136407438:136407210:136407227:None:None   

     junctionAnnotated  readFrameAnnotate

99it [00:00, 4005.09it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                              coord  \
0   VELEAKFET      52238706:52238724:52243011:52243020:None:None   
1   VELEAKFET      52238706:52238724:52243011:52243020:None:None   
2   VELEAKFET      52238706:52238724:52243011:52243020:None:None   
3   VELEAKFET      52238706:52238724:52243011:52243020:None:None   
4   LEAKFETLQ      52238712:52238724:52243011:52243026:None:None   
..        ...                                                ...   
94  VEDLMDSGN  240787250:240787256:240786342:240786363:None:None   
95  EDLMDSGNK  240787250:240787253:240786339:240786363:None:None   
96  YSVEDLMDS  240787250:240787262:240786348:240786363:None:None   
97  SYSVEDLMD  240787250:240787265:240786351:240786363:None:None   
98  SVEDLMDSG  240787250:240787259:240786345:240786363:None:None   

    junctionAnnotated  readFrameAnnotated strand  jun

31it [00:00, 6209.04it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                              coord  \
0   WAGPFPPAS      95287734:95287752:95288229:95288238:None:None   
1   PFPPASPSP      95287743:95287752:95288229:95288247:None:None   
2   FPPASPSPE      95287746:95287752:95288229:95288250:None:None   
3   AGPFPPASP      95287737:95287752:95288229:95288241:None:None   
4   PWAGPFPPA      95287731:95287752:95288229:95288235:None:None   
5   LPWAGPFPP      95287728:95287752:95288229:95288232:None:None   
6   GPFPPASPS      95287740:95287752:95288229:95288244:None:None   
7   FVKTLRWDA  124913290:124913306:124913082:124913093:None:None   
8   FVKTLRWDA  124913290:124913306:124913082:124913093:None:None   
9   FVKTLRWDA  124913290:124913306:124913082:124913093:None:None   
10  FVKTLRWDA  124913290:124913306:124913082:124913093:None:None   
11  KTLRWDANL  124913290:124913300:124913076:124913093

31it [00:00, 1070.75it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                              coord  \
0   WAGPFPPAS      95287734:95287752:95288229:95288238:None:None   
1   PFPPASPSP      95287743:95287752:95288229:95288247:None:None   
2   FPPASPSPE      95287746:95287752:95288229:95288250:None:None   
3   AGPFPPASP      95287737:95287752:95288229:95288241:None:None   
4   PWAGPFPPA      95287731:95287752:95288229:95288235:None:None   
5   LPWAGPFPP      95287728:95287752:95288229:95288232:None:None   
6   GPFPPASPS      95287740:95287752:95288229:95288244:None:None   
7   FVKTLRWDA  124913290:124913306:124913082:124913093:None:None   
8   FVKTLRWDA  124913290:124913306:124913082:124913093:None:None   
9   FVKTLRWDA  124913290:124913306:124913082:124913093:None:None   
10  FVKTLRWDA  124913290:124913306:124913082:124913093:None:None   
11  KTLRWDANL  124913290:124913300:124913076:124913093

31it [00:00, 6341.37it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                              coord  \
0   WAGPFPPAS      95287734:95287752:95288229:95288238:None:None   
1   PFPPASPSP      95287743:95287752:95288229:95288247:None:None   
2   FPPASPSPE      95287746:95287752:95288229:95288250:None:None   
3   AGPFPPASP      95287737:95287752:95288229:95288241:None:None   
4   PWAGPFPPA      95287731:95287752:95288229:95288235:None:None   
5   LPWAGPFPP      95287728:95287752:95288229:95288232:None:None   
6   GPFPPASPS      95287740:95287752:95288229:95288244:None:None   
7   FVKTLRWDA  124913290:124913306:124913082:124913093:None:None   
8   FVKTLRWDA  124913290:124913306:124913082:124913093:None:None   
9   FVKTLRWDA  124913290:124913306:124913082:124913093:None:None   
10  FVKTLRWDA  124913290:124913306:124913082:124913093:None:None   
11  KTLRWDANL  124913290:124913300:124913076:124913093

134it [00:00, 3787.26it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
1    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
2    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
3    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
4    LEAKFETLQ      52238712:52238724:52243011:52243026:None:None   
..         ...                                                ...   
129  EGSPSADFA  136407428:136407432:136407204:136407227:None:None   
130  EEGSPSADF  136407428:136407435:136407207:136407227:None:None   
131  DLLDEEEGS  136407428:136407450:136407222:136407227:None:None   
132  LLDEEEGSP  136407428:136407447:136407219:136407227:None:None   
133  EEEGSPSAD  136407428:136407438:136407210:136407227:None:None   

     junctionAnnotated  readFrameAnnotate

969it [00:00, 3430.13it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    DKLFQSWSS      89497170:89497175:89421235:89421257:None:None   
1    DKLFQSWSS      89497170:89497175:89421235:89421257:None:None   
2    DKLFQSWSS      89497170:89497175:89421235:89421257:None:None   
3    DKLFQSWSS      89497170:89497175:89421235:89421257:None:None   
4    IPDKLFQSW      89497170:89497181:89421241:89421257:None:None   
..         ...                                                ...   
964  EGSPSADFA  136407428:136407432:136407204:136407227:None:None   
965  EEGSPSADF  136407428:136407435:136407207:136407227:None:None   
966  DLLDEEEGS  136407428:136407450:136407222:136407227:None:None   
967  LLDEEEGSP  136407428:136407447:136407219:136407227:None:None   
968  EEEGSPSAD  136407428:136407438:136407210:136407227:None:None   

     junctionAnnotated  readFrameAnnotate

134it [00:00, 2805.99it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
1    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
2    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
3    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
4    LEAKFETLQ      52238712:52238724:52243011:52243026:None:None   
..         ...                                                ...   
129  EGSPSADFA  136407428:136407432:136407204:136407227:None:None   
130  EEGSPSADF  136407428:136407435:136407207:136407227:None:None   
131  DLLDEEEGS  136407428:136407450:136407222:136407227:None:None   
132  LLDEEEGSP  136407428:136407447:136407219:136407227:None:None   
133  EEEGSPSAD  136407428:136407438:136407210:136407227:None:None   

     junctionAnnotated  readFrameAnnotate

77it [00:00, 3216.78it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                              coord  \
0   VELEAKFET      52238706:52238724:52243011:52243020:None:None   
1   VELEAKFET      52238706:52238724:52243011:52243020:None:None   
2   VELEAKFET      52238706:52238724:52243011:52243020:None:None   
3   VELEAKFET      52238706:52238724:52243011:52243020:None:None   
4   LEAKFETLQ      52238712:52238724:52243011:52243026:None:None   
..        ...                                                ...   
72  TLRWDANLR  124913290:124913297:124913073:124913093:None:None   
73  IFVKTLRWD  124913290:124913309:124913085:124913093:None:None   
74  IFVKTLRWD  124913290:124913309:124913085:124913093:None:None   
75  IFVKTLRWD  124913290:124913309:124913085:124913093:None:None   
76  IFVKTLRWD  124913290:124913309:124913085:124913093:None:None   

    junctionAnnotated  readFrameAnnotated strand  jun

570it [00:00, 3018.05it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    FLLQRSDTS  101148149:101148161:101160072:101160087:None:None   
1    VSFLLQRSD  101148143:101148161:101160072:101160081:None:None   
2    AVSFLLQRS  101148140:101148161:101160072:101160078:None:None   
3    SFLLQRSDT  101148146:101148161:101160072:101160084:None:None   
4    LLQRSDTSL  101148152:101148161:101160072:101160090:None:None   
..         ...                                                ...   
565  EGSPSADFA  136407428:136407432:136407204:136407227:None:None   
566  EEGSPSADF  136407428:136407435:136407207:136407227:None:None   
567  DLLDEEEGS  136407428:136407450:136407222:136407227:None:None   
568  LLDEEEGSP  136407428:136407447:136407219:136407227:None:None   
569  EEEGSPSAD  136407428:136407438:136407210:136407227:None:None   

     junctionAnnotated  readFrameAnnotate

31it [00:00, 5393.15it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                              coord  \
0   WAGPFPPAS      95287734:95287752:95288229:95288238:None:None   
1   PFPPASPSP      95287743:95287752:95288229:95288247:None:None   
2   FPPASPSPE      95287746:95287752:95288229:95288250:None:None   
3   AGPFPPASP      95287737:95287752:95288229:95288241:None:None   
4   PWAGPFPPA      95287731:95287752:95288229:95288235:None:None   
5   LPWAGPFPP      95287728:95287752:95288229:95288232:None:None   
6   GPFPPASPS      95287740:95287752:95288229:95288244:None:None   
7   FVKTLRWDA  124913290:124913306:124913082:124913093:None:None   
8   FVKTLRWDA  124913290:124913306:124913082:124913093:None:None   
9   FVKTLRWDA  124913290:124913306:124913082:124913093:None:None   
10  FVKTLRWDA  124913290:124913306:124913082:124913093:None:None   
11  KTLRWDANL  124913290:124913300:124913076:124913093

441it [00:00, 4026.13it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    DKLFQSWSS      89497170:89497175:89421235:89421257:None:None   
1    DKLFQSWSS      89497170:89497175:89421235:89421257:None:None   
2    DKLFQSWSS      89497170:89497175:89421235:89421257:None:None   
3    DKLFQSWSS      89497170:89497175:89421235:89421257:None:None   
4    IPDKLFQSW      89497170:89497181:89421241:89421257:None:None   
..         ...                                                ...   
436  EGSPSADFA  136407428:136407432:136407204:136407227:None:None   
437  EEGSPSADF  136407428:136407435:136407207:136407227:None:None   
438  DLLDEEEGS  136407428:136407450:136407222:136407227:None:None   
439  LLDEEEGSP  136407428:136407447:136407219:136407227:None:None   
440  EEEGSPSAD  136407428:136407438:136407210:136407227:None:None   

     junctionAnnotated  readFrameAnnotate

432it [00:00, 3035.45it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    DKLFQSWSS      89497170:89497175:89421235:89421257:None:None   
1    DKLFQSWSS      89497170:89497175:89421235:89421257:None:None   
2    DKLFQSWSS      89497170:89497175:89421235:89421257:None:None   
3    DKLFQSWSS      89497170:89497175:89421235:89421257:None:None   
4    IPDKLFQSW      89497170:89497181:89421241:89421257:None:None   
..         ...                                                ...   
427  EGSPSADFA  136407428:136407432:136407204:136407227:None:None   
428  EEGSPSADF  136407428:136407435:136407207:136407227:None:None   
429  DLLDEEEGS  136407428:136407450:136407222:136407227:None:None   
430  LLDEEEGSP  136407428:136407447:136407219:136407227:None:None   
431  EEEGSPSAD  136407428:136407438:136407210:136407227:None:None   

     junctionAnnotated  readFrameAnnotate

149it [00:00, 3878.87it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
1    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
2    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
3    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
4    LEAKFETLQ      52238712:52238724:52243011:52243026:None:None   
..         ...                                                ...   
144  EGSPSADFA  136407428:136407432:136407204:136407227:None:None   
145  EEGSPSADF  136407428:136407435:136407207:136407227:None:None   
146  DLLDEEEGS  136407428:136407450:136407222:136407227:None:None   
147  LLDEEEGSP  136407428:136407447:136407219:136407227:None:None   
148  EEEGSPSAD  136407428:136407438:136407210:136407227:None:None   

     junctionAnnotated  readFrameAnnotate

408it [00:00, 3483.12it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    SAAGKEQRV      44940171:44940186:44958774:44958786:None:None   
1    GKEQRVWFL      44940180:44940186:44958774:44958795:None:None   
2    AAGKEQRVW      44940174:44940186:44958774:44958789:None:None   
3    AGKEQRVWF      44940177:44940186:44958774:44958792:None:None   
4    KEQRVWFLL      44940183:44940186:44958774:44958798:None:None   
..         ...                                                ...   
403  EGSPSADFA  136407428:136407432:136407204:136407227:None:None   
404  EEGSPSADF  136407428:136407435:136407207:136407227:None:None   
405  DLLDEEEGS  136407428:136407450:136407222:136407227:None:None   
406  LLDEEEGSP  136407428:136407447:136407219:136407227:None:None   
407  EEEGSPSAD  136407428:136407438:136407210:136407227:None:None   

     junctionAnnotated  readFrameAnnotate

134it [00:00, 2958.29it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
1    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
2    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
3    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
4    LEAKFETLQ      52238712:52238724:52243011:52243026:None:None   
..         ...                                                ...   
129  EGSPSADFA  136407428:136407432:136407204:136407227:None:None   
130  EEGSPSADF  136407428:136407435:136407207:136407227:None:None   
131  DLLDEEEGS  136407428:136407450:136407222:136407227:None:None   
132  LLDEEEGSP  136407428:136407447:136407219:136407227:None:None   
133  EEEGSPSAD  136407428:136407438:136407210:136407227:None:None   

     junctionAnnotated  readFrameAnnotate

76it [00:00, 2702.03it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                          coord  \
0   WAGPFPPAS  95287734:95287752:95288229:95288238:None:None   
1   PFPPASPSP  95287743:95287752:95288229:95288247:None:None   
2   FPPASPSPE  95287746:95287752:95288229:95288250:None:None   
3   AGPFPPASP  95287737:95287752:95288229:95288241:None:None   
4   PWAGPFPPA  95287731:95287752:95288229:95288235:None:None   
..        ...                                            ...   
71  EAGGRAGGS          944778:944783:944715:944737:None:None   
72  GDPDAEAGG          944778:944798:944730:944737:None:None   
73  PDAEAGGRA          944778:944792:944724:944737:None:None   
74  DPDAEAGGR          944778:944795:944727:944737:None:None   
75  DAEAGGRAG          944778:944789:944721:944737:None:None   

    junctionAnnotated  readFrameAnnotated strand junction_coordinate  
0               False         

162it [00:00, 2935.55it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
1    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
2    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
3    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
4    LEAKFETLQ      52238712:52238724:52243011:52243026:None:None   
..         ...                                                ...   
157  EGSPSADFA  136407428:136407432:136407204:136407227:None:None   
158  EEGSPSADF  136407428:136407435:136407207:136407227:None:None   
159  DLLDEEEGS  136407428:136407450:136407222:136407227:None:None   
160  LLDEEEGSP  136407428:136407447:136407219:136407227:None:None   
161  EEEGSPSAD  136407428:136407438:136407210:136407227:None:None   

     junctionAnnotated  readFrameAnnotate

77it [00:00, 3286.00it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                              coord  \
0   VELEAKFET      52238706:52238724:52243011:52243020:None:None   
1   VELEAKFET      52238706:52238724:52243011:52243020:None:None   
2   VELEAKFET      52238706:52238724:52243011:52243020:None:None   
3   VELEAKFET      52238706:52238724:52243011:52243020:None:None   
4   LEAKFETLQ      52238712:52238724:52243011:52243026:None:None   
..        ...                                                ...   
72  TLRWDANLR  124913290:124913297:124913073:124913093:None:None   
73  IFVKTLRWD  124913290:124913309:124913085:124913093:None:None   
74  IFVKTLRWD  124913290:124913309:124913085:124913093:None:None   
75  IFVKTLRWD  124913290:124913309:124913085:124913093:None:None   
76  IFVKTLRWD  124913290:124913309:124913085:124913093:None:None   

    junctionAnnotated  readFrameAnnotated strand  jun

524it [00:00, 2849.78it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    DKLFQSWSS      89497170:89497175:89421235:89421257:None:None   
1    DKLFQSWSS      89497170:89497175:89421235:89421257:None:None   
2    DKLFQSWSS      89497170:89497175:89421235:89421257:None:None   
3    DKLFQSWSS      89497170:89497175:89421235:89421257:None:None   
4    IPDKLFQSW      89497170:89497181:89421241:89421257:None:None   
..         ...                                                ...   
519  EGSPSADFA  136407428:136407432:136407204:136407227:None:None   
520  EEGSPSADF  136407428:136407435:136407207:136407227:None:None   
521  DLLDEEEGS  136407428:136407450:136407222:136407227:None:None   
522  LLDEEEGSP  136407428:136407447:136407219:136407227:None:None   
523  EEEGSPSAD  136407428:136407438:136407210:136407227:None:None   

     junctionAnnotated  readFrameAnnotate

149it [00:00, 4015.47it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
1    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
2    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
3    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
4    LEAKFETLQ      52238712:52238724:52243011:52243026:None:None   
..         ...                                                ...   
144  EGSPSADFA  136407428:136407432:136407204:136407227:None:None   
145  EEGSPSADF  136407428:136407435:136407207:136407227:None:None   
146  DLLDEEEGS  136407428:136407450:136407222:136407227:None:None   
147  LLDEEEGSP  136407428:136407447:136407219:136407227:None:None   
148  EEEGSPSAD  136407428:136407438:136407210:136407227:None:None   

     junctionAnnotated  readFrameAnnotate

149it [00:00, 2571.93it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
1    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
2    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
3    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
4    LEAKFETLQ      52238712:52238724:52243011:52243026:None:None   
..         ...                                                ...   
144  EGSPSADFA  136407428:136407432:136407204:136407227:None:None   
145  EEGSPSADF  136407428:136407435:136407207:136407227:None:None   
146  DLLDEEEGS  136407428:136407450:136407222:136407227:None:None   
147  LLDEEEGSP  136407428:136407447:136407219:136407227:None:None   
148  EEEGSPSAD  136407428:136407438:136407210:136407227:None:None   

     junctionAnnotated  readFrameAnnotate

77it [00:00, 3221.08it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
         kmer                                              coord  \
0   VELEAKFET      52238706:52238724:52243011:52243020:None:None   
1   VELEAKFET      52238706:52238724:52243011:52243020:None:None   
2   VELEAKFET      52238706:52238724:52243011:52243020:None:None   
3   VELEAKFET      52238706:52238724:52243011:52243020:None:None   
4   LEAKFETLQ      52238712:52238724:52243011:52243026:None:None   
..        ...                                                ...   
72  TLRWDANLR  124913290:124913297:124913073:124913093:None:None   
73  IFVKTLRWD  124913290:124913309:124913085:124913093:None:None   
74  IFVKTLRWD  124913290:124913309:124913085:124913093:None:None   
75  IFVKTLRWD  124913290:124913309:124913085:124913093:None:None   
76  IFVKTLRWD  124913290:124913309:124913085:124913093:None:None   

    junctionAnnotated  readFrameAnnotated strand  jun

180it [00:00, 2756.06it/s]


Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
1    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
2    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
3    VELEAKFET      52238706:52238724:52243011:52243020:None:None   
4    LEAKFETLQ      52238712:52238724:52243011:52243026:None:None   
..         ...                                                ...   
175  EGSPSADFA  136407428:136407432:136407204:136407227:None:None   
176  EEGSPSADF  136407428:136407435:136407207:136407227:None:None   
177  DLLDEEEGS  136407428:136407450:136407222:136407227:None:None   
178  LLDEEEGSP  136407428:136407447:136407219:136407227:None:None   
179  EEEGSPSAD  136407428:136407438:136407210:136407227:None:None   

     junctionAnnotated  readFrameAnnotate

447it [00:00, 4812.74it/s]

Index(['kmer', 'coord', 'junctionAnnotated', 'readFrameAnnotated', 'strand',
       'junction_coordinate'],
      dtype='object')
          kmer                                              coord  \
0    DKLFQSWSS      89497170:89497175:89421235:89421257:None:None   
1    DKLFQSWSS      89497170:89497175:89421235:89421257:None:None   
2    DKLFQSWSS      89497170:89497175:89421235:89421257:None:None   
3    DKLFQSWSS      89497170:89497175:89421235:89421257:None:None   
4    IPDKLFQSW      89497170:89497181:89421241:89421257:None:None   
..         ...                                                ...   
442  EGSPSADFA  136407428:136407432:136407204:136407227:None:None   
443  EEGSPSADF  136407428:136407435:136407207:136407227:None:None   
444  DLLDEEEGS  136407428:136407450:136407222:136407227:None:None   
445  LLDEEEGSP  136407428:136407447:136407219:136407227:None:None   
446  EEEGSPSAD  136407428:136407438:136407210:136407227:None:None   

     junctionAnnotated  readFrameAnnotate




<a name='3'></a>
## Download tables

It uses a custom create_path function to build file paths for data storage and retrieval, with the paths being contingent on whether a certain condition (represented by OHSU_BRCA_NEW) is true or false.
The script appears to switch between different directory paths based on this condition, suggesting that it manages multiple versions or sets of data (e.g., "new" versus "old").
Paths for different types of data (FILTERING_BRCA, ETH_TASK_BRCA, OHSU_TASK_BRCA) are created, which might correspond to various steps or aspects of the data processing workflow, such as filtering tasks or specific analyses related to BRCA, which is often a reference to breast cancer research.
Finally, the script writes a DataFrame out_df_filtered to a CSV file, specifying that headers should be included and that the separator should be a comma.

In [None]:
if not OHSU_BRCA_NEW:
    PATH_DATA = create_path.create_path(SAVE_DIR,[DIR_CSV,DIR_BRCA,NAME_TABLES])
if OHSU_BRCA_NEW:
    PATH_DATA = create_path.create_path(SAVE_DIR,[DIR_CSV,DIR_BRCA,NAME_TABLES,'OHSU_BRCA_NEW',''])
path_filtering=create_path.create_path(PATH_DATA,[NAME_FILTERING_BRCA])
path_ETH=create_path.create_path(PATH_DATA,[NAME_ETH_TASK_BRCA])
path_OHSU=create_path.create_path(PATH_DATA,[NAME_OHSU_TASK_BRCA])

out_df_filtered.to_csv(path_filtering,header=True,sep=';')