## Table of Contents
- [Import libraries](#1)
- [Process tables](#2)

<a name='1'></a>
## Import libraries

The script is focused on setting up an environment for data analysis and visualization. It imports a suite of libraries and modules that are essential for statistical computing, data manipulation, progress tracking, file system operations, and generating visualizations such as plots and Venn diagrams. The specific libraries imported include pandas for data structures, numpy for numerical operations, tqdm for progress bars, glob for file path retrieval, os for operating system interaction, matplotlib and seaborn for plotting and graphical representations, and matplotlib_venn for creating Venn diagrams.

Additionally, the script modifies the system path to include a custom directory, which suggests that the script will use additional custom modules and configuration settings located in this directory. These custom modules, imported with wildcard imports (from config import * and from functions import *)


In [2]:
# %load /cluster/home/myurchikova/github/projects2020_ohsu/eth/learning_Master_thesis/TASKS/func/base_imports.py
import pandas as pd
import numpy as np
import tqdm 
import glob
import os
import matplotlib
import matplotlib.pyplot as plt
import seaborn as sns
import tarfile
import re
from matplotlib_venn import venn2, venn2_circles, venn2_unweighted
from matplotlib_venn import venn3, venn3_circles
import sys
sys.path.append(r"/cluster/home/myurchikova/github/projects2020_ohsu/eth/learning_Master_thesis/TASKS/func")
from config import *
from functions import *




<a name='2'></a>
## Process tables

Configuration and Setup:
The code sets up various file paths and configurations for processing data. It includes file directories for input data and specifies output paths.
Colors for different data types or sources are defined (e.g., OHSU_COLOR, ETH_COLOR), which suggests these colors will be used in visualizations.
Data Loading and Preprocessing:
Data files are loaded, potentially compressed files (as suggested by the use of tarfile), and file names are retrieved.
The code performs preprocessing steps to organize and pair related data files, indicating a comparison between different datasets or conditions.
Data Filtering and Compilation:
The script uses regular expressions to filter data based on specific patterns or criteria.
DataFrames are created and manipulated, possibly merging or aligning data from different sources (e.g., OHSU and ETH).
Data Analysis:
The code includes conditional logic to manipulate and analyze the data based on certain criteria, such as filtering and prioritizing specific data points.
Various metrics are calculated, such as sizes of intersections between datasets, and differences, indicating comparative analysis.
Visualization Preparation:
The code appears to prepare data for visualization, possibly in the form of Venn diagrams or other comparative visual tools, as suggested by the variables related to colors and coordinates.
Error Handling and Logging:
The script includes try-except blocks to handle potential errors during the file reading and data processing steps.
Logging conditions are present, indicating that the script may provide output logs for debugging or record-keeping.
Output Generation:
The script aggregates results into a final DataFrame, which might be used for subsequent analysis or for generating a visual report.
Conditions for printing or saving the results are included, with checks on whether the processed data is empty or not.

In [3]:
TEXT_SIZE = 65

In [4]:
filter_dir ='/cluster/work/grlab/projects/projects2020_OHSU/peptides_generation/CANCER_eth/commit_c4dd02c_conf2_Frame_cap0_runs/TCGA_Ovarian_374/filtering_samples/filters_19May_order_5ge_wAnnot_GPstar'
             #/cluster/work/grlab/projects/projects2020_OHSU/peptides_generation/CANCER_eth/commit_c4dd02c_conf2_Frame_cap0_runs/TCGA_Breast_1102/filtering_samples/filters_19May_order_5ge_wAnnot_GPstar
tar_file_OHSU = '/cluster/work/grlab/projects/projects2020_OHSU/share_OHUS_PNLL/June28_renamed_kmerfiles_OHSU.tar.gz'#'/cluster/work/grlab/projects/projects2020_OHSU/share_OHUS_PNLL/OHSU_June2023_filter-debug_all_output.tar.gz'
pwd = '/cluster/home/myurchikova/github/projects2020_ohsu/eth/learning_Master_thesis/'

OHSU_COLOR ='red' #'red'
ETH_COLOR = 'green'#'green'
OHSU_ETH_COLOR ='yellow' #'yellow'
LANG = 'ENG'
LOG = False
out_df_filtered=pd.DataFrame()
sample_target=['TCGA25131901A01R156513',
            'TCGA25131301A01R156513',
            'TCGA61200801A02R156813',
            'TCGA24143101A01R156613',
            'TCGA24229801A01R156913',]
# sample_target=['TCGA-C8-A12P-01A-11R-A115-07']
                
if LANG == 'ENG':
    title_venn = ' {sample}'
else:
    title_venn = '{sample}'

# ETH Names
eth_all = glob.glob(os.path.join(filter_dir, 'G*'))

# OHSU Names
with tarfile.open(tar_file_OHSU, "r:*") as tar:
    ohsu_all = tar.getnames()

# Get file pairs
file_pair = {}
for idx_eth, eth in enumerate(eth_all):
    pattern = os.path.basename(eth).replace('G_', '').replace('.gz', '') 
    for idx_ohsu, ohsu in enumerate(ohsu_all):
        if pattern in ohsu:
            file_pair[eth] = ohsu

restricts = sample_target
for restrict in restricts:

    df = {'sample' : [], 
      'filter_foreground' : [], 
      'filter_background' : [], 
      'filter': [],
      'size_ohsu' : [], 
      'size_eth' : [], 
      'size_intersection' : [], 
      'size_ohsu\eth' : [], 
      'size_eth\ohsu' : [],
      'eth_kmers\inter':[],
      'ohsu_kmers\inter':[],

      'coord_OHSU':[],
      'coord_ETH':[],
      'size_ohsu_coor' : [], 
      'size_eth_coor' : [], 
      'size_intersection_coor' : [], 
      'size_ohsu\eth_coor' : [], 
      'size_eth\ohsu_coor' : [],
      'eth_coor\inter_coor':[],
      'ohsu_coor\inter_coor':[],
      'inter_coor':[],
      'eth_coor\ohsu_coor':[],
      'ohsu_coor\eth_coor':[],
      'priority':[],
          
         }
    with tarfile.open(tar_file_OHSU, "r:*") as tar: #OHSU
        for eth, ohsu in file_pair.items(): # ETH
            if (not restrict) or restrict == re.findall('G_([\s\S]+?)_',eth)[0].replace('-',''): #Restrict to category of interest
                # try:
                    df_ohsu = pd.read_csv(tar.extractfile(ohsu), sep="\t")
                    df_ohsu.reset_index(inplace=True)
                    if not df_ohsu.empty: df_ohsu = table_processing.ohsu_to_eth_coord(df_ohsu)
                    if not df_ohsu.empty: df_ohsu['junction_coordinate'] = df_ohsu['jx_shifted'].apply(lambda x: ':'.join(x.split(';')[1:3]))
                   
                    df_eth = pd.read_csv(eth, sep="\t")
                    df_eth = pd.read_csv(eth, sep="\t")
                    if not df_eth.empty: df_eth=table_processing.get_junction_coordinates(df_eth,'coord')
                    df1=df_eth
                    df2=df_ohsu
                    df_eth_coor = set(df1['junction_coordinate']) if not df1.empty else set([])
                    df_ohsu_coor = set(df2['junction_coordinate']) if not df1.empty else set([])
                    df['coord_ETH'].append(df1['junction_coordinate'] if not df1.empty else 'None' )
                    df['coord_OHSU'].append(df2['junction_coordinate'] if not df2.empty else 'None')
                    df_eth = set(df_eth['kmer'])
                    df_ohsu = set(df_ohsu['index'])
                    name = os.path.basename(ohsu).replace('.tsv', '').split('_')
                    print(restrict,name)
                    df['sample'].append(name[1].replace('-',''))
                    if not OHSU_BRCA_NEW:
                        df['filter_foreground'].append(name[2])
                        df['filter_background'].append(name[3])
                        if name[3][1] == 'Any':
                            priority=0
                        elif name[3][1] == 10:
                            priority=1
                        elif name[3][1] == 2:
                            priority=2
                        else:
                            priotity = None
                        df['priority'].append(priority)
                        df['filter'].append(name[2]+' '+name[3])
                    else:
                        df['filter'].append(name[2])
                        a = []
                        for i in range(5):
                            if name[2][i] == 'A':
                                a.append('Any')
                            elif name[2][i] == 'X':
                                a.append('10')
                            elif name[2][i] == 'N':
                                a.append('None')
                            else:
                                a.append(name[2][i])
                        df['filter_foreground'].append(f'({a[0]}, {a[1]}, {a[2]})')
                        df['filter_background'].append(f'({a[3]}, {a[4]})')
                        if a[4] == 'Any':
                            priority=0
                        elif a[4] == '10':
                            priority=1
                        elif a[4] == '2':
                            priority=2
                        else:
                            priority = None
                        df['priority'].append(priority)
                    print(a)
                    print(priority)
                    df['size_ohsu'].append(len(df_ohsu))
                    df['size_eth'].append(len(df_eth))
                    df['size_ohsu\eth'].append(len(df_ohsu_filter:=df_ohsu.difference(df_eth)))
                    df['size_eth\ohsu'].append(len(df_eth_filter:=df_eth.difference(df_ohsu)))
                    df['size_intersection'].append(len(df_inter_filter:=df_ohsu & df_eth))
                    df['eth_kmers\inter'].append(df_eth_witout_inter:=list(df_eth_filter.difference(df_inter_filter)))
                    df['ohsu_kmers\inter'].append(df_ohsu_witout_inter:=list(df_ohsu_filter.difference(df_inter_filter)))

                    
                    df['size_ohsu_coor'].append(len(df_ohsu_coor))
                    df['size_eth_coor'].append(len(df_eth_coor))
                    df['size_ohsu\eth_coor'].append(len(df_ohsu_filter_coor:=df_ohsu_coor.difference(df_eth_coor)))
                    df['size_eth\ohsu_coor'].append(len(df_eth_filter_coor:=df_eth_coor.difference(df_ohsu_coor)))
                    df['size_intersection_coor'].append(len(df_inter_filter_coor:=df_ohsu_coor & df_eth_coor))
                    df['eth_coor\inter_coor'].append(df_eth_witout_inter_coor:=list(df_eth_filter_coor.difference(df_inter_filter_coor)))
                    df['ohsu_coor\inter_coor'].append(df_ohsu_witout_inter_coor:=list(df_ohsu_filter_coor.difference(df_inter_filter_coor)))
                    df['eth_coor\ohsu_coor'].append(df_eth_witout_inter_coor:=list(df_eth_filter_coor.difference(df_ohsu_filter_coor)))
                    df['ohsu_coor\eth_coor'].append(df_ohsu_witout_inter_coor:=list(df_ohsu_filter_coor.difference(df_eth_filter_coor)))
                    
                    df['inter_coor'].append(list(df_inter_filter_coor))
                    if LOG == True:
                        print("\n\nOSHU\n\n")
                        print(df_ohsu)
                        print("\n\nETH\n\n")
                        print(df_eth)

                # except Exception as e:
                #     print("Error",e)
                #     continue
    df = pd.DataFrame(df)
    if not out_df_filtered.empty:
        out_df_filtered = pd.concat([out_df_filtered, df])
    else:
        out_df_filtered = df

56807it [00:08, 6793.87it/s]


TCGA25131901A01R156513 ['J', 'TCGA-25-1319-01A-01R-1565-13', '0A501GA']
['0', 'Any', '5', '0', '1']
None


106206it [00:15, 6903.75it/s]


TCGA25131901A01R156513 ['J', 'TCGA-25-1319-01A-01R-1565-13', '0A13XGA']
['0', 'Any', '1', '3', '10']
1


73042it [00:10, 6880.24it/s]


TCGA25131901A01R156513 ['J', 'TCGA-25-1319-01A-01R-1565-13', '0AN1XGA']
['0', 'Any', 'None', '1', '10']
1


88048it [00:12, 6918.64it/s]


TCGA25131901A01R156513 ['J', 'TCGA-25-1319-01A-01R-1565-13', '0AN32GA']
['0', 'Any', 'None', '3', '2']
2


47656it [00:07, 6772.77it/s]


TCGA25131901A01R156513 ['J', 'TCGA-25-1319-01A-01R-1565-13', '0253AGA']
['0', '2', '5', '3', 'Any']
0


58810it [00:08, 6799.65it/s]


TCGA25131901A01R156513 ['J', 'TCGA-25-1319-01A-01R-1565-13', '0A51AGA']
['0', 'Any', '5', '1', 'Any']
0


54874it [00:08, 6830.13it/s]


TCGA25131901A01R156513 ['J', 'TCGA-25-1319-01A-01R-1565-13', '02101GA']
['0', '2', '1', '0', '1']
None


27839it [00:04, 6837.33it/s]


TCGA25131901A01R156513 ['J', 'TCGA-25-1319-01A-01R-1565-13', '02501GA']
['0', '2', '5', '0', '1']
None


28765it [00:04, 6657.39it/s]


TCGA25131901A01R156513 ['J', 'TCGA-25-1319-01A-01R-1565-13', '0251XGA']
['0', '2', '5', '1', '10']
1


72916it [00:10, 6844.00it/s]


TCGA25131901A01R156513 ['J', 'TCGA-25-1319-01A-01R-1565-13', '0AN12GA']
['0', 'Any', 'None', '1', '2']
2


85136it [00:12, 6659.07it/s]


TCGA25131901A01R156513 ['J', 'TCGA-25-1319-01A-01R-1565-13', '0A132GA']
['0', 'Any', '1', '3', '2']
2


92397it [00:13, 6747.32it/s]


TCGA25131901A01R156513 ['J', 'TCGA-25-1319-01A-01R-1565-13', '0A53XGA']
['0', 'Any', '5', '3', '10']
1


70340it [00:10, 6863.13it/s]


TCGA25131901A01R156513 ['J', 'TCGA-25-1319-01A-01R-1565-13', '0A11AGA']
['0', 'Any', '1', '1', 'Any']
0


109314it [00:16, 6831.91it/s]


TCGA25131901A01R156513 ['J', 'TCGA-25-1319-01A-01R-1565-13', '0AN3XGA']
['0', 'Any', 'None', '3', '10']
1


28765it [00:04, 6883.87it/s]


TCGA25131901A01R156513 ['J', 'TCGA-25-1319-01A-01R-1565-13', '0251AGA']
['0', '2', '5', '1', 'Any']
0


35176it [00:05, 6837.84it/s]


TCGA25131901A01R156513 ['J', 'TCGA-25-1319-01A-01R-1565-13', '02532GA']
['0', '2', '5', '3', '2']
2


109444it [00:16, 6810.58it/s]


TCGA25131901A01R156513 ['J', 'TCGA-25-1319-01A-01R-1565-13', '0A13AGA']
['0', 'Any', '1', '3', 'Any']
0


86917it [00:12, 6815.92it/s]


TCGA25131901A01R156513 ['J', 'TCGA-25-1319-01A-01R-1565-13', '0213XGA']
['0', '2', '1', '3', '10']
1


45856it [00:06, 6578.81it/s]


TCGA25131901A01R156513 ['J', 'TCGA-25-1319-01A-01R-1565-13', '0253XGA']
['0', '2', '5', '3', '10']
1


56681it [00:08, 6577.02it/s]


TCGA25131901A01R156513 ['J', 'TCGA-25-1319-01A-01R-1565-13', '0211XGA']
['0', '2', '1', '1', '10']
1


70214it [00:10, 6652.17it/s]


TCGA25131901A01R156513 ['J', 'TCGA-25-1319-01A-01R-1565-13', '0A112GA']
['0', 'Any', '1', '1', '2']
2


95310it [00:14, 6776.13it/s]


TCGA25131901A01R156513 ['J', 'TCGA-25-1319-01A-01R-1565-13', '0A53AGA']
['0', 'Any', '5', '3', 'Any']
0


68867it [00:10, 6637.43it/s]


TCGA25131901A01R156513 ['J', 'TCGA-25-1319-01A-01R-1565-13', '02132GA']
['0', '2', '1', '3', '2']
2


68164it [00:10, 6754.10it/s]


TCGA25131901A01R156513 ['J', 'TCGA-25-1319-01A-01R-1565-13', '0A101GA']
['0', 'Any', '1', '0', '1']
None


73042it [00:10, 6920.53it/s]


TCGA25131901A01R156513 ['J', 'TCGA-25-1319-01A-01R-1565-13', '0AN1AGA']
['0', 'Any', 'None', '1', 'Any']
0


70859it [00:10, 6840.58it/s]


TCGA25131901A01R156513 ['J', 'TCGA-25-1319-01A-01R-1565-13', '0AN01GA']
['0', 'Any', 'None', '0', '1']
None


112573it [00:16, 6859.15it/s]


TCGA25131901A01R156513 ['J', 'TCGA-25-1319-01A-01R-1565-13', '0AN3AGA']
['0', 'Any', 'None', '3', 'Any']
0


58691it [00:08, 6766.46it/s]


TCGA25131901A01R156513 ['J', 'TCGA-25-1319-01A-01R-1565-13', '0A512GA']
['0', 'Any', '5', '1', '2']
2


56681it [00:08, 6696.93it/s]


TCGA25131901A01R156513 ['J', 'TCGA-25-1319-01A-01R-1565-13', '0211AGA']
['0', '2', '1', '1', 'Any']
0


56580it [00:08, 6837.31it/s]


TCGA25131901A01R156513 ['J', 'TCGA-25-1319-01A-01R-1565-13', '02112GA']
['0', '2', '1', '1', '2']
2


28717it [00:04, 6547.17it/s]


TCGA25131901A01R156513 ['J', 'TCGA-25-1319-01A-01R-1565-13', '02512GA']
['0', '2', '5', '1', '2']
2


58810it [00:09, 6326.88it/s]


TCGA25131901A01R156513 ['J', 'TCGA-25-1319-01A-01R-1565-13', '0A51XGA']
['0', 'Any', '5', '1', '10']
1


89689it [00:13, 6636.97it/s]


TCGA25131901A01R156513 ['J', 'TCGA-25-1319-01A-01R-1565-13', '0213AGA']
['0', '2', '1', '3', 'Any']
0


72436it [00:11, 6473.95it/s]


TCGA25131901A01R156513 ['J', 'TCGA-25-1319-01A-01R-1565-13', '0A532GA']
['0', 'Any', '5', '3', '2']
2


70340it [00:10, 6751.73it/s]


TCGA25131901A01R156513 ['J', 'TCGA-25-1319-01A-01R-1565-13', '0A11XGA']
['0', 'Any', '1', '1', '10']
1


42233it [00:06, 6569.44it/s]


TCGA25131301A01R156513 ['J', 'TCGA-25-1313-01A-01R-1565-13', '0AN1XGA']
['0', 'Any', 'None', '1', '10']
1


42233it [00:06, 6590.14it/s]


TCGA25131301A01R156513 ['J', 'TCGA-25-1313-01A-01R-1565-13', '0AN1AGA']
['0', 'Any', 'None', '1', 'Any']
0


14978it [00:02, 5875.88it/s]


TCGA25131301A01R156513 ['J', 'TCGA-25-1313-01A-01R-1565-13', '02501GA']
['0', '2', '5', '0', '1']
None


47353it [00:09, 4956.73it/s]


TCGA25131301A01R156513 ['J', 'TCGA-25-1313-01A-01R-1565-13', '0A53XGA']
['0', 'Any', '5', '3', '10']
1


57519it [00:17, 3379.64it/s]


TCGA25131301A01R156513 ['J', 'TCGA-25-1313-01A-01R-1565-13', '0A13AGA']
['0', 'Any', '1', '3', 'Any']
0


45541it [00:11, 4079.69it/s]


TCGA25131301A01R156513 ['J', 'TCGA-25-1313-01A-01R-1565-13', '0213XGA']
['0', '2', '1', '3', '10']
1


35945it [00:05, 6713.92it/s]


TCGA25131301A01R156513 ['J', 'TCGA-25-1313-01A-01R-1565-13', '02132GA']
['0', '2', '1', '3', '2']
2


29582it [00:04, 6849.40it/s]


TCGA25131301A01R156513 ['J', 'TCGA-25-1313-01A-01R-1565-13', '02112GA']
['0', '2', '1', '1', '2']
2


30291it [00:04, 6683.65it/s]


TCGA25131301A01R156513 ['J', 'TCGA-25-1313-01A-01R-1565-13', '0A512GA']
['0', 'Any', '5', '1', '2']
2


30326it [00:04, 6746.64it/s]


TCGA25131301A01R156513 ['J', 'TCGA-25-1313-01A-01R-1565-13', '0A51XGA']
['0', 'Any', '5', '1', '10']
1


46993it [00:06, 6908.37it/s]


TCGA25131301A01R156513 ['J', 'TCGA-25-1313-01A-01R-1565-13', '0213AGA']
['0', '2', '1', '3', 'Any']
0


29607it [00:04, 6956.65it/s]


TCGA25131301A01R156513 ['J', 'TCGA-25-1313-01A-01R-1565-13', '0211XGA']
['0', '2', '1', '1', '10']
1


29607it [00:08, 3326.90it/s]


TCGA25131301A01R156513 ['J', 'TCGA-25-1313-01A-01R-1565-13', '0211AGA']
['0', '2', '1', '1', 'Any']
0


37449it [00:10, 3413.88it/s]


TCGA25131301A01R156513 ['J', 'TCGA-25-1313-01A-01R-1565-13', '0A112GA']
['0', 'Any', '1', '1', '2']
2


49919it [00:07, 6857.94it/s]


TCGA25131301A01R156513 ['J', 'TCGA-25-1313-01A-01R-1565-13', '0AN32GA']
['0', 'Any', 'None', '3', '2']
2


48784it [00:07, 6888.05it/s]


TCGA25131301A01R156513 ['J', 'TCGA-25-1313-01A-01R-1565-13', '0A53AGA']
['0', 'Any', '5', '3', 'Any']
0


30326it [00:04, 6722.09it/s]


TCGA25131301A01R156513 ['J', 'TCGA-25-1313-01A-01R-1565-13', '0A51AGA']
['0', 'Any', '5', '1', 'Any']
0


37484it [00:05, 6777.26it/s]


TCGA25131301A01R156513 ['J', 'TCGA-25-1313-01A-01R-1565-13', '0A11XGA']
['0', 'Any', '1', '1', '10']
1


42192it [00:06, 6772.21it/s]


TCGA25131301A01R156513 ['J', 'TCGA-25-1313-01A-01R-1565-13', '0AN12GA']
['0', 'Any', 'None', '1', '2']
2


15483it [00:02, 6863.19it/s]


TCGA25131301A01R156513 ['J', 'TCGA-25-1313-01A-01R-1565-13', '02512GA']
['0', '2', '5', '1', '2']
2


28775it [00:04, 6809.29it/s]


TCGA25131301A01R156513 ['J', 'TCGA-25-1313-01A-01R-1565-13', '02101GA']
['0', '2', '1', '0', '1']
None


15494it [00:02, 6832.91it/s]


TCGA25131301A01R156513 ['J', 'TCGA-25-1313-01A-01R-1565-13', '0251XGA']
['0', '2', '5', '1', '10']
1


29408it [00:04, 6761.08it/s]


TCGA25131301A01R156513 ['J', 'TCGA-25-1313-01A-01R-1565-13', '0A501GA']
['0', 'Any', '5', '0', '1']
None


37484it [00:05, 6766.21it/s]


TCGA25131301A01R156513 ['J', 'TCGA-25-1313-01A-01R-1565-13', '0A11AGA']
['0', 'Any', '1', '1', 'Any']
0


55922it [00:08, 6911.85it/s]


TCGA25131301A01R156513 ['J', 'TCGA-25-1313-01A-01R-1565-13', '0A13XGA']
['0', 'Any', '1', '3', '10']
1


37104it [00:05, 6911.40it/s]


TCGA25131301A01R156513 ['J', 'TCGA-25-1313-01A-01R-1565-13', '0A532GA']
['0', 'Any', '5', '3', '2']
2


15494it [00:02, 6823.56it/s]


TCGA25131301A01R156513 ['J', 'TCGA-25-1313-01A-01R-1565-13', '0251AGA']
['0', '2', '5', '1', 'Any']
0


19577it [00:02, 6875.81it/s]


TCGA25131301A01R156513 ['J', 'TCGA-25-1313-01A-01R-1565-13', '02532GA']
['0', '2', '5', '3', '2']
2


44879it [00:06, 6847.09it/s]


TCGA25131301A01R156513 ['J', 'TCGA-25-1313-01A-01R-1565-13', '0A132GA']
['0', 'Any', '1', '3', '2']
2


26157it [00:03, 6853.13it/s]


TCGA25131301A01R156513 ['J', 'TCGA-25-1313-01A-01R-1565-13', '0253XGA']
['0', '2', '5', '3', '10']
1


27058it [00:03, 6868.53it/s]


TCGA25131301A01R156513 ['J', 'TCGA-25-1313-01A-01R-1565-13', '0253AGA']
['0', '2', '5', '3', 'Any']
0


63099it [00:09, 6895.65it/s]


TCGA25131301A01R156513 ['J', 'TCGA-25-1313-01A-01R-1565-13', '0AN3AGA']
['0', 'Any', 'None', '3', 'Any']
0


36525it [00:05, 6816.12it/s]


TCGA25131301A01R156513 ['J', 'TCGA-25-1313-01A-01R-1565-13', '0A101GA']
['0', 'Any', '1', '0', '1']
None


41224it [00:05, 6898.48it/s]


TCGA25131301A01R156513 ['J', 'TCGA-25-1313-01A-01R-1565-13', '0AN01GA']
['0', 'Any', 'None', '0', '1']
None


61487it [00:09, 6827.81it/s]


TCGA25131301A01R156513 ['J', 'TCGA-25-1313-01A-01R-1565-13', '0AN3XGA']
['0', 'Any', 'None', '3', '10']
1


148334it [00:21, 6768.44it/s]


TCGA61200801A02R156813 ['J', 'TCGA-61-2008-01A-02R-1568-13', '0AN3AGA']
['0', 'Any', 'None', '3', 'Any']
0


87533it [00:12, 6777.81it/s]


TCGA61200801A02R156813 ['J', 'TCGA-61-2008-01A-02R-1568-13', '0A11AGA']
['0', 'Any', '1', '1', 'Any']
0


69153it [00:10, 6867.02it/s]


TCGA61200801A02R156813 ['J', 'TCGA-61-2008-01A-02R-1568-13', '0211AGA']
['0', '2', '1', '1', 'Any']
0


114871it [00:16, 6845.34it/s]


TCGA61200801A02R156813 ['J', 'TCGA-61-2008-01A-02R-1568-13', '0AN32GA']
['0', 'Any', 'None', '3', '2']
2


94099it [00:13, 6800.99it/s]


TCGA61200801A02R156813 ['J', 'TCGA-61-2008-01A-02R-1568-13', '0AN12GA']
['0', 'Any', 'None', '1', '2']
2


107646it [00:15, 6876.44it/s]


TCGA61200801A02R156813 ['J', 'TCGA-61-2008-01A-02R-1568-13', '0A132GA']
['0', 'Any', '1', '3', '2']
2


87272it [00:12, 6772.71it/s]


TCGA61200801A02R156813 ['J', 'TCGA-61-2008-01A-02R-1568-13', '0A112GA']
['0', 'Any', '1', '1', '2']
2


70467it [00:10, 6938.73it/s]


TCGA61200801A02R156813 ['J', 'TCGA-61-2008-01A-02R-1568-13', '0A51AGA']
['0', 'Any', '5', '1', 'Any']
0


33356it [00:04, 6758.48it/s]


TCGA61200801A02R156813 ['J', 'TCGA-61-2008-01A-02R-1568-13', '02512GA']
['0', '2', '5', '1', '2']
2


94360it [00:13, 6787.19it/s]


TCGA61200801A02R156813 ['J', 'TCGA-61-2008-01A-02R-1568-13', '0AN1XGA']
['0', 'Any', 'None', '1', '10']
1


115268it [00:16, 6866.68it/s]


TCGA61200801A02R156813 ['J', 'TCGA-61-2008-01A-02R-1568-13', '0A53XGA']
['0', 'Any', '5', '3', '10']
1


94360it [00:15, 6021.70it/s]


TCGA61200801A02R156813 ['J', 'TCGA-61-2008-01A-02R-1568-13', '0AN1AGA']
['0', 'Any', 'None', '1', 'Any']
0


58352it [00:08, 6562.90it/s]


TCGA61200801A02R156813 ['J', 'TCGA-61-2008-01A-02R-1568-13', '0253AGA']
['0', '2', '5', '3', 'Any']
0


70230it [00:10, 6669.64it/s]


TCGA61200801A02R156813 ['J', 'TCGA-61-2008-01A-02R-1568-13', '0A512GA']
['0', 'Any', '5', '1', '2']
2


140698it [00:20, 6780.84it/s]


TCGA61200801A02R156813 ['J', 'TCGA-61-2008-01A-02R-1568-13', '0A13AGA']
['0', 'Any', '1', '3', 'Any']
0


135718it [00:20, 6734.39it/s]


TCGA61200801A02R156813 ['J', 'TCGA-61-2008-01A-02R-1568-13', '0A13XGA']
['0', 'Any', '1', '3', '10']
1


91623it [00:13, 6872.55it/s]


TCGA61200801A02R156813 ['J', 'TCGA-61-2008-01A-02R-1568-13', '0AN01GA']
['0', 'Any', 'None', '0', '1']
None


108856it [00:15, 6862.03it/s]


TCGA61200801A02R156813 ['J', 'TCGA-61-2008-01A-02R-1568-13', '0213XGA']
['0', '2', '1', '3', '10']
1


87533it [00:12, 6803.94it/s]


TCGA61200801A02R156813 ['J', 'TCGA-61-2008-01A-02R-1568-13', '0A11XGA']
['0', 'Any', '1', '1', '10']
1


113185it [00:16, 6813.44it/s]


TCGA61200801A02R156813 ['J', 'TCGA-61-2008-01A-02R-1568-13', '0213AGA']
['0', '2', '1', '3', 'Any']
0


33424it [00:04, 6808.11it/s]


TCGA61200801A02R156813 ['J', 'TCGA-61-2008-01A-02R-1568-13', '0251AGA']
['0', '2', '5', '1', 'Any']
0


69002it [00:10, 6793.86it/s]


TCGA61200801A02R156813 ['J', 'TCGA-61-2008-01A-02R-1568-13', '02112GA']
['0', '2', '1', '1', '2']
2


69153it [00:10, 6798.30it/s]


TCGA61200801A02R156813 ['J', 'TCGA-61-2008-01A-02R-1568-13', '0211XGA']
['0', '2', '1', '1', '10']
1


67070it [00:09, 6927.55it/s]


TCGA61200801A02R156813 ['J', 'TCGA-61-2008-01A-02R-1568-13', '02101GA']
['0', '2', '1', '0', '1']
None


143297it [00:20, 6826.11it/s]


TCGA61200801A02R156813 ['J', 'TCGA-61-2008-01A-02R-1568-13', '0AN3XGA']
['0', 'Any', 'None', '3', '10']
1


88779it [00:13, 6790.70it/s]


TCGA61200801A02R156813 ['J', 'TCGA-61-2008-01A-02R-1568-13', '0A532GA']
['0', 'Any', '5', '3', '2']
2


33424it [00:04, 6795.40it/s]


TCGA61200801A02R156813 ['J', 'TCGA-61-2008-01A-02R-1568-13', '0251XGA']
['0', '2', '5', '1', '10']
1


68039it [00:09, 6807.32it/s]


TCGA61200801A02R156813 ['J', 'TCGA-61-2008-01A-02R-1568-13', '0A501GA']
['0', 'Any', '5', '0', '1']
None


70467it [00:10, 6797.21it/s]


TCGA61200801A02R156813 ['J', 'TCGA-61-2008-01A-02R-1568-13', '0A51XGA']
['0', 'Any', '5', '1', '10']
1


32270it [00:04, 6859.50it/s]


TCGA61200801A02R156813 ['J', 'TCGA-61-2008-01A-02R-1568-13', '02501GA']
['0', '2', '5', '0', '1']
None


42132it [00:10, 4031.43it/s]


TCGA61200801A02R156813 ['J', 'TCGA-61-2008-01A-02R-1568-13', '02532GA']
['0', '2', '5', '3', '2']
2


85236it [00:18, 4705.58it/s]


TCGA61200801A02R156813 ['J', 'TCGA-61-2008-01A-02R-1568-13', '02132GA']
['0', '2', '1', '3', '2']
2


55747it [00:08, 6886.86it/s]


TCGA61200801A02R156813 ['J', 'TCGA-61-2008-01A-02R-1568-13', '0253XGA']
['0', '2', '5', '3', '10']
1


119866it [00:34, 3437.46it/s]


TCGA61200801A02R156813 ['J', 'TCGA-61-2008-01A-02R-1568-13', '0A53AGA']
['0', 'Any', '5', '3', 'Any']
0


84819it [00:12, 6864.79it/s]


TCGA61200801A02R156813 ['J', 'TCGA-61-2008-01A-02R-1568-13', '0A101GA']
['0', 'Any', '1', '0', '1']
None


40564it [00:12, 3323.17it/s]


TCGA24143101A01R156613 ['J', 'TCGA-24-1431-01A-01R-1566-13', '0AN1XGA']
['0', 'Any', 'None', '1', '10']
1


41243it [00:12, 3186.40it/s]


TCGA24143101A01R156613 ['J', 'TCGA-24-1431-01A-01R-1566-13', '0A532GA']
['0', 'Any', '5', '3', '2']
2


33852it [00:07, 4258.56it/s]


TCGA24143101A01R156613 ['J', 'TCGA-24-1431-01A-01R-1566-13', '0A51XGA']
['0', 'Any', '5', '1', '10']
1


53216it [00:07, 6771.74it/s]


TCGA24143101A01R156613 ['J', 'TCGA-24-1431-01A-01R-1566-13', '0A53XGA']
['0', 'Any', '5', '3', '10']
1


32658it [00:04, 6858.34it/s]


TCGA24143101A01R156613 ['J', 'TCGA-24-1431-01A-01R-1566-13', '0A501GA']
['0', 'Any', '5', '0', '1']
None


50531it [00:07, 6821.88it/s]


TCGA24143101A01R156613 ['J', 'TCGA-24-1431-01A-01R-1566-13', '0213XGA']
['0', '2', '1', '3', '10']
1


32315it [00:04, 6809.75it/s]


TCGA24143101A01R156613 ['J', 'TCGA-24-1431-01A-01R-1566-13', '0211XGA']
['0', '2', '1', '1', '10']
1


39415it [00:05, 6905.20it/s]


TCGA24143101A01R156613 ['J', 'TCGA-24-1431-01A-01R-1566-13', '0A11AGA']
['0', 'Any', '1', '1', 'Any']
0


60229it [00:08, 6843.63it/s]


TCGA24143101A01R156613 ['J', 'TCGA-24-1431-01A-01R-1566-13', '0A13XGA']
['0', 'Any', '1', '3', '10']
1


52401it [00:07, 6854.49it/s]


TCGA24143101A01R156613 ['J', 'TCGA-24-1431-01A-01R-1566-13', '0213AGA']
['0', '2', '1', '3', 'Any']
0


22013it [00:03, 6745.93it/s]


TCGA24143101A01R156613 ['J', 'TCGA-24-1431-01A-01R-1566-13', '02532GA']
['0', '2', '5', '3', '2']
2


33852it [00:04, 6862.83it/s]


TCGA24143101A01R156613 ['J', 'TCGA-24-1431-01A-01R-1566-13', '0A51AGA']
['0', 'Any', '5', '1', 'Any']
0


40455it [00:05, 6973.48it/s]


TCGA24143101A01R156613 ['J', 'TCGA-24-1431-01A-01R-1566-13', '0AN12GA']
['0', 'Any', 'None', '1', '2']
2


17722it [00:02, 6921.57it/s]


TCGA24143101A01R156613 ['J', 'TCGA-24-1431-01A-01R-1566-13', '0251XGA']
['0', '2', '5', '1', '10']
1


40564it [00:05, 6915.01it/s]


TCGA24143101A01R156613 ['J', 'TCGA-24-1431-01A-01R-1566-13', '0AN1AGA']
['0', 'Any', 'None', '1', 'Any']
0


39443it [00:05, 6917.25it/s]


TCGA24143101A01R156613 ['J', 'TCGA-24-1431-01A-01R-1566-13', '02132GA']
['0', '2', '1', '3', '2']
2


62406it [00:09, 6847.52it/s]


TCGA24143101A01R156613 ['J', 'TCGA-24-1431-01A-01R-1566-13', '0A13AGA']
['0', 'Any', '1', '3', 'Any']
0


28876it [00:04, 6878.64it/s]


TCGA24143101A01R156613 ['J', 'TCGA-24-1431-01A-01R-1566-13', '0253XGA']
['0', '2', '5', '3', '10']
1


55173it [00:08, 6882.49it/s]


TCGA24143101A01R156613 ['J', 'TCGA-24-1431-01A-01R-1566-13', '0A53AGA']
['0', 'Any', '5', '3', 'Any']
0


31194it [00:04, 6769.51it/s]


TCGA24143101A01R156613 ['J', 'TCGA-24-1431-01A-01R-1566-13', '02101GA']
['0', '2', '1', '0', '1']
None


17722it [00:02, 6821.58it/s]


TCGA24143101A01R156613 ['J', 'TCGA-24-1431-01A-01R-1566-13', '0251AGA']
['0', '2', '5', '1', 'Any']
0


17050it [00:02, 6837.59it/s]


TCGA24143101A01R156613 ['J', 'TCGA-24-1431-01A-01R-1566-13', '02501GA']
['0', '2', '5', '0', '1']
None


47525it [00:06, 6911.55it/s]


TCGA24143101A01R156613 ['J', 'TCGA-24-1431-01A-01R-1566-13', '0A132GA']
['0', 'Any', '1', '3', '2']
2


32212it [00:05, 6332.23it/s]


TCGA24143101A01R156613 ['J', 'TCGA-24-1431-01A-01R-1566-13', '02112GA']
['0', '2', '1', '1', '2']
2


33743it [00:10, 3319.44it/s]


TCGA24143101A01R156613 ['J', 'TCGA-24-1431-01A-01R-1566-13', '0A512GA']
['0', 'Any', '5', '1', '2']
2


61473it [00:18, 3252.64it/s]


TCGA24143101A01R156613 ['J', 'TCGA-24-1431-01A-01R-1566-13', '0AN3XGA']
['0', 'Any', 'None', '3', '10']
1


63664it [00:18, 3358.79it/s]


TCGA24143101A01R156613 ['J', 'TCGA-24-1431-01A-01R-1566-13', '0AN3AGA']
['0', 'Any', 'None', '3', 'Any']
0


32315it [00:04, 6677.85it/s]


TCGA24143101A01R156613 ['J', 'TCGA-24-1431-01A-01R-1566-13', '0211AGA']
['0', '2', '1', '1', 'Any']
0


39264it [00:05, 6792.42it/s]


TCGA24143101A01R156613 ['J', 'TCGA-24-1431-01A-01R-1566-13', '0AN01GA']
['0', 'Any', 'None', '0', '1']
None


39306it [00:05, 6757.67it/s]


TCGA24143101A01R156613 ['J', 'TCGA-24-1431-01A-01R-1566-13', '0A112GA']
['0', 'Any', '1', '1', '2']
2


48688it [00:07, 6715.81it/s]


TCGA24143101A01R156613 ['J', 'TCGA-24-1431-01A-01R-1566-13', '0AN32GA']
['0', 'Any', 'None', '3', '2']
2


38115it [00:05, 6768.68it/s]


TCGA24143101A01R156613 ['J', 'TCGA-24-1431-01A-01R-1566-13', '0A101GA']
['0', 'Any', '1', '0', '1']
None


30115it [00:04, 6749.76it/s]


TCGA24143101A01R156613 ['J', 'TCGA-24-1431-01A-01R-1566-13', '0253AGA']
['0', '2', '5', '3', 'Any']
0


17661it [00:02, 6782.06it/s]


TCGA24143101A01R156613 ['J', 'TCGA-24-1431-01A-01R-1566-13', '02512GA']
['0', '2', '5', '1', '2']
2


39415it [00:05, 6738.90it/s]


TCGA24143101A01R156613 ['J', 'TCGA-24-1431-01A-01R-1566-13', '0A11XGA']
['0', 'Any', '1', '1', '10']
1


74974it [00:10, 6864.13it/s]


TCGA24229801A01R156913 ['J', 'TCGA-24-2298-01A-01R-1569-13', '0A501GA']
['0', 'Any', '5', '0', '1']
None


93349it [00:13, 6817.48it/s]


TCGA24229801A01R156913 ['J', 'TCGA-24-2298-01A-01R-1569-13', '0A101GA']
['0', 'Any', '1', '0', '1']
None


75369it [00:11, 6725.83it/s]


TCGA24229801A01R156913 ['J', 'TCGA-24-2298-01A-01R-1569-13', '02112GA']
['0', '2', '1', '1', '2']
2


164923it [00:24, 6730.00it/s]


TCGA24229801A01R156913 ['J', 'TCGA-24-2298-01A-01R-1569-13', '0AN3AGA']
['0', 'Any', 'None', '3', 'Any']
0


73147it [00:10, 6762.93it/s]


TCGA24229801A01R156913 ['J', 'TCGA-24-2298-01A-01R-1569-13', '02101GA']
['0', '2', '1', '0', '1']
None


36821it [00:05, 6676.51it/s]


TCGA24229801A01R156913 ['J', 'TCGA-24-2298-01A-01R-1569-13', '02512GA']
['0', '2', '5', '1', '2']
2


61051it [00:09, 6637.05it/s]


TCGA24229801A01R156913 ['J', 'TCGA-24-2298-01A-01R-1569-13', '0253XGA']
['0', '2', '5', '3', '10']
1


45249it [00:06, 6715.80it/s]


TCGA24229801A01R156913 ['J', 'TCGA-24-2298-01A-01R-1569-13', '02532GA']
['0', '2', '5', '3', '2']
2


77523it [00:11, 6695.14it/s]


TCGA24229801A01R156913 ['J', 'TCGA-24-2298-01A-01R-1569-13', '0A51XGA']
['0', 'Any', '5', '1', '10']
1


35692it [00:05, 6720.78it/s]


TCGA24229801A01R156913 ['J', 'TCGA-24-2298-01A-01R-1569-13', '02501GA']
['0', '2', '5', '0', '1']
None


125692it [00:18, 6771.59it/s]


TCGA24229801A01R156913 ['J', 'TCGA-24-2298-01A-01R-1569-13', '0AN32GA']
['0', 'Any', 'None', '3', '2']
2


36926it [00:05, 6735.52it/s]


TCGA24229801A01R156913 ['J', 'TCGA-24-2298-01A-01R-1569-13', '0251XGA']
['0', '2', '5', '1', '10']
1


95209it [00:14, 6740.00it/s]


TCGA24229801A01R156913 ['J', 'TCGA-24-2298-01A-01R-1569-13', '0A532GA']
['0', 'Any', '5', '3', '2']
2


125591it [00:18, 6686.72it/s]


TCGA24229801A01R156913 ['J', 'TCGA-24-2298-01A-01R-1569-13', '0A53XGA']
['0', 'Any', '5', '3', '10']
1


63694it [00:09, 6687.69it/s]


TCGA24229801A01R156913 ['J', 'TCGA-24-2298-01A-01R-1569-13', '0253AGA']
['0', '2', '5', '3', 'Any']
0


96064it [00:14, 6801.69it/s]


TCGA24229801A01R156913 ['J', 'TCGA-24-2298-01A-01R-1569-13', '0A112GA']
['0', 'Any', '1', '1', '2']
2


75554it [00:11, 6746.83it/s]


TCGA24229801A01R156913 ['J', 'TCGA-24-2298-01A-01R-1569-13', '0211AGA']
['0', '2', '1', '1', 'Any']
0


104864it [00:15, 6754.11it/s]


TCGA24229801A01R156913 ['J', 'TCGA-24-2298-01A-01R-1569-13', '0AN1AGA']
['0', 'Any', 'None', '1', 'Any']
0


129908it [00:19, 6743.92it/s]


TCGA24229801A01R156913 ['J', 'TCGA-24-2298-01A-01R-1569-13', '0A53AGA']
['0', 'Any', '5', '3', 'Any']
0


119022it [00:17, 6730.65it/s]


TCGA24229801A01R156913 ['J', 'TCGA-24-2298-01A-01R-1569-13', '0213XGA']
['0', '2', '1', '3', '10']
1


160092it [00:23, 6802.48it/s]


TCGA24229801A01R156913 ['J', 'TCGA-24-2298-01A-01R-1569-13', '0AN3XGA']
['0', 'Any', 'None', '3', '10']
1


104864it [00:15, 6706.57it/s]


TCGA24229801A01R156913 ['J', 'TCGA-24-2298-01A-01R-1569-13', '0AN1XGA']
['0', 'Any', 'None', '1', '10']
1


75554it [00:11, 6686.86it/s]


TCGA24229801A01R156913 ['J', 'TCGA-24-2298-01A-01R-1569-13', '0211XGA']
['0', '2', '1', '1', '10']
1


91520it [00:13, 6771.21it/s]


TCGA24229801A01R156913 ['J', 'TCGA-24-2298-01A-01R-1569-13', '02132GA']
['0', '2', '1', '3', '2']
2


36926it [00:05, 6797.47it/s]


TCGA24229801A01R156913 ['J', 'TCGA-24-2298-01A-01R-1569-13', '0251AGA']
['0', '2', '5', '1', 'Any']
0


149577it [00:22, 6757.71it/s]


TCGA24229801A01R156913 ['J', 'TCGA-24-2298-01A-01R-1569-13', '0A13XGA']
['0', 'Any', '1', '3', '10']
1


101878it [00:15, 6698.22it/s]


TCGA24229801A01R156913 ['J', 'TCGA-24-2298-01A-01R-1569-13', '0AN01GA']
['0', 'Any', 'None', '0', '1']
None


116393it [00:17, 6581.19it/s]


TCGA24229801A01R156913 ['J', 'TCGA-24-2298-01A-01R-1569-13', '0A132GA']
['0', 'Any', '1', '3', '2']
2


104637it [00:16, 6222.00it/s]


TCGA24229801A01R156913 ['J', 'TCGA-24-2298-01A-01R-1569-13', '0AN12GA']
['0', 'Any', 'None', '1', '2']
2


96286it [00:14, 6766.34it/s]


TCGA24229801A01R156913 ['J', 'TCGA-24-2298-01A-01R-1569-13', '0A11AGA']
['0', 'Any', '1', '1', 'Any']
0


154355it [00:22, 6785.87it/s]


TCGA24229801A01R156913 ['J', 'TCGA-24-2298-01A-01R-1569-13', '0A13AGA']
['0', 'Any', '1', '3', 'Any']
0


96286it [00:14, 6797.32it/s]


TCGA24229801A01R156913 ['J', 'TCGA-24-2298-01A-01R-1569-13', '0A11XGA']
['0', 'Any', '1', '1', '10']
1


122968it [00:18, 6811.91it/s]


TCGA24229801A01R156913 ['J', 'TCGA-24-2298-01A-01R-1569-13', '0213AGA']
['0', '2', '1', '3', 'Any']
0


77310it [00:11, 6764.55it/s]


TCGA24229801A01R156913 ['J', 'TCGA-24-2298-01A-01R-1569-13', '0A512GA']
['0', 'Any', '5', '1', '2']
2


77523it [00:11, 6721.41it/s]


TCGA24229801A01R156913 ['J', 'TCGA-24-2298-01A-01R-1569-13', '0A51AGA']
['0', 'Any', '5', '1', 'Any']
0


In [8]:
path_filtering=create_path.create_path(SAVE_DIR,[DIR_CSV,DIR_OVARIAN,NAME_TABLES,NAME_FILTERING_OVARIAN])
out_df_filtered.to_csv(path_filtering,header=True,sep=';')
