<a id='top'></a>

# Spreadsheet Transformations
1. [Transpose](#transpose)
2. [Common Samples](#common_samples)
3. [Merge](#merge)
4. [Select Genes](#select_genes)
5. [Cluster Averages](#cluster_averages)
6. [Select Categorical](#select_categorical)

<a id='transpose'></a>

In [1]:
# %%html
# <style>
# div.input {
#     display:none;
# }
# div.output_stderr{
#     display:none
# }
# </style>

In [2]:
#                                         single cell for select, open and transpose:
#                                         target directory set for docker run -v `pwd`:...   ==  mount user data
target_dir = '../../user_data'

import warnings
warnings.filterwarnings('ignore')

import os
import sys
import pandas as pd
import knpackage.toolbox as kn
import numpy as np

from IPython.display import display, HTML, clear_output
import ipywidgets as widgets
#                                         Get list of (docker run -v) mounted files:
flist = os.listdir(target_dir)
FEXT = ['.tsv', '.txt', '.df']
my_file_list = []
for f in flist:
    if os.path.isfile(os.path.join(target_dir, f)):
        noNeed, f_ext = os.path.splitext(f)
        if f_ext in FEXT:
            my_file_list.append(f)

#                                         (docker run -v) mounted files was empty:
if len(my_file_list) <= 0:
    my_file_list.append('No Data')

### Note: 
#### View a file:
* **Select your spreadsheet file from the "Select File" dropdown listbox below.**
* **Press the "Visualize" button and the spreadsheet (or the upper-left part if it is too large) will be displayed.**

In [3]:
#                                         local function to open and Visualize:
def visualize_selected_file(button):
    """"""
    clear_output()
    try:
        if hasattr(button, 'fname_list'): 
            full_fname_list = button.fname_list
        else: 
            full_fname = os.path.join(target_dir, button.dropdown_box.value)
            full_fname_list = [full_fname]
        for full_fname in full_fname_list: 
            pheno_df = pd.read_csv(full_fname,sep='\t',header=0,index_col=0)
            Step = pheno_df.iloc[0:10,0:10];
            Step2 = HTML(Step.to_html())
            display(Step2)
        
    except OSError:
        print("No data to visualize! ")
        
def clear(change):
    clear_output()

<a id='transpose'></a>

## Transpose [[back to top]](#top) 

### Transpose a file:
* **Select your spreadsheet file from the "Select File" dropdown listbox below.**
* **Press the "Transpose" button and the transposed spreadsheet will be written to a file by the same name of the input file with "_T" appended in the same directory of the input file.**

In [4]:
#                                         local function to open and transpose:
def transpose_selected_file(button):
    if len(my_file_list) == 0 or my_file_list[0] == 'No Data':
        return
    
    file_name = os.path.join(target_dir, flistbx.value)
    spreadsheet_df = pd.read_csv(file_name, sep='\t', index_col=0, header=0)
    spreadsheet_df = spreadsheet_df.transpose()
    name_base, file_extension = os.path.splitext(file_name)
    outfile_name = name_base + '_T.tsv'
    spreadsheet_df.to_csv(outfile_name, sep='\t')
    print('Output written to\n', outfile_name)
    button.fname_list = [outfile_name]
    visualize_selected_file(button)


#                                         Create and display the widget controls:
flistbx = widgets.Dropdown(
    options=my_file_list,
    value=my_file_list[0],
    description='Select File:'
)
display(flistbx)

visualize_file_button = widgets.Button(
    description='Visualize',
    disabled=False,
    button_style='',
    tooltip='file to visulaize',
    dropdown_box=flistbx
    )
visualize_file_button.on_click(visualize_selected_file)
display(visualize_file_button)

output_file_button = widgets.Button(
    description='Transpose',
    disabled=False,
    button_style='',
    tooltip='file to transpose',
    data_file_key='output_file_name'
    )
output_file_button.on_click(transpose_selected_file)
display(output_file_button)

Widget Javascript not detected.  It may not be installed or enabled properly.


Widget Javascript not detected.  It may not be installed or enabled properly.


Widget Javascript not detected.  It may not be installed or enabled properly.


<a id='common_samples'></a>

## Common Samples [[back to top]](#top)

### Return the subsets of two spreadsheets with only common samples of 2 spreadsheets: 
* **Select your two samples x phenotypes spreadsheet files respectively from the "Select File 1" and "Select File 2" dropdown listboxes below.**
* **Press the "Get Common Samples" button and the corresponding subsets of the spreadsheets with only common samples of 2 spreadsheets will be written to 2 files by the same names of 2 input files with "_Com" appended respectively in the same directory of the input files.**

In [5]:
#                                         local function to read files and get common samples and write:
def get_common_samples(button):
    if len(my_file_list) == 0 or my_file_list[0] == 'No Data':
        return
    
    file_name_1 = os.path.join(target_dir, flistbx_2.value)
    file_name_2 = os.path.join(target_dir, flistbx_3.value)
    sxp_1_df = pd.read_csv(file_name_1, sep='\t', index_col=0, header=0)
    sxp_2_df = pd.read_csv(file_name_2, sep='\t', index_col=0, header=0)
    sxp_1_gene_names = kn.extract_spreadsheet_gene_names(sxp_1_df)
    sxp_2_gene_names = kn.extract_spreadsheet_gene_names(sxp_2_df)
    common_samples_list = kn.find_common_node_names(sxp_1_gene_names, sxp_2_gene_names)
    sxp_1_trim_df,sxp_2_trim_df = sxp_1_df.loc[common_samples_list], sxp_2_df.loc[common_samples_list]
    name_base_1, file_extension_1 = os.path.splitext(file_name_1)
    outfile_name_1 = name_base_1 + '_Com.tsv'
    name_base_2, file_extension_2 = os.path.splitext(file_name_2)
    outfile_name_2 = name_base_2 + '_Com.tsv'
    sxp_1_trim_df.to_csv(outfile_name_1, sep='\t', index=True, header=True)
    sxp_2_trim_df.to_csv(outfile_name_2, sep='\t', index=True, header=True)
    print('Outputs written to\n', outfile_name_1,'\nand\n',outfile_name_2)
    button.fname_list = [outfile_name_1, outfile_name_2]
    visualize_selected_file(button)


#                                         Create and display the widget controls:
flistbx_2 = widgets.Dropdown(
    options=my_file_list,
    value=my_file_list[0],
    description='Select File 1:'
)
display(flistbx_2)
flistbx_2.observe(clear, names='value')

visualize_file_button_2 = widgets.Button(
    description='Visualize',
    disabled=False,
    button_style='',
    tooltip='file to visulaize',
    dropdown_box=flistbx_2
    )
display(visualize_file_button_2)
visualize_file_button_2.on_click(visualize_selected_file)


Unnamed: 0_level_0,ICDO3site,stage,stage_simple,stage_ismeta,grade,grade_simple,residual
sample_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
TCGA-A5-A0G1,c54.1,stage ia,i,0,grade 3,High Grade,r0
TCGA-A5-A0G3,c54.1,stage iiic2,iii,0,grade 3,High Grade,r0
TCGA-A5-A0G5,c54.1,stage ib,i,0,grade 3,High Grade,r0
TCGA-A5-A0GA,c54.1,stage iiic2,iii,0,grade 3,High Grade,r0
TCGA-A5-A0GB,c54.1,stage ib,i,0,grade 3,High Grade,r0
TCGA-A5-A0GD,c54.1,stage ia,i,0,grade 2,"Low grade (1,2)",r0
TCGA-A5-A0GE,c54.1,stage ia,i,0,grade 2,"Low grade (1,2)",r0
TCGA-A5-A0GG,c54.1,stage ia,i,0,grade 1,"Low grade (1,2)",r0
TCGA-A5-A0GH,c54.1,stage ia,i,0,grade 3,High Grade,r0
TCGA-A5-A0GI,c54.1,stage ia,i,0,grade 2,"Low grade (1,2)",r0


In [6]:
flistbx_3 = widgets.Dropdown(
    options=my_file_list,
    value=my_file_list[0],
    description='Select File 2:'
)
display(flistbx_3)
flistbx_3.observe(clear, names='value')

visualize_file_button_3 = widgets.Button(
    description='Visualize',
    disabled=False,
    button_style='',
    tooltip='file to visulaize',
    dropdown_box=flistbx_3
    )
display(visualize_file_button_3)
visualize_file_button_3.on_click(visualize_selected_file)


Unnamed: 0_level_0,ICDO3site,stage,stage_simple,stage_ismeta,grade,grade_simple,residual
sample_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
TCGA-A5-A0G1,c54.1,stage ia,i,0,grade 3,High Grade,r0
TCGA-A5-A0G3,c54.1,stage iiic2,iii,0,grade 3,High Grade,r0
TCGA-A5-A0G5,c54.1,stage ib,i,0,grade 3,High Grade,r0
TCGA-A5-A0GA,c54.1,stage iiic2,iii,0,grade 3,High Grade,r0
TCGA-A5-A0GB,c54.1,stage ib,i,0,grade 3,High Grade,r0
TCGA-A5-A0GD,c54.1,stage ia,i,0,grade 2,"Low grade (1,2)",r0
TCGA-A5-A0GE,c54.1,stage ia,i,0,grade 2,"Low grade (1,2)",r0
TCGA-A5-A0GG,c54.1,stage ia,i,0,grade 1,"Low grade (1,2)",r0
TCGA-A5-A0GH,c54.1,stage ia,i,0,grade 3,High Grade,r0
TCGA-A5-A0GI,c54.1,stage ia,i,0,grade 2,"Low grade (1,2)",r0


In [7]:

output_file_button_2 = widgets.Button(
    description='Get Common Samples',
    disabled=False,
    button_style='',
    tooltip='get common samples button',
    data_file_key='output_file_name'
    )
display(output_file_button_2)
output_file_button_2.on_click(get_common_samples)

flistbx_2.observe(clear_output, names='values')

Unnamed: 0_level_0,days_to_death,days_to_last,days_survival,diag_age,race,ethnicity,gender
sample_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
TCGA-A5-A0G1,3251.0,,3251,67,white,not hispanic or latino,female
TCGA-A5-A0G3,,1079.0,1079,61,black or african american,,female
TCGA-A5-A0G5,,790.0,790,73,black or african american,,female
TCGA-A5-A0GA,543.0,,543,67,white,not hispanic or latino,female
TCGA-A5-A0GB,,275.0,275,65,white,,female
TCGA-A5-A0GE,,2717.0,2717,38,asian,not hispanic or latino,female
TCGA-A5-A0GG,,2516.0,2516,76,black or african american,not hispanic or latino,female
TCGA-A5-A0GI,,1750.0,1750,63,white,not hispanic or latino,female
TCGA-A5-A0GJ,,1447.0,1447,44,white,not hispanic or latino,female
TCGA-A5-A0GN,,1477.0,1477,65,white,not hispanic or latino,female


Unnamed: 0_level_0,ICDO3site,stage,stage_simple,stage_ismeta,grade,grade_simple,residual
sample_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
TCGA-A5-A0G1,c54.1,stage ia,i,0,grade 3,High Grade,r0
TCGA-A5-A0G3,c54.1,stage iiic2,iii,0,grade 3,High Grade,r0
TCGA-A5-A0G5,c54.1,stage ib,i,0,grade 3,High Grade,r0
TCGA-A5-A0GA,c54.1,stage iiic2,iii,0,grade 3,High Grade,r0
TCGA-A5-A0GB,c54.1,stage ib,i,0,grade 3,High Grade,r0
TCGA-A5-A0GE,c54.1,stage ia,i,0,grade 2,"Low grade (1,2)",r0
TCGA-A5-A0GG,c54.1,stage ia,i,0,grade 1,"Low grade (1,2)",r0
TCGA-A5-A0GI,c54.1,stage ia,i,0,grade 2,"Low grade (1,2)",r0
TCGA-A5-A0GJ,c54.1,stage ia,i,0,grade 2,"Low grade (1,2)",r0
TCGA-A5-A0GN,c54.1,stage ib,i,0,grade 1,"Low grade (1,2)",r0


<a id='merge'></a>

## Merge [[back to top]](#top)

### Combine two spreadsheets into one with all samples and phenotypes (NaN filled): 
* **Select your two samples x phenotypes spreadsheet files respectively from the "Select File 1" and "Select File 2" dropdown listboxes below.**
* **Press the "Merge" button and the union spreadsheet of those 2 spreadsheets will be written to a file by the name of first input file with the name of the second input file and "_Mrg" appended in the same directory of the input files.**

In [8]:
#                                         local function to read files and get common samples and write:
def merge(button):
    if len(my_file_list) == 0 or my_file_list[0] == 'No Data':
        return
    
    file_name_1 = os.path.join(target_dir, flistbx_4.value)
    file_name_2 = os.path.join(target_dir, flistbx_5.value)

    spreadsheet_1_df = pd.read_csv(file_name_1, sep='\t', index_col=0, header=0)
    spreadsheet_2_df = pd.read_csv(file_name_2, sep='\t', index_col=0, header=0)
    spreadsheet_1_samples = kn.extract_spreadsheet_gene_names(spreadsheet_1_df)
    spreadsheet_2_samples = kn.extract_spreadsheet_gene_names(spreadsheet_2_df)
    
    #all_samples_list = kn.find_unique_node_names(spreadsheet_1_samples, spreadsheet_2_samples)
    
    spreadsheet_1_phenotypes = list(spreadsheet_1_df.columns)
    spreadsheet_2_phenotypes = list(spreadsheet_2_df.columns)
    
    #all_phenotypes_list = kn.find_unique_node_names(spreadsheet_1_phenotypes, spreadsheet_2_phenotypes)
    
    spreadsheet_X_df = pd.concat([spreadsheet_1_df, spreadsheet_2_df], axis=1)
    name_base_1, file_extension_1 = os.path.splitext(file_name_1)
    name_base_2, file_extension_2 = os.path.splitext(file_name_2)
    # print(os.path.basename(name_base_2))
    # print(os.path.relpath(name_base_2,start=target_dir))
    outfile_name = name_base_1 + '_' + os.path.basename(name_base_2) + '_Mrg.tsv'
    spreadsheet_X_df.to_csv(outfile_name, sep='\t', index=True, header=True)
    print('Output written to\n', outfile_name)
    button.fname_list = [outfile_name]
    visualize_selected_file(button)

#                                         Create and display the widget controls:
flistbx_4 = widgets.Dropdown(
    options=my_file_list,
    value=my_file_list[0],
    description='Select File 1:'
)
display(flistbx_4)

visualize_file_button_4 = widgets.Button(
    description='Visualize',
    disabled=False,
    button_style='',
    tooltip='file to visulaize',
    dropdown_box=flistbx_4
    )
visualize_file_button_4.on_click(visualize_selected_file)
display(visualize_file_button_4)


flistbx_5 = widgets.Dropdown(
    options=my_file_list,
    value=my_file_list[0],
    description='Select File 2:'
)
display(flistbx_5)

visualize_file_button_5 = widgets.Button(
    description='Visualize',
    disabled=False,
    button_style='',
    tooltip='file to visulaize',
    dropdown_box=flistbx_5
    )
visualize_file_button_5.on_click(visualize_selected_file)
display(visualize_file_button_5)



output_file_button_3 = widgets.Button(
    description='Merge',
    disabled=False,
    button_style='',
    tooltip='merge button',
    data_file_key='output_file_name'
    )
output_file_button_3.on_click(merge)
display(output_file_button_3)

Widget Javascript not detected.  It may not be installed or enabled properly.


Widget Javascript not detected.  It may not be installed or enabled properly.


Widget Javascript not detected.  It may not be installed or enabled properly.


Widget Javascript not detected.  It may not be installed or enabled properly.


Widget Javascript not detected.  It may not be installed or enabled properly.


<a id='select_genes'></a>

## Select Genes [[back to top]](#top)

### Return one spreadsheet with only those genes selected from an input list: 
* **Select your genes x samples spreadsheet file and gene list file respectively from the "Select Spreadsheet File" and "Select Gene List File" dropdown listboxes below.**
* **Press the "Select Genes" button and the spreadsheet with only those genes selected will be written to a file by same name of the input spreadsheet file with "_Slct_Gn" appended in the same directory of the input files.**

In [9]:
# utility
def read_a_list_file(input_file_name):
    """
    Args:
        input_file_name:     full path name of a file containing a list
    Returns:
        a list that is contained in the file
    """
    with open(input_file_name, 'r') as fh:
        str_input = fh.read()
    return list(str_input.split())

#                                         local function to read files and get common samples and write:
def select_genes(button):
    if len(my_file_list) == 0 or my_file_list[0] == 'No Data':
        return
    
    file_name_1 = os.path.join(target_dir, flistbx_6.value)
    file_name_2 = os.path.join(target_dir, flistbx_7.value)

    gene_select_list = read_a_list_file(file_name_2)
    spreadsheet_df = pd.read_csv(file_name_1, sep='\t', index_col=0, header=0)
    gene_names = kn.extract_spreadsheet_gene_names(spreadsheet_df)
    intersection_names = kn.find_common_node_names(gene_names, gene_select_list)
    spreadsheet_intersected_df = spreadsheet_df.loc[intersection_names] 
    
    name_base_1, file_extension_1 = os.path.splitext(file_name_1)
    # print(os.path.basename(name_base_2))
    # print(os.path.relpath(name_base_2,start=target_dir))
    outfile_name = name_base_1 + '_Slt_Gn.tsv'
    spreadsheet_intersected_df.to_csv(outfile_name, sep='\t', index=True, header=True)
    print('Output written to\n', outfile_name)
    button.fname_list = [outfile_name]
    visualize_selected_file(button)


#                                         Create and display the widget controls:
flistbx_6 = widgets.Dropdown(
    options=my_file_list,
    value=my_file_list[0],
    description='Select Spreadsheet File:'
)
display(flistbx_6)

visualize_file_button_6 = widgets.Button(
    description='Visualize',
    disabled=False,
    button_style='',
    tooltip='file to visulaize',
    dropdown_box=flistbx_6
    )
visualize_file_button_6.on_click(visualize_selected_file)
display(visualize_file_button_6)


flistbx_7 = widgets.Dropdown(
    options=my_file_list,
    value=my_file_list[0],
    description='Select Gene List File:'
)
display(flistbx_7)

visualize_file_button_7 = widgets.Button(
    description='Visualize',
    disabled=False,
    button_style='',
    tooltip='file to visulaize',
    dropdown_box=flistbx_7
    )
visualize_file_button_7.on_click(visualize_selected_file)
display(visualize_file_button_7)


output_file_button_4 = widgets.Button(
    description='Select Genes',
    disabled=False,
    button_style='',
    tooltip='select genes button',
    data_file_key='output_file_name'
    )
output_file_button_4.on_click(select_genes)
display(output_file_button_4)

Widget Javascript not detected.  It may not be installed or enabled properly.


Widget Javascript not detected.  It may not be installed or enabled properly.


Widget Javascript not detected.  It may not be installed or enabled properly.


Widget Javascript not detected.  It may not be installed or enabled properly.


Widget Javascript not detected.  It may not be installed or enabled properly.


<a id='cluster_averages'></a>

##  Cluster Averages [[back to top]](#top)

### Return a spreadsheet of averages for each category given a genes x samples dataframe and a samples classification dictionary: 
* **Select your genes x samples spreadsheet file and samples classification dictionary file respectively from the "Select Spreadsheet File" and "Select Dictionary File" dropdown listboxes below.**
* **Press the "Get Cluster Averages" button the spreadsheet of averages for each category will be written to a file by the same name of the input spreadsheet file with "_Clst_Avg" appended in the same directory of the input files.**

In [10]:
#                                         local function to read files and get common samples and write:
def get_cluster_averages(button):
    if len(my_file_list) == 0 or my_file_list[0] == 'No Data':
        return
    file_name_1 = os.path.join(target_dir, flistbx_8.value)
    file_name_2 = os.path.join(target_dir, flistbx_9.value)

    spreadsheet_df = pd.read_csv(file_name_1, sep='\t', index_col=0, header=0)
    labels_df = pd.read_csv(file_name_2, sep='\t', index_col=0, names=['sample','cluster_number'])
    # print(spreadsheet_df)
    # print(labels_df)
    labels_dict = labels_df.to_dict()['cluster_number']
    cluster_numbers = list(np.unique(list(labels_dict.values())))
    labels = list(labels_dict.values())
    # labels == i is a boolean list
    cluster_ave_df = pd.DataFrame({i: spreadsheet_df.iloc[:, labels == i].mean(axis=1) for i in cluster_numbers})
    name_base_1, file_extension_1 = os.path.splitext(file_name_1)
    # print(os.path.basename(name_base_2))
    # print(os.path.relpath(name_base_2,start=target_dir))
    outfile_name = name_base_1 + '_Clst_Avg.tsv'
    cluster_ave_df.to_csv(outfile_name, sep='\t', index=True, header=True)
    print('Output written to\n', outfile_name)
    button.fname_list = [outfile_name]
    visualize_selected_file(button)

#                                         Create and display the widget controls:
flistbx_8 = widgets.Dropdown(
    options=my_file_list,
    value=my_file_list[0],
    description='Select Spreadsheet File:'
)
display(flistbx_8)

visualize_file_button_8 = widgets.Button(
    description='Visualize',
    disabled=False,
    button_style='',
    tooltip='file to visulaize',
    dropdown_box=flistbx_8
    )
visualize_file_button_8.on_click(visualize_selected_file)
display(visualize_file_button_8)


flistbx_9 = widgets.Dropdown(
    options=my_file_list,
    value=my_file_list[0],
    description='Select Dictionary File:'
)
display(flistbx_9)

visualize_file_button_9 = widgets.Button(
    description='Visualize',
    disabled=False,
    button_style='',
    tooltip='file to visulaize',
    dropdown_box=flistbx_9
    )
visualize_file_button_9.on_click(visualize_selected_file)
display(visualize_file_button_9)


output_file_button_5 = widgets.Button(
    description='Get Cluster Averages',
    disabled=False,
    button_style='',
    tooltip='get cluster averages button',
    data_file_key='output_file_name'
    )
output_file_button_5.on_click(get_cluster_averages)
display(output_file_button_5)

Widget Javascript not detected.  It may not be installed or enabled properly.


Widget Javascript not detected.  It may not be installed or enabled properly.


Widget Javascript not detected.  It may not be installed or enabled properly.


Widget Javascript not detected.  It may not be installed or enabled properly.


Widget Javascript not detected.  It may not be installed or enabled properly.


<a id='select_categorical'></a>

## Select Categorical [[back to top]](#top)

### From a genes x samples spreadsheet and a samples x phenotypes spreadsheet, return both spreadsheets with only the samples corresponding to a category in a phenotype: 
* **Select your genes x samples spreadsheet and samples x phenotypes spreadsheet respectively from the "Select G x S File" and "Select S x P File" dropdown listboxes below, and then select the phenotype id and select category from the next 2 dropdown listboxes.**
* **Press the "Select Categorical" button and the 2 spreadsheets with only the samples corresponding to a category in a phenotype will be written to 2 input files by the same names of those 2 files with "_Slct_Ctg" appended in the same directory of the input files.**

In [11]:
#                                         local function to read files and select categorical and write:
def select_categorical(button):
    if len(my_file_list) == 0 or my_file_list[0] == 'No Data':
        return
    
    file_name_1 = os.path.join(target_dir, flistbx_10.value)
    file_name_2 = os.path.join(target_dir, flistbx_11.value)
    spreadsheet_df = pd.read_csv(file_name_1, sep='\t', index_col=0, header=0)
    phenotype_df = pd.read_csv(file_name_2, sep='\t', index_col=0, header=0)
    phenotype_id = flistbx_12.value
    select_category = flistbx_13.value
    samples_list = phenotype_df.index[phenotype_df[phenotype_id] == select_category]
    #print(phenotype_df.index)
    # print(phenotype_df[phenotype_id][2])
    # print(samples_list)
    phenotype_category_df = phenotype_df.loc[samples_list]
    spreadsheet_category_df = spreadsheet_df[samples_list]
    
    name_base_1, file_extension_1 = os.path.splitext(file_name_1)
    outfile_name_1 = name_base_1 + '_Slct_Ctg.tsv'
    name_base_2, file_extension_2 = os.path.splitext(file_name_2)
    outfile_name_2 = name_base_2 + '_Slct_Ctg.tsv'
    spreadsheet_category_df.to_csv(outfile_name_1, sep='\t', index=True, header=True)
    phenotype_category_df.to_csv(outfile_name_2, sep='\t', index=True, header=True)
    print('Outputs written to\n', outfile_name_1,'\nand\n',outfile_name_2)
    button.fname_list = [outfile_name_1, outfile_name_2]
    visualize_selected_file(button)
    
def all_phenotypes(file_rel_path):
    """get all the phenotypes, i.e. column names, of the samples x phenotypes dataframe, 
 which is read from the file_rel_path file """
    try:
        phenotype_df = pd.read_csv(os.path.join(target_dir,file_rel_path), sep='\t', index_col=0, header=0)
        return list(phenotype_df.columns)
    except:
        return ['No Data or Invalid File']


def nan_unique(x):
    """a wrapper of the numpy.unique function that handles the NaN problem, 
    since numpy.unique will return multiple NaN's"""
    a = np.unique(x)
    r = []
    has_nan = False
    for i in a:
        if isinstance(i,float) and np.isnan(i):
            if has_nan: 
                continue
            else:
                has_nan = True
                r.append(i)
        else:
            r.append(i)
    return np.array(r)

def all_categories(file_rel_path,phenotype_id):
    """get all the categories, i.e. the values, of a specfic phenotype, 
    in the dataframe read from file_rel_path file"""
    try:
        phenotype_df = pd.read_csv(os.path.join(target_dir,file_rel_path), sep='\t', index_col=0, header=0)
        # print(list(np.unique(phenotype_df[phenotype_id])))
        return list(nan_unique(phenotype_df[phenotype_id]))
    except:
        return ['No Data or Invalid File']

#                                         Create and display the widget controls:
flistbx_10 = widgets.Dropdown(
    options=my_file_list,
    value=my_file_list[0],
    description='Select G x S File:'
)
display(flistbx_10)

visualize_file_button_10 = widgets.Button(
    description='Visualize',
    disabled=False,
    button_style='',
    tooltip='file to visulaize',
    dropdown_box=flistbx_10
    )
visualize_file_button_10.on_click(visualize_selected_file)
display(visualize_file_button_10)


flistbx_11 = widgets.Dropdown(
    options=my_file_list,
    value=my_file_list[0],
    description='Select S x P File:'
)
display(flistbx_11)

visualize_file_button_11 = widgets.Button(
    description='Visualize',
    disabled=False,
    button_style='',
    tooltip='file to visulaize',
    dropdown_box=flistbx_11
    )
visualize_file_button_11.on_click(visualize_selected_file)
display(visualize_file_button_11)


flistbx_12 = widgets.Dropdown(
    options=all_phenotypes(my_file_list[0]),
    value=all_phenotypes(my_file_list[0])[0],
    description='Select Phenotype Id:'
)

flistbx_13 = widgets.Dropdown(
    options=all_categories(my_file_list[0],flistbx_12.value),
    value=all_categories(my_file_list[0],flistbx_12.value)[0],
    description='Select Select Category:'
)

def handle_file_change(change):
    """the callback registered to handle changes in the 'value' 
    attribute of widget 'flist_11'"""
    flistbx_12.options = all_phenotypes(change['new']) 
    flistbx_12.value = all_phenotypes(change['new'])[0]
    flistbx_13.options = all_categories(change['new'],flistbx_12.value)
    flistbx_13.value = all_categories(change['new'],flistbx_12.value)[0]

flistbx_11.observe(handle_file_change, names='value')


def handle_phenotype_change(change):
    """the callback registered to handle changes in the 'value' 
    attribute of widget 'flist_12'"""
    flistbx_13.options = all_categories(flistbx_11.value,change['new'])
    flistbx_13.value = all_categories(flistbx_11.value,change['new'])[0]

flistbx_12.observe(handle_phenotype_change, names='value')

display(flistbx_12, flistbx_13)

output_file_button_6 = widgets.Button(
    description='Select Categorical',
    disabled=False,
    button_style='',
    tooltip='select categorical button',
    data_file_key='output_file_name'
    )
output_file_button_6.on_click(select_categorical)
display(output_file_button_6)

Widget Javascript not detected.  It may not be installed or enabled properly.


Widget Javascript not detected.  It may not be installed or enabled properly.


Widget Javascript not detected.  It may not be installed or enabled properly.


Widget Javascript not detected.  It may not be installed or enabled properly.


Widget Javascript not detected.  It may not be installed or enabled properly.


Widget Javascript not detected.  It may not be installed or enabled properly.


Widget Javascript not detected.  It may not be installed or enabled properly.
