<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Preparing-data-for-SmartTables" data-toc-modified-id="Preparing-data-for-SmartTables-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Preparing data for SmartTables</a></span><ul class="toc-item"><li><span><a href="#First,-we-read-each-Excel-sheet-into-a-separate-data-frame" data-toc-modified-id="First,-we-read-each-Excel-sheet-into-a-separate-data-frame-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>First, we read each Excel sheet into a separate data frame</a></span></li><li><span><a href="#Next,-we-concatenate-the-$log_2FC$-and-$p_{adj}$-columns-from-each-dataframe-into-one-overall-table,-while-preserving-the-name-of-the-sheet-where-the-columns-came-from" data-toc-modified-id="Next,-we-concatenate-the-$log_2FC$-and-$p_{adj}$-columns-from-each-dataframe-into-one-overall-table,-while-preserving-the-name-of-the-sheet-where-the-columns-came-from-1.2"><span class="toc-item-num">1.2&nbsp;&nbsp;</span>Next, we concatenate the $log_2FC$ and $p_{adj}$ columns from each dataframe into one overall table, while preserving the name of the sheet where the columns came from</a></span></li><li><span><a href="#SmartTables-can't-handle-multi-index-headers,-so-we-flatten-them" data-toc-modified-id="SmartTables-can't-handle-multi-index-headers,-so-we-flatten-them-1.3"><span class="toc-item-num">1.3&nbsp;&nbsp;</span>SmartTables can't handle multi-index headers, so we flatten them</a></span></li><li><span><a href="#If-you-are-using--Binder,-download-the-data-file-to-your-Desktop" data-toc-modified-id="If-you-are-using--Binder,-download-the-data-file-to-your-Desktop-1.4"><span class="toc-item-num">1.4&nbsp;&nbsp;</span>If you are using  Binder, <a href="Data/all_data.tab" target="_blank">download the data file</a> to your Desktop</a></span></li></ul></li></ul></div>

# Preparing data for SmartTables

## First, we read each Excel sheet into a separate data frame

In [None]:
import pandas as pd
excel_file = 'Data4Deanna.xlsx'
sheets = ['oneA','thirty3B','fiveE','oneAA','fiveE1AA','Sp245']
tab_file = 'Data/{}.tab'
data = {}
for sheet_name in sheets:
    data[sheet_name] = pd.read_excel(excel_file,
                                     sheet_name = sheet_name,
                                     index_col='locus')
    data[sheet_name].to_csv(tab_file.format(sheet_name),
                           sep='\t')

In [None]:
data['oneA'].head()

In [None]:
!head Data/oneA.tab

## Next, we concatenate the $log_2FC$ and $p_{adj}$ columns from each dataframe into one overall table, while preserving the name of the sheet where the columns came from

In [None]:
all_data = pd.concat([data[sheet_name][['log2FoldChange','padj']] 
           for sheet_name in sheets],
          axis=1,
          keys=sheets,
          names=['sheet_names','columns']              
     )
all_data.head()

## SmartTables can't handle multi-index headers, so we flatten them 

In [None]:
flat_headers = ['{}_{}'.format(sheet_name,column) 
                    for sheet_name, column in
                        zip(all_data.columns.get_level_values( 'sheet_names' ),
                            all_data.columns.get_level_values( 'columns' ) )]

all_data_w_flat_headers = all_data.copy()
all_data_w_flat_headers.columns = flat_headers
all_data_w_flat_headers.to_csv('Data/all_data.tab',
                header=flat_headers,
                sep='\t')
all_data_w_flat_headers.head()

## If you are using  Binder, [download the data file](Data/all_data.tab) to your Desktop

