# Pandas XL Writer functions

### Writing Pandas projects to an Excel workbook
***
This notebook contains Python functions useful for outputting an Excel, project workbook whose sheets are one or more Pandas DataFrames.  This is useful when a Python project builds up and completes several, related DataFrames whose data needs to be shared with a consulting client or other user.  Usage is to call the XLWriterPrep function for each DataFrame that will later be included in the Excel workbook.  The XLWriter function is then called at the end of the Python code to create the Excel workbook. The XLWriter functions facilitate using this for organized creation of a formatted workbook whose columns use specified number formats and which have specified column widths to control the Excel data appearance. XLWriter uses [Pandas ExcelWriter](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.ExcelWriter.html). [Here is a useful ExcelWriter code example](https://xlsxwriter.readthedocs.io/example_pandas_column_formats.html) from the documentation.  

The XLPrep Python function updates four lists used later by XLWriter.  XLPrep adds the DataFrame to a DataFrames list and adds a sheet name to a list of those. Two lists of lists hold formatting specifications for the columns on each sheet.  XLPrep adds a blank list for Excel number formats and a blank Excel column width list to these lists of lists for those items.  These list elements contain the same number of blank items as the DataFrame has columns.  The list elements can then be manually updated to specify Excel number formats and column widths for each DataFrame's columns.

J.D. Landgrebe,

October 25, 2019

In [39]:
import pandas as pd
import numpy as np

## XLWriter functions

In [40]:
#Initialize XLWriter lists
list_dfs = []
list_shts = []
list_fmts = []
list_colwidths = []

#Loads new item into lists of DataFrames, sheet names.  Initializes column formats
def XLWriterPrep(lst_dfs, lst_shts, lst_fmts, lst_colwidths, df,sht):
    lst_dfs.append(df)
    lst_shts.append(sht)
    lst_fmts.append([])
    lst_colwidths.append([])
    for i in range(len(df.index.names) + len(df.columns)):
        lst_fmts[len(lst_fmts) - 1].append('')
        lst_colwidths[len(lst_colwidths) - 1].append(0)
    return lst_dfs, lst_shts, lst_fmts, lst_colwidths

# Write list of DataFrames to Excel workbook as separate worksheets
def XLWriter(wkbk, lst_dfs, lst_shts, lst_fmts, lst_colwidths):
    writer = pd.ExcelWriter(wkbk, engine='xlsxwriter')
    worksheet = []
    workbook = writer.book
    for i in range(len(lst_dfs)):
        lst_dfs[i].to_excel(writer, sheet_name=lst_shts[i])
        worksheet.append(writer.sheets[lst_shts[i]])
    
    #Add all uniqueformats to a dict
    dict_fmts = {}
    format = []
    k = 0
    for i in range(len(lst_fmts)):
        for j in range(len(lst_fmts[i])):
            curfmt = lst_fmts[i][j]
            if len(curfmt) > 0 and curfmt not in dict_fmts:
                dict_fmts[curfmt] = k #Save the index, k, as dictionary value for later
                format.append(workbook.add_format({'num_format': curfmt}))
                k += 1
    
    #Assign specified formats and column widths to each sheet
    for i in range(len(lst_shts)):
        
        #create pd.ExcelWriter object for each sheet
        worksheet = writer.sheets[lst_shts[i]]
        
        #Assign any specified column widths and number formats
        for j in range(1,len(lst_fmts[i])):
            colstr = XLColString(j + 1)
            colwidth = None
            fmt = None
            
            if lst_colwidths[i][j] > 0: colwidth = lst_colwidths[i][j]
            if len(lst_fmts[i][j]) > 0: fmt = lst_fmts[i][j]

            if fmt != None:
                worksheet.set_column(colstr, colwidth, format[dict_fmts[fmt]])
            else:
                worksheet.set_column(colstr, colwidth, None)
                
    #Write the workbook and return
    writer.save()
    return()

#Converts an integer, icol, into an Excel column range (Example: icol = 30 --> 'DD:DD')
def XLColString(icol):
    alphabet = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
    mult = icol//26 #Python floor division
    letters = alphabet[(icol % 26) - 1] * (mult + 1) #Python modulo
    return letters + ':' + letters

### Demo of XLWriter usage

In [41]:
df = pd.DataFrame([['Company A','Product A',27.46],
                   ['Company B','Product B',18.33],
                   ['Company C','Product C',14.0],
                   ['Company A','Product A',19.27], 
                   ['Company C','Product B',94.17],
                   ['Company B','Product B',18.13],
                   ['Company B','Product B',15.05],
                   ['Company C','Product B',19.25],
                   ['Company A','Product B',27.6]], 
                  columns=['Company','Product','Revenue'])
df

Unnamed: 0,Company,Product,Revenue
0,Company A,Product A,27.46
1,Company B,Product B,18.33
2,Company C,Product C,14.0
3,Company A,Product A,19.27
4,Company C,Product B,94.17
5,Company B,Product B,18.13
6,Company B,Product B,15.05
7,Company C,Product B,19.25
8,Company A,Product B,27.6


In [42]:
#Add the DataFrame as a sheet for Excel output. First list position is index column A
XLWriterPrep(list_dfs,list_shts,list_fmts,list_colwidths,df,'Sales Data')
i = len(list_dfs)-1
list_fmts[i] = ['','@','@','$#,##0.00']
list_colwidths[i] = [0,12,12,10]
print(len(list_dfs),list_fmts, list_colwidths)

1 [['', '@', '@', '$#,##0.00']] [[0, 12, 12, 10]]


In [43]:
df_summ = df.groupby(['Company','Product']).sum()
df_summ.reset_index(inplace=True)
df_summ

Unnamed: 0,Company,Product,Revenue
0,Company A,Product A,46.73
1,Company A,Product B,27.6
2,Company B,Product B,51.51
3,Company C,Product B,113.42
4,Company C,Product C,14.0


In [44]:
#Add the DataFrame as a sheet for Excel output. First list position is index column A
XLWriterPrep(list_dfs,list_shts,list_fmts,list_colwidths,df_summ,'Sales Summary')
i = len(list_dfs)-1
list_fmts[i] = ['','@','@','$#,##0']
list_colwidths[i] = [0,12,12,10]
print(len(list_dfs),list_fmts, list_colwidths)

2 [['', '@', '@', '$#,##0.00'], ['', '@', '@', '$#,##0']] [[0, 12, 12, 10], [0, 12, 12, 10]]


In [45]:
#XL writer - create an Excel workbook with the DataFrames on sheets
XLWriter('AllData.xlsx',list_dfs,list_shts,list_fmts,list_colwidths)

()