# Example: Pandas to Word

## Using: mspandas.pandasDOC

This document will show you how to create a word document report from python, using the `pandasDOC` module from [mspandas](https://github.com/knanne/mspandas) library.

**NOTE:** The `pandasDOC` module is designed as an addition to the existing `python-docx` library for helping to automate the process of writing Pandas DataFrames to Word presentations. Please educate yourself on how [python-docx](https://python-docx.readthedocs.io/en/latest/) works first.  

# Contents

- [Important Info](#Important-Info)
    - [Background](#Background)
    - [Instructions](#Instructions)
- [Dependencies](#Dependencies)
- [Dummy Data](#Dummy-Data)
- [Create Presentation](#Create-Presentation)
- [Add Content](#Add-Content)
- [Save Report](#Save-Report)
- [Preview](#Preview)

# Important Info

## Background

To create a sample report, we initialize a docx.Document which will build the necessary file for us. In in this case, it is not necessary to have a template file. However, you may wish to create one with preset formatting based on your brand's styleguide.

## Instructions

The process to create the word report in python is the following:

- create a new doc object using initialize, or by importing a template 
- recursively add content to doc with one of the following processes:
    - add a paragragh to the document
    - add a table to the document with `mspandas.pandasDOC.create_table()`
    - add an image to the document (e.g. matplotlib exported image)
- save the doc object to file

**Note that this library currently only supports the creation of tables in Word.** This means, in order to add visualizations, you will need to create your own charts in python and add these as static images to the Word document.

# Dependencies

To install mspandas, refer to the [project's homepage](https://github.com/knanne/mspandas/#Installation)

You will also need the following libraries installed:  

[Python DOCX](https://python-docx.readthedocs.io/en/latest/). Install via `pip install python-docx`  

[Pandas](http://pandas.pydata.org/). Install via `pip install pandas`  

[Numpy](http://www.numpy.org/). Install via `pip install numpy`  

In [1]:
import sys
import os

# define path
mspandas = 'C:/Users/Kain/OneDrive/projects/mspandas/'

# add mspandas to path
sys.path.append(os.path.abspath(mspandas))

In [2]:
import mspandas
from mspandas.pandasDOC import Table
from mspandas.utils.doc import docFunctions
from mspandas.utils.pd import pdFunctions

In [3]:
import docx
import pandas as pd
import numpy as np

# Dummy Data


For the purpose of this demo, let's create a Pandas DataFrame made out of dummy data and add some complex structures like a multiindex and timeseries.  

In [4]:
df = pd.DataFrame(np.random.rand(6, 4),
                  columns=pd.MultiIndex.from_tuples([('Group1','A'),('Group1','B'),('Group2','C'),('Group2','D')]),
                  index=pd.MultiIndex.from_tuples([(d.strftime('%Y'),d) for d in pd.date_range(end=pd.datetime.now(), periods=6, freq='Q')]))

df.columns = df.columns.set_names(['Group','Series'])
df.index = df.index.set_names(['Year','Quarter End'])

In [5]:
df

Unnamed: 0_level_0,Group,Group1,Group1,Group2,Group2
Unnamed: 0_level_1,Series,A,B,C,D
Year,Quarter End,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2
2018,2018-09-30 12:28:45.380465,0.947519,0.725774,0.622965,0.606724
2018,2018-12-31 12:28:45.380465,0.064062,0.271584,0.661532,0.85513
2019,2019-03-31 12:28:45.380465,0.913462,0.076513,0.30185,0.354734
2019,2019-06-30 12:28:45.380465,0.710577,0.001451,0.187222,0.947276
2019,2019-09-30 12:28:45.380465,0.649732,0.28346,0.897122,0.843704
2019,2019-12-31 12:28:45.380465,0.948963,0.560266,0.836463,0.81231


# Create Presentation

In [6]:
doc = docx.Document()

# Add Content

In [7]:
h = doc.add_heading(text='mspandas Demo', level=0)

In [8]:
p = doc.add_paragraph()

In [9]:
table = Table(doc, df,
              header=True, index=True,
              dtype_format={np.datetime64:'%b %d, %Y',
                            np.float64:'{:.2f}%'},
              font_size=9, header_font_color='#FFFFFF',
              column_totals=True, row_totals=True,
              keep_names='index')

# required call to convert DataFrame to Table
output_df = table.convert()

# Save Report

In [10]:
doc.save('example_report.docx')

# Preview

![Example Page](example_page.png)