# Example: Pandas to PowerPoint

## Using: mspandas.pandasPPT

This document will show you how to create a powerpoint report from python, using the `pandasPPT` module from [mspandas](https://github.com/knanne/mspandas) library.

**NOTE:** The `pandasPPT` module is designed as an addition to the existing `python-pptx` library for helping to automate the process of writing Pandas DataFrames to PowerPoint presentations. Please educate yourself on how [python-pptx](https://python-pptx.readthedocs.io/en/latest/) works first.  

# Contents

- [Important Info](#Important-Info)
    - [Background](#Background)
    - [Instructions](#Instructions)
    - [Layouts](#Layouts)
    - [Shapes](#Shapes)
    - [Template File](#Template-File)
- [Dependencies](#Dependencies)
- [Dummy Data](#Dummy-Data)
- [Create Presentation](#Create-Presentation)
- [Map Layouts](#Map-Layouts)
- [Add Content](#Add-Content)
- [Save Report](#Save-Report)
- [Preview](#Preview)

# Important Info

## Background

To initialize a report, we utilize a "template". This is simply a blank presentation. The presentation contains a slide master by default which can be customized with preset "layouts" based on your brand's styleguide.  

Example tasks of customizing a template for reusable reporting would include:  

- Add placeholders to layouts for reporting objects to be appended to (e.g. chart, table, picture)
- Rename the placeholders with logical names for fetching in python (e.g. chart, table, picture)
- Rename the layouts with logical names for fetching in python (e.g. custom_chart, custom_chart_table)

To learn more on PowerPoint templates, read [more by Microsoft here](https://support.office.com/en-us/article/design-your-slides-53c20bd5-e594-4837-a7ad-525706e09960)  

## Instructions

The process to create the powerpoint report in python is the following, links provided to below code.

- [create a new ppt object from template](#Create-Presentation)
- [map the layouts available in template](#Map-Layouts)
- [recursively add content to ppt with the following process](#Add-Content)
    1. add a slide using an available layout
    2. map the shapes available in layout
    3. define placeholders using shape map
    4. add content to placeholders
- [save the ppt object to file](#Save-Report)

Use the following code to create a ppt object from a template file:

```python
# create new presentation from template
ppt = pptx.Presentation('template.pptx')
```

Use the following code to save a ppt object to file when finished:

```python
# save modified presentation to file
ppt.save('my_custom_presentation.pptx')
```

**Please educate yourself on the two important concepts below: Layouts and Shapes.**


## Layouts

Layouts are the predefined slide templates within a template presentation file, and are accessed and edited through the "Slide Master" in PowerPoint.  

Use the following code to map layouts in a ppt object, and add a new slide to a ppt object.  

```python
# create a map of template layouts
layout_map = Handler.map_layouts(ppt=ppt)

# add slide to ppt from layout
slide = ppt.slides.add_slide(layout_map['custom_text'])
```

## Shapes

Shapes are the predefined placeholder objects on a slide layout, used for adding content to the slide. The shapes used are: `table`, `chart`, `text`, and `picture`. 

Use the following code to map shapes in a slide layout, and add content to a shape.

```python
# create a map of layout shapes
shape_map = Handler.map_shapes(layout_map['custom_text'])

# define placeholders using shape map
shape_title = slide.placeholders[shape_map['title']]
shape_text = slide.placeholders[shape_map['text']]

# add title
shape_title.text = 'Dummy Title'

# add text
shape_text.text = 'Dummy Text'
```

**Note that charts can only be added to chart placeholders and tables can only be added to table placeholders.** This means you will need to explicitly define a slide layout with these shapes.  

## Template File

To create a reusable report, you will need to know how to edit the template file yourself. This is done from the "Slide Master" section in PowerPoint. Breif instructions on important actions are given below, otherwise consult the [official Microsoft documentation](https://support.office.com/en-us/article/what-is-a-slide-master-b9abb2a0-7aef-4257-a14e-4329c904da54) for more reading.  

To access layouts in powerpoint:
- Open the template file, go to `View > Slide Master`

To add a layout:
- In the slide master, either copy an existing layout and paste it or right click and select `Insert Slide Layout`

To rename a layout:
- In the slide master, right click on a layout and select `Rename`

To add a placeholder shape:
- In a slide layout, go to `Slide Master > Insert Placeholder`

To rename a placeholder shape:
- In a slide layout, click on a placeholder and go to `Format > Selection Pane` then double click an existing placeholder name in the menu to rename it

# Dependencies

To install mspandas, refer to the [project's homepage](https://github.com/knanne/mspandas/#Installation)

You will also need the following libraries installed:  

[Python PPTX](https://python-pptx.readthedocs.io/en/latest/). Install via `pip install python-pptx`  

[Pandas](http://pandas.pydata.org/). Install via `pip install pandas`  

[Numpy](http://www.numpy.org/). Install via `pip install numpy`  

In [1]:
import sys
import os

# define path
mspandas = 'C:/Users/Kain/SkyDrive/projects/mspandas/'

# add mspandas to path
sys.path.append(os.path.abspath(mspandas))

In [2]:
import mspandas
from mspandas.pandasPPT import Table

In [3]:
import pptx
import pandas as pd
import numpy as np

# Dummy Data


For the purpose of this demo, let's create a Pandas DataFrame made out of dummy data and add some complex structures like a multiindex and timeseries.  

In [4]:
df = pd.DataFrame(np.random.rand(6, 4),
                  columns=pd.MultiIndex.from_tuples([('Group1','A'),('Group1','B'),('Group2','C'),('Group2','D')]),
                  index=pd.MultiIndex.from_tuples([(d.strftime('%Y'),d) for d in pd.date_range(end=pd.datetime.now(), periods=6, freq='Q')]))

df.columns = df.columns.set_names(['Group','Series'])
df.index = df.index.set_names(['Year','Quarter End'])

In [5]:
df

Unnamed: 0_level_0,Group,Group1,Group1,Group2,Group2
Unnamed: 0_level_1,Series,A,B,C,D
Year,Quarter End,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2
2017,2017-03-31 23:24:58.403200,0.448374,0.036429,0.957139,0.926038
2017,2017-06-30 23:24:58.403200,0.083659,0.023015,0.116609,0.714057
2017,2017-09-30 23:24:58.403200,0.116687,0.338116,0.779541,0.241196
2017,2017-12-31 23:24:58.403200,0.659253,0.513945,0.368346,0.273506
2018,2018-03-31 23:24:58.403200,0.75673,0.928519,0.167211,0.950219
2018,2018-06-30 23:24:58.403200,0.011517,0.180494,0.65984,0.844283


# Create Presentation

In [6]:
ppt = pptx.Presentation('blank_presentation_2016.pptx')

# Convencience Functions

For your convenience, some functions are provided thru a Handler class to make accessing PPT template slides easier.

In [7]:
Handler = mspandas.pandasPPT.Handler()

# Map Layouts

`mspandas` has a helper method to return slide layouts in the powerpoint template as a dictionary with keys as layout names, for easy retrieval. The same logic is applied to shapes on layouts.

In [8]:
layout_map = Handler.map_layouts(ppt=ppt, verbose=True)

Title Slide
Title and Content
Section Header
Two Content
Two Content Modified
Comparison
Title Only
Blank
Content with Caption
Picture with Caption
Title and Vertical Text
Vertical Title and Text


# Add Content

**Recursively add content to ppt with the following process:**
    1. add a slide using an available layout
    2. map the shapes available in layout
    3. define placeholders using shape map
    4. add content to placeholders

## Add a slide using an available layout

In [9]:
slide = ppt.slides.add_slide(layout_map['Two Content Modified'])

## Map the shapes available in layout

In [10]:
shape_map = Handler.map_shapes(layout_map['Two Content Modified'], verbose=True)

Title 1 index: 0, type: TITLE (1)
Date Placeholder 4 index: 10, type: DATE (16)
Footer Placeholder 5 index: 11, type: FOOTER (15)
Slide Number Placeholder 6 index: 12, type: SLIDE_NUMBER (13)
Table Placeholder 8 index: 13, type: TABLE (12)
Chart Placeholder 10 index: 14, type: CHART (8)


## Define placeholders using shape map

In [11]:
title_shape = slide.placeholders[shape_map['Title 1']]
table_shape = slide.placeholders[shape_map['Table Placeholder 8']]

## Add content to placeholders

In [12]:
title_shape.text = 'mspandas Demo'

In [13]:
table = Table(table_shape, df,
              header=True, index=True,
              dtype_format={np.datetime64:'%b %d, %Y',
                            np.float:'{:.2f}%'},
              font_size=9, header_font_color='#FFFFFF',
              column_totals=True, row_totals=True,
              keep_names='index')

# required call to convert DataFrame to Table
output_df = table.convert()

# Save Report

In [14]:
ppt.save('example_report.pptx')

# Preview

![Example Slide](example_slide.png)