# `xlsxwriter` plus `pandas`

So far, we've been concocting Excel worksheet out of Python structures called *lists*:

```
my_list = [1,2,3]
my_other_list = ['Red','Blue','Green']
```

However, you are probably more used to seeing data stored in structures like this:

![Example DataFrame](images/pandas-dataframe.png)

*[Image source](https://www.oreilly.com/content/data-indexing-and-selection/)*

This is an example of *tabular* data, with *rows* and *columns*. You use it in spreadsheets all the time!

When you hear tabular data in Python, think `pandas`. 

We used `pandas` earlier to read data from Excel, operate on it, and write the results into another workbook.

Now, let's combine the powers of `xlsxwriter` with `pandas` to create powerful, automated data analysis with Python and Excel. 

First, let's create a small DataFrame of the land areas of the boroughs of New York City:

In [1]:
import pandas as pd
import xlsxwriter

# Create a DataFrame of land sizes of NYC boroughs
data = {'borough':['The Bronx', 'Brooklyn', 'Manhattan', 'Queens', 'Staten Island'],'land_area':[42.10,70.82,22.83,108.53,58.37]}

df = pd.DataFrame(data)

# Doesn't this look familiar?
df


Unnamed: 0,borough,land_area
0,The Bronx,42.1
1,Brooklyn,70.82
2,Manhattan,22.83
3,Queens,108.53
4,Staten Island,58.37


Let's say we wanted to load this DataFrame into a workbook and format the output with `xlsxwriter`.

Unfortunately, `pandas` DataFrames take a couple of extra steps to use with `xlsxwriter`. Here are our steps:

1. Set the `pandas` Excel-writing engine to `xlsxwriter` with `ExcelWriter()`.  
2. Convert the DataFrame into an `xlsxwriter` object with `to_excel()`.  
3. Create workbook and worksheet objects for the resulting output with `writer.book` and `writer.sheets`.

Let's take a look: 

In [None]:
# Set Pandas engine to xlsxwriter
writer = pd.ExcelWriter('nycland.xlsx', engine='xlsxwriter')

# Convert the dataframe to an XlsxWriter Excel object.
df.to_excel(writer, sheet_name='Sheet1')

# Get the xlsxwriter objects from the DataFrame writer object.
workbook  = writer.book
worksheet = writer.sheets['Sheet1']

Were you to close your workbook now, you would see something like the below.

![DataFrame index visible in Excel export](images/nyc-land-index.png)


-  By default, our DataFrame will be written starting in `A1` of the worksheet. To write it elsewhere, check out this [`pandas` documentation](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_excel.html).  


-  The numbers in column `A` represent the ***index*** of the DataFrame. Indexes are great for hleping us access and manipulate data in `pandas`, but aren't so helpful in our finished Excel export. 


We can hide the index from displaying by including `Index = False` in our `to_excel()` method.

Let's try this again:

In [None]:
# Set Pandas engine to xlsxwriter
writer = pd.ExcelWriter('nycland.xlsx', engine='xlsxwriter')

# Convert the dataframe to an XlsxWriter Excel object.
### index = 'False' ###
df.to_excel(writer, sheet_name='Sheet1', index=False)

# Get the xsxwriter objects from the dataframe writer object.
workbook  = writer.book
worksheet = writer.sheets['Sheet1']

workbook.close()

# Drill

1. Place the steps in order for writing a `pandas` DataFrame into an `xlsxwriter` workbook. 

- Create workbook and worksheet objects for the resulting output with `writer.book` and `writer.sheets`.
- Convert the DataFrame into an `xlsxwriter` object with `to_excel()`.  
- Set the `pandas` Excel-writing engine to `xlsxwriter` with `ExcelWriter()`.  



2. Fill out the below code to write this DataFrame to a workbook named `hr.xlsx` and worksheet named `leaders`. 

In [None]:
import pandas as pd
import xlsxwriter

# Create a DataFrame
data = {'player':['Barry', 'Hank', 'Babe', 'Alex', 'Wille'],'hr':[762,755,714,696,660]}
df = pd.DataFrame(data)

# Set Pandas engine to xlsxwriter
writer = pd.ExcelWriter(___, engine='xlsxwriter')

# Convert the dataframe to an XlsxWriter Excel object.
df.to_excel(writer, ___=___, index=False)

# Get the xsxwriter objects from the dataframe writer object.
workbook  = writer.book
worksheet = writer.sheets[___]

# We can now make any changes to it 
### ADD IN A TABLE HERE ###

workbook.close()

## Customizing `pandas` output with `xlsxwriter`

You may remember that we were writing `pandas` DataFrames to Excel back at the beginning of this course. Why do it this newfangled way now?

The benefit of sending our DataFrame to `xlsxwriter` is that we can now add any formatting and analysis to the workbook in ways that would be difficult or impossible in `pandas`. 

You've already learned several useful methods for customizing workbooks from Python. Let's learn a few more. 

We'll set up our data and get started:

- Conditional formatting 
- Applying filters
- Worksheet protection

## Conditional formatting

In [14]:
import pandas as pd
import xlsxwriter

# Create a DataFrame of land sizes of NYC boroughs
data = {'borough':['The Bronx', 'Brooklyn', 'Manhattan', 'Queens', 'Staten Island'],'land_area':[42.10,70.82,22.83,108.53,58.37]}

df = pd.DataFrame(data)

# Set Pandas engine to xlsxwriter
writer = pd.ExcelWriter('nycland.xlsx', engine='xlsxwriter')

# Convert the dataframe to an XlsxWriter Excel object.
df.to_excel(writer, sheet_name='Sheet1', index=False)

# Get the xsxwriter objects from the dataframe writer object.
workbook  = writer.book
worksheet = writer.sheets['Sheet1']

# Don't close yet; we're not done!

# DRILL

Take the below DataFrame again:


In [3]:

data = {'player':['Barry', 'Hank', 'Babe', 'Alex', 'Wille'],'hr':[762,755,714,696,660]}

df = pd.DataFrame(data)