# [Working with Pandas and NumPy](https://openpyxl.readthedocs.io/en/latest/pandas.html)


openpyxl is able to work with the popular libraries [`Pandas`](http://pandas.pydata.org) and [`NumPy`](http://numpy.org)


## NumPy Support


openpyxl has builtin support for the NumPy types float, integer and boolean.
DateTimes are supported using the Pandas' Timestamp type.


## Working with Pandas Dataframes


The `openpyxl.utils.dataframe.dataframe_to_rows` function provides a
simple way to work with Pandas Dataframes:

In [1]:
import pandas as pd

raw_data = {'first_name': ['Jason', 'Molly', 'Tina', 'Jake', 'Amy'], 
        'last_name': ['Miller', 'Jacobson', 'Ali', 'Milner', 'Cooze'], 
        'age': [42, 52, 36, 24, 73], 
        'preTestScore': [4, 24, 31, 2, 3],
        'postTestScore': [25, 94, 57, 62, 70]}
df = pd.DataFrame(raw_data, columns = ['first_name', 'last_name', 'age', 'preTestScore', 'postTestScore'])
df

Unnamed: 0,first_name,last_name,age,preTestScore,postTestScore
0,Jason,Miller,42,4,25
1,Molly,Jacobson,52,24,94
2,Tina,Ali,36,31,57
3,Jake,Milner,24,2,62
4,Amy,Cooze,73,3,70


In [2]:
from openpyxl import Workbook
from openpyxl.utils.dataframe import dataframe_to_rows

wb = Workbook()
ws = wb.active

for r in dataframe_to_rows(df, index=True, header=True):
    ws.append(r)

wb.save("pandas1_openpyxl.xlsx")

While Pandas itself supports conversion to Excel, this gives client code
additional flexibility including the ability to stream dataframes straight to
files.

To convert a dataframe into a worksheet highlighting the header and index:

In [3]:
wb = Workbook()
ws = wb.active

for r in dataframe_to_rows(df, index=True, header=True):
    ws.append(r)
    
for cell in ws['A'] + ws[1]:
    cell.style = 'Pandas'

wb.save("pandas2_openpyxl.xlsx")

Alternatively, if you just want to convert the data you can use write-only mode::

In [4]:
from openpyxl.cell.cell import WriteOnlyCell

wb = Workbook(write_only=True)
ws = wb.create_sheet()
cell = WriteOnlyCell(ws)
cell.style = 'Pandas'

def format_first_row(row, cell):
    for c in row:
        cell.value = c
        yield cell

rows = dataframe_to_rows(df)

first_row = format_first_row(next(rows), cell)
ws.append(first_row)

for row in rows:
    row = list(row)
    cell.value = row[0]
    row[0] = cell
    ws.append(row)

wb.save("openpyxl_stream.xlsx")

This code will work just as well with a standard workbook.


## Converting a worksheet to a Dataframe


To convert a worksheet to a Dataframe you can use the `values` property. This
is very easy if the worksheet has no headers or indices:

In [5]:
wb = Workbook()
ws2 = wb.active

for r in dataframe_to_rows(df, index=True, header=True):
    ws2.append(r)

In [8]:
ws_df1 = pd.DataFrame(ws2.values)

If the worksheet does have headers or indices, such as one created by Pandas,
then a little more work is required:

In [10]:
from itertools import islice

data = ws2.values
cols = next(data)[1:]
data = list(data)
idx = [r[0] for r in data]
data = (islice(r, 1, None) for r in data)
ws_df2 = pd.DataFrame(data, index=idx, columns=cols)

ws_df2

Unnamed: 0,first_name,last_name,age,preTestScore,postTestScore
,,,,,
0.0,Jason,Miller,42.0,4.0,25.0
1.0,Molly,Jacobson,52.0,24.0,94.0
2.0,Tina,Ali,36.0,31.0,57.0
3.0,Jake,Milner,24.0,2.0,62.0
4.0,Amy,Cooze,73.0,3.0,70.0


---