# [Working with Pandas and NumPy](https://openpyxl.readthedocs.io/en/stable/pandas.html)
openpyxl is able to work with the popular libraries Pandas and NumPy

## NumPy Support
openpyxl has builtin support for the NumPy types float, integer and boolean. DateTimes are supported using the Pandas’ Timestamp type.

## Working with Pandas Dataframes
The openpyxl.utils.dataframe.dataframe_to_rows() function provides a simple way to work with Pandas Dataframes:

In [1]:
import pandas as pd
import numpy as np
from openpyxl import Workbook

In [2]:
data = np.array([[1, 2, 3], [4, 5, 6]])
df = pd.DataFrame(data)
df

Unnamed: 0,0,1,2
0,1,2,3
1,4,5,6


In [3]:
from openpyxl.utils.dataframe import dataframe_to_rows
wb = Workbook()
ws = wb.active

for r in dataframe_to_rows(df, index=True, header=True):
    ws.append(r)

While Pandas itself supports conversion to Excel, this gives client code additional flexibility including the ability to stream dataframes straight to files.

To convert a dataframe into a worksheet highlighting the header and index:

In [4]:
wb = Workbook()
ws = wb.active

for r in dataframe_to_rows(df, index=True, header=True):
    ws.append(r)

for cell in ws['A'] + ws[1]:
    cell.style = 'Pandas'

wb.save("data/pandas_openpyxl.xlsx")

Alternatively, if you just want to convert the data you can use write-only mode:

In [5]:
from openpyxl.cell.cell import WriteOnlyCell
wb = Workbook(write_only=True)
ws = wb.create_sheet()

cell = WriteOnlyCell(ws)
cell.style = 'Pandas'

def format_first_row(row, cell):
    
    for c in row:
        cell.value = c
        yield cell

rows = dataframe_to_rows(df)
first_row = format_first_row(next(rows), cell)
ws.append(first_row)

for row in rows:
    row = list(row)
    cell.value = row[0]
    row[0] = cell
    ws.append(row)

wb.save("data/openpyxl_stream.xlsx")

This code will work just as well with a standard workbook.

## Converting a worksheet to a Dataframe
To convert a worksheet to a Dataframe you can use the values property. This is very easy if the worksheet has no headers or indices:

In [6]:
wb = Workbook()
ws = wb.active

for r in dataframe_to_rows(df, index=True, header=True):
    ws.append(r)

for cell in ws['A'] + ws[1]:
    cell.style = 'Pandas'

wb.save("data/pandas_openpyxl.xlsx")

In [7]:
df = pd.DataFrame(ws.values)
df

Unnamed: 0,0,1,2,3
0,,0.0,1.0,2.0
1,,,,
2,0.0,1.0,2.0,3.0
3,1.0,4.0,5.0,6.0


If the worksheet does have headers or indices, such as one created by Pandas, then a little more work is required:

In [8]:
from itertools import islice
data = ws.values
cols = next(data)[1:]
data = list(data)
idx = [r[0] for r in data]
data = (islice(r, 1, None) for r in data)
df = pd.DataFrame(data, index=idx, columns=cols)
df

Unnamed: 0,0,1,2
,,,
0.0,1.0,2.0,3.0
1.0,4.0,5.0,6.0


---
*EOF*