Important
pyopenxlsx currently uses a specialized fork of the OpenXLSX library (v1.0.0+), which includes critical performance optimizations and functional enhancements (such as custom properties and improved hyperlink handling) not currently available in the upstream repository.
pyopenxlsx is a high-performance Python binding for the OpenXLSX C++ library. It aims to provide significantly faster read/write speeds compared to pure Python libraries like openpyxl, while maintaining a Pythonic API design.
- High Performance: Powered by the modern C++17 OpenXLSX library.
- Pythonic API: Intuitive interface with properties, iterators, and context managers.
- Async Support:
async/awaitsupport for key I/O operations. - Rich Styling: Comprehensive support for fonts, fills, borders, alignments, and number formats.
- Extended Metadata: Support for both standard and custom document properties.
- Advanced Content: Support for images, hyperlinks (external/internal), and comments.
- Memory Safety: Combines C++ efficiency with Python's automatic memory management.
| Component | Technology |
|---|---|
| C++ Core | OpenXLSX |
| Bindings | nanobind |
| Build System | scikit-build-core & CMake |
# Using pip
pip install pyopenxlsx
# Using uv
uv pip install pyopenxlsx# Using uv
uv pip install .
# Or using pip
pip install .uv pip install -e .from pyopenxlsx import Workbook
# Create a new workbook
with Workbook() as wb:
ws = wb.active
ws.title = "MySheet"
# Write data
ws["A1"].value = "Hello"
ws["B1"].value = 42
ws.cell(row=2, column=1).value = 3.14
# Save
wb.save("example.xlsx")from pyopenxlsx import Workbook
with Workbook() as wb:
# Set custom document properties
wb.custom_properties["Author"] = "Curry Tang"
wb.custom_properties["Project"] = "PyOpenXLSX"
wb.save("props.xlsx")from pyopenxlsx import Workbook
with Workbook() as wb:
ws = wb.active
ws["A1"].value = "Google"
# External link
ws.add_hyperlink("A1", "https://www.google.com", tooltip="Search")
# Internal link to another sheet
ws2 = wb.create_sheet("Data")
ws["A2"].value = "See Data"
ws.add_internal_hyperlink("A2", "Data!A1")
wb.save("links.xlsx")from pyopenxlsx import load_workbook
wb = load_workbook("example.xlsx")
ws = wb["MySheet"]
print(ws["A1"].value) # Output: Hello
wb.close()pyopenxlsx provides async/await support for all I/O-intensive operations, ensuring your event loop remains responsive.
import asyncio
from pyopenxlsx import Workbook, load_workbook_async, Font
async def main():
# 1. Async context manager for automatic cleanup
async with Workbook() as wb:
ws = wb.active
ws["A1"].value = "Async Data"
# 2. Async stylesheet creation
style_idx = await wb.add_style_async(font=Font(bold=True))
ws["A1"].style_index = style_idx
# 3. Async worksheet operations
new_ws = await wb.create_sheet_async("AsyncSheet")
await new_ws.append_async(["Dynamic", "Row", 123])
# 4. Async range operations
await new_ws.range("A1:C1").clear_async()
# 5. Async save
await wb.save_async("async_example.xlsx")
# 6. Async load
async with await load_workbook_async("async_example.xlsx") as wb:
ws = wb.active
print(ws["A1"].value)
# 7. Async protection
await ws.protect_async(password="secret")
await ws.unprotect_async()
asyncio.run(main())from pyopenxlsx import Workbook, Font, Fill, Border, Side, Alignment
wb = Workbook()
ws = wb.active
# Define styles using hex colors (ARGB) or names
# Hex colors can be 6-digit (RRGGBB) or 8-digit (AARRGGBB)
font = Font(name="Arial", size=14, bold=True, color="FF0000") # Red
fill = Fill(pattern_type="solid", color="FFFF00") # Yellow
border = Border(
left=Side(style="thin", color="000000"),
right=Side(style="thin"),
top=Side(style="thick"),
bottom=Side(style="thin")
)
alignment = Alignment(horizontal="center", vertical="center", wrap_text=True)
# Apply style
style_idx = wb.add_style(font=font, fill=fill, border=border, alignment=alignment)
ws["A1"].value = "Styled Cell"
ws["A1"].style_index = style_idx
wb.save("styles.xlsx")from pyopenxlsx import Workbook
wb = Workbook()
ws = wb.active
# Insert image at A1, automatically maintaining aspect ratio
# Requires Pillow: pip install pillow
ws.add_image("logo.png", anchor="A1", width=200)
# Or specify exact dimensions
ws.add_image("banner.jpg", anchor="B5", width=400, height=100)
wb.save("images.xlsx")Comments are automatically resized to fit their content by default.
from pyopenxlsx import Workbook
wb = Workbook()
ws = wb.active
# Simple or multiline comments - all will auto-size perfectly
ws["A1"].comment = "Short comment"
ws["B2"].comment = "Line 1: High performance\nLine 2: Pythonic API\nLine 3: Auto-sized by default!"
wb.save("comments.xlsx")from pyopenxlsx import (
# Core Classes
Workbook, Worksheet, Cell, Range,
load_workbook, load_workbook_async,
# Style Classes
Font, Fill, Border, Side, Alignment, Style, Protection,
# Enums & Constants
XLColor, XLSheetState, XLLineStyle, XLPatternType, XLAlignmentStyle,
XLProperty, XLUnderlineStyle, XLFontSchemeStyle, XLVerticalAlignRunStyle,
XLFillType,
)The top-level container for an Excel file.
Workbook(filename: str | None = None, force_overwrite: bool = True)filename: Optional. If provided, opens an existing file; otherwise, creates a new workbook.force_overwrite: Optional (default:True). IfTrue, overwrites existing files when creating or saving. IfFalse, an exception is raised if the file already exists.
| Property | Type | Description |
|---|---|---|
active |
Worksheet | None |
Get or set the currently active worksheet. |
sheetnames |
list[str] |
Returns a list of all worksheet names. |
properties |
DocumentProperties |
Access standard document metadata (title, author, etc.). |
custom_properties |
CustomProperties |
Access custom document properties (dict-like). |
styles |
XLStyles |
Access underlying style object (advanced usage). |
workbook |
XLWorkbook |
Access underlying C++ workbook object (advanced usage). |
| Method | Return Type | Description |
|---|---|---|
save(filename=None, force_overwrite=True) |
None |
Save the workbook. Saves to original path if filename is omitted. |
save_async(filename=None, force_overwrite=True) |
Coroutine |
Asynchronously save the workbook. |
close() |
None |
Close the workbook and release resources. |
close_async() |
Coroutine |
Asynchronously close the workbook. |
create_sheet(title=None, index=None) |
Worksheet |
Create a new worksheet. title defaults to "Sheet1", etc. |
create_sheet_async(...) |
Coroutine |
Asynchronously create a worksheet. |
remove(worksheet) |
None |
Delete the specified worksheet. |
remove_async(worksheet) |
Coroutine |
Asynchronously delete a worksheet. |
copy_worksheet(from_worksheet) |
Worksheet |
Copy a worksheet and return the copy. |
copy_worksheet_async(...) |
Coroutine |
Asynchronously copy a worksheet. |
add_style(...) |
int |
Create a new style and return its index. See below. |
add_style_async(...) |
Coroutine |
Asynchronously create a style. |
get_embedded_images() |
list[ImageInfo] |
Get list of all images embedded in the workbook. |
get_image_data(name) |
bytes |
Get binary data of an embedded image by its name or path. |
extract_images(out_dir) |
list[str] |
Extract all images to a directory. Returns list of file paths. |
extract_images_async(...) |
Coroutine |
Asynchronously extract all images. |
def add_style(
font: Font | int | None = None,
fill: Fill | int | None = None,
border: Border | int | None = None,
alignment: Alignment | None = None,
number_format: str | int | None = None,
protection: Protection | None = None,
) -> int:Returns: A style index (int) that can be assigned to Cell.style_index.
Example:
# Pass all styles via a Style object
from pyopenxlsx import Style, Font, Fill
style = Style(
font=Font(bold=True),
fill=Fill(color="C8C8C8"), # Hex color
number_format="0.00"
)
idx = wb.add_style(style)Represents a sheet within an Excel file.
| Property | Type | Description |
|---|---|---|
title |
str |
Get or set the worksheet name. |
index |
int |
Get or set the worksheet index (0-based). |
sheet_state |
str |
Visibility: "visible", "hidden", "very_hidden". |
max_row |
int |
Returns the maximum row index used. |
max_column |
int |
Returns the maximum column index used. |
rows |
Iterator |
Iterate over all rows with data. |
has_drawing |
bool |
True if the worksheet has a drawing (images, etc.). |
drawing |
XLDrawing |
Access underlying drawing object (advanced usage). |
merges |
MergeCells |
Access merged cells information. |
protection |
dict |
Get worksheet protection status (read-only). |
| Method | Return Type | Description |
|---|---|---|
cell(row, column, value=None) |
Cell |
Get cell by 1-based indices. Optionally set value. |
range(address) |
Range |
Get range by string, e.g., ws.range("A1:C3"). |
range(start, end) |
Range |
Get range by endpoints, e.g., ws.range("A1", "C3"). |
merge_cells(address) |
None |
Merge cells, e.g., ws.merge_cells("A1:B2"). |
merge_cells_async(...) |
Coroutine |
Asynchronously merge cells. |
unmerge_cells(address) |
None |
Unmerge cells. |
unmerge_cells_async(...) |
Coroutine |
Asynchronously unmerge cells. |
append(iterable) |
None |
Append a row of data after the last used row. |
append_async(iterable) |
Coroutine |
Asynchronously append a row. |
add_hyperlink(ref, url, tooltip="") |
None |
Add an external hyperlink to a cell. |
add_internal_hyperlink(ref, loc, ...) |
None |
Add an internal hyperlink (e.g., "Sheet2!A1"). |
set_column_format(col, style_idx) |
None |
Set default style for a column. col can be int or "A". |
set_row_format(row, style_idx) |
None |
Set default style for a row. |
column(col) |
Column |
Get column object for width adjustments. |
protect(...) |
None |
Protect the worksheet. |
protect_async(...) |
Coroutine |
Asynchronously protect the worksheet. |
unprotect() |
None |
Unprotect the worksheet. |
unprotect_async() |
Coroutine |
Asynchronously unprotect. |
add_image(...) |
None |
Insert an image. |
add_image_async(...) |
Coroutine |
Asynchronously insert an image. |
def add_image(
img_path: str,
anchor: str = "A1",
width: int | None = None,
height: int | None = None,
) -> None:img_path: Path to image (PNG, JPG, GIF).anchor: Top-left cell address.width,height: Pixel dimensions. Requires Pillow for auto-detection if not provided.
| Method | Description |
|---|---|
__getitem__(key) |
Get cell by address: ws["A1"] |
The fundamental unit of data in Excel.
| Property | Type | Description |
|---|---|---|
value |
Any |
Get/Set value. Supports str, int, float, bool, datetime. |
formula |
Formula |
Get/Set formula string (without initial =). |
style_index |
int |
Get/Set style index. |
style |
int |
Alias for style_index. |
is_date |
bool |
True if the cell has a date format. |
comment |
str | None |
Get/Set cell comment. Set None to remove. |
font |
XLFont |
Get font object (read-only). |
fill |
XLFill |
Get fill object (read-only). |
border |
XLBorder |
Get border object (read-only). |
alignment |
XLAlignment |
Get alignment object (read-only). |
If is_date is True, value automatically returns a Python datetime object.
# Write
ws["A1"].value = datetime(2024, 1, 15)
ws["A1"].style_index = wb.add_style(number_format=14) # Built-in date format
# Read
print(ws["A1"].value) # datetime.datetime(2024, 1, 15, 0, 0)
print(ws["A1"].is_date) # TrueNote: Formulas must be set via the formula property, not value.
# Correct
ws["A3"].formula = "SUM(A1:A2)"
# Incorrect (treated as string)
ws["A3"].value = "=SUM(A1:A2)"Represents a rectangular area of cells.
| Property | Type | Description |
|---|---|---|
address |
str |
Range address, e.g., "A1:C3". |
num_rows |
int |
Row count. |
num_columns |
int |
Column count. |
| Method | Return Type | Description |
|---|---|---|
clear() |
None |
Clear values in all cells of the range. |
clear_async() |
Coroutine |
Asynchronously clear range. |
for cell in ws.range("A1:B2"):
print(cell.value)Font(name="Arial", size=11, bold=False, italic=False, color=None)Fill(pattern_type="solid", color=None, background_color=None)Border(left=Side(), right=Side(), top=Side(), bottom=Side(), diagonal=Side())Side(style="thin", color=None)Styles: "thin", "thick", "dashed", "dotted", "double", "hair", "medium", "mediumDashed", "mediumDashDot", "mediumDashDotDot", "slantDashDot"
Alignment(horizontal="center", vertical="center", wrap_text=True)Options: "left", "center", "right", "general", "top", "bottom"
Accessed via wb.properties. Supports dict-like access.
- Metadata:
title,subject,creator,keywords,description,last_modified_by,category,company.
wb.properties["title"] = "My Report"
print(wb.properties["creator"])Accessed via ws.column(col_index) or ws.column("A").
| Property | Type | Description |
|---|---|---|
width |
float |
Get or set the column width. |
hidden |
bool |
Get or set whether the column is hidden. |
style_index |
int |
Get or set the default style index for the column. |
Accessed via cell.formula.
| Property | Type | Description |
|---|---|---|
text |
str |
Get or set the formula string. |
| Method | Return Type | Description |
|---|---|---|
clear() |
None |
Remove the formula from the cell. |
Accessed via ws.merges. Represents the collection of merged ranges in a worksheet.
| Method | Return Type | Description |
|---|---|---|
append(reference) |
None |
Create a merged range (e.g., "A1:B2"). |
delete(index) |
None |
Remove a merged range by its index. |
find(reference) |
int |
Find the index of a merged range. Returns -1 if not found. |
__len__() |
int |
Return the number of merged ranges. |
__getitem__(index) |
XLMergeCell |
Get a merged range object by index. |
__iter__() |
Iterator |
Iterate over all merged ranges. |
__contains__(ref) |
bool |
Check if a reference is within any merged range. |
Accessed via ws._sheet.comments().
| Method | Return Type | Description |
|---|---|---|
add_author(name) |
int |
Add a new author to the workbook. |
set(ref, text, author_id=0) |
None |
Set comment for a cell reference. |
get(ref_or_idx) |
str | XLComment |
Get comment text or object. |
shape(cell_ref) |
XLShape |
Get the VML shape object for the comment box. |
count() |
int |
Number of comments in the sheet. |
Represents the visual properties of a comment box.
| Method | Return Type | Description |
|---|---|---|
style() |
XLShapeStyle |
Access size, position, and visibility properties. |
client_data() |
XLShapeClientData |
Access Excel-specific anchor and auto-fill data. |
| Method | Return Type | Description |
|---|---|---|
set_width(val) |
None |
Set box width in points. |
set_height(val) |
None |
Set box height in points. |
show() / hide() |
None |
Set comment visibility. |
| Method | Return Type | Description |
|---|---|---|
set_anchor(str) |
None |
Set box position/size using grid coordinates. |
set_auto_fill(bool) |
None |
Enable/disable automatic box sizing. |
Returned by wb.get_embedded_images().
| Attribute | Type | Description |
|---|---|---|
name |
str |
Filename of the image. |
path |
str |
Internal path in the XLSX archive. |
extension |
str |
File extension (e.g., "png"). |
def load_workbook(filename: str) -> Workbook:
"""Open an existing Excel file."""async def load_workbook_async(filename: str) -> Workbook:
"""Asynchronously open an existing Excel file."""def is_date_format(format_code: int | str) -> bool:
"""
Check if a number format code (int) or string represents a date/time format.
Useful for determining if a cell value should be treated as a datetime.
"""pyopenxlsx is built for speed. By leveraging the C++ OpenXLSX engine and providing optimized bulk operations, it significantly outperforms pure-Python alternatives.
| Scenario | pyopenxlsx | openpyxl | Speedup |
|---|---|---|---|
| Read (20,000 cells) | ~7.2ms | ~145ms | 20.2x |
| Write (1,000 cells) | ~4.8ms | ~8.1ms | 1.7x |
| Write (50,000 cells) | ~169ms | ~305ms | 1.8x |
| Bulk Write (50,000 cells) | ~74ms | N/A | 4.1x |
| Iteration (20,000 cells) | ~80ms | ~150ms | 1.9x |
| Bulk Write (1,000,000 cells) | ~1.5s | ~6.2s | 4.1x |
| Library | Execution Time | Memory Delta | CPU Load |
|---|---|---|---|
| pyopenxlsx | ~1.5s | ~400 MB | ~99% |
| openpyxl | ~6.2s | ~600 MB* | ~99% |
Note
*Memory delta for openpyxl can be misleading due to Python's garbage collection timing during the benchmark. However, pyopenxlsx consistently shows lower memory pressure for bulk operations as data is handled primarily in C++.
- C++ Foundation: Core operations happen in highly optimized C++.
- Reduced Object Overhead:
pyopenxlsxminimizes the creation of many PythonCellobjects during bulk operations. - Efficient Memory Mapping: Leverages the memory-efficient design of OpenXLSX.
- Asynchronous I/O: Key operations are available as non-blocking coroutines to maximize throughput in concurrent applications.
# Run all tests
uv run pytest
# With coverage
uv run pytest --cov=src/pyopenxlsx --cov-report=term-missingBSD 3-Clause License. The underlying OpenXLSX library is licensed under the MIT License, and nanobind under a BSD-style license.