# Import data from Excel files

Load data from Excel spreadsheets (.xlsx) into Pixeltable tables.

## Problem

You have data in Excel format that needs to be loaded for AI processingâ€”reports, inventory lists, or business data exported from other systems.

| Source | Rows | Use case |
|--------|------|----------|
| Sales report.xlsx | 10K | Analyze with AI |
| Inventory.xlsx | 5K | Enrich with descriptions |
| Survey results.xlsx | 1K | Sentiment analysis |

## Solution

**What's in this recipe:**

- Import Excel files directly into tables
- Handle multiple sheets
- Override column types when needed

You use `pxt.create_table()` with an Excel file path as the `source` parameter. Pixeltable infers column types automatically.

### Setup

In [1]:
%pip install -qU pixeltable openpyxl pandas

In [2]:
import pixeltable as pxt
import pandas as pd
import tempfile
from pathlib import Path

### Create sample Excel file

In [3]:
# Create sample Excel file for demo
sample_data = pd.DataFrame(
    {
        'order_id': [1001, 1002, 1003, 1004, 1005],
        'customer': ['Alice', 'Bob', 'Carol', 'Dave', 'Eve'],
        'product': [
            'Widget A',
            'Gadget B',
            'Widget A',
            'Tool C',
            'Gadget B',
        ],
        'quantity': [2, 1, 5, 3, 2],
        'price': [29.99, 149.99, 29.99, 79.99, 149.99],
        'date': [
            '2024-01-15',
            '2024-01-16',
            '2024-01-16',
            '2024-01-17',
            '2024-01-18',
        ],
    }
)

# Save to temp Excel file
temp_dir = tempfile.mkdtemp()
excel_path = Path(temp_dir) / 'orders.xlsx'
sample_data.to_excel(excel_path, index=False)
sample_data

Unnamed: 0,order_id,customer,product,quantity,price,date
0,1001,Alice,Widget A,2,29.99,2024-01-15
1,1002,Bob,Gadget B,1,149.99,2024-01-16
2,1003,Carol,Widget A,5,29.99,2024-01-16
3,1004,Dave,Tool C,3,79.99,2024-01-17
4,1005,Eve,Gadget B,2,149.99,2024-01-18


### Import Excel file

In [4]:
# Create a fresh directory
pxt.drop_dir('excel_demo', force=True)
pxt.create_dir('excel_demo')

Connected to Pixeltable database at: postgresql+psycopg://postgres:@/pixeltable?host=/Users/pjlb/.pixeltable/pgdata
Created directory 'excel_demo'.


<pixeltable.catalog.dir.Dir at 0x30b6cb730>

In [5]:
# Import Excel file directly
orders = pxt.create_table(
    'excel_demo/orders',
    source=str(excel_path),
    source_format='excel',  # Hint for Excel format
)

Created table 'orders'.



Inserting rows into `orders`: 0 rows [00:00, ? rows/s]


Inserting rows into `orders`: 5 rows [00:00, 501.21 rows/s]


Inserted 5 rows with 0 errors.


In [6]:
# View imported data
orders.collect()

order_id,customer,product,quantity,price,date
1001,Alice,Widget A,2,29.99,2024-01-15
1002,Bob,Gadget B,1,149.99,2024-01-16
1003,Carol,Widget A,5,29.99,2024-01-16
1004,Dave,Tool C,3,79.99,2024-01-17
1005,Eve,Gadget B,2,149.99,2024-01-18


### Add computed columns

In [7]:
# Add computed column for order total
orders.add_computed_column(total=orders.quantity * orders.price)

Added 5 column values with 0 errors.


5 rows updated, 10 values computed.

In [8]:
# View with computed total
orders.select(
    orders.order_id,
    orders.customer,
    orders.product,
    orders.quantity,
    orders.price,
    orders.total,
).collect()

order_id,customer,product,quantity,price,total
1001,Alice,Widget A,2,29.99,59.98
1002,Bob,Gadget B,1,149.99,149.99
1003,Carol,Widget A,5,29.99,149.95
1004,Dave,Tool C,3,79.99,239.97
1005,Eve,Gadget B,2,149.99,299.98


## Explanation

**Import methods:**

| Method | Example |
|--------|---------|
| With source_format hint | `pxt.create_table('t', source=path, source_format='excel')` |
| Auto-detect from .xlsx | `pxt.create_table('t', source='data/xlsx')` |

**Excel-specific options:**

Pass Pandas `read_excel` arguments via `extra_args`:

```python
pxt.create_table(
    'table_name',
    source='data.xlsx',
    source_format='excel',
    extra_args={'sheet_name': 'Sheet2', 'skiprows': 1}
)
```

**Common extra_args:**

| Option | Purpose |
|--------|---------|
| `sheet_name` | Select specific sheet |
| `skiprows` | Skip header rows |
| `usecols` | Select specific columns |
| `dtype` | Force column types |

## See also

- [Import CSV files](https://docs.pixeltable.com/howto/cookbooks/data/data-import-csv) - CSV and tabular data
- [Import Parquet files](https://docs.pixeltable.com/howto/cookbooks/data/data-import-parquet) - Columnar data