📊 Convert Excel files (XLSX, XLS) to CSV format. Handle multiple sheets, large files, and batch processing. Perfect for ETL pipelines and data integration.
A robust Excel to CSV converter that:
- Converts XLSX and XLS files to standard CSV format
- Handles multiple sheets - extract all or specific sheets
- Processes large files - memory-efficient streaming
- Preserves data types - dates, numbers, text formatted correctly
- Batch processing - convert multiple files at once
| Use Case | Description |
|---|---|
| ETL Pipelines | Transform Excel exports for data warehouses |
| Data Migration | Convert legacy Excel databases |
| API Integration | Excel → CSV → JSON for APIs |
| Reporting | Standardize financial reports |
| Data Analysis | Prepare data for Pandas, R, SQL |
| Automation | Process daily Excel email attachments |
Drag and drop your Excel file in the Apify Console.
{
"fileUrl": "https://example.com/report.xlsx"
}
{
"fileUrls": [
"https://example.com/report-q1.xlsx",
"https://example.com/report-q2.xlsx",
"https://example.com/report-q3.xlsx"
]
}
| Parameter | Type | Description |
|---|---|---|
file |
string | Upload file directly |
fileUrl |
string | URL to Excel file |
fileUrls |
array | Multiple file URLs |
| Parameter | Type | Default | Description |
|---|---|---|---|
allSheets |
boolean | true |
Convert all sheets |
sheets |
array | [] |
Specific sheet names or indices |
| Parameter | Type | Default | Description |
|---|---|---|---|
delimiter |
string | , |
Field separator: , ; \t ` |
includeHeaders |
boolean | true |
First row is headers |
dateFormat |
string | YYYY-MM-DD |
Date formatting (dayjs) |
skipEmptyRows |
boolean | true |
Omit blank rows |
| Parameter | Type | Default | Description |
|---|---|---|---|
outputToDataset |
boolean | false |
Also push rows as JSON |
{
"fileName": "sales-report.xlsx",
"sheetName": "Q1 Sales",
"sheetIndex": 0,
"rowCount": 1523,
"columnCount": 12,
"csvUrl": "https://api.apify.com/v2/key-value-stores/.../records/sales_q1.csv",
"status": "success",
"convertedAt": "2024-01-15T10:30:00.000Z"
}
Download CSV files directly from the Key-Value Store.
- Upload your Excel file or enter URL
- Configure sheet and CSV options
- Click Start
- Download CSVs from Storage → Key-Value Store
curl -X POST "https://api.apify.com/v2/acts/YOUR_USERNAME~excel-to-csv/runs?token=YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"fileUrl": "https://example.com/data.xlsx",
"allSheets": true,
"delimiter": ","
}'
import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'YOUR_TOKEN' });
const run = await client.actor('YOUR_USERNAME/excel-to-csv').call({
fileUrl: 'https://example.com/quarterly-report.xlsx',
allSheets: true,
delimiter: ',',
dateFormat: 'YYYY-MM-DD'
});
// Get conversion results
const { items } = await client.dataset(run.defaultDatasetId).listItems();
for (const item of items) {
console.log(`Sheet: ${item.sheetName}`);
console.log(`Rows: ${item.rowCount}`);
console.log(`Download: ${item.csvUrl}`);
}
from apify_client import ApifyClient
import pandas as pd
client = ApifyClient('YOUR_TOKEN')
# Convert Excel to CSV
run = client.actor('YOUR_USERNAME/excel-to-csv').call(run_input={
'fileUrl': 'https://example.com/data.xlsx'
})
# Get CSV URLs
items = client.dataset(run['defaultDatasetId']).list_items().items
# Load into pandas
for item in items:
if item['status'] == 'success':
df = pd.read_csv(item['csvUrl'])
print(f"Loaded {item['sheetName']}: {len(df)} rows")
{
"fileUrl": "https://example.com/workbook.xlsx",
"allSheets": false,
"sheets": ["Summary", "Data", "0"]
}
Note: You can use sheet names or zero-based indices
{
"fileUrl": "https://example.com/report.xlsx",
"delimiter": ";",
"dateFormat": "DD.MM.YYYY"
}
{
"fileUrl": "https://example.com/customers.xlsx",
"outputToDataset": true,
"includeHeaders": true
}
This adds each row as a JSON object to the Dataset:
{
"_sheet": "Customers",
"_file": "customers.xlsx",
"Name": "John Doe",
"Email": "john@example.com",
"SignupDate": "2024-01-15"
}
| Format | Extension | Support |
|---|---|---|
| Excel 2007+ | .xlsx | ✅ Full |
| Excel 97-2003 | .xls | ✅ Full |
| OpenDocument | .ods | ✅ Full |
| CSV (input) | .csv | ✅ Full |
| Numbers | .numbers |
| File Size | Sheets | Approx. Time | Compute Units |
|---|---|---|---|
| 1 MB | 3 | ~5 seconds | ~0.002 |
| 10 MB | 5 | ~15 seconds | ~0.008 |
| 50 MB | 10 | ~45 seconds | ~0.03 |
| 100 MB | 20 | ~2 minutes | ~0.08 |
- Node.js: 22.x
- Library: SheetJS (xlsx)
- Max File Size: ~200MB recommended
- Memory: 512MB-2GB depending on file size
- Formulas: Values only (not formula text)
- Formatting: Lost in CSV conversion
- Merged Cells: Unmerged, value in first cell
- Images/Charts: Not extracted
- Password Protected: Not supported
Dates are converted using the dateFormat parameter (default: YYYY-MM-DD). Uses dayjs formatting.
Numbers are extracted as raw values. Currency symbols and formatting are removed.
No, password-protected Excel files are not currently supported.
Recommended max is ~200MB. Larger files may timeout or run out of memory.
// 1. Fetch Excel from email/S3/API
const excelUrl = await fetchLatestReport();
// 2. Convert to CSV
const convertRun = await client.actor('YOUR_USERNAME/excel-to-csv').call({
fileUrl: excelUrl,
outputToDataset: true
});
// 3. Load into database
const { items } = await client.dataset(convertRun.defaultDatasetId).listItems();
await database.insertMany(items);
// 4. Notify completion
await sendSlackNotification(`Imported ${items.length} rows`);
MIT License - see LICENSE for details.