A powerful CLI tool for importing CSV files into multiple database systems with automatic schema detection, type inference, and streaming support for large files.
- SQL Server (MSSQL) - with proper handling of special characters in column names
- PostgreSQL
- ClickHouse
- MongoDB
- ✨ Automatic type detection (string, number, boolean, date, datetime)
- 📊 Dynamic column detection from CSV headers
- 🚀 Streaming architecture - handles files of any size without memory issues
- 💾 Memory efficient - constant ~50MB memory usage regardless of file size
- 🎯 Progress tracking with visual feedback and ETA
- 🔍 Dry-run mode to preview schema without importing
- 🔄 Drop table option for fresh imports
- 🌐 Connection string support via CLI or environment variables
- 🔧 Special character handling - properly escapes brackets and special chars in column names
- 📦 Docker Compose included for quick database setup
git clone <repository>
cd csv-importer
npm install# Start all databases
docker-compose up -d
# Or use the automated setup script
./setup-docker.sh
# Or use Make commands
make setup# Copy example environment file
cp .env.docker .env
# Or configure manually (see Environment Variables section)# Development mode
npm run dev -- import -f yourfile.csv -d mongodb -t your_table
# For large files (with increased memory)
npm run dev:large -- import -f largefile.csv -d postgres -t data -b 10000
# Production mode
npm run build
npm start import -f yourfile.csv -d mongodb -t your_tablenpm install
npm run buildOr for development:
npm install
npm run dev -- import [options]# Using connection string from .env
csv-import import -f data.csv -d postgres -t my_table
# Using connection string directly
csv-import import -f data.csv -d mssql -t my_table -c "Server=localhost;Database=mydb;..."
# Dry run to preview schema
csv-import import -f data.csv -d postgres -t my_table --dry-run
# Drop existing table before import
csv-import import -f data.csv -d mongodb -t my_collection --drop
# Custom batch size
csv-import import -f data.csv -d clickhouse -t my_table -b 5000
# For large files (increases memory limit)
npm run dev:large -- import -f large-data.csv -d mongodb -t big_data -b 10000-f, --file <path> Path to CSV file (required)
-d, --database <type> Database type: mssql, postgres, clickhouse, mongodb (required)
-t, --table <name> Table/Collection name (required)
-c, --connection <string> Connection string (overrides .env)
-b, --batch-size <number> Batch size for inserts (default: 1000)
--drop Drop table/collection if exists before import
--dry-run Parse CSV and show schema without importing
The project includes a complete Docker Compose configuration for all databases.
# Start all services
docker-compose up -d
# Check status
docker-compose ps
# View logs
docker-compose logs -f# See all available commands
make help
# Setup and start everything
make setup
# Import test data to all databases
make test-all
# Import to specific database
make test-mongodb
make test-postgres
make test-clickhouse
make test-mssql
# Access database shells
make shell-mongodb
make shell-postgres
make shell-clickhouse
make shell-mssql
# SQL Server troubleshooting
make diagnose-mssql
make fix-mssqlAfter starting Docker Compose, you can access:
- Adminer (SQL databases): http://localhost:8080
- Mongo Express (MongoDB): http://localhost:8081
Create a .env file in the project root:
# SQL Server
MSSQL_CONNECTION_STRING=Server=localhost,1433;Database=testdb;User Id=sa;Password=YourStrong@Passw0rd;Encrypt=true;TrustServerCertificate=true
# PostgreSQL
POSTGRES_CONNECTION_STRING=postgresql://postgres:postgres@localhost:5432/testdb
# ClickHouse
CLICKHOUSE_CONNECTION_STRING=http://default:@localhost:8123/default
# MongoDB
MONGODB_CONNECTION_STRING=mongodb://admin:password@localhost:27017/testdb?authSource=adminThe importer uses streaming to handle files of any size efficiently:
# For large files, use the dev:large script (4GB memory limit)
npm run dev:large -- import -f huge.csv -d postgres -t data -b 10000
# Or set memory manually
NODE_OPTIONS="--max-old-space-size=8192" npm run dev -- import -f huge.csv -d mongodb -t data -b 10000Memory Usage: The importer only loads ~1000 rows for schema detection, then streams the rest. Memory usage stays constant regardless of file size.
| File Size | Rows | Memory Usage | Processing Time* |
|---|---|---|---|
| 10 MB | ~50K | ~50 MB | ~10 seconds |
| 100 MB | ~500K | ~50 MB | ~1-2 minutes |
| 1 GB | ~5M | ~50 MB | ~10-15 minutes |
| 10 GB | ~50M | ~50 MB | ~2-3 hours |
*Times are approximate and vary by database and hardware
The tool properly handles special characters, including square brackets, in column names (especially important for SQL Server).
CSV with special characters:
Time[h],I[A],U[V],P[W],Temp[°C],Δ_Value
1.5,2.3,12.0,27.6,25.5,0.12These are automatically escaped correctly:
- SQL Server:
[Time[h]]],[I[A]]],[Temp[°C]]] - PostgreSQL:
"Time[h]","I[A]","Temp[°C]" - Other databases handle them natively
The tool automatically detects column types from CSV data:
| Detected Type | SQL Server | PostgreSQL | ClickHouse | MongoDB |
|---|---|---|---|---|
| string | NVARCHAR(MAX) | TEXT | String | String |
| number | FLOAT | DOUBLE PRECISION | Float64 | Number |
| boolean | BIT | BOOLEAN | UInt8 | Boolean |
| date | DATE | DATE | Date | Date |
| datetime | DATETIME2 | TIMESTAMP | DateTime | Date |
id,name,age,is_active,created_at
1,John Doe,30,true,2024-01-15 10:30:00
2,Jane Smith,25,false,2024-01-16 14:20:00npm run dev -- import -f users.csv -d mssql -t usersnpm run dev -- import -f users.csv -d postgres -t users --dropnpm run dev -- import -f users.csv -d clickhouse -t users -b 10000npm run dev -- import -f users.csv -d mongodb -t users-
Batch Size: Adjust based on your data and database
- Small datasets: 1000 (default)
- Large datasets: 5000-10000
- Very large datasets: 10000+
-
Network: For cloud databases, larger batch sizes reduce round trips
-
Memory: Smaller batch sizes use less memory
-
Database-specific recommendations:
- SQL Server: 1,000-5,000 rows per batch
- PostgreSQL: 1,000-5,000 rows per batch
- ClickHouse: 10,000-50,000 rows per batch (handles very large batches well)
- MongoDB: 5,000-10,000 rows per batch
| File Size | Recommended Batch Size |
|---|---|
| < 100 MB | 1,000 (default) |
| 100 MB - 1 GB | 5,000 |
| 1 GB - 10 GB | 10,000 |
| > 10 GB | 10,000-50,000 (ClickHouse) |
If you encounter out-of-memory errors:
# Use the large file script
npm run dev:large -- import -f file.csv -d mongodb -t data
# Or increase memory manually
NODE_OPTIONS="--max-old-space-size=8192" npm run dev -- import -f file.csv -d mongodb -t dataThe tool provides clear error messages for common issues:
- Missing connection string
- Invalid CSV format
- Database connection failures
- Schema creation errors
- Insert failures
- Memory limit exceeded
Use the dry-run mode first to validate your CSV structure:
npm run dev -- import -f data.csv -d postgres -t my_table --dry-run# 1. Clone and setup
git clone <repository>
cd csv-importer
npm install
# 2. Start databases with Docker
docker-compose up -d
# Wait 30 seconds for databases to start
sleep 30
# 3. Configure environment
cp .env.docker .env
# 4. Test with sample data
npm run dev -- import -f sample-data.csv -d mongodb -t employees
# 5. Import your actual data
npm run dev:large -- import -f yourdata.csv -d postgres -t your_table -b 5000
# 6. Verify import
make shell-postgres
# Then: SELECT COUNT(*) FROM your_table;# Install dependencies
npm install
# Run in development mode
npm run dev -- import -f test.csv -d postgres -t test
# Run with increased memory (for large files)
npm run dev:large -- import -f large.csv -d postgres -t test
# Build for production
npm run build
# Run built version
npm start import -f test.csv -d postgres -t testcsv-importer/
├── src/
│ ├── index.ts # CLI entry point
│ ├── importer.ts # Main import logic with streaming
│ ├── parser.ts # CSV parsing & type detection
│ ├── types.ts # TypeScript interfaces
│ └── adapters/
│ ├── index.ts # Adapter factory
│ ├── mssql.ts # SQL Server adapter
│ ├── postgres.ts # PostgreSQL adapter
│ ├── clickhouse.ts # ClickHouse adapter
│ └── mongodb.ts # MongoDB adapter
├── docker-compose.yml # All databases setup
├── Makefile # Convenient commands
├── *.sh # Helper scripts
└── *.md # Documentation
Unlike traditional CSV importers that load entire files into memory, this tool:
- Reads CSV files as streams
- Processes data in configurable batches
- Maintains constant memory usage (~50MB)
- Can handle files of any size (limited only by disk space)
The tool analyzes the first 1,000 rows to detect:
- Numbers: Integer or float values
- Booleans: true/false, yes/no, 1/0
- Dates: YYYY-MM-DD format
- DateTimes: ISO 8601 and common formats
- Strings: Everything else
- SQL Server: Multi-row parameterized INSERT with proper bracket escaping
- PostgreSQL: Batch INSERT with proper identifier quoting
- ClickHouse: Native batch insert (handles very large batches)
- MongoDB: Unordered insertMany for maximum throughput
MIT