Skip to content

Leptons1618/Ingesta

Repository files navigation

πŸš€ Ingesta - Intelligent Excel to Database Import System

A powerful Next.js application that intelligently transforms Excel files into database tables with automatic data type detection, schema generation, and multi-database support.

Next.js React TypeScript License

✨ Features

🎯 Core Functionality

  • πŸ“ Excel File Upload - Drag & drop or browse to upload multiple Excel files (.xlsx, .xls)
  • πŸ‘οΈ Live Preview - Preview Excel data with headers and sample rows before import
  • πŸ”Œ Multi-Database Support - PostgreSQL, MySQL, MSSQL, SQLite
  • πŸ”— Connection Management - Save and reuse database connections securely
  • πŸ“Š Sheet Selection - Choose which sheets to import from Excel files
  • πŸ€– Intelligent Type Detection - Automatic detection of data types (integers, decimals, dates, booleans, emails, URLs, JSON)
  • βš™οΈ Smart Table Configuration - Auto-generate table schemas with intelligent column suggestions
  • πŸ“ˆ Data Insights - Real-time analysis of data quality, null percentages, and primary key suggestions
  • βœ… Data Validation - Preview and validate data before insertion
  • 🎨 Dark Mode - Full dark mode support with system preference detection

🧠 Intelligent Features

  • Auto-Type Optimization - Database-specific type adaptation (e.g., VARCHAR β†’ NVARCHAR for MSSQL)
  • Primary Key Detection - Automatically suggests optimal primary keys based on uniqueness
  • Null Handling - Smart null value detection and conversion
  • Data Quality Insights - Identifies potential issues before table creation
  • Column Statistics - Shows unique values, null counts, and max lengths
  • Type Suggestions - Recommends optimal data types for each column

πŸ’Ύ Database Support

Database Status Features
PostgreSQL βœ… Full Support SERIAL, TIMESTAMP, JSONB, UUID support
MySQL βœ… Full Support TINYINT, ENUM, JSON support
MSSQL βœ… Full Support NVARCHAR, BIT, DATETIME2 support
SQLite βœ… Full Support INTEGER PRIMARY KEY, TEXT, REAL support

🎬 Quick Start

Prerequisites

  • Node.js 18 or higher
  • pnpm 10.15.0 or higher (recommended) or npm/yarn
  • Database server (PostgreSQL, MySQL, MSSQL, or SQLite)

Installation

  1. Clone the repository
git clone https://github.com/Leptons1618/Ingesta.git
cd Ingesta
  1. Install dependencies
# Using pnpm (recommended)
corepack enable
corepack pnpm install

# Or using npm
npm install
  1. Run the development server
# Using pnpm
corepack pnpm dev

# Or using npm
npm run dev
  1. Open your browser
http://localhost:3000

πŸ“– Usage Guide

1️⃣ Upload Excel Files

  • Drag and drop Excel files or click to browse
  • Supports multiple file uploads
  • Preview data before proceeding

2️⃣ Connect to Database

  • Click "Connect to Database"
  • Select database type (PostgreSQL, MySQL, MSSQL, SQLite)
  • Enter connection details (host, port, username, password, database name)
  • Save connection for future use

3️⃣ Select Sheets

  • Choose which sheets to import from your Excel files
  • Preview sheet data with headers
  • Select multiple sheets for batch processing

4️⃣ Configure Tables

The intelligent table configuration interface provides:

Automatic Features:

  • βœ… Auto-detected data types for each column
  • βœ… Primary key suggestions based on data uniqueness
  • βœ… Column statistics (unique values, null counts, max length)
  • βœ… Data quality insights with emoji indicators

Manual Controls:

  • Edit table names
  • Modify column names
  • Adjust data types (dropdown with database-specific types)
  • Toggle nullable constraints
  • Add/remove columns
  • Set primary keys

Smart Insights Panel:

  • πŸ“Š Data Quality: Percentage of non-null values
  • πŸ”‘ Primary Key Candidates: Columns suitable as primary keys
  • 🎯 Type Distribution: Breakdown of detected types
  • ⚠️ Warnings: Potential issues (high null percentage, no primary key)

5️⃣ Create Tables & Insert Data

  • Click "Create N Tables" button
  • Tables are created with optimized schemas
  • Data is inserted with proper type conversion
  • Null values handled intelligently
  • Progress shown for each table

6️⃣ Preview Created Tables

  • View all created tables with row counts
  • Preview first 10 rows of each table
  • Navigate between tables
  • Verify data insertion success

πŸ—οΈ Project Structure

ingesta/
β”œβ”€β”€ app/                          # Next.js App Router
β”‚   β”œβ”€β”€ layout.tsx               # Root layout with theme provider
β”‚   β”œβ”€β”€ page.tsx                 # Main application page (8-step workflow)
β”‚   β”œβ”€β”€ globals.css              # Global styles
β”‚   └── api/                     # API routes
β”‚       β”œβ”€β”€ database/            # Database operations
β”‚       β”‚   β”œβ”€β”€ connections/     # List saved connections
β”‚       β”‚   β”œβ”€β”€ create-table/    # Create table endpoint
β”‚       β”‚   β”œβ”€β”€ insert-data/     # Insert data endpoint
β”‚       β”‚   β”œβ”€β”€ preview-table/   # Preview table data
β”‚       β”‚   β”œβ”€β”€ test-connection/ # Test DB connection
β”‚       β”‚   └── get-tables/      # List tables in database
β”‚       └── excel/               # Excel operations
β”‚           β”œβ”€β”€ analyze/         # Analyze Excel files
β”‚           └── upload/          # Upload Excel files
β”œβ”€β”€ components/                   # React components
β”‚   β”œβ”€β”€ ui/                      # shadcn/ui components
β”‚   β”‚   β”œβ”€β”€ button.tsx
β”‚   β”‚   β”œβ”€β”€ card.tsx
β”‚   β”‚   β”œβ”€β”€ input.tsx
β”‚   β”‚   β”œβ”€β”€ select.tsx
β”‚   β”‚   β”œβ”€β”€ table.tsx
β”‚   β”‚   └── ...                  # Other UI components
β”‚   β”œβ”€β”€ database-connection-form.tsx
β”‚   β”œβ”€β”€ database-connection-list.tsx
β”‚   β”œβ”€β”€ excel-preview.tsx
β”‚   β”œβ”€β”€ file-upload-zone.tsx
β”‚   β”œβ”€β”€ sheet-selection-interface.tsx
β”‚   β”œβ”€β”€ table-creation-interface.tsx  # ⭐ Intelligent configuration
β”‚   β”œβ”€β”€ table-preview-interface.tsx
β”‚   β”œβ”€β”€ results-dashboard.tsx
β”‚   └── theme-provider.tsx
β”œβ”€β”€ lib/                         # Core business logic
β”‚   β”œβ”€β”€ database-manager.ts      # Database abstraction layer
β”‚   β”œβ”€β”€ excel-parser.ts          # Excel file parsing
β”‚   β”œβ”€β”€ data-type-detector.ts    # 🧠 Intelligent type detection
β”‚   β”œβ”€β”€ data-transformer.ts      # Data transformation utilities
β”‚   β”œβ”€β”€ data-cleaner.ts          # Data cleaning and validation
β”‚   β”œβ”€β”€ connection-storage.ts    # Connection persistence
β”‚   β”œβ”€β”€ operation-tracker.ts     # Operation history tracking
β”‚   └── utils.ts                 # Utility functions
β”œβ”€β”€ public/                      # Static assets
β”œβ”€β”€ styles/                      # Additional styles
β”œβ”€β”€ package.json                 # Dependencies
β”œβ”€β”€ tsconfig.json               # TypeScript configuration
β”œβ”€β”€ next.config.mjs             # Next.js configuration
β”œβ”€β”€ tailwind.config.ts          # Tailwind CSS configuration
└── README.md                   # This file

🎨 UI Components

Built with:

  • shadcn/ui - High-quality React components
  • Radix UI - Accessible component primitives
  • Tailwind CSS - Utility-first CSS framework
  • Lucide Icons - Beautiful icon set
  • next-themes - Dark mode support

πŸ”§ Configuration

Database Connection Storage

Connections are stored in localStorage with the following structure:

interface DatabaseConfig {
  id: string
  name: string
  type: 'postgresql' | 'mysql' | 'mssql' | 'sqlite'
  host: string
  port: number
  username: string
  password: string  // Stored securely in browser
  database: string
  ssl: boolean
}

Data Type Detection

The system uses intelligent algorithms to detect:

  • Integers (INT, BIGINT, TINYINT, SMALLINT)
  • Decimals (DECIMAL, FLOAT, REAL)
  • Strings (VARCHAR, TEXT, NVARCHAR)
  • Booleans (BOOLEAN, BIT)
  • Dates (DATE, DATETIME, DATETIME2, TIMESTAMP)
  • JSON (JSON, JSONB)
  • Special types (EMAIL, URL, UUID)

Type Adaptation by Database

Detected Type PostgreSQL MySQL MSSQL SQLite
Integer INTEGER INT INT INTEGER
Big Integer BIGINT BIGINT BIGINT INTEGER
Decimal DECIMAL(10,2) DECIMAL(10,2) DECIMAL(10,2) REAL
String (short) VARCHAR(50) VARCHAR(50) NVARCHAR(50) TEXT
String (long) TEXT TEXT NVARCHAR(MAX) TEXT
Boolean BOOLEAN BOOLEAN BIT INTEGER
Date DATE DATE DATE TEXT
DateTime TIMESTAMP DATETIME DATETIME2 TEXT
JSON JSONB JSON NVARCHAR(MAX) TEXT

πŸš€ Deployment

Build for Production

# Using pnpm
corepack pnpm build

# Or using npm
npm run build

Start Production Server

# Using pnpm
corepack pnpm start

# Or using npm
npm start

Environment Variables (Optional)

Create a .env.local file for environment-specific configuration:

# Optional: Set custom ports
PORT=3000

# Optional: Database connection strings (if pre-configuring)
DATABASE_URL=postgresql://user:password@localhost:5432/mydb

πŸ› Known Issues & Fixes

Recent Fixes (v2.5)

  • βœ… Fixed multiple table creation navigation bug
  • βœ… Fixed table preview component data structure mismatch
  • βœ… Fixed dark mode styling inconsistencies
  • βœ… Fixed font consistency across interfaces
  • βœ… Fixed null/undefined handling in row counts
  • βœ… Fixed button state during table creation

See MULTIPLE_TABLE_CREATION_FIX.md for detailed fix documentation.

πŸ“ API Endpoints

Database Operations

  • POST /api/database/test-connection - Test database connection
  • POST /api/database/create-table - Create table with schema
  • POST /api/database/insert-data - Insert data into table (supports batch-size tuning + cleaning controls)
  • POST /api/database/preview-table - Preview table data (first 10 rows)
  • POST /api/database/get-tables - List all tables in database

Excel Operations

  • POST /api/excel/analyze - Analyze Excel file structure
  • POST /api/excel/upload - Upload and parse Excel files

⚑ UX + Scalability Upgrade Highlights

Recent platform upgrades focus on two goals: making the import workflow easier for non-expert users and making large imports safer to run in production-like environments.

1) Workflow Guidance Layer (User Experience)

  • The main workflow now surfaces a readiness score, top blockers, and recommendations.
  • Guidance is computed from live workflow state (files analyzed, connection state, selected sheets, created tables, recent operations) to keep users on a high-confidence path.
  • This improves onboarding by telling users what to do next instead of relying on implicit UI knowledge.

2) Import Execution Controls (User + Operator Friendly)

  • Table creation now includes execution settings for:
    • batch size,
    • null handling strategy,
    • empty-row skipping,
    • string trimming,
    • optional type conversion,
    • optional continue-on-batch-error mode.
  • These controls allow the same product to serve both quick small imports and more careful enterprise-style runs.

3) Batch/Chunk-Oriented Insert Path (Scale)

  • /api/database/insert-data now processes incoming rows in bounded batches.
  • Batch metadata is returned (total/processed/failed batches, duration, chunk errors), giving users and integrators observability into import health.
  • Chunking reduces peak request-processing pressure and provides failure isolation for large datasets.

For the full API contract and examples, see /home/runner/work/Ingesta/Ingesta/docs/API_ROUTES.md.

🀝 Contributing

Contributions are welcome! Please follow these steps:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Development Guidelines

  • Follow TypeScript best practices
  • Use meaningful variable and function names
  • Add comments for complex logic
  • Test with multiple database types
  • Ensure responsive design works on all screen sizes
  • Follow the existing code style and conventions

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ‘¨β€πŸ’» Author

Leptons1618

πŸ™ Acknowledgments

πŸ“Š Tech Stack

Frontend

  • Framework: Next.js 15.2.4 (App Router)
  • UI Library: React 19.0.0
  • Language: TypeScript 5.0
  • Styling: Tailwind CSS 4.1.9
  • Components: Radix UI + shadcn/ui
  • Icons: Lucide React
  • Theme: next-themes

Backend

  • Runtime: Node.js β‰₯18
  • API: Next.js API Routes
  • Databases: PostgreSQL, MySQL, MSSQL, SQLite
  • Excel Parsing: xlsx
  • Type Safety: TypeScript with strict mode

Development

  • Package Manager: pnpm 10.15.0
  • Build Tool: Next.js Compiler (Turbopack)
  • Linting: ESLint
  • Type Checking: TypeScript Compiler

🎯 Roadmap

Planned Features

  • Batch operations support (update, delete)
  • CSV import/export
  • Data transformation rules (custom mapping)
  • Scheduled imports
  • Import history and rollback
  • Advanced filtering and search
  • Column mapping templates
  • Database schema migration tools
  • RESTful API for programmatic access
  • Cloud database support (AWS RDS, Azure SQL)
  • Docker containerization
  • Multi-user support with authentication

πŸ”’ Security Notes

  • Database credentials are stored in browser localStorage
  • No server-side credential persistence
  • SSL/TLS support for secure database connections
  • Sanitized table and column names to prevent SQL injection
  • Input validation on all forms

⚠️ Important: For production use, implement proper authentication and credential management.

πŸ’‘ Tips & Best Practices

Excel File Preparation

  • Use clear, descriptive column headers in the first row
  • Avoid merged cells
  • Use consistent data formats within columns
  • Remove unnecessary formatting
  • Keep data types consistent within columns

Database Performance

  • Add indexes after bulk imports
  • Use appropriate data types for better storage efficiency
  • Consider table partitioning for large datasets
  • Regular maintenance and optimization

Common Workflows

Single Sheet Import:

  1. Upload Excel file
  2. Connect to database
  3. Select sheet
  4. Review auto-detected schema
  5. Create table & insert data
  6. Preview results

Multi-Sheet Batch Import:

  1. Upload Excel file with multiple sheets
  2. Connect to database
  3. Select all relevant sheets
  4. Configure tables in bulk
  5. Create all tables at once
  6. Review all table previews

🐞 Troubleshooting

Connection Issues

  • Verify database server is running
  • Check firewall rules
  • Ensure correct port numbers
  • Verify credentials

Import Errors

  • Check Excel file format (.xlsx supported)
  • Ensure data types are consistent
  • Review column names (no special characters)
  • Check for duplicate column names

Performance Issues

  • Reduce batch size for large files
  • Create indexes after import
  • Use appropriate data types
  • Consider database server resources

πŸ“š Documentation

Additional documentation:


Made with ❀️ by Leptons1618

⭐ Star this repo if you find it useful!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors