A powerful, intuitive desktop application for splitting large CSV files efficiently
Process massive datasets, split files with precision, and streamline your data workflow β‘
Built with: Python, Tkinter, Pandas
DataSplitter Pro is a robust desktop application designed to handle large CSV files with ease. Whether you're dealing with massive datasets for analysis, machine learning, or database management, this tool helps you split files into smaller, more manageable parts with precision and flexibility.
- β‘ Lightning Fast Processing - Handle large CSV files efficiently
- π― Flexible Splitting Options - Split by equal parts or custom row counts
- π οΈ Customizable Output - Add custom headers and choose output formats
- π» User-Friendly GUI - Intuitive interface built with Tkinter
- π§ Batch Processing - Process multiple files with consistent settings
- Multiple File Support - Process various CSV file formats
- File Browser Integration - Easy file selection with native dialogs
- Output Directory Control - Choose custom output locations
- Format Preservation - Maintain CSV structure and integrity
- Equal Parts Splitting - Divide files into equal-sized chunks
- Custom Row Splitting - Specify exact number of rows per file
- Flexible Configuration - Switch between splitting methods easily
- Progress Tracking - Monitor splitting progress in real-time
- Header Addition - Add custom text/headers to each split file
- Multiple Header Lines - Support for multi-line custom headers
- Output Format Control - Ensure consistent CSV output formatting
- Batch Configuration - Apply same settings across multiple splits
- Clean Interface - Modern, distraction-free GUI design
- Intuitive Controls - Easy-to-understand options and settings
- Real-time Feedback - Immediate visual feedback on operations
- Error Handling - Comprehensive error messages and validations
- Python 3.8 or higher
- Required Python packages (automatically installed)
-
Clone the repository:
git clone https://github.com/CodeWithTanim/Data-Splitter.git cd Data-Splitter -
Install dependencies:
pip install -r requirements.txt
-
Run the application:
python DataSplitterPro.py
For users who prefer not to install Python:
- Download the latest release from the Releases page
- Run the executable - No installation required!
- Start splitting your TXT, XLSX, XMML, CSV files immediately
-
Select Input File
- Click "Browse" to select your TXT, XLSX, XMML, CSV file
- Supported formats: TXT, XLSX, XMML, CSV
-
Choose Split Method
- Option 1: Split into equal parts - Divide file into N equal parts
- Option 2: Split by rows per file - Specify exact row count per output file
-
Add Custom Headers (Optional)
- Enter custom text in "Email Header Line 1" and "Line 2"
- This text will be added to the beginning of each split file
-
Configure Output
- Select output format (CSV)
- Choose output directory using "Browse"
-
Execute Split
- Click "Split Data" to start the process
- Monitor progress in the status area
- Custom Text Addition: Add specific headers or metadata to each split file
- Multiple Output Control: Process files in batches with same settings
- File Size Management: Handle files up to several GB in size
- Error Recovery: Continue processing even with malformed dat
Data-Splitter/
βββ DataSplitterPro.py # Main application entry point
βββ requirements.txt # Python dependencies
βββ README.md # Project documentation
- Split large datasets for parallel processing
- Prepare data for machine learning pipelines
- Create manageable chunks for analytical tools
- Prepare CSV files for database imports
- Handle large data migrations
- Create backup files in manageable sizes
- Process research data in batches
- Share subsets of data with collaborators
- Prepare data for statistical analysis
- Split customer data for targeted campaigns
- Process transaction records in chunks
- Manage large inventory datasets
- Memory Efficient - Processes files in chunks to handle large datasets
- Multi-threading - Non-blocking UI during file operations
- Progress Tracking - Real-time progress updates for large files
- Error Handling - Robust error recovery and user feedback
- CSV Files - Comma-separated values
- Text Files - Tab-delimited or custom delimited files
- Large Files - Handles files up to several GB in size
- Windows: Windows 7 or later
- macOS: macOS 10.12 or later
- Linux: Ubuntu 16.04 or equivalent
- RAM: 4GB minimum, 8GB recommended for large files
- Storage: Sufficient space for input and output files
Full Stack Developer & Open Source Enthusiast
We welcome contributions from developers and data enthusiasts! Here's how you can help:
- Check existing issues before creating new ones
- Provide detailed descriptions including:
- File size and structure
- Splitting method used
- Error messages received
- Steps to reproduce
- Suggest new splitting methods or output formats
- Propose UI/UX improvements
- Request additional customization options
- Fork the repository
- Create a feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
- Add unit tests for new features
- Test with various file sizes and formats
- Verify cross-platform compatibility
- Python Community - For the versatile programming language
- Pandas Team - For powerful data manipulation capabilities
- Tkinter Developers - For the reliable GUI framework
- Open Source Contributors - For continuous improvements and feedback
Need help or have questions about DataSplitter Pro?
- π§ Email: codewithtanim+support@gmail.com
- π Issues: GitHub Issues
- π¬ Discussions: GitHub Discussions
