Skip to content

CodeWithTanim/Data-Splitter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

7 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

DataSplitter Pro - CSV File Splitter πŸš€

DataSplitter Pro Banner

πŸ“Š DataSplitter Pro - Your Ultimate CSV File Management Solution

A powerful, intuitive desktop application for splitting large CSV files efficiently
Process massive datasets, split files with precision, and streamline your data workflow ⚑
Built with: Python, Tkinter, Pandas

Python Pandas Tkinter CSV Open Source


🎯 Introduction

DataSplitter Pro is a robust desktop application designed to handle large CSV files with ease. Whether you're dealing with massive datasets for analysis, machine learning, or database management, this tool helps you split files into smaller, more manageable parts with precision and flexibility.

✨ Key Highlights

  • ⚑ Lightning Fast Processing - Handle large CSV files efficiently
  • 🎯 Flexible Splitting Options - Split by equal parts or custom row counts
  • πŸ› οΈ Customizable Output - Add custom headers and choose output formats
  • πŸ’» User-Friendly GUI - Intuitive interface built with Tkinter
  • πŸ”§ Batch Processing - Process multiple files with consistent settings

πŸš€ Features

πŸ“ File Management

  • Multiple File Support - Process various CSV file formats
  • File Browser Integration - Easy file selection with native dialogs
  • Output Directory Control - Choose custom output locations
  • Format Preservation - Maintain CSV structure and integrity

πŸ”ͺ Splitting Options

  • Equal Parts Splitting - Divide files into equal-sized chunks
  • Custom Row Splitting - Specify exact number of rows per file
  • Flexible Configuration - Switch between splitting methods easily
  • Progress Tracking - Monitor splitting progress in real-time

πŸ“ Customization

  • Header Addition - Add custom text/headers to each split file
  • Multiple Header Lines - Support for multi-line custom headers
  • Output Format Control - Ensure consistent CSV output formatting
  • Batch Configuration - Apply same settings across multiple splits

🎨 User Experience

  • Clean Interface - Modern, distraction-free GUI design
  • Intuitive Controls - Easy-to-understand options and settings
  • Real-time Feedback - Immediate visual feedback on operations
  • Error Handling - Comprehensive error messages and validations

πŸ› οΈ Tech Stack

Core Technologies

Python Pandas Tkinter

Key Libraries

CSV Processing File Handling GUI Development


πŸš€ Getting Started

Prerequisites

  • Python 3.8 or higher
  • Required Python packages (automatically installed)

Installation & Setup

  1. Clone the repository:

    git clone https://github.com/CodeWithTanim/Data-Splitter.git
    cd Data-Splitter
  2. Install dependencies:

    pip install -r requirements.txt
  3. Run the application:

    python DataSplitterPro.py

Alternative Installation (Executable)

For users who prefer not to install Python:

  1. Download the latest release from the Releases page
  2. Run the executable - No installation required!
  3. Start splitting your TXT, XLSX, XMML, CSV files immediately

🎯 How to Use

Basic File Splitting

  1. Select Input File

    • Click "Browse" to select your TXT, XLSX, XMML, CSV file
    • Supported formats: TXT, XLSX, XMML, CSV
  2. Choose Split Method

    • Option 1: Split into equal parts - Divide file into N equal parts
    • Option 2: Split by rows per file - Specify exact row count per output file
  3. Add Custom Headers (Optional)

    • Enter custom text in "Email Header Line 1" and "Line 2"
    • This text will be added to the beginning of each split file
  4. Configure Output

    • Select output format (CSV)
    • Choose output directory using "Browse"
  5. Execute Split

    • Click "Split Data" to start the process
    • Monitor progress in the status area

Advanced Features

  • Custom Text Addition: Add specific headers or metadata to each split file
  • Multiple Output Control: Process files in batches with same settings
  • File Size Management: Handle files up to several GB in size
  • Error Recovery: Continue processing even with malformed dat
Watch Video
See the software in action with a complete tutorial
---

πŸ—οΈ Project Structure

Data-Splitter/
β”œβ”€β”€ DataSplitterPro.py         # Main application entry point
β”œβ”€β”€ requirements.txt           # Python dependencies
└── README.md                  # Project documentation

🎯 Use Cases

πŸ“Š Data Analysis

  • Split large datasets for parallel processing
  • Prepare data for machine learning pipelines
  • Create manageable chunks for analytical tools

πŸ’Ύ Database Management

  • Prepare CSV files for database imports
  • Handle large data migrations
  • Create backup files in manageable sizes

πŸ”¬ Research & Academia

  • Process research data in batches
  • Share subsets of data with collaborators
  • Prepare data for statistical analysis

🏒 Business Applications

  • Split customer data for targeted campaigns
  • Process transaction records in chunks
  • Manage large inventory datasets

πŸ”§ Technical Details

Performance Optimizations

  • Memory Efficient - Processes files in chunks to handle large datasets
  • Multi-threading - Non-blocking UI during file operations
  • Progress Tracking - Real-time progress updates for large files
  • Error Handling - Robust error recovery and user feedback

Supported Formats

  • CSV Files - Comma-separated values
  • Text Files - Tab-delimited or custom delimited files
  • Large Files - Handles files up to several GB in size

System Requirements

  • Windows: Windows 7 or later
  • macOS: macOS 10.12 or later
  • Linux: Ubuntu 16.04 or equivalent
  • RAM: 4GB minimum, 8GB recommended for large files
  • Storage: Sufficient space for input and output files

πŸ‘¨β€πŸ’» Developer

MD SAMIUR RAHMAN TANIM

Full Stack Developer & Open Source Enthusiast

GitHub LinkedIn Email


🀝 Contributing

We welcome contributions from developers and data enthusiasts! Here's how you can help:

πŸ› Reporting Issues

  • Check existing issues before creating new ones
  • Provide detailed descriptions including:
    • File size and structure
    • Splitting method used
    • Error messages received
    • Steps to reproduce

πŸ’‘ Feature Requests

  • Suggest new splitting methods or output formats
  • Propose UI/UX improvements
  • Request additional customization options

πŸ”§ Development Contributions

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

πŸ§ͺ Testing

  • Add unit tests for new features
  • Test with various file sizes and formats
  • Verify cross-platform compatibility

πŸ™ Acknowledgments

  • Python Community - For the versatile programming language
  • Pandas Team - For powerful data manipulation capabilities
  • Tkinter Developers - For the reliable GUI framework
  • Open Source Contributors - For continuous improvements and feedback

πŸ“ž Support

Need help or have questions about DataSplitter Pro?


⭐ If DataSplitter Pro helps you manage your data efficiently, please give it a star on GitHub!

Built with ❀️ using Python, Pandas, and Tkinter

Happy Data Processing! πŸš€

About

A powerful, user-friendly desktop application for splitting large TXT, XLSX, XML, CSV files into smaller, manageable parts with customizable options and batch processing capabilities.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages