Here's a complete GitHub README for the Data Helper project:
Data Helper is a powerful Python application designed to streamline your data cleaning and exploration process. With its intuitive graphical user interface (GUI), this tool empowers you to effortlessly load and cleanse your data files, leaving you with a pristine dataset ready for analysis.
- Seamless Data Ingestion: Support for popular file formats like CSV and Excel, ensuring compatibility with your existing data sources.
- Intelligent Data Cleaning: Automatically removes rows with missing values in the specified key column, ensuring data integrity.
- Comprehensive Summary Statistics: Gain valuable insights with overall descriptive statistics and column-wise detailed summaries.
- Data Visualization: Explore your cleaned dataset through an interactive tabular view with scrolling capabilities.
- User-Friendly GUI: Enjoy a smooth and intuitive experience with the sleek Tkinter-based interface.
- Leveraging Industry-Standard Libraries: Harness the power of Pandas, NumPy, and other renowned data analysis libraries.
- Robust and Extensible: Built with a modular architecture, ensuring scalability and future enhancements.
- Error Handling and Input Validation: Implemented robust error handling and input validation mechanisms for reliable operation.
Click the image above to watch the video on YouTube.
- Clone the repository:
git clone https://github.com/your-username/data-helper.git
- Navigate to the project directory:
cd data-helper
- Install the required dependencies:
pip install -r requirements.txt
-
Place your data files (CSV or Excel) in the
datadirectory within the project. -
Run the application:
python main.py
-
The application will prompt you to select a file from the
datadirectory. -
Choose the key column you want to use for data cleaning.
-
The application will display the cleaned data summary, including overall and column-wise statistics, along with an interactive tabular view.
Contributions are welcome! If you have any suggestions, bug reports, or feature requests, please open an issue or submit a pull request.
This project is licensed under the MIT License.
