Welcome to the Turing Data Cleaner App! This project is a web application built with Python and Streamlit that provides a platform to clean the data.
- CSV & XLSX Files upload: Upload CSV and XLSX files to process them
- Detect & Remove outliers Detect and remove outliers
- Remove duplicates Remove duplicates from file
- Drop missing values Drop invalid rows
- Fill missing values Fill missing values with mean
Follow these steps to set up and run the application on your local machine.
Ensure you have the following installed:
- Python 3.8 or later
- pip (Python package manager)
-
Clone the repository:
git clone git@github.com:TuringCollegeSubmissions/avoito-AE.3.5.git turing-data-cleaner cd turing-data-cleaner -
Create a virtual environment:
python -m venv venv source venv/bin/activate # On Windows, use `venv\Scripts\activate`
-
Install the dependencies:
pip install -r requirements.txt
-
Change secrets:
Adjust secrets.toml with API Keys for specific LLMs
-
Launch the Streamlit app:
streamlit run cleaner.py
-
Open your browser and go to: