Extracts metadata from dataset files like .csv, .xlsx, and more.
-
Clone the repository:
git clone https://github.com/jraa1995/Metadata_Extractor``` -
Make sure to set up venv (Virtual Env)
1. python -m venv venv 2. venv\Scripts\Activate
-
Install dependencies (requirements.txt):
pip install -r requirements.txt
-
Run the MCP Pipeline
python src/main.py data/YOUR_CSV_OR_DATAFILE_NAME.type
- Modularity - Separates concerns (CLI, logic, utils) for easier maintenance
- Testing - Dedicateed 'tests/' folder for validation
- Scalability - Ready for adding new features such as web app in 'src/web/'..
- Organization - keeps data, logs, and code separate
Let me know if you'd like to tweak this further or add sample files to data/ for testing!