Skip to content

Metadata extractor allows you to extract the metadata properties of the dataset you are working on to gather initial information. Similar to .info or .describe but a bit more robust.

Notifications You must be signed in to change notification settings

jraa1995/Metadata_Extractor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Metadata_Extractor

Extracts metadata from dataset files like .csv, .xlsx, and more.

Setup

  1. Clone the repository:

    git clone https://github.com/jraa1995/Metadata_Extractor```
    
  2. Make sure to set up venv (Virtual Env)

    1. python -m venv venv
    2. venv\Scripts\Activate
  3. Install dependencies (requirements.txt):

    pip install -r requirements.txt
  4. Run the MCP Pipeline

    python src/main.py data/YOUR_CSV_OR_DATAFILE_NAME.type

Structure

  • Modularity - Separates concerns (CLI, logic, utils) for easier maintenance
  • Testing - Dedicateed 'tests/' folder for validation
  • Scalability - Ready for adding new features such as web app in 'src/web/'..
  • Organization - keeps data, logs, and code separate

Let me know if you'd like to tweak this further or add sample files to data/ for testing!

About

Metadata extractor allows you to extract the metadata properties of the dataset you are working on to gather initial information. Similar to .info or .describe but a bit more robust.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages