Skip to content

Automate metadata extraction for Parquet & ORC datasets (schema, outliers, contextual, skewness, semanto) with this toolkit. Compatible with Google Gemma and Meta Llama frameworks.

Notifications You must be signed in to change notification settings

varunajmera0/MetaGenAI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MetaGenAI 👋

Where Data Meets Clarity

Overview

This repository hosts tools designed to automate the extraction of metadata from datasets, enhancing data understanding and management. Leveraging cutting-edge AI models, our tools provide robust capabilities for schema extraction, outlier identification, contextual metadata generation, skewness detection, and semantic context understanding, specifically tailored for Parquet and ORC file formats.

Features

Schema Extraction: Automatically extract schemas from datasets to understand their structure and organization. Outlier Identification: Identify outliers within datasets to ensure data quality and reliability. Contextual Metadata Generation: Generate rich contextual metadata to provide deeper insights into the data's meaning and context. Skewness Detection: Detect skewness within datasets, enabling better data distribution understanding. Semantic Context Understanding: Utilize advanced AI models to understand the semantic context of data, enhancing interpretation and analysis.

AI Models Used

  • Gen AI by OpenAI (google/gemma-1.1-7b-it)

  • Meta-Llama (meta-llama/Meta-Llama-3-70B-Instruct)

  • NLP

MetaGenAI UI

MetaGenAI

MetaGenAI Granularity UI

Granularity

MetaGenAI Data Analysis UI

Data Analysis

Contributions

Contributions to this project are welcome! Whether it's bug fixes, feature enhancements, or documentation improvements, feel free to submit pull requests.

Best Regards,

Varun Ajmera

About

Automate metadata extraction for Parquet & ORC datasets (schema, outliers, contextual, skewness, semanto) with this toolkit. Compatible with Google Gemma and Meta Llama frameworks.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages