🧾 Image-to-Structured-Menu Data Extractor

This project is a Streamlit-based web app that performs structured data extraction from menu images using Google Gemini (via LangChain) and outputs a clean, downloadable CSV file. It detects food items, categories, types (Veg/Non-Veg), descriptions, prices, and add-ons from menu images.

📌 Features

🔍 Extracts structured data (restaurant, category, item, price, type, add-ons) from menu images.
📦 Uses Google Gemini (Generative AI) via LangChain with structured output schema.
🖼️ Supports image upload (JPG, PNG, JPEG).
📋 Displays extracted data as a table in the app.
💾 Exports the result as a CSV file for download.

🧠 Tech Stack

Python
Streamlit – Web UI
LangChain – LLM chaining and structured output
Google Gemini API – Image understanding and extraction
Pydantic – Schema validation for structured data
Pandas – Dataframe manipulation
dotenv – Environment variable handling

📁 Project Structure

.
├── app.py                  # Streamlit frontend
├── model.py                # Image processing, LLM calls, data transformation
├── schema.py               # Pydantic schema for structured output
├── .env                    # Contains GOOGLE_API_KEY
├── /data                   # Folder to save downloaded CSVs
├── /images                 # Folder of menu images

🛠️ Setup Instructions

1. Clone the repository

git clone https://github.com/a2hishek/Image-Data-Extraction.git
cd image-menu-extractor

2. Create a virtual environment and install dependencies

python -m venv venv
.\venv\Scripts\activate  # or source venv/bin/activate on Mac

pip install -r requirements.txt

3. Add your Google API Key

Create a .env file in the project root with the following content:

GOOGLE_API_KEY=your_google_genai_api_key

Alternatively, the app will prompt you to enter the key when running.

▶️ Run the App

streamlit run app.py

🧪 How It Works

Upload Image – Menu image with food categories and items.
Gemini LLM Prompting – Image and context are sent to Gemini using LangChain with a Pydantic schema.
Structured Response – LLM returns a JSON conforming to the schema.
Tabular View – Parsed JSON is converted to a DataFrame.
Download CSV – Export and save results.

✅ Output Format

Each row in the CSV contains:

Restaurant Name & Area
Category ID & Name
Item ID, Name, Description, Price, Type (Veg/Non-Veg)
Add-on Name & Price (if any)

✅ Output Results

Example 1(task_menu_1):

Example 2(task_menu_2):

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧾 Image-to-Structured-Menu Data Extractor

📌 Features

🧠 Tech Stack

📁 Project Structure

🛠️ Setup Instructions

1. Clone the repository

2. Create a virtual environment and install dependencies

3. Add your Google API Key

▶️ Run the App

🧪 How It Works

✅ Output Format

✅ Output Results

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
images		images
.gitignore		.gitignore
Readme.md		Readme.md
app.py		app.py
model.py		model.py
requirements.txt		requirements.txt
schema.py		schema.py

Folders and files

Latest commit

History

Repository files navigation

🧾 Image-to-Structured-Menu Data Extractor

📌 Features

🧠 Tech Stack

📁 Project Structure

🛠️ Setup Instructions

1. Clone the repository

2. Create a virtual environment and install dependencies

3. Add your Google API Key

▶️ Run the App

🧪 How It Works

✅ Output Format

✅ Output Results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages