Shopping Assistant API

A Flask-based API that provides shopping assistance through audio and image inputs, powered by Google's Gemini AI model.

Features

Audio input processing with speech recognition
Image analysis for product identification
Structured product information responses
Support for multiple product categories (Food, Clothes, Shoes)
Detailed size and pricing information
Rack location tracking

Prerequisites

Python 3.8 or higher
Google Cloud API key for Gemini AI
Internet connection for API calls

Installation

Clone the repository:

git clone <repository-url>
cd <repository-name>

Install dependencies:

pip install -r requirements.txt

Set up environment variables: Create a .env file in the root directory with:

GOOGLE_API_KEY=your_api_key_here

Usage

Starting the Server

python app.py

The server will start on http://0.0.0.0:5000

API Endpoints

1. Audio Input (`/hardware/audio_input`)

Method: POST
Input: Audio file (WAV format)
Output Format:

{
    "transcription": "user's speech text",
    "text_response": {
        "Section": "section name",
        "Rack": "rack position",
        "Name": "product name",
        "Price": "price",
        "Size": "size information"
    }
}

2. Image Input (`/hardware/image_input`)

Method: POST
Input: Base64 encoded image
Output Format:

{
    "text_response": {
        "Section": "section name",
        "Rack": "rack position",
        "Name": "product name",
        "Price": "price",
        "Size": "size information"
    }
}

Product Categories

Food
- Available sizes: Small Pack (200g), Medium Pack (500g), Large Pack (1kg)
- Products: Biscuits, Snacks, etc.
Clothes
- Available sizes: S, M, L, XL, XXL
- Products: T-shirts, Shirts, etc.
Shoes
- Available sizes: US 6-12, UK 5-11, EU 39-45
- Products: Leather Shoes, Formal Shoes, Loafers

Response Format

All responses follow this structure:

Section: [section name]
Rack: [rack position]
Name: [product name]
Price: [price]
Size: [size information]

Error Handling

The API handles various error cases:

Invalid audio input
Unsupported image format
Missing API key
Network errors
Speech recognition errors

Testing

Use the provided test scripts:

test_audio.py for testing audio input
test_image.py for testing image input

Contributing

Fork the repository
Create a feature branch
Commit your changes
Push to the branch
Create a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
hardware		hardware
static		static
templates		templates
.gitignore		.gitignore
README.md		README.md
app.py		app.py
hardware.py		hardware.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Shopping Assistant API

Features

Prerequisites

Installation

Usage

Starting the Server

API Endpoints

1. Audio Input (`/hardware/audio_input`)

2. Image Input (`/hardware/image_input`)

Product Categories

Response Format

Error Handling

Testing

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

AmitRajegaonkar/bazaarbot

Folders and files

Latest commit

History

Repository files navigation

Shopping Assistant API

Features

Prerequisites

Installation

Usage

Starting the Server

API Endpoints

1. Audio Input (/hardware/audio_input)

2. Image Input (/hardware/image_input)

Product Categories

Response Format

Error Handling

Testing

Contributing

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

1. Audio Input (`/hardware/audio_input`)

2. Image Input (`/hardware/image_input`)

Packages