Skip to content

AmitRajegaonkar/bazaarbot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Shopping Assistant API

A Flask-based API that provides shopping assistance through audio and image inputs, powered by Google's Gemini AI model.

Features

  • Audio input processing with speech recognition
  • Image analysis for product identification
  • Structured product information responses
  • Support for multiple product categories (Food, Clothes, Shoes)
  • Detailed size and pricing information
  • Rack location tracking

Prerequisites

  • Python 3.8 or higher
  • Google Cloud API key for Gemini AI
  • Internet connection for API calls

Installation

  1. Clone the repository:
git clone <repository-url>
cd <repository-name>
  1. Install dependencies:
pip install -r requirements.txt
  1. Set up environment variables: Create a .env file in the root directory with:
GOOGLE_API_KEY=your_api_key_here

Usage

Starting the Server

python app.py

The server will start on http://0.0.0.0:5000

API Endpoints

1. Audio Input (/hardware/audio_input)

  • Method: POST
  • Input: Audio file (WAV format)
  • Output Format:
{
    "transcription": "user's speech text",
    "text_response": {
        "Section": "section name",
        "Rack": "rack position",
        "Name": "product name",
        "Price": "price",
        "Size": "size information"
    }
}

2. Image Input (/hardware/image_input)

  • Method: POST
  • Input: Base64 encoded image
  • Output Format:
{
    "text_response": {
        "Section": "section name",
        "Rack": "rack position",
        "Name": "product name",
        "Price": "price",
        "Size": "size information"
    }
}

Product Categories

  1. Food

    • Available sizes: Small Pack (200g), Medium Pack (500g), Large Pack (1kg)
    • Products: Biscuits, Snacks, etc.
  2. Clothes

    • Available sizes: S, M, L, XL, XXL
    • Products: T-shirts, Shirts, etc.
  3. Shoes

    • Available sizes: US 6-12, UK 5-11, EU 39-45
    • Products: Leather Shoes, Formal Shoes, Loafers

Response Format

All responses follow this structure:

Section: [section name]
Rack: [rack position]
Name: [product name]
Price: [price]
Size: [size information]

Error Handling

The API handles various error cases:

  • Invalid audio input
  • Unsupported image format
  • Missing API key
  • Network errors
  • Speech recognition errors

Testing

Use the provided test scripts:

  • test_audio.py for testing audio input
  • test_image.py for testing image input

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Commit your changes
  4. Push to the branch
  5. Create a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published