Skip to content

5418XR/VIID

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VIID

Visually Impaired Image Describer (VIID)

##YouTube: Watch the video ABC Empowering the visually impaired with AI-powered image descriptions transformed into audible experiences.

Overview

VIID leverages the power of cutting-edge AI models, like miniGPT4 and GPT4, combined with image recognition and text-to-speech technologies, to offer the visually impaired a unique opportunity to "hear" the visual world around them.

Features

  • Image Recognition: Accurate image recognition to capture essential details.
  • AI-Powered Descriptions: Enhanced detail recognition using miniGPT4 and enriched textual descriptions using GPT4.
  • Blind-Friendly Textual Adaptation: Textual content refinement tailored for the visually impaired audience.
  • Text-to-Speech Transformation: Convert descriptions into clear, comprehensible, and natural audio using gtts.

Installation

# Clone the repository
git clone https://github.com/your_username/VIID.git

# Navigate to the directory
cd hackson2

# Install required packages (consider using a virtual environment)
pip install -r requirements.txt

# Run the application
python app.py

Usage

  1. Capture or upload an image.
  2. Let VIID process the image.
  3. Listen to the detailed audible description.

What We Learned

  • Emphasis on high-quality data for optimal image recognition.
  • Iterative development and the value of continuous user feedback.
  • The importance of a user-centric approach.
  • Addressing technical challenges related to real-time processing and compatibility.

Next Steps

  • Initiate a broader user trial phase for feedback.
  • Enhance voice output quality and options.
  • Expand multilingual and dialect support.

Contributing

Feel free to fork the project, submit pull requests, or create issues. We appreciate collaboration and feedback!

License

MIT License. See LICENSE for more information.

##Devpost QR code 00053-1352516137

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published