Skip to content

Vivek-Kamboj/BTP_Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Smart Assistant for Visual Impaired Person ✨

Brief Description

This is my B.Tech. project and its title is Smart Assistant for Visual Impaired Person.

Visually impaired people are facing a lot of problems in their daily life. So, It would be great if visually impaired people can also interact with the environment with the help of the latest technology and utilize the facilities of the technology. Utilizing technologies like Artificial Intelligence, Machine Learning, Image, and Text Recognition, we can help visually impaired people to get information about their surroundings. This can help them a lot and can make their life easier than before.

Objective and Functionalities 😃

The main idea of my project is to implement a Web-based application that provides a way for visually impaired people to interact and understand their surroundings. It would focus on tools that can help these people, which includes:

  1. Object Detection:
    The idea is to build an application that would detect the object present in front of the webcam or camera in our computer or smartphone and can tell the information about the objects to the user in the form of voice.
  2. Voice assistant:
    The idea is to build an assistant that can take the input of a user in the form of voice and can perform some basic tasks like searching on the web, switching the feature in the web app, etc.
  3. Image to text/speech:
    The idea is to make the words recognition system that can extract words from the given input image and display them in the form of text and produces an output in the form of voice.
  4. Speech to text:
    The idea is to construct a speech recognition system that would listen to the user’s speech and convert it into text. This would be a great tool that would help the visually impaired person to write a kind of information by just saying it.
  5. Text to Speech:
    The idea here is to convert the text into speech. This will help the user to read anything by just passing the text in the application, and the application would read on behalf of the user.

These are some of the deliverables that I have planned to make for this project. I have planned to integrate these tools in a single web application using React.js framework as frontend and Python - flask as a backend. I am building a web application because it can run in any system using their default browser without any need for installation of any kind of application in our system. However, it may need an internet connection to load the application in the browser.

ARCHITECTURE/ METHODOLOGY

My aim in this project is to build a smart virtual assistant for visually impaired people that would be helpful for them in their daily routine. This assistant application would provide a different look about their surroundings to the user. It would have the following modules:

image

  1. Object Detection
    It is done using YOLO algorithm.
  2. Image to text/speech convertor
    It is done using easyOCR.
  3. Speech to text converter
    It is done using WebSpeechAPI.
  4. Text to Speech
    It is done using WebSpeechAPI.
  5. Voice assistant
    It is done using WebSpeechAPI.

I have planned to make a web application to integrate these modules as a web application can run in any system’s browser, whether it is a desktop or a smartphone, without any need to install any software. The user only needs to go to the particular website, and this application would get loaded into the browser and start working. However, for this to work Internet connection is required. I am considering using the React.js framework and python as backend to implement this project and run these models in the system’s browser

Tech Stack 💻

MERN stack has been used for the development of this website.

SetUp Steps

Prerequisites: Python, Flask, npm, pip, create-react-app, etc.

For Frontend

  • Write cd ./client in terminal for going in frontend folder.
  • Run npm install for installing dependencies.
  • Run npm start

For Backend

  • Install libraries like numpy, pandas, pillow easyocr, tensorflow, openCV, keras, matplotlib, etc. by running pip install <<Library_name>>.
  • Download yolo weight and put it in model_data. For reference.
  • Run python ./server.py

Hurray, Your app is now running on port 3000 in your browser

Demo 👨‍💻

Link : https://www.youtube.com/watch?v=0TPdp-As1Ac

Screenshots

Object Detection

image image Image to Text

image Text to Speech

image Speech to Text

image Assistant

  • It can perform task like go to , search on google, play on youtube, weather, etc...