Smart Assistant for Visual Impaired Person ✨

Brief Description

This is my B.Tech. project and its title is Smart Assistant for Visual Impaired Person.

Visually impaired people are facing a lot of problems in their daily life. So, It would be great if visually impaired people can also interact with the environment with the help of the latest technology and utilize the facilities of the technology. Utilizing technologies like Artificial Intelligence, Machine Learning, Image, and Text Recognition, we can help visually impaired people to get information about their surroundings. This can help them a lot and can make their life easier than before.

Objective and Functionalities 😃

The main idea of my project is to implement a Web-based application that provides a way for visually impaired people to interact and understand their surroundings. It would focus on tools that can help these people, which includes:

Object Detection:
The idea is to build an application that would detect the object present in front of the webcam or camera in our computer or smartphone and can tell the information about the objects to the user in the form of voice.
Voice assistant:
The idea is to build an assistant that can take the input of a user in the form of voice and can perform some basic tasks like searching on the web, switching the feature in the web app, etc.
Image to text/speech:
The idea is to make the words recognition system that can extract words from the given input image and display them in the form of text and produces an output in the form of voice.
Speech to text:
The idea is to construct a speech recognition system that would listen to the user’s speech and convert it into text. This would be a great tool that would help the visually impaired person to write a kind of information by just saying it.
Text to Speech:
The idea here is to convert the text into speech. This will help the user to read anything by just passing the text in the application, and the application would read on behalf of the user.

These are some of the deliverables that I have planned to make for this project. I have planned to integrate these tools in a single web application using React.js framework as frontend and Python - flask as a backend. I am building a web application because it can run in any system using their default browser without any need for installation of any kind of application in our system. However, it may need an internet connection to load the application in the browser.

ARCHITECTURE/ METHODOLOGY

My aim in this project is to build a smart virtual assistant for visually impaired people that would be helpful for them in their daily routine. This assistant application would provide a different look about their surroundings to the user. It would have the following modules:

Object Detection
It is done using YOLO algorithm.
Image to text/speech convertor
It is done using easyOCR.
Speech to text converter
It is done using WebSpeechAPI.
Text to Speech
It is done using WebSpeechAPI.
Voice assistant
It is done using WebSpeechAPI.

I have planned to make a web application to integrate these modules as a web application can run in any system’s browser, whether it is a desktop or a smartphone, without any need to install any software. The user only needs to go to the particular website, and this application would get loaded into the browser and start working. However, for this to work Internet connection is required. I am considering using the React.js framework and python as backend to implement this project and run these models in the system’s browser

Tech Stack 💻

MERN stack has been used for the development of this website.

SetUp Steps

Prerequisites: Python, Flask, npm, pip, create-react-app, etc.

For Frontend

Write cd ./client in terminal for going in frontend folder.
Run npm install for installing dependencies.
Run npm start

For Backend

Install libraries like numpy, pandas, pillow easyocr, tensorflow, openCV, keras, matplotlib, etc. by running pip install <<Library_name>>.
Download yolo weight and put it in model_data. For reference.
Run python ./server.py

Hurray, Your app is now running on port 3000 in your browser

Demo 👨‍💻

Link : https://www.youtube.com/watch?v=0TPdp-As1Ac

Screenshots

Object Detection

Image to Text

Text to Speech

Speech to Text

Assistant

It can perform task like go to , search on google, play on youtube, weather, etc...

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Models		Models
client		client
uploads/test_docs		uploads/test_docs
.gitignore		.gitignore
README.md		README.md
server.py		server.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Smart Assistant for Visual Impaired Person ✨

Brief Description

Objective and Functionalities 😃

ARCHITECTURE/ METHODOLOGY

Tech Stack 💻

SetUp Steps

For Frontend

For Backend

Demo 👨‍💻

Screenshots

About

Releases

Packages

Languages

Vivek-Kamboj/BTP_Project

Folders and files

Latest commit

History

Repository files navigation

Smart Assistant for Visual Impaired Person ✨

Brief Description

Objective and Functionalities 😃

ARCHITECTURE/ METHODOLOGY

Tech Stack 💻

SetUp Steps

For Frontend

For Backend

Demo 👨‍💻

Screenshots

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages