Conversational Robot

Robotics Club Summer Project 2020

Aim

The aim of this project was to make a Talking bot, one which can pay attention to the user's voice and generate meaningful and contextual responses according to their intent, much like human conversations.

Ideation

This project was divided into overall three parts :

Overall Pipeline of the Project

Speech Recognition

A Deep Speech 2 like architecture had been made for this purpose. Eventually we used google-speech-to-text (gstt) API for the conversion of speech to text transcripts with a WER(Word Error Rate) of 4.7%.

Response Generation

The second step in our pipeline is generating conversational responses after we have recognised input speech content. We tried two distinct response generation models trained on a subset of OpenSubtitles Dataset.

Seq2Seq with Message Attention
Topic Aware Seq2Seq with Message Attention

Text to speech conversion

We used the google-text-to-speech (gtts) API for the conversion of text transcripts of responses back to speech.
The API uses pyglet to play a temporary mp3 file created from the Response Generator's textual response.

Installation

Install the required dependencies :

$ cd ConversationalRobot/integration
$ pip install -r requirements.txt

Download the model weights and parameters from here.

Usage

usage: eval_script.py

The bot starts up and begins accepting speech input.

Documentation

Here's a documentation of the project.

Demonstration

Here's a video demonstrating the functioning of the bot as well as the use of a GUI in tkinter.

References

Deep Speech 2: End-to-End Speech Recognition in English and Mandarin
- Link : [https://arxiv.org/abs/1512.02595]
- Author(s)/Organization : Baidu Research – Silicon Valley AI Lab
- Tags : Speech Recognition
- Published : 8 Dec, 2015
Topic Aware Neural Response Generation
- Link : [https://arxiv.org/abs/1606.08340]
- Authors : Chen Xing, Wei Wu, Yu Wu, Jie Liu, Yalou Huang, Ming Zhou, Wei-Ying Ma
- Tags : Neural response generation; Sequence to sequence model; Topic aware conversation model; Joint attention; Biased response generation
- Published : 21 Jun 2016 (v1), 19 Sep 2016 (v2)
Topic Modelling and Event Identification from Twitter Textual Data
- Link : [https://arxiv.org/abs/1608.02519]
- Authors : Marina Sokolova, Kanyi Huang, Stan Matwin, Joshua Ramisch, Vera Sazonova, Renee Black, Chris Orwa, Sidney Ochieng, Nanjira Sambuli
- Tags : Latent Dirichlet Allocation; Topic Models; Statistical machine translation
- Published : 8 Aug 2016
OpenSubtitles (Dataset)
- Link : [http://opus.nlpl.eu/OpenSubtitles-v2018.php]

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
AIML		AIML
CourseraNotebook		CourseraNotebook
Embeddings/glove		Embeddings/glove
assets		assets
integration		integration
responseGen		responseGen
speechRecog		speechRecog
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AIML

AIML

CourseraNotebook

CourseraNotebook

Embeddings/glove

Embeddings/glove

assets

assets

integration

integration

responseGen

responseGen

speechRecog

speechRecog

LICENSE

LICENSE

README.md

README.md

Repository files navigation

Conversational Robot

Aim

Ideation

Overall Pipeline of the Project

Speech Recognition

Response Generation

Text to speech conversion

Installation

Usage

Documentation

Demonstration

References

About

Releases

Packages

Languages

License

ShivenTripathi/ConversationalRobot

Folders and files

Latest commit

History

Repository files navigation

Conversational Robot

Aim

Ideation

Overall Pipeline of the Project

Speech Recognition

Response Generation

Text to speech conversion

Installation

Usage

Documentation

Demonstration

References

About

Resources

License

Stars

Watchers

Forks

Languages