Speech to Text | Meeting summarizer model

Description

The goal is to make a model which translates audio from meetings and encodes it to a text file, which on completion is summarized in to a smaller, concise list of bullet points covering the entire meeting. The initial idea is to implement it through pre-trained models, but the accuracy and efficiency has to take care of.
There are a few considersation. First, that it should desirably also support mulit-lingual conversation, with english being the primary support, and additional support for hindi and at least the default regional language.
The end goal however, is to wrap this model into a presentable web-app, with gui interface to start the recording, and add the administrators to the meetings, which will recieve the summary as a text file in the mail box automatically, without manual intervention.

Problem Statement

To create a machine learning model which can listen to the audio from meetings and translate the speech to text, and finally output a summary of the entire meeting in a text format. This model can further be wrapped inside a graphical interface for easier access, where the summarized text has to be sent to the administrators of the meeting provided on onset.

Course of Action

Setting the baseline

using google's standard api to convert speech to text¹
- setting a baseline
- finding accuracy
- improving on the same
dumping the output to a text file
- that is then picked up for summarizing
- using ~~genism~~² ³ sumy standard module⁴ ⁵
- improving on the same

Improving on the baseline

having got the baseline in
- google's speechrecognition⁶ for converting speech to text
- sumy-lsasummarize⁷ for summarizing the contents
to use them together and streamline the model
- bridge the gap between the two - preprocessing
- process the output from converted text with proper punctuation and markings
- then running the summarizing models on the processed text block
thus, require a middleware leveragin a natural languge processing model
- current options - nltk⁸, openai⁹, spacy¹⁰
major shift in workflow, found better option: ¹¹
- vosk model for speech to text convertion
- preprocessing through transforerms model
- finally, summarizing through pipeline()

Advanced Optimization and Improvization

Context aware summary
- change of paragraph in cases of change of paragraph
- might require deep learning,
- can seperate two summaries, general and context based summary
Multilingual speech optimization
Adapting to bandwidth, backup solutions
- recording set audio
- fall-back to recording and translating the saved audio instead of live time transcribing

References & Bibliography

GfG articles on speech-to-text Python: Con... ↩
Knowledge Base/ turing.com ↩
Developer's documentation for genism ↩
Sasha Bondar's blog post on reintech.io ↩
Official PyPi documentation for sumy ↩
Official PyPi documentation for SpeechRecognition ↩
Official PyPi documentation for sumy ↩
Official PyPi documentation for nltk ↩
Official PyPi documentation for openai ↩
Official PyPi documentation for spacy ↩
DataQuest's blog post, github ↩

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
base-four.py		base-four.py
base-one.py		base-one.py
base-three.py		base-three.py
base-two.py		base-two.py
buildup-four.py		buildup-four.py
dump.txt		dump.txt
fjwlekj.py		fjwlekj.py
flg.png		flg.png
pau.png		pau.png
ply.png		ply.png
rec.png		rec.png
recorder.py		recorder.py
snip.py		snip.py
stp.png		stp.png
summ-one.py		summ-one.py
summ-two.py		summ-two.py
summary.txt		summary.txt
waveform.gif		waveform.gif

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Speech to Text | Meeting summarizer model

Description

Problem Statement

Course of Action

Setting the baseline

Improving on the baseline

Advanced Optimization and Improvization

References & Bibliography

About

Uh oh!

Releases

Packages

Languages

License

ranpy13/stt-meeting-summary

Folders and files

Latest commit

History

Repository files navigation

Speech to Text | Meeting summarizer model

Description

Problem Statement

Course of Action

Setting the baseline

Improving on the baseline

Advanced Optimization and Improvization

References & Bibliography

Footnotes

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages