scikit-surgeryspeech

Author: Kim-Celine Kahl

scikit-surgeryspeech is part of the SciKit-Surgery software project, developed at the Wellcome EPSRC Centre for Interventional and Surgical Sciences, part of University College London (UCL).

scikit-surgeryspeech supports Python 3.6.

scikit-surgeryspeech is a project which runs the Python Speech Recognition API in the background listening for a specific command. After saying the keyword you can say different commands, which get converted to QT Signals.

The speech recognition is done by the Google Cloud API, you have to get the credentials to use it or change the recognition service.

Keyword detection is done by the Porcupine API. This should be have been installed automatically via the pvporcupine dependency

Please explore the project structure, and implement your own functionality.

Example usage

To run an example, just start

sksurgeryspeech.py -c example_config.json

The config file should define the paths for the porcupine library and the Google Cloud API if you are using it.

You can then say the keyword depending on the Porcupine keyword file you chose and afterwards a command. The command "quit" exits the application.

Note: each time you have already entered a command, you need to say the keyword again to trigger the listening to commands.

Developing

Cloning

You can clone the repository using the following command:

git clone https://github.com/SciKit-Surgery/scikit-surgeryspeech

If you have problems running the application, you might need to install portaudio

Mac :

brew install portaudio

Ubuntu :

sudo apt-get install libasound-dev portaudio19-dev

If you're going to try sphinx might need to install pulseaudo-dev

Ubuntu :

sudo apt-get install swig libpulse-dev

Set up the Porcupine keyword detection

Then, you have to set the following variables in the configuration file

"porcupine dynamic library path" : ".tox/py37/lib/python3.7/site-packages/pvporcupine/lib/linux/x86_64/libpv_porcupine.so",
 "porcupine model file path" : ".tox/py37/lib/python3.7/site-packages/pvporcupine/lib/common/porcupine_params.pv",
 "porcupine keyword file" : [".tox/py37/lib/python3.7/site-packages/pvporcupine/resources/keyword_files/linux/jarvis_linux.ppn"],

You can also generate your own keyword files

If you are using the speech recognition service within your own application, you have to start a background thread which calls the method to listen to the keyword over and over again.

You can find an example how to create such a thread in the sksurgeryspech_demo.py

Use the Google Cloud speech recognition service

To use the Google Cloud speech recognition service, you need to get the credentials first. After signing up, you should get a json file with your credentials. Download this file and add add it to the configuration file

"google credentials file" : "snappy-speech-6ff24bf3e262.json",

To the path of your json file. You should then be able to run the application.

Change speech recognition service

You can try different speech recognition services by changing the recogniser entry in the config file. sphinx, google and google_cloud have all been tested, other options are possible but may not be implemented yet.

"recogniser" : "sphinx"
"recogniser" : "google" 
"recogniser" : "google_cloud"
"recogniser" : "wit"
"recogniser" : "bing"
"recogniser" : "azure"
"recogniser" : "houndify"
"recogniser" : "ibm"

Python development

This project uses tox. Start with a clean python environment, then do:

pip install tox
tox

and the commands that are run can be found in tox.ini.

Installing

You can pip install directly from the repository as follows:

pip install git+https://github.com/SciKit-Surgery/scikit-surgeryspeech

Contributing

Please see the contributing guidelines.

Useful links

Source code repository

Licensing and copyright

Acknowledgements

Supported by Wellcome and EPSRC.

Name		Name	Last commit message	Last commit date
Latest commit History 86 Commits
.github/workflows		.github/workflows
docs		docs
sksurgeryspeech		sksurgeryspeech
tests		tests
.coveragerc		.coveragerc
.coveralls.yml		.coveralls.yml
.gitattributes		.gitattributes
.gitignore		.gitignore
CITATION.cff		CITATION.cff
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.rst		CONTRIBUTING.rst
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.rst		README.rst
example_config.json		example_config.json
google_cloud_config.json		google_cloud_config.json
google_config.json		google_config.json
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py
sksurgeryspeech.py		sksurgeryspeech.py
tox.ini		tox.ini
versioneer.py		versioneer.py
weiss_logo.png		weiss_logo.png

License

SciKit-Surgery/scikit-surgeryspeech

Folders and files

Latest commit

History

Repository files navigation

scikit-surgeryspeech

Example usage

Developing

Cloning

Set up the Porcupine keyword detection

Use the Google Cloud speech recognition service

Change speech recognition service

Python development

Installing

Contributing

Useful links

Licensing and copyright

Acknowledgements

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Languages