Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to recognize users by voice #267

Open
aaronchantrill opened this issue May 3, 2020 · 2 comments · May be fixed by #367
Open

Ability to recognize users by voice #267

aaronchantrill opened this issue May 3, 2020 · 2 comments · May be fixed by #367

Comments

@aaronchantrill
Copy link
Contributor

Detailed Description

Naomi should be able to respond differently to different users. If a family member asks "do I have any emails" it should not be necessary for Naomi to ask "who are you?" This would allow the user's voice to act as a sort of authorization. As part of the speech to text training, ultimately I would like to train a different acoustic model for each member of the family. Being able to identify the speaker by voice before selecting the acoustic model would make it possible to use an acoustic model optimized for the speaker, which should lead to better recognition overall.

Context

This could start allowing a database to be built around the user, and also help improve speech recognition

Possible Implementation

Your Environment

  • Version used:
  • Environment name and version (e.g. PHP 5.4 on nginx 1.9.1):
  • Server type and version:
  • Operating System and version:
  • Link to your project:
@aaronchantrill
Copy link
Contributor Author

I'm looking at using this project for an initial test: https://github.com/Suhee05/Text-Independent-Speaker-Verification

I already have had the NaomiSTTTrainer.py allowing you to enter a name for a while, so I have a database with a bunch of recordings labeled with my own name and just a few with other people's names. It would be interesting to see how many recordings are needed to differentiate between two individuals, and also how much audio is required to do a check.

@aaronchantrill
Copy link
Contributor Author

aaronchantrill commented Aug 30, 2021

I've been working with Speaker-Verification-Toolkit and have a test project at python_speaker_verification_test. This package is easy to install on x86_64 systems (pip install speaker-verification-toolkit) but a pain on ARM (Raspberry Pi). To install it on ARM, you need to install version 11 of llvm first, which isn't really obvious from the error messages. Also, when building the package from source it is import to build it as type=Release or else you will run out of memory during the linking step.

$ wget https://github.com/llvm/llvm-project/releases/download/llvmorg-11.1.0/llvm-11.1.0.src.tar.xz
$ tar -xvf llvm-11.1.0.src.tar.xz
$ cd llvm-11.1.0.src/
$ mkdir build
$ cd build
$ cmake -DCMAKE_BUILD_TYPE=Release ..
$ make
$ sudo make install
$ pip install speaker-verification-toolkit

@aaronchantrill aaronchantrill linked a pull request Nov 6, 2022 that will close this issue
8 tasks
@aaronchantrill aaronchantrill linked a pull request Nov 6, 2022 that will close this issue
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging a pull request may close this issue.

2 participants