Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Precise Custom Wake Word mic to be used and additional data set training #53

Closed
tech-novic opened this issue Dec 29, 2018 · 17 comments
Closed

Comments

@tech-novic
Copy link

I trained model with 200 + dataset (about 15 sample from different individuals) usind method 2 to take out false negative. Now the model when tested works good with the mic i used to collect the dataset. When i use other mics then the result is different, when i use laptop mic there is lot of false positive when i use another mic then to get activation is very hard. I also exported the model to tensorflow and used it with 6 array mic and couldnt good activation.
There are 2 questions here,

  1. Is there any dependency on the mic used for capturing the dataset and what is the recommendation?
  2. How do I train the same model with new dataset?
@MatthewScholefield
Copy link
Collaborator

MatthewScholefield commented Dec 29, 2018

I have indeed noticed a dependency on the mic used. This could be both because of the different noise frequencies it picks up, and because of the different volumes. Apart from recording samples on a few different mics, you can try recording a long audio of silence on a few other mics (almost like getting the "noise profile") and using precise-add-noise to create a dataset with the noise of all the mics.

However, this might not solve the volume issue. Thinking about it, I should probably add a feature to precise-add-noise to randomly vary the output volume.

Edit: And you can train the same model on a new dataset just like you would expect (passing the name of the model in the command, but passing in the folder with the new dataset).

Let me know if this helps.

@tech-novic
Copy link
Author

Thanks Matthew, i will try first with collecting noise from different mic and also plan to collect data from different mic sources.
I will keep you posted on the outcome.

@tech-novic
Copy link
Author

tech-novic commented Dec 30, 2018

Have one question on precise-add-noise, will the syntax be
precise-add-noise path to folder containing wake word dataset path to folder containing noise data set path to folder to write output

After this I assume I should use precise-train <model.net> path to output from above

@MatthewScholefield
Copy link
Collaborator

You can always run the command with --help to check how to use it:

$ precise-add-noise --help
usage: precise-add-noise [-h] [-tg TAGS_FILE] [-if INFLATION_FACTOR]
                         [-nl NOISE_RATIO_LOW] [-nh NOISE_RATIO_HIGH]
                         folder noise_folder output_folder

Create a duplicate dataset with added noise

positional arguments:
  folder                Folder containing source dataset
  noise_folder          Folder with wav files containing noise to be added
  output_folder         Folder to write the duplicate generated dataset

optional arguments:
  -h, --help            show this help message and exit
  -tg TAGS_FILE, --tags-file TAGS_FILE
                        Tags file to optionally load from. Default: -
  -if INFLATION_FACTOR, --inflation-factor INFLATION_FACTOR
                        The number of noisy samples generated per single
                        source sample. Default: 1
  -nl NOISE_RATIO_LOW, --noise-ratio-low NOISE_RATIO_LOW
                        Minimum random ratio of noise to sample. 1.0 is all
                        noise, no sample sound. Default: 0.0
  -nh NOISE_RATIO_HIGH, --noise-ratio-high NOISE_RATIO_HIGH
                        Maximum random ratio of noise to sample. 1.0 is all
                        noise, no sample sound. Default: 0.4

As you can see, the 3 required arguments, in order, are:

  folder                Folder containing source dataset
  noise_folder          Folder with wav files containing noise to be added
  output_folder         Folder to write the duplicate generated dataset

As you may now guess, this will create a duplicated dataset with the noise added in. You would then point the data folder in precise-train to this new, duplicated dataset.

@tech-novic
Copy link
Author

Hi Matthew,
Update from using precise-add-noise.
I collected noise sample as recommended from different mic and generated output wav file. Then i did training and when i tested the model with precise-listen i got lot of false activation. Then I did a incremental training with data\random (method 2). This resolved the false activation and when i tested with precise-listen using different mic it gave good result.
I then converted the model and when I used it with mycroft-core it did not give same result. I got very few activation and that too with too much stress on the wake word.
I played around with the threshold value, multiplier as well as energy ratio. This did not make much difference. I believe the model is now trained good, but to use it with mycroft-core the configuration of the core needs more tuning. Is there any recommendation i can try out here.

@MatthewScholefield
Copy link
Collaborator

The problem might be that there's been a change with the audio processing library Precise uses. Mycroft Core I think is still using the old one. Sorry I just realized this, but I think your problem should be fixed if you modify vectorizer= in ListenerParams within precise/params.py to be vectorizer=Vectorizer.speechpy_mfccs before training a new model. You can also try making that code change and deleting the my-copied-model.net.params and retraining it since the audio inputs will be similar, just slightly different.

@tech-novic
Copy link
Author

tech-novic commented Jan 1, 2019

Thanks Matthew, I will try as suggested and share results here.

Edit:
Quick Update: I made changed to the code and retrained the model. Tried it with mycorft-core on my laptop with a head set it worked as charm. Tomorrow i will try with mic array and share the update

@el-tocino
Copy link

I retrained my models with this setting as well. They definitely feel more correctly responsive. The training took a few more steps to get where I liked it.

@tech-novic
Copy link
Author

Hi Matthew,
I tested with mic array today and it is working good. Thanks for all your support.

@MatthewScholefield
Copy link
Collaborator

Awesome to hear! Closing, but let me know if you have any other issues.

@EuphoriaCelestial
Copy link

precise-add-noise

@MatthewScholefield I dont find any usage of "precise-add-noise" in training tutorial
is it removed?

@MatthewScholefield
Copy link
Collaborator

@EuphoriaCelestial Not removed, it's just I don't think it's ever been documented. However, we'd love contributions. Feel free to create a new page on the wiki about it.

@EuphoriaCelestial
Copy link

@EuphoriaCelestial Not removed, it's just I don't think it's ever been documented. However, we'd love contributions. Feel free to create a new page on the wiki about it.

yeah I've never known about that command until I reached this issue. Can you provide more information on what it does and how to use in training progress?

@MatthewScholefield
Copy link
Collaborator

@EuphoriaCelestial Is there any part you'd like me to expand on? What I explained from before covers most of it:

You can always run the command with --help to check how to use it:

$ precise-add-noise --help
usage: precise-add-noise [-h] [-tg TAGS_FILE] [-if INFLATION_FACTOR]
                         [-nl NOISE_RATIO_LOW] [-nh NOISE_RATIO_HIGH]
                         folder noise_folder output_folder

Create a duplicate dataset with added noise

positional arguments:
  folder                Folder containing source dataset
  noise_folder          Folder with wav files containing noise to be added
  output_folder         Folder to write the duplicate generated dataset

optional arguments:
  -h, --help            show this help message and exit
  -tg TAGS_FILE, --tags-file TAGS_FILE
                        Tags file to optionally load from. Default: -
  -if INFLATION_FACTOR, --inflation-factor INFLATION_FACTOR
                        The number of noisy samples generated per single
                        source sample. Default: 1
  -nl NOISE_RATIO_LOW, --noise-ratio-low NOISE_RATIO_LOW
                        Minimum random ratio of noise to sample. 1.0 is all
                        noise, no sample sound. Default: 0.0
  -nh NOISE_RATIO_HIGH, --noise-ratio-high NOISE_RATIO_HIGH
                        Maximum random ratio of noise to sample. 1.0 is all
                        noise, no sample sound. Default: 0.4

As you can see, the 3 required arguments, in order, are:

  folder                Folder containing source dataset
  noise_folder          Folder with wav files containing noise to be added
  output_folder         Folder to write the duplicate generated dataset

As you may now guess, this will create a duplicated dataset with the noise added in. You would then point the data folder in precise-train to this new, duplicated dataset.

This would be most useful to do in cases where you don't have a lot of wakewords and want to generate more variations of the data.

@EuphoriaCelestial
Copy link

EuphoriaCelestial commented Mar 30, 2021

Sorry I just realized this, but I think your problem should be fixed if you modify vectorizer= in ListenerParams within precise/params.py to be vectorizer=Vectorizer.speechpy_mfccs before training a new model.

@MatthewScholefield should I do this?

@MatthewScholefield
Copy link
Collaborator

MatthewScholefield commented Mar 30, 2021

@EuphoriaCelestial First, just to clarify, this only pertains to Mycroft Core. Now, in order to make it work with Mycroft Core, I think it's actually more realiable (but still a bit hacky) to basically do a source install of precise on the same platform you use mycroft core on. Then you can just link the source install engine script to where mycroft core expects it:

default_engine_path=~/.mycroft/precise/precise-engine/precise-engine

# Back up default precise-engine
mv "$default_engine_path" "$default_engine_path.bak"

# Link source install to Mycroft Core
cd mycroft-precise/  # Source install location
ln -s "$(pwd)/.venv/bin/precise-engine" "$default_engine_path"

If you do this you would definitely not need to modify the vectorizer.

@EuphoriaCelestial
Copy link

@MatthewScholefield I encountered this error when run precise-add-noise :
WARNING: Found 676 wavs but no tags file specified! Data: <TrainData wake_words=0 not_wake_words=0 test_wake_words=0 test_not_wake_words=0> Done!

I dont know what happened, just yesterday it still working fine, I generated hundreds of file using this command
now with the same command, same PC, same environment, everything just dont work anymore, even with old files, which I successfully added noise before; really confusing
I tried adding tag using VLC but it doesnt work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants