Train own model with Ubuntu 18? #7

TDHTTTT · 2021-02-17T06:07:42Z

I tried to bypass the version check in snowboy_pmdl.py: elif platform.linux_distribution()[0] == "Ubuntu" and platform.linux_distribution()[1] in ("16.04", "18.04"):.
When I try to run the generate_pmdl.py, output (no error):

template cut
personal enroll
channels: 1, sample rate: 16000, bits: 16
processing xxx.wav
processing xxx.wav
processing xxx.wav
saving file to hotword.pmdl
finished

But when I try to use the hotword.pmdl with demo, it doesn't work:

terminate called after throwing an instance of 'std::runtime_error'
  what():  ERROR (ReadToken():snowboy-io.cc:131) Fail to read token in ReadToken(), position -1

[stack trace: ]
/home/tdhttt/workspace/snowboy/examples/Python/_snowboydetect.so(_ZN7snowboy13GetStackTraceEv+0x35) [0x7f627ff5d1d5]
/home/tdhttt/workspace/snowboy/examples/Python/_snowboydetect.so(_ZN7snowboy13SnowboyLogMsgD1Ev+0x47a) [0x7f627ff5d7ba]
/home/tdhttt/workspace/snowboy/examples/Python/_snowboydetect.so(_ZN7snowboy9ReadTokenEbPSsPSi+0x270) [0x7f627ff69980]
/home/tdhttt/workspace/snowboy/examples/Python/_snowboydetect.so(_ZN7snowboy14PipelineDetect14ClassifyModelsERKSsPSsS3_+0x1f4) [0x7f627ff4b7d4]
/home/tdhttt/workspace/snowboy/examples/Python/_snowboydetect.so(_ZN7snowboy14PipelineDetect8SetModelERKSs+0x15a) [0x7f627ff4bd3a]
.
.
.
python(PyRun_FileExFlags+0x82) [0x56536f768222]
python(PyRun_SimpleFileExFlags+0x18d) [0x56536f767c4d]
python(Py_Main+0x616) [0x56536f716a86]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7) [0x7f6281fc3b97]
python(_start+0x2a) [0x56536f71637a]

Aborted (core dumped)

This issue makes me think that the pmdl is corrupted so it was not generated properly in the first place. May I ask how can I make it work in Ubuntu 18? What's the difference between Ubuntu 16 and 18? Maybe I can try Docker?

The text was updated successfully, but these errors were encountered:

dschnabel · 2021-02-19T08:22:15Z

I had the same problem so I made a Dockerfile:

FROM ubuntu:16.04

RUN apt update && apt --yes --force-yes install wget unzip build-essential python python-dev virtualenv portaudio19-dev
RUN wget https://github.com/seasalt-ai/snowboy/archive/master.zip && unzip master.zip

RUN cd snowboy-master/ && \
    virtualenv -p python2 venv/snowboy && \
    . venv/snowboy/bin/activate && \
    cd examples/Python && \
    pip install -r requirements.txt

RUN apt -y remove wget unzip build-essential portaudio19-dev && apt -y autoremove && apt clean && rm -rf /var/lib/apt/lists/*

CMD cd snowboy-master/ && \
    . venv/snowboy/bin/activate && \
    cd examples/Python && \
    python generate_pmdl.py -r1=model/record1.wav -r2=model/record2.wav -r3=model/record3.wav -lang=en -n=model/hotword.pmdl

Save the above in a file called Dockerfile and from the same directory build a docker image like this:

docker build -t snowboy-pmdl .

This will create an image which you can run to train your personal model. In order for this to work you'll need to create a directory called model on your host machine (Ubuntu 18 or whatever) and place your three audio files in there. So the directory should look something like this (note: the wav files need the exact names as below or it won't work):

$ ls model/
record1.wav  record2.wav  record3.wav

Finally you can call docker (note: need to be in the parent directory of model):

docker run -it -v $(pwd)/model:/snowboy-master/examples/Python/model snowboy-pmdl

This command mounts the model directory in the docker container and runs a script which calls generate_pmdl.py

If everything went well, you should now have a file called hotword.pmdl in your model directory.

Hemanshu-Bhargav · 2021-02-19T10:00:50Z

Hi Daniel Schnabel, I'm not using the Ubuntu version, but it seems the API is not functional at the moment. I see your Dockerfile creates a personal model, however, is this personal model successfully trained on new audio samples? Just to confirm, your Dockerfile does indeed work for personal models trained after the API was shutdown on December 31st, 2020?

dschnabel · 2021-02-19T18:34:34Z

My dockerfile does not use the API, it uses the python script to train a personal model. See this portion from the Dockerfile which does the job:

python generate_pmdl.py -r1=model/record1.wav -r2=model/record2.wav -r3=model/record3.wav -lang=en -n=model/hotword.pmdl

Hemanshu-Bhargav · 2021-02-20T07:10:40Z

My dockerfile does not use the API, it uses the python script to train a personal model.

Thanks for your reply. Do you know if this script performs the same function as the API (without using training_service)? @chenguoguo Can you maybe clarify? Thanks!

dschnabel · 2021-02-21T03:21:35Z

@Hemanshu-Bhargav I asked a similar question in #5

Hemanshu-Bhargav · 2021-02-21T04:29:31Z

@dschnabel
Ah thanks, I didn't know any of the original model training was still functional for new "hotwords". Pre-existing models trained before the shutdown, however, remain functional.

@chenguoguo @hs79hs
If this new script is intended to replace the API, then, for the sake of example, in the Python demo, training_service.py has been removed in favour of the new script, correct? However, as referenced in #3, what would the execution flow look like for Android/iOS?

chenguoguo · 2021-04-26T17:33:17Z

@Hemanshu-Bhargav yes we are supposed to use the new script to replace the old training_service.py script. For Android/iOS, you can try to set up your own service, and then call the service from Android/iOS.

@dschnabel would you like to add your dockerfile to the repo? By the way it looks great!

Hemanshu-Bhargav · 2021-04-26T18:53:45Z

@chenguoguo Thanks for the confirmation. I had another question related to Ubuntu if you don't mind. I can open another issue if that keeps things organized.

Although I believe that support for the Raspberry Pi was later added to the original repository by other collaborators, I've been experiencing an issue with SciPy on the Raspberry Pi and I wanted to ask if you've perhaps had a similar experience.

I've tried different versions of SciPy, Python virtual environments, and @dschnabel's Dockerfile on both Ubuntu and Raspbian, but they all fail— either stating that SciPy is not available, or that Ubuntu 16.04 is required. The Dockerfile works without issue on Ubuntu running on any other architecture.

Any thoughts?

dschnabel · 2021-04-26T19:40:23Z

@chenguoguo I created a PR: #14

rrsaikat · 2022-10-15T01:47:45Z

Hi @dschnabel @chenguoguo
The dockerfile works exactly what i wanted, but i'm facing a major issue which is about the generated pmdl file size. I used 3 recording with different sizes (245kb, 350kb and 300kb) , but got the output only 35kb. I also used other recorded files to check if it resturns the same or not, actullay the scripts always return a file size in between 30 to 35kb.
And because of that detection is not really good, so can you suggest me what can i do?

Thank you

codakkk · 2022-10-24T11:24:55Z

I'm having the same issue, but cannot find a way to make it working.
Is there any way to make it work?

hanzala123 mentioned this issue Apr 26, 2021

add support for more than three audio files. #12

Closed

dschnabel mentioned this issue Apr 26, 2021

Build your own personal models (using Docker) #14

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Train own model with Ubuntu 18? #7

Train own model with Ubuntu 18? #7

TDHTTTT commented Feb 17, 2021

dschnabel commented Feb 19, 2021

Hemanshu-Bhargav commented Feb 19, 2021

dschnabel commented Feb 19, 2021

Hemanshu-Bhargav commented Feb 20, 2021 •

edited

Loading

dschnabel commented Feb 21, 2021

Hemanshu-Bhargav commented Feb 21, 2021 •

edited

Loading

chenguoguo commented Apr 26, 2021

Hemanshu-Bhargav commented Apr 26, 2021 •

edited

Loading

dschnabel commented Apr 26, 2021

rrsaikat commented Oct 15, 2022 •

edited

Loading

codakkk commented Oct 24, 2022

Train own model with Ubuntu 18? #7

Train own model with Ubuntu 18? #7

Comments

TDHTTTT commented Feb 17, 2021

dschnabel commented Feb 19, 2021

Hemanshu-Bhargav commented Feb 19, 2021

dschnabel commented Feb 19, 2021

Hemanshu-Bhargav commented Feb 20, 2021 • edited Loading

dschnabel commented Feb 21, 2021

Hemanshu-Bhargav commented Feb 21, 2021 • edited Loading

chenguoguo commented Apr 26, 2021

Hemanshu-Bhargav commented Apr 26, 2021 • edited Loading

dschnabel commented Apr 26, 2021

rrsaikat commented Oct 15, 2022 • edited Loading

codakkk commented Oct 24, 2022

Hemanshu-Bhargav commented Feb 20, 2021 •

edited

Loading

Hemanshu-Bhargav commented Feb 21, 2021 •

edited

Loading

Hemanshu-Bhargav commented Apr 26, 2021 •

edited

Loading

rrsaikat commented Oct 15, 2022 •

edited

Loading