Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

homeassistant-core wyoming enabled voice assistant add-ons #481

Merged

Conversation

ms1design
Copy link
Contributor

Hi @dusty-nv,

Pushing this in hope that you could build on your farm the pytorch wheels from wyoming-piper container (mentioned here) to your pip repo.

It's also a good starting point to testing all integrations working together over wyoming protocol. I don't expect it will work out of the box – I can test it on my devices on the weekend.

If anyone wanna to try this, here's the docker-compose I'm using for testing:

name: home-assistant
version: "3.9"
services:
  home-assistant:
    image: ms1design/homeassistant-core:latest-r36.2.0-cu124
    restart: unless-stopped
    runtime: nvidia
    privileged: true
    network_mode: host
    container_name: home-assistant
    hostname: home-assistant
    ports:
      - "8123:8123"
    devices:
      - /dev/snd:/dev/snd
      - /dev/bus/usb
    volumes:
      - config:/config
      - /etc/localtime:/etc/localtime:ro
      - /etc/timezone:/etc/timezone:ro
    environment:
      TZ: Europe/Amsterdam
    stdin_open: true
    tty: true
    healthcheck:
      test: curl -s -o /dev/null -w "%{http_code}" http://localhost:8123 || exit 1
      interval: 1m
      timeout: 30s
      retries: 3

  openwakeword:
    image: ms1design/wyoming-openwakeword:latest-r36.2.0-cu124
    restart: unless-stopped
    runtime: nvidia
    network_mode: host
    container_name: openwakeword
    hostname: openwakeword
    ports:
      - "10400:10400/tcp"
    devices:
      - /dev/snd:/dev/snd
      - /dev/bus/usb
    volumes:
      - openwakeword_models:/share/openwakeword
      - /etc/localtime:/etc/localtime:ro
      - /etc/timezone:/etc/timezone:ro
    environment:
      TZ: Europe/Amsterdam
    stdin_open: true
    tty: true
    healthcheck:
      test: ["CMD", "echo", "{ \"type\": \"describe\" }", "|", "nc", "-w", "1", "localhost", "10400", "|", "grep", "-iq", "openWakeWord", "||", "exit", "1"]
      interval: 1m
      timeout: 30s
      retries: 3

  faster-whisper:
    image: ms1design/wyoming-whisper:r36.2.0-cu124
    restart: unless-stopped
    runtime: nvidia
    network_mode: host
    container_name: faster-whisper
    hostname: faster-whisper
    ports:
      - "10300:10300/tcp"
    devices:
      - /dev/snd:/dev/snd
      - /dev/bus/usb
    volumes:
      - whisper_models:/share/whisper
      - whisper_data:/data
      - /etc/localtime:/etc/localtime:ro
      - /etc/timezone:/etc/timezone:ro
    environment:
      TZ: Europe/Amsterdam
    stdin_open: true
    tty: true

  assist-microphone:
    image: ms1design/wyoming-assist-microphone:r36.2.0-cu124
    restart: unless-stopped
    network_mode: host
    container_name: assist-microphone
    hostname: assist-microphone
    depends_on:
      - openwakeword
    ports:
      - "10700:10700/tcp"
    devices:
      - /dev/snd:/dev/snd
      - /dev/bus/usb
    volumes:
      - assist_microphone_share:/share
      - /etc/localtime:/etc/localtime:ro
      - /etc/timezone:/etc/timezone:ro
    environment:
      TZ: Europe/Amsterdam
    stdin_open: true
    tty: true

volumes:
  config:
    name: ha-config
  openwakeword_models:
    name: ha-openwakeword-models
  whisper_models:
    name: ha-whisper-models
  whisper_data:
    name: ha-whisper-data
  assist_microphone_share:
    name: ha-assist-microphone-share

@dusty-nv
Copy link
Owner

Cool @ms1design ! Okay, I see in your docker-compose you are building everything against CUDA 12.4 which is why it is going after PyTorch 2.3rc, however that isn't really stable yet (I added it mostly for preliminary TRT-LLM work). Right now until I can merge your PR, I will build/upload the wheel for pytorch 2.2 for Python 3.11 and CUDA 12.2. Then when PyTorch 2.3 is actually released (it is currently up to RC12), I will build that.

@ms1design
Copy link
Contributor Author

ms1design commented Apr 18, 2024

Ahhh that makes sense @dusty-nv, dunno why I missed that ;) Let me just rebuild those against cuda:12.2 👍

I need to spend some time to update my CI/CD env to support latest improvements here :)

@ms1design ms1design marked this pull request as draft April 18, 2024 15:37
@dusty-nv
Copy link
Owner

I think it will still make you build pytorch 2.2 for python 3.11 because I don't have that up yet...kicking it off now

@dusty-nv
Copy link
Owner

OK, pytorch 2.2 wheel for Python 3.11 and CUDA 12.2 is up: http://jetson.webredirect.org/jp6/cu122/torch/2.2.0

Will look at merging this PR shortly!

@johnnynunez
Copy link

OK, pytorch 2.2 wheel for Python 3.11 and CUDA 12.2 is up: http://jetson.webredirect.org/jp6/cu122/torch/2.2.0

Will look at merging this PR shortly!

All packages should be in python 3.11. It is the standalone now from desktop, so people that come to jetson, it would be fine to find python3.11 packages

@dusty-nv
Copy link
Owner

@johnnynunez I am not changing the default python version away yet for all containers from what is the default on that version of ubuntu (so python 3.10 on ubuntu 22.04), but for the containers that need it they can specify which python they need

@johnnynunez
Copy link

I am not changing the default python version away yet for all containers from what is the default on that version of ubuntu (so python 3.10 on ubuntu 22.04), but for the containers that need it they can specify which python they need

Yes yes, your solution is great

@ms1design
Copy link
Contributor Author

Nice @dusty-nv, will try that tomorrow 🙌

@ms1design
Copy link
Contributor Author

ms1design commented Apr 19, 2024

Fixed wyoming-piper wrapper container for piper-tts, it's still using the CPU unfortunately 🕯️ Probably we just need to patch some thing for now...

From the good news I managed to integrate all required containers into working state with Home Assistant in a way that we can configure full Voice Assistant Pipeline in HA running on Jetson:

Screenshot 2024-04-19 at 18 11 54

We can choose the languages and models:
Screenshot 2024-04-19 at 18 12 35

wyoming-assist-microphone container allows to connect your Mic/Speaker (eg, most loved Anker S330 over USB) and it has also nice native control supported as shown below. In addition it's not using VAD, but relies on wyoming-openwakeword container to detect the wake word.

Screenshot 2024-04-19 at 18 17 25

There's still quite a lot of TODO's in this PR, but it's ready for initial testing if anyone is interested :) Questions welcome.

@dusty-nv
Copy link
Owner

Oh wow, that's amazing you got all the add-ons build, loading, and running! Huge step, thanks @ms1design !!

Maybe wyoming-piper needs to specify use_cuda=True when it loads PiperVoice:

self.model = PiperVoice.load(model_path, config_path=config_path, use_cuda=True)

In parallel with your efforts, I have been integrating PiperTTS into the NanoLLM agents today. It's sounding good!

@ms1design
Copy link
Contributor Author

Correct @dusty-nv , when you follow my mention on rhasspy/wyoming-piper#5 you will find the required changes there as a diff (bottom of every PR page on github) – not sure if that would be enough, but I taken some break to play around with other things ;)

@ms1design
Copy link
Contributor Author

ms1design commented Apr 20, 2024

Update:

  • wyoming-piper is re-using the ng-speak data from piper-tts container
  • both wyoming-piper and piper-tts containers have the same download dir for models
  • wyoming-piper container is using cuda starting from now:
wyoming-piper-gpu.mov

_piper-tts_logs.txt

@ms1design
Copy link
Contributor Author

ms1design commented Apr 22, 2024

Update:

  • added config.py to most of the containers
  • add-ons have their own root filesystem for better customisation
  • wyoming-assist-microphone is currently HARDCODED to support Anker S330 USB Mic/Speaker (for testing purposes only)

Known Issues

  • wyoming-openwakeword is detecting the wake word through wyoming-assist-microphone container, but never starts listening for voice command after activation : WARNING:root:Event(type='error', data={'text': 'speech-to-text failed', 'code': 'stt-stream-failed'}, payload=None) in wyoming-assist-microphone

@ms1design
Copy link
Contributor Author

Update

  • Fully working Home Assistant Voice Pipeline is finally here 🌟

Known Issues

  • 🤷

TODO

  • Check & sort out the docker volumes
  • wyoming-openwakeword is still on tflite/cpu (need to build on onnxruntime-gpu)
  • Documentation

docker-compose.yaml

name: home-assistant-jetson
version: "3.9"
services:
  home-assistant:
    image: ms1design/homeassistant-core:latest-r36.2.0-cu122
    restart: unless-stopped
    init: false
    privileged: true
    network_mode: host
    container_name: home-assistant
    hostname: home-assistant
    ports:
      - "8123:8123"
    devices:
      - /dev/snd:/dev/snd
      - /dev/bus/usb
    volumes:
      - config:/config
      - /etc/localtime:/etc/localtime:ro
      - /etc/timezone:/etc/timezone:ro
    stdin_open: true
    tty: true

  openwakeword:
    image: ms1design/wyoming-openwakeword:latest-r36.2.0-cu122
    restart: unless-stopped
    runtime: nvidia
    network_mode: host
    container_name: openwakeword
    hostname: openwakeword
    init: false
    depends_on:
      - faster-whisper
    ports:
      - "10400:10400/tcp"
    volumes:
      - openwakeword_models:/share/openwakeword
      - /etc/localtime:/etc/localtime:ro
      - /etc/timezone:/etc/timezone:ro
    stdin_open: true
    tty: true

  faster-whisper:
    image: ms1design/wyoming-whisper:latest-r36.2.0-cu122
    restart: unless-stopped
    runtime: nvidia
    network_mode: host
    container_name: faster-whisper
    hostname: faster-whisper
    init: false
    ports:
      - "10300:10300/tcp"
    volumes:
      - whisper_models:/share/whisper
      - whisper_data:/data
      - /etc/localtime:/etc/localtime:ro
      - /etc/timezone:/etc/timezone:ro
    stdin_open: true
    tty: true

  assist-microphone:
    image: ms1design/wyoming-assist-microphone:latest-r36.2.0-cu122
    restart: unless-stopped
    network_mode: host
    container_name: assist-microphone
    hostname: assist-microphone
    init: false
    depends_on:
      - openwakeword
    ports:
      - "10700:10700/tcp"
    devices:
      - /dev/snd:/dev/snd
      - /dev/bus/usb
    volumes:
      - assist_microphone_share:/share
      - /etc/localtime:/etc/localtime:ro
      - /etc/timezone:/etc/timezone:ro
    environment:
      AUDIO_DEVICE: "plughw:CARD=S330,DEV=0"
    stdin_open: true
    tty: true

  piper-tts:
    image: ms1design/wyoming-piper:master-r36.2.0-cu122
    restart: unless-stopped
    network_mode: host
    runtime: nvidia
    container_name: piper-tts
    hostname: piper-tts
    init: false
    ports:
      - "10200:10200/tcp"
    devices:
      - /dev/snd:/dev/snd
      - /dev/bus/usb
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - /etc/timezone:/etc/timezone:ro
    stdin_open: true
    tty: true

volumes:
  config:
    name: ha-config
  openwakeword_models:
    name: ha-openwakeword-models
  whisper_models:
    name: ha-whisper-models
  whisper_data:
    name: ha-whisper-data
  assist_microphone_share:
    name: ha-assist-microphone-share

@ms1design ms1design marked this pull request as ready for review April 23, 2024 18:09
@dusty-nv dusty-nv merged commit 01c2223 into dusty-nv:dev Apr 23, 2024
@dusty-nv
Copy link
Owner

🥳 🙌 🎉 thanks @ms1design!, trying to build these now - will push to dockerhub if successful.

Are there additional procedures needed to document for when others are testing? Or do you basically just need to change "plughw:CARD=S330,DEV=0" to your desired audio device in your docker-compose.yml

@ms1design
Copy link
Contributor Author

ms1design commented Apr 23, 2024

Hi @dusty-nv,

Basically one should take a look on the all ENV variables declared in each of wyoming-container Dockerfiles. Some things like sound or mic volume is only configurable by those variables. That’s because we skipped the HA Supervisor which exposes UI to not only set add-ons options but also the default audio device to use. Instead of that we need to use env variables for now ;)

Edit:
And yes – you need to pass your AUDIO_DEVICE as shown in above docker-compose.yaml example.

@dusty-nv
Copy link
Owner

OK gotcha - eventually we will have an entire section on Jetson AI Lab with easy-to-follow HomeAssistant tutorials for setting up the AI services, but for now it would be nice to have lower-level notes for ppl on the forums/ect who want to try this. Chugging through the builds now!

@ms1design
Copy link
Contributor Author

That’s understandable, I’m gonna work on that on upcoming days 🙌

@dusty-nv
Copy link
Owner

Hitting an issue where these containers use Python 3.11, and also use tensorrt (at least the piper one does) - but tensorrt only installs python bindings for the default version of Python (i.e. 3.10). I believe I can get around this by having the tensorrt container build it's bindings from source, but I'm curious how you got around this in your builds?

@ms1design
Copy link
Contributor Author

I'm curious how you got around this in your builds?

I think it fails when you run tests, right? Its not the build that fails

@dusty-nv
Copy link
Owner

Yea, the tests...I just disabled that test for now, because I don't believe tensorrt python module is actually used (onnxruntime links against the TensorRT C++ libs). Will revisit that later...

@ms1design
Copy link
Contributor Author

I had the same feeling so tbh I skipped that and later forgot about it completely 🤓

@dusty-nv
Copy link
Owner

@ms1design do you see potential side-effects in your docker-compose setup if I change ENTRYPOINT ["/init"] to CMD /init ? I am working through the wyoming-piper container and that entrypoint is causing the post-build tests to fail (which does illustrate an issue with how I invoke the tests...but yea. been one of those days 🥴)

@ms1design
Copy link
Contributor Author

ms1design commented Apr 23, 2024

Not tried that. What I tried was setting the entrypoints from docker-compose.yaml file, but I was facing some PID issues.

edit: @dusty-nv I had issues when using —init, thats why I explicitly set it to false in docker-compose

@dusty-nv
Copy link
Owner

OK, here they are!

dustynv/wyoming-openwakeword:r36.2.0
dustynv/wyoming-assist-microphone:r36.2.0
dustynv/wyoming-whisper:r36.2.0
dustynv/wyoming-piper:r36.2.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants