Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for ESP32-S3-BOX peripherals + voice_assistant #2239

Open
9 of 18 tasks
rpatel3001 opened this issue May 16, 2023 · 108 comments
Open
9 of 18 tasks

Support for ESP32-S3-BOX peripherals + voice_assistant #2239

rpatel3001 opened this issue May 16, 2023 · 108 comments

Comments

@rpatel3001
Copy link

rpatel3001 commented May 16, 2023

Describe the problem you have/What new integration you would like

Main features: support for peripherals on the ESP32-S3-BOX dev kit:

To get voice_assistant working:

  • voice_assistant is not compatible with docker bridge networking, must use macvlan/ipvlan/host mode
  • get media_player to work with .raw streams (speaker component works fine)

Architectural changes to support wakeword and esp-idf framework (probably out of scope here and will be transferred to a new issue or 3 once the S3-BOX works for on-demand voice commands):

  • Debug bootloops when using esp-idf framework
    • No longer happens
  • Add esp-adf, skainet, etc
    • esp-adf added, wake word done remotely
  • Update i2s_audio_media_player
    • Refactor current library to only handle audio streams; move i2s setup into esphome proper
    • Possibly switch to an audio library that doesn't require arduino framework?

Please describe your use case for this integration and alternatives you've tried:

Use the peripherals on the board. Working on-demand voice_assistant.

Additional context

This device has recently had a bit of attention due to posts about Willow on hackernews and elsewhere. Willow is fantastic but I'd like to be able to use the full extent of existing esphome components, and I bet others would also. Adding hardware peripherals is the smallest part of this, wake word detection is the major missing feature missing to make esphome a viable alternative (out of scope for this feature request though).

Reference links:
https://github.com/espressif/esp-box
https://github.com/toverainc/willow
https://github.com/hugobloem/esp-ha-speech
espressif/esp-dev-kits#24 (comment)
https://components.espressif.com/components/espressif/es8311
https://components.espressif.com/components/espressif/es7210
https://github.com/espressif/esp-bsp/
https://github.com/espressif/esp-adf/

@kroimon
Copy link

kroimon commented May 17, 2023

Thanks for this overview!

I have created a component for the touchscreen already in esphome/esphome#4793 which is working fine on my Box and is ready for review.

I am currently working on the I2C control component for the ES8311 (no PR yet, trying to figure out the best solution for MCLK).

@kroimon
Copy link

kroimon commented May 17, 2023

Also, the ILI9342C driver requires some additions to allow enabling x-mirroring for the ESP32-S3-BOX. I have started implementing that, a PR will also follow. For now, you can check out my sample config linked in esphome/esphome#4793 (comment) which I will update periodically.

@rpatel3001
Copy link
Author

Awesome, nice progress. There is a very rough implementation for the ES8388 here, not sure how helpful it is as the register map is quite different and MCLK is currently hard-coded.

Also, it's worth looking into how willow and the default firmware handle MCLK, I think the codec has a mode which can derive it's LRCLK and BCLK from it's MCLK/SCLK and distribute them to the ADC on the board. That may be required if there is a requirement that MCLK is synchronous to LRCLK and BCLK? I'm not too familiar with I2S or the ESP32/esphome implementation of it.

Also, is it worth trying to get I2S support for the esp-idf framework as well, to make hacking in wake word stuff with esp-adf/esp-sr later easier? I haven't looked into this much but maybe not, since I think I saw an arduino framework wrapper for esp-adf somewhere.

@kroimon
Copy link

kroimon commented May 17, 2023

Yeah the ES8311 can theoretically work without a dedicated MCLK by generating it internally from the SCLK, but as the ESP32-S3-BOX has an MCLK wired to GPIO2 anyway, we should figure out the best way to implement that in esphome, I guess.
Maybe @jesserockz already has plans for that?

My ES8311 branch is at https://github.com/kroimon/esphome/tree/es8311 if you're interested, but it's still a WIP and a few days away from a proper PR.

Also, getting the whole I2S stuff working on esp-idf would be great, because we could probably integrate libraries such as WakeNet much easier. However, I could not even get a very simple esphome config to run on my S3, because it kept resetting due to some watchdog. I did not debug this any further because using esp-idf wasn't of much use without the I2S components anyway.

@rpatel3001
Copy link
Author

Adding MCLK to i2s_audio seems to me like the most straightforward path for that, is there any case where two devices might share an LRCLK and BCLK but have different MCLKs? I simply added MCLK as an optional param for i2s_audio: esphome/esphome@dev...rpatel3001:esphome:add_i2s_mclk

@rpatel3001
Copy link
Author

Also I forked your box.yaml gist to add the RGB LED that comes with the kit, invert the sense of the settings button, and add my MCLK change and your ES8311 components

@rpatel3001
Copy link
Author

i've successfully gotten home assistant to stream TTS and radio audio to the ESP-BOX using the config in my gist. the volume is quite low, though I expect your work on the codec interface will help with that.

@kroimon
Copy link

kroimon commented May 17, 2023

I probably won't have time to look into it before Friday afternoon, but that already sounds awesome!

@rpatel3001
Copy link
Author

rpatel3001 commented May 19, 2023

started some ADC code at https://github.com/rpatel3001/esphome/tree/es7210

this I2S stuff make very little sense to me right now, the frequencies I measure on the pins are not at all what it looks like is configured by the i2s components. It's difficult to debug the ADC without access to the raw audio, trying to send it to the home assistant pipeline with whisper actually causes an error in whisper so it's clearly doing something wrong.

also the ADC datasheet is terrible, I could only find a register map on some sketchy chinese site by googling and it's version 2.0, compared to the most recent version 23 (without registers).

@rpatel3001
Copy link
Author

dumping some thoughts here stream of consciousness style: I think ideally i2s_audio would have options for mclk frequency and sample rate and that would be pulled into i2s_audio_media_player and i2s_audio_microphone to setup the i2s peripheral in the same way the pin numbers are currently pulled in. the DAC and ADC I2C components would need options as well to setup the chips with the correct options based on the clock settings.

it's unclear to me why i2s_audio_microphone and i2s_audio_speaker are calling esp-idf i2s functions but i2s_audio_media_player is not. The media player library is handling it internally? how do these two components work together?

@ssieb
Copy link
Member

ssieb commented May 19, 2023

The media player library is a little difficult that way. It's kind of a black box right now. We tell it what to play and it just does it.
And yes, if devices need an MCLK signal, then that should be added to the i2s audio component as an optional parameter.
Someone asked about that a while back, but the easier solution was to change the device setting to not require it. But he was doing the wiring, so that was easy to do.

@jesserockz
Copy link
Member

it's unclear to me why i2s_audio_microphone and i2s_audio_speaker are calling esp-idf i2s functions but i2s_audio_media_player is not

This is because the Audio library handles the streaming, decoding and playing to i2s. It's not the best solution, but it was the easiest at the time given the timeframe I had. The weird thing is the library actually supports calling a function to give the i2s data to and not send it out, but it still requires to set up the i2s peripheral itself 🤦

@kroimon
Copy link

kroimon commented May 19, 2023

I mean, we could probably make changes to the Audio library, as it is already a modified fork. The question is how close to upstream you want it to be.
I think the main benefit of using the library in the first place are it's audio format decoders. The I2S stuff could be implemented in native esphome code to be able to better integrate different external codec chips.

@kroimon
Copy link

kroimon commented May 20, 2023

I spent some more time learning the inner workings of I2S and how the different components use it right now. The following is a list of findings and 'challenges' I ran into:

The main issue we have is that there is currently no central instance that controls the parameters of the I2S bus.
The i2s_audio platform merely acts as a container for the pin configuration, but the actual calls to i2s_driver_install() and i2s_set_pin() are done in i2s_audio_microphone.cpp, i2s_audio_speaker.cpp and i2s_audio_media_player.cpp. In addition to that, the external ESP32-audioI2S component calls i2s_set_sample_rates depending on the media being played.

This makes it very hard to implement external ADCs and DACs whose configuration depend on the current clock speeds and sampling rates. Those audio codec components need a central instance to register for configuration change events so the new settings can be forwarded to the external controllers.

With the current architecture, there is also no way for full-duplex operation of the same I2S port. The Mutex in the i2s_audio component only allows exclusive access to an I2S port. However, the ESP32-S3-BOX and ESP32-S3-Korvo-2 boards share the same I2S port (MCLK, SCLK, LRCK pins) for both audio input and output.

In general, full-duplex operation can only work if both input and output use the same clock parameters. The microphone and speaker components currently use fixed 16000 Hz sampling rates at 16 bits per sample. The media player switches the sampling rates based on the currently played files/streams.
So I don't really see a way to use a media player together with a microphone and/or speaker component for a voice assistant right now, at least not at the same time. It might be possible to implement a priority-based switching logic that allows them to coexist.

ESP-IDF 5.0 introduced the concept of 'channels' in the new i2s driver which would make full-duplex operation a somewhat easier task. (For reference, the latest currently available version of arduino-esp32 2.0.9 is based on ESP-IDF 4.4.4).

In summary, I think we need a major refactoring of the i2s_audio platform and its microphone, speaker and media_player components:

  • Create an interface for DAC and ADC components that need to react to clock timing and sampling rate changes by reconfiguring attached DAC/ADC controllers.
  • Allow full-duplex operation over a single I2S port.
  • Investigate: Could media_player use speaker to output audio? (Would make sense from an architecture point of view)

@nagyrobi
Copy link
Member

See how many ideas are outthere for media player in ESPHome:
integration: media_player
There's no other topic so hot imho...

@kroimon
Copy link

kroimon commented May 20, 2023

@rpatel3001 I found the full datasheet for the ES7210 here (Backup).
Unfortunately I was still unable to locate the corresponding user guide, but this should be enough information to get it working, together with the existing implementations in esp-bsp and esp-adf.
I feel like the esp-adf implementation is even more helpful as it shows all the bits and pieces required for mic selection.

I continued a bit on your work over in my branch, mostly formatting and cleanup for now.

@guillempages
Copy link

I made the "mistake" of trying to save some bucks and bought the ESP32-S3-Box-Lite instead of the full one. That one does not have touchscreen, but three additional buttons, and it has (apparently) an ST7789v display instead of an ILI9342C one.

For some (to me yet unexplained) reason, I can show things on the display by using the ILI9342C configuration from @kroimon (https://gist.github.com/kroimon/f6692879f9c00702990801ae9dfa433b); it just doesn't need the mirroring, but the colors are somehow offset (e.g. Red is (255, 255,0), Green is (255, 0, 255) and Blue is (0, 255, 255); while White and Black would be the expected colors). I haven't managed to show anything useful using the standard st7789 component. Does anyone have an idea why this would be?

Is it worth it, to track the S3-Box-Lite support here as well, or would it be better to create a separate Feature Request? (Since most of the components would be the same anyway).

@mattkasa
Copy link

Seems like the peripherals of the ESP32-S3-Korvo-1 are really similar to ESP32-S3-BOX as well.

One main difference is the ES7210 is on a different I2S bus from the ES8311.

I have an ESP32-S3-Korvo-1 running this config and LED ring and buttons are working, audio not working at all yet so I'm not sure I have the two I2S buses configured correctly or maybe two i2s_audio buses aren't supported yet.

Waiting on an ESP32-S3-BOX to be able to do more testing, but the Korvo is currently in stock on Amazon for 50USD if anyone else is curious about it.

@rpatel3001
Copy link
Author

rpatel3001 commented May 23, 2023

@guillempages I can add the Lite's display to the top post, but can't promise anyone will work on it as I don't have a Lite to play with. You'll probably get more visibility/help by creating a bug report for the st7789 component.

@mattkasa I think two I2S buses ought to work, but not totally sure. Does the codec work by itself if you comment out the ADC config? The current tip of the ES8311 PR sets the volume to 0, try an earlier commit or my es8311 branch for now.

@mattkasa
Copy link

@rpatel3001 I'm testing like this, but I have no idea how speaker.play: is supposed to look:

    on_press:
      - output.turn_on: pa_ctrl
      - speaker.play:
          id: external_speaker
          data: [64, 64, 0, 0, 128, 128, 0, 0, 64, 64, 0, 0, 128, 128, 0, 0, 64, 64, 0, 0, 128, 128, 0, 0, 64, 64, 0, 0, 128, 128, 0, 0, 64, 64, 0, 0, 128, 128, 0, 0]
      - output.turn_off: pa_ctrl

Not getting any audible sound, but logs look like:

[02:23:19][C][es8311:167]: ES8311 Audio Codec:
[02:23:19][C][es8311:168]:   Use MCLK: YES
[02:23:49][D][sensor:094]: 'button_adc': Sending state 1.63600 V with 2 decimals of accuracy
[02:23:49][D][binary_sensor:036]: 'Korvo 1 Play': Sending state ON
[02:23:49][D][esp-idf:000]: I (38072) I2S: DMA Malloc info, datalen=blocksize=4092, dma_buf_count=8

[02:23:49][D][esp-idf:000]: I (38074) I2S: I2S0, MCLK output by GPIO42

[02:23:50][D][esp-idf:000]: I (38239) I2S: DMA queue destroyed

So I wonder if it's just my speaker.play data 🤔

@rpatel3001
Copy link
Author

rpatel3001 commented May 23, 2023

hm, I can't say about speaker.play, I've been using home assistant to send audio to the media_player component. Do you at least get clicks when the PA is muted/unmuted? Maybe try the media player component also, the I2S code is different.

@mattkasa
Copy link

mattkasa commented May 23, 2023

Ah yeah, I'm using the esp-idf framework, so media_player isn't supported, my thinking has been to use esp-idf to make it easier to build a component that uses esp-sr for wakeword since it seems like that's probably where all of this is headed :)

edit: I tried building with arduino to test with media_player and it panics and boot loops, there is something the bootloader doesn't like, I'll keep looking at it to see if I can get it running with arduino.

@rpatel3001
Copy link
Author

I did some testing with i2s_audio_speaker and it seems to be partially working (on Arduino). With a much longer data vector (8k samples = half a second, a full second crashed the board when played) I mostly just hear clicks but occasionally the tone plays for a fraction of the duration. Interestingly the tone is twice the frequency it should be, which is maybe a clue about what's wrong.

I also tried compiling a barebones config for esp-idf but it bootloops. Fixed the bootloop with

  platformio_options:
    board_build.flash_mode: dio

but then it just hangs after booting. Haven't found a fix for that, it does this even with the most recent esp-idf version/platform_version.

@mattkasa
Copy link

@rpatel3001 for esp-idf try:

esp32:
  board: esp32s3box
  framework:
    type: esp-idf
  variant: ESP32S3

I was able to get arduino working on the Korvo with this:

esp32:
  board: esp32-s3-devkitc-1
  variant: esp32s3
  framework:
    type: arduino

And media_player tries to work, but no sound, not even clicks, so I don't have something right with the I2S bus.

[05:36:40][D][media_player:059]: 'Korvo 1 Media Player' - Setting
[05:36:40][D][media_player:066]:   Media URL: https://homeassistant.local/api/tts_proxy/726c76553e1a3fdea29134f36e6af2ea05ec5cce_en-us_a877e2b3bf_tts.piper.wav

@rpatel3001
Copy link
Author

Adding the variant and/or changing the board didn't change anything unfortunately.

@Hellis81
Copy link

I just got my ESP Box, how do I get it in to flash mode?
I can't find it in ESP-Home or ESP-Home flasher.

I tried to hold the boot button when connecting but it wont work, do I need some special drivers for this?

@KTibow
Copy link

KTibow commented Jul 28, 2023

Generally, the process is like

  1. Plug the thing into the device with the ESPHome dashboard, via a USB cable with data
  2. Press upload, choose plug into device with dashboard, choose the device

@Hellis81
Copy link

Generally, the process is like

  1. Plug the thing into the device with the ESPHome dashboard, via a USB cable with data
  2. Press upload, choose plug into device with dashboard, choose the device

Major facepalm, the cable I have been using for an hour or so does not have data.
I (wrongly) assumed they had stopped making cables without data.

@snechiporenko
Copy link

Work in progress? esphome/esphome#5230

@rpatel3001
Copy link
Author

Nice, ADF is a great stepping stone to wake word and the advanced audio stuff willow is doing (combining multiple mics, DAC cancelation). Will monitor that PR.

@qJake
Copy link

qJake commented Sep 2, 2023

I just got my ESP32-S3-BOX in the mail. I'm using the box.yaml from @rpatel3001 (thanks!).

I've added the ESPHome device into Home Assistant, and I can see in HA when I press the mic mute or "Assistant" top-left button. But I can't seem to get any audio output or get it to register my voice into STT even if I manually press the assistant button.

The screen is also blank (white), but that's a separate issue. I mostly bought this for the mic+speakers.

What step did I miss to get this to behave like a voice assistant in HA? I do have Piper and Whisper set up and if I try it with my browser, it works just fine. But I can't seem to get it to work with the ESP32 device.

@KTibow
Copy link

KTibow commented Sep 2, 2023

What step did I miss to get this to behave like a voice assistant in HA?

read latest comments, rpatel3001 is still working on getting the audio sent to home assistant

@rpatel3001
Copy link
Author

Not working on it, my particular issue was resolved. It seems that you might be having a similar issue, @qJake. Your home assistant instance needs to be accessible on all ports because it uses a random port number to transfer audio samples.

@KTibow
Copy link

KTibow commented Sep 3, 2023

huh it was resolved?
what dependencies do you need to upgrade to resolve it?

@rpatel3001
Copy link
Author

#2239 (comment)

@qJake
Copy link

qJake commented Sep 5, 2023

@rpatel3001 I use HA OS, so I have Home Assistant running in a VM on my hypervisor. I can send audio to the ESP32 box (cloud TTS works, as you pointed out, Piper does not work yet, plus it's about 5x as slow anyway).

If my HA instance is accessible over the internet, is there a port I need to forward / hard-code somewhere to get it to work?

@jesserockz
Copy link
Member

Just to chime in here. I have been working on getting the s3-box working as a Voice Assistant for Home Assistant.
The ongoing work is here esphome/esphome#5229 and esphome/esphome#5230.
There is an example YAML file here: https://github.com/esphome/firmware/blob/main/voice-assistant/esp32-s3-box.yaml

@llamaonaskateboard
Copy link

llamaonaskateboard commented Sep 16, 2023

@jesserockz Great work, seems to work well in VAD mode (I'm assuming remote wake word needs a dev build of HA?).

I tried to get the tt21100 touchscreen working at the same time but it seems there's a conflict between the i2c component and esp-adf where the microphone or speaker don't provide/produce any audio:

[11:48:48][C][i2c.idf:017]: Setting up I2C bus...
[11:48:48][I][i2c.idf:233]: Performing I2C bus recovery
[11:48:48][V][esp-idf:000]: I (26) gpio: GPIO[18]| InputEn: 1| OutputEn: 1| OpenDrain: 1| Pullup: 1| Pulldown: 0| Intr:0
[11:48:48][V][esp-idf:000]: I (26) gpio: GPIO[8]| InputEn: 1| OutputEn: 1| OpenDrain: 1| Pullup: 1| Pulldown: 0| Intr:0
[11:48:48][V][i2c.idf:056]: Scanning i2c bus for active devices...
[11:48:48][V][esp-idf:000]: I (56) gpio: GPIO[4]| InputEn: 0| OutputEn: 1| OpenDrain: 0| Pullup: 0| Pulldown: 0| Intr:0
[11:48:48][V][esp-idf:000]: I (56) gpio: GPIO[48]| InputEn: 0| OutputEn: 1| OpenDrain: 0| Pullup: 0| Pulldown: 0| Intr:0
[11:48:48][V][esp-idf:000]: I (57) gpio: GPIO[5]| InputEn: 0| OutputEn: 1| OpenDrain: 0| Pullup: 0| Pulldown: 0| Intr:0
[11:48:48][V][esp-idf:000]: I (383) gpio: GPIO[1]| InputEn: 1| OutputEn: 0| OpenDrain: 0| Pullup: 0| Pulldown: 0| Intr:0
[11:48:48][I][esp_adf:015]: Start codec chip
[11:48:48][V][esp-idf:000]: E (384) i2c: i2c driver install error
[11:48:48][V][esp-idf:000]: E (385) I2C_BUS: components/esp_peripherals/driver/i2c_bus/i2c_bus.c:89 (i2c_bus_write_bytes):Handle error

@fredyolha
Copy link

@llamaonaskateboard I managed to get the wake word simply with the openwakeword addon installation (https://github.com/rhasspy/hassio-addons) without any issues..
If I omit the display and touch, it works prefectly... otherwise the same issue as u

@ChristophCaina
Copy link

ChristophCaina commented Oct 13, 2023

hm... so I have to decide now between using the Wakeword - or the Display...
That's unfortune...
And... it seems that if Bluetooth-Proxy is enabled, the Wakeword detection / PushToTalk does not work?

@netweaver1970
Copy link

hm... so I have to decide now between using the Wakeword - or the Display... That's unfortune... And... it seems that if Bluetooth-Proxy is enabled, the Wakeword detection / PushToTalk does not work?

I seem to have the same (BLE proxy + VAD/wakeword stuff coexistence) problem on an M5Stack echo. I wanted to make a super-duper sensor controllerout of it, I have it running fine as a BLE-Proxy + mmwave sensor node. But when extra adding the VAD/wakeword stuff (and avoid the switch behaviour clash), the sensor either doesn't do anything or goes haywire, needing physical reboot. Sad ... :(

@qJake
Copy link

qJake commented Nov 4, 2023

Adding in my experience here as well...

I'm using (nearly verbatim) the example provided by ESPHome to get the ESP32-S3-Box working as a voice assistant.

Here's what works as of Nov 2023:

  • Wake word, including custom wake word trained on the wake word Colab
  • Speech to text recognition with good speed and accuracy
  • Text to speech using any TTS provider (Piper, HA Cloud, etc)
  • Buttons and LED status lights

Here's what I'm struggling to get working:

  • TTS audio output is broken using the speaker component - my console frequently gets flooded with: [W][voice_assistant:293]: Speaker buffer full.
  • I can't use the ESP32 device as both a speaker and a media_player - sometimes I want to play TTS audio from Home Assistant outside the context of voice recognition (e.g. an automation announcement)
  • Display does not work / can't be used at the same time
  • Wake word is spotty because it seems like it shuts off the microphone after ~5s of no audio detected, and then activates again once it hears anything - which means if you go from a completely silent room to just saying the wake word, it doesn't work the first time.

@anth-dinosaur
Copy link

anth-dinosaur commented Nov 11, 2023

Adding in my experience here as well...

I'm using (nearly verbatim) the example provided by ESPHome to get the ESP32-S3-Box working as a voice assistant.

Here's what works as of Nov 2023:

  • Wake word, including custom wake word trained on the wake word Colab
  • Speech to text recognition with good speed and accuracy
  • Text to speech using any TTS provider (Piper, HA Cloud, etc)
  • Buttons and LED status lights

Here's what I'm struggling to get working:

  • TTS audio output is broken using the speaker component - my console frequently gets flooded with: [W][voice_assistant:293]: Speaker buffer full.
  • I can't use the ESP32 device as both a speaker and a media_player - sometimes I want to play TTS audio from Home Assistant outside the context of voice recognition (e.g. an automation announcement)
  • Display does not work / can't be used at the same time
  • Wake word is spotty because it seems like it shuts off the microphone after ~5s of no audio detected, and then activates again once it hears anything - which means if you go from a completely silent room to just saying the wake word, it doesn't work the first time.

I have all of the same results as you. I get [voice_assistant:293]: Speaker buffer full. after 1-2 responses back from the assistant. I also notice that responses stop playing a little early, about 1-2 seconds before the end of the response. Also, more critically, on power cycle most of the configuration is lost:

  • The hostname is back to ESPHome Web XXXXX
  • The api encryption key is gone so HA can't talk to it
  • No switches/entities/etc are configured, and it is running the web_server component even though that is not in the below config
  • It seems like a full wipe back to when I first "prepared" it on esphome Web....except for that it has remembered its network settings (which were not configured with esphome web, and only upon flashing the config)

Used standard config as linked by HA docs (+ my wifi info): https://github.com/esphome/firmware/blob/main/voice-assistant/esp32-s3-box.yaml

Would be curious if others have the same issue?

@shyney7
Copy link

shyney7 commented Nov 12, 2023

Once all features are implemented will this also work with the newest model: ESP32-S3-Box3? https://github.com/espressif/esp-box/blob/master/docs/hardware_overview/esp32_s3_box_3/hardware_overview_for_box_3.md

@sammcj
Copy link

sammcj commented Nov 13, 2023

I have the new ESP32 S3 Box 3, currently have Willow installed but would much rather use ESPhome.

Id be happy to try a build on it and provide feedback if it helps.

@pauln
Copy link

pauln commented Dec 9, 2023

@rpatel3001 I've been testing your es7210 component on a Lilygo T-Embed ESP32-S3; it's been working well for me (I've had some issues with other aspects of the T-Embed, but mic input via the ES7210 has been working without issue. Would you mind putting up a PR for the es7210 component? It'd be nice to have it in ESPHome rather than having to use your fork as an external component.

@rpatel3001
Copy link
Author

@pauln I think the ES8311 and ES7210 components are somewhat redundant now that esp-adf is being integrated. The ES8311 PR went stale and was closed. Your best bet is probably to make a PR adding the Lyra to the esp-adf supported boards after testing that it works.

In other news I was unable to get the esp-adf example yaml to work, it complained at runtime about needing a patch to be applied to some freertos files. The ES8311 and ES7210 external components worked fine however, with both the arduino and esp-idf frameworks. I anticipate this is an issue with my local install since others have reported it working.

@formatBCE
Copy link

Interesting. My box device works incredible with that voice assistant example. Well, excluding that fact that there's not much of head space processor- and memory-wise. Audio responses are somewhat cut out, and there's no way to use it for announcements because of lack of media_player component for adf - but it's still useable. I put clock instead of static icon to idle state, and focused on Assist capabilities themselves so far.

@pauln
Copy link

pauln commented Dec 11, 2023

@rpatel3001 I've been keeping half an eye on the esp-adf PR, but it seems to be taking its time landing. If it'll be usable for all boards with these audio devices, having one component that handles them all rather than one per audio device does sound a bit cleaner - but my understanding is that it (currently, at least) uses board configs from esp-adf, of which there isn't one for the T-Embed. None of the various Lyra configs (nor any of the others) seem to be particularly close matches, either - was there a specific one you thought might be suitable?

@rpatel3001
Copy link
Author

hm no there wasn't I didn't realize esp-adf had a specific subset of supported boards. The ES7210 and ES8311 PRs in this issue basically reimplement the drivers in esp-adf. There's probably a way to directly instantiate the esp-adf one instead if you're inclined to figure out how and make your own PR.

@janstadt
Copy link

Interesting. My box device works incredible with that voice assistant example. Well, excluding that fact that there's not much of head space processor- and memory-wise. Audio responses are somewhat cut out, and there's no way to use it for announcements because of lack of media_player component for adf - but it's still useable. I put clock instead of static icon to idle state, and focused on Assist capabilities themselves so far.

Curious if you could share ypur config to get the clock displayed instead of static icon.

@formatBCE
Copy link

Curious if you could share ypur config to get the clock displayed instead of static icon.

Created fork and added my changes. See THIS commit, but first read following:

  1. in this commit there's error in color string, it should be without #, i fixed it in following commit.
  2. For this to work, you have to create text sensor (template/helper) sensor.local_clock in HA to prepare time string for you and update it each minute. It's easier to do on HA, than hold on ESP. I use this template:
{{ now().timestamp() | timestamp_custom('%H:%M') }}

Cheers!

@janstadt
Copy link

Curious if you could share ypur config to get the clock displayed instead of static icon.

Created fork and added my changes. See THIS commit, but first read following:

  1. in this commit there's error in color string, it should be without #, i fixed it in following commit.
  2. For this to work, you have to create text sensor (template/helper) sensor.local_clock in HA to prepare time string for you and update it each minute. It's easier to do on HA, than hold on ESP. I use this template:
{{ now().timestamp() | timestamp_custom('%H:%M') }}

Cheers!

@formatBCE thanks a ton! I was able to modify the existing default firmware by extending some of the properties so i didnt have to fork the repo.

substitutions:
  clock_font_color: "AA6600"

color:
  - id: clock_color
    hex: ${clock_font_color}

font:
  - file: 
      type: gfonts
      family: Roboto
      weight: 500
    glyphs: "0123456789:"
    id: font_large
    size: 110

text_sensor:
  - id: text_time
    platform: homeassistant
    entity_id: sensor.local_clock
    on_value:
      then:
        - component.update: s3_box_lcd

display:
  - id: !extend s3_box_lcd
    pages:
      - id: !extend idle_page
        lambda: |-
          it.fill(id(idle_color));
          it.printf(160, 120, id(font_large), id(clock_color), TextAlign::CENTER, "%s", id(text_time).state.c_str());

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests