Add delay until STT start media finishes playing #11

relust · 2024-02-20T06:35:05Z

Hello. Great job. I was waiting for the wake word for Stream Assist and I'm glad you managed to do it. My problem is that for "STT start media" I want to use personalized random answers like ”yes, i m listening”, ”how can I assist you” etc. and, because VAD is too aggressive, it also records part of the answer ”yes , i m listening” reason for which it gives an error response, that it did not understand the request. I tried an automation so that when it detects the wake word it turns off the microphone switch for a second and then turns it on again, but it doesn't start listening again. Can you make it possible to set a delay between wake word detection and STT listening?

AlexxIT · 2024-02-20T07:04:57Z

StreamAssist uses default Assist Pipeline component. It has some settings, but I don't really understand them :)
https://github.com/home-assistant/core/blob/54d005a3b8a5beaaf912a37b89ceab78694bd9db/homeassistant/components/assist_pipeline/pipeline.py#L447-L457

Also realise that the player has finished playing for all kinds of media player can be a problem.

relust · 2024-02-20T07:37:55Z

Assist Microphone addon and wyoming satellite on raspberry py do not have this problem. Wait for awake response to finish playing then start listening. So there is something like that in the code, but we have to figure out where. And on the satellite on the Esp32 it has three levels of end-of-speech detection (Default, Relaxed and Aggressive).

AlexxIT · 2024-02-20T09:36:56Z

end-of-speech detection is setting for VoiceCommandSegmenter. Unfortunately it is not possible to change params for the Pipepeline integration.

https://github.com/home-assistant/core/blob/2f026ca9631d13bf3e04349dfc27909105977e9f/homeassistant/components/assist_pipeline/vad.py#L118-L119

https://github.com/home-assistant/core/blob/2f026ca9631d13bf3e04349dfc27909105977e9f/homeassistant/components/assist_pipeline/vad.py#L14-L31

AlexxIT · 2024-02-20T09:39:24Z

I get the idea. I don't know if I'll have time to implement this.

relust · 2024-02-22T06:46:24Z

I found a possible solution to this problem:

in stream_assist/core/_init_.py set the wake sound playback to WAKE_WORD_END instead of STT_START because if I want it to start a continuous conversation I don't want it to say the same thing as when I call by name.
to create an asynchronous task by which to pause 0.1 seconds after giving the command to play awake sound with await asyncio.sleep(0.1) then to put a blocking pause, to stop all the code until awake sound finishes playing, with time.sleep()
The problem that still needs to be solved is that VAD is too aggressive, we should find a way to tell Home Assistant to use Relaxed VAD. I think this can be done where the settings are sent to the Pipeline.

# 2. Setup Pipeline Run
#...
        if event.type == PipelineEventType.WAKE_WORD_END:
            if player_entity_id and (media_id := data.get("stt_start_media")):
                # We schedule the execution of the asynchronous function in the background
                asyncio.create_task(async_play_media_and_pause(hass, player_entity_id, media_id))
#... at the bottom of the script
async def async_play_media_and_pause(hass, player_entity_id, media_id):
    play_media(hass, player_entity_id, media_id, "audio")
    await asyncio.sleep(0.1)  # We add an asynchronous pause of 100 ms
    time.sleep(5)  # We add a blocking pause that can be adjusted to how long the awake sentence is

AlexxIT · 2024-02-22T07:06:20Z

Block loop is very bad idea. You are blocking whole Hass.

I know what can be done. I can stop forwarding audio stream from source to pipeline for some time

relust · 2024-02-22T07:22:09Z

I didn't think that it blocks whole Hass. Anyway, it doesn't really work because, I don't know why it starts recording as soon as the wake word is detected, then blocks and delays the VAD and doesn't recognize the commands. Stopping audio stream forwarding would be a much better solution.

relust · 2024-02-26T19:05:00Z

@AlexxIT please can you find a solution to this problem because I want to add visual responses instead of beeps in this integration and if I don't solve the problem with activate mute or delay listening I can't use such responses because it records them and no longer recognize commands.

AlexxIT · 2024-02-27T04:16:09Z

I don't have time for this in near future

relust · 2024-03-05T20:39:34Z

I added a browser mod popup with a gif and I need the player status to close the popup when the response finishes playing , but I'm not getting the "player_entity_id" from the args. @AlexxIT can you tell me how I could do it.

        elif event.type == PipelineEventType.TTS_END:
            if player_entity_id:
                tts = event.data["tts_output"]
                play_media(hass, player_entity_id, tts["url"], tts["mime_type"])
            if player_entity_id and (media_id := data.get("speech_gif")):
                show_popup(hass, player_entity_id, media_id, "picture", browser_id)
            if player_entity_id:
                asyncio.create_task(async_delay_close_popup(hass, player_entity_id, browser_id))
                
   
######################################################              
  
   async def async_delay_close_popup(hass, player_entity_id, browser_id):
    
    await asyncio.sleep(1)

    while True:
        player_state = hass.states.get(player_entity_id).state
        if player_state == "idle":
            break 

        await asyncio.sleep(0.1)

    close_popup(hass, player_entity_id, browser_id)
    
##################################################   
    def close_popup(hass: HomeAssistant, player_entity_id: str, browser_id: str):
    service_data = {        
        "entity_id": player_entity_id,
        "browser_id": browser_id,
    }

    coro = hass.services.async_call("browser_mod", "close_popup", service_data)
    hass.async_create_background_task(coro, "stream_assist_close_popup")

If I use the name of the player directly, it works, but not when I want to take it from args
player_state = hass.states.get("media_player.ha_display2_browser").state

AlexxIT · 2024-03-06T04:16:26Z

I'm not sure what args you talking about. I have never used browser mod. Don't understand your code.

relust · 2024-03-06T04:40:40Z

I just need to import the name of the player that is selected in the gui that the responses are playing on to set the popup to close when the response is done playing.
I need to replace the name of the player that I put directly in the code and it works with, player_state = hass.states.get("media_player.ha_display2_browser").state, with the name of the player set in the graphic interface so that the player selector can work player_state = hass.states.get(player_entity_id).state
I don't know why it doesn't import the name of the player or maybe it doesn't import it in a format that works in this template. player_entity_id is imported from function arguments (hass, player_entity_id, media_id, "picture")

AlexxIT · 2024-03-06T04:59:23Z

I don't understand from what place your trying to get player_entity_id var.

janstadt · 2024-09-09T14:35:22Z

Did this ever get taken care of? I noticed teh VAD is way too aggressive as well and depending on how quickly the mp3 you play during start media vad is already over and the conversation agent cancels the request.

AlexxIT added the question Further information is requested label Feb 20, 2024

AlexxIT added enhancement New feature or request and removed question Further information is requested labels Feb 20, 2024

relust mentioned this issue Feb 24, 2024

Question about Android audio only #12

Closed

AlexxIT mentioned this issue Mar 7, 2024

Add visual custom responses with gifs and tts #16

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add delay until STT start media finishes playing #11

Add delay until STT start media finishes playing #11

relust commented Feb 20, 2024 •

edited

Loading

AlexxIT commented Feb 20, 2024

relust commented Feb 20, 2024

AlexxIT commented Feb 20, 2024

AlexxIT commented Feb 20, 2024

relust commented Feb 22, 2024 •

edited

Loading

AlexxIT commented Feb 22, 2024 •

edited

Loading

relust commented Feb 22, 2024

relust commented Feb 26, 2024

AlexxIT commented Feb 27, 2024

relust commented Mar 5, 2024 •

edited

Loading

AlexxIT commented Mar 6, 2024

relust commented Mar 6, 2024

AlexxIT commented Mar 6, 2024

janstadt commented Sep 9, 2024

Add delay until STT start media finishes playing #11

Add delay until STT start media finishes playing #11

Comments

relust commented Feb 20, 2024 • edited Loading

AlexxIT commented Feb 20, 2024

relust commented Feb 20, 2024

AlexxIT commented Feb 20, 2024

AlexxIT commented Feb 20, 2024

relust commented Feb 22, 2024 • edited Loading

AlexxIT commented Feb 22, 2024 • edited Loading

relust commented Feb 22, 2024

relust commented Feb 26, 2024

AlexxIT commented Feb 27, 2024

relust commented Mar 5, 2024 • edited Loading

AlexxIT commented Mar 6, 2024

relust commented Mar 6, 2024

AlexxIT commented Mar 6, 2024

janstadt commented Sep 9, 2024

relust commented Feb 20, 2024 •

edited

Loading

relust commented Feb 22, 2024 •

edited

Loading

AlexxIT commented Feb 22, 2024 •

edited

Loading

relust commented Mar 5, 2024 •

edited

Loading