🚀 v2026.6.0 - Intercom Native polish and Espressif GMF audio stack

Hotfix after initial 2026.6.0 publication

Published on May 30, 2026 after field testing the first 2026.6.0 build.

esp_afe now uses a compile-time split between single-mic and dual-mic targets.
Single-mic AFE profiles use the official ESP-SR direct feed/fetch path instead of the GMF AFE element path.
Dual-mic AFE profiles keep the GMF manager/element path and raw-output selection behavior.
Spotpear full AFE TCP/UDP profiles keep AFE VAD restore disabled by default to avoid restoring an unstable VAD state at boot.
The full MWW safe-start logic now uses the Intercom API idle condition instead of string-matching the state name.

This hotfix keeps the public YAMLs pointed at main and removes local debug/telemetry from production profiles.

This release is the next major step after the 2026.5.0 PBX-lite migration.

2026.5.x introduced the new call model: ESP devices as independent extensions, Home Assistant as a peer/bridge, unified phonebook, TCP/UDP routing and browser softphone support.

2026.6.0 keeps that model and rebuilds the audio foundation under it.

🏠 Home Assistant / Intercom Native

The Home Assistant side has been cleaned up around the unified PBX-lite event model.

🔁 Unified call event model

The integration and Lovelace card now use the unified:

intercom_native.call_event

event shape for session, bridge and forward updates.

This gives automations and the card a more consistent view of:

call scope
event type
call state
hangup / decline / failure reason
bridge and forward lifecycle

The older split event behavior is no longer the preferred model.

📵 Better unavailable-device handling

The card now handles unavailable ESP devices more explicitly instead of showing stale call controls as if the device were still reachable.

This should make dashboard state clearer when an ESP is offline, rebooting, being flashed, or temporarily disconnected from Home Assistant.

⚡ Safer fast hangup / redial behavior

The browser softphone path has been hardened for fast user actions.

If a call is ended and another call starts immediately after, browser audio cleanup no longer tears down the new call's microphone/audio path by mistake.

This fixes a class of "second call has no browser audio" style problems.

📱 Mobile notification answer flow

The documented mobile flow now supports real Answer / Decline actions:

Answer opens the dashboard view containing intercom-card with ?intercom_answer=1
the card requests microphone permission and starts the full-duplex browser/app audio path
Decline stays in Home Assistant automation logic and calls intercom_native.decline

This is the supported way to answer an ESP-originated call from the Home Assistant Companion app.

🧹 Versioned card cache behavior

The card is registered with a versioned frontend URL derived from the installed integration version.

After upgrading, hard-refresh the dashboard page or clear the Companion app cache if the card still shows an old version.

✅ Minimum Versions

This release requires:

ESPHome: 2026.5.x or newer
Home Assistant Core: 2026.5.0 or newer

HACS metadata now declares the Home Assistant minimum version accordingly.

⚠️ Breaking Changes

Custom YAMLs that still use the old audio component/package layout need to be migrated.

Main migration points:

maintained YAMLs now use esp_audio_stack
old i2s_audio_duplex packages are no longer the supported path
some YAML options were renamed:
- speaker_volume -> master_volume
- mic_attenuation -> input_gain
- frame_buffers_in_psram -> buffers_in_psram
- audio_stack_in_psram -> audio_task_stack_in_psram
Generic full profiles are split into AEC and AFE variants
full audio/LVGL profiles include OTA maintenance handling
old copied Lovelace card files should be replaced by the bundled card

After upgrading, clear ESPHome build caches once before compiling.

find . -type d -name .esphome -prune -exec rm -rf {} +

🎧 Audio Stack Migration

The biggest internal change in 2026.6.0 is the migration from the old custom duplex audio path to the new:

esp_audio_stack

backend.

This replaces the maintained i2s_audio_duplex path.

The goal is not just a component rename. The new backend is built around Espressif / ESP-IDF audio components that are designed to work together:

esp_driver_i2s for official I2S channel ownership
esp_codec_dev for codec-backed devices
gmf_io / io_codec_dev for codec IO
esp_audio_effects for rate, bit-depth and layout conversion
esp-sr for Acoustic Echo Cancellation
gmf_ai_audio / esp_gmf_afe_manager for the full Audio Front-End pipeline

This means the project now carries less custom audio infrastructure and relies more directly on the Espressif audio ecosystem.

💡 Why This Matters

Earlier versions had custom code for a lot of low-level audio work:

I2S lifecycle
speaker/microphone glue
AEC reference routing
rate conversion
bit-depth conversion
channel layout conversion
ring buffers
processor feed/fetch timing
codec-specific assumptions

That worked, but it created too much maintenance pressure and too many board-specific edge cases.

With esp_audio_stack, the project is closer to the native ESP-IDF audio model while still exposing normal ESPHome surfaces above it:

microphone
speaker
media player
mixer
Voice Assistant
Micro Wake Word
intercom API
Home Assistant entities

🧩 Supported Audio Shapes

The maintained profiles now cover these layouts through the new stack:

single-bus codec boards
single-bus no-codec boards
dual-bus MEMS mic + I2S amplifier boards
ES8311 stereo playback-reference boards
ES7210 + ES8311 TDM reference boards
dual-mic AFE boards
lightweight AEC-only Generic S3 profiles
full AFE profiles for larger flash/RAM layouts

Codec-backed devices use esp_codec_dev.

No-codec devices use official esp_driver_i2s channels directly, avoiding unnecessary codec/GMF IO dependencies on smaller builds.

🎙️ AEC and AFE Profiles

Profiles are now split more clearly.

🪶 `esp_aec`

Use this for lightweight echo cancellation.

It is the default direction for:

intercom-only devices
Generic S3 full-experience profiles that need to fit smaller flash layouts
users who want Acoustic Echo Cancellation without the full Audio Front-End cost

🧠 `esp_afe`

Use this for the full Espressif Audio Front-End path.

It adds:

Acoustic Echo Cancellation
Noise Suppression
Automatic Gain Control
Voice Activity Detection
Speech Enhancement / Blind Source Separation on supported dual-mic boards

It is heavier, but it is the right direction for boards with enough flash/RAM and for full voice-device profiles.

📦 Generic Profile Split

Generic S3 full-experience YAMLs are now split by intended target:

generic-s3-full-aec-*
- lightweight path
- intended for 4 MB-friendly builds
- uses standalone esp_aec
- uses the lighter previous_frame reference
generic-s3-full-afe-*
- full Audio Front-End path
- intended for larger flash layouts
- uses esp_afe
- uses TYPE2-style software reference

This avoids pretending one Generic YAML can fit every board and every flash layout.

🔊 Better AEC Reference Handling

Echo cancellation quality depends heavily on the playback reference.

The new stack handles reference routing per topology:

ES8311 boards can use stereo digital feedback
ES7210 TDM boards can use a hardware TDM reference slot
no-codec Generic AEC profiles can use previous_frame
Generic AFE profiles can use TYPE2-style software reference

This is one of the main reasons for the audio migration. AEC quality depends on reference timing, channel layout and conversion path, not only on enabling a library.

🧠 Runtime and Memory Improvements

The migration also cleaned up runtime behavior:

large buffers and task stacks are allocated earlier
repeated heap churn during call/media transitions has been reduced
microphone and speaker wrapper loops wake on real events instead of spinning
intercom_api parks its loop when idle
intercom TX uses lower-copy reads where possible
full profiles place selected buffers/stacks in PSRAM
full LVGL/audio profiles enter OTA maintenance mode before flashing

This helps demanding full-experience devices where media playback, Piper TTS, Micro Wake Word, Voice Assistant, AFE/AEC and intercom all coexist.

🧭 Maintained Board Direction

Current maintained baseline:

Waveshare ESP32-S3 Audio Board: full AFE, dual mic, TDM reference
Spotpear Ball v2: codec-backed AFE/intercom profiles
Generic S3 AEC: lightweight 4 MB-friendly full-experience profiles
Generic S3 AFE: larger flash full AFE profiles
Generic dual-bus: maintained intercom profiles
Waveshare P4 Touch: present and improving, still board-specific/experimental

🧪 Validation

Before this release, the public YAMLs were switched to remote release mode so users can download only the YAML and let ESPHome fetch packages, assets and external components from main.

Validation performed:

HACS validation passes
hassfest validation passes
generic-s3-full-afe-tcp.yaml compiles successfully with ESPHome 2026.5.1
ESPHome fetches this repository from main
Espressif managed components resolve and build correctly

Generic full AFE firmware size from the validation build is about 2.1 MB.

⬆️ Upgrade Notes

Recommended upgrade path:

Update the Home Assistant integration through HACS.
Restart Home Assistant.
Hard-refresh the dashboard page containing intercom-card.
Clear ESPHome build cache once.
Recompile from the updated YAMLs.
Flash the ESP firmware.

If you maintain custom YAMLs, start from the closest maintained profile and reapply only your board-specific changes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v2026.6.0 - Intercom Native polish and Espressif GMF audio stack

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

🚀 v2026.6.0 - Intercom Native polish and Espressif GMF audio stack

Hotfix after initial 2026.6.0 publication

🏠 Home Assistant / Intercom Native

🔁 Unified call event model

📵 Better unavailable-device handling

⚡ Safer fast hangup / redial behavior

📱 Mobile notification answer flow

🧹 Versioned card cache behavior

✅ Minimum Versions

⚠️ Breaking Changes

🎧 Audio Stack Migration

💡 Why This Matters

🧩 Supported Audio Shapes

🎙️ AEC and AFE Profiles

🪶 `esp_aec`

🧠 `esp_afe`

📦 Generic Profile Split

🔊 Better AEC Reference Handling

🧠 Runtime and Memory Improvements

🧭 Maintained Board Direction

🧪 Validation

⬆️ Upgrade Notes

Uh oh!

v2026.6.0 - Intercom Native polish and Espressif GMF audio stack

🚀 v2026.6.0 - Intercom Native polish and Espressif GMF audio stack

Hotfix after initial 2026.6.0 publication

🏠 Home Assistant / Intercom Native

🔁 Unified call event model

📵 Better unavailable-device handling

⚡ Safer fast hangup / redial behavior

📱 Mobile notification answer flow

🧹 Versioned card cache behavior

✅ Minimum Versions

⚠️ Breaking Changes

🎧 Audio Stack Migration

💡 Why This Matters

🧩 Supported Audio Shapes

🎙️ AEC and AFE Profiles

🪶 esp_aec

🧠 esp_afe

📦 Generic Profile Split

🔊 Better AEC Reference Handling

🧠 Runtime and Memory Improvements

🧭 Maintained Board Direction

🧪 Validation

⬆️ Upgrade Notes

Uh oh!

🪶 `esp_aec`

🧠 `esp_afe`