Summary
When Feishu channel connection_mode is websocket, the bot can send messages via the REST API, but user → bot messages often never reach process_feishu_event, so the agent appears to ignore DMs. Outbound traffic still works because it does not depend on the event pipeline.
Root cause
In backend/app/services/feishu_ws.py, the Lark SDK invokes the registered handle_message callback from a worker thread, where there is usually no running asyncio event loop.
The code uses asyncio.get_running_loop() and loop.create_task(self._async_handle_message(...)). That fails with RuntimeError in the common case. The fallback tries to guess the main loop via asyncio.all_tasks()[0].get_loop(), which is non-deterministic and unreliable, so the coroutine may never run on the FastAPI loop.
Expected behavior
Every im.message.receive_v1 event should be serialized and executed on the same event loop that runs FastAPI (e.g. via a loop captured at start_client / start_all and asyncio.run_coroutine_threadsafe from the SDK callback thread). The payload should be normalized to the same dict shape as the HTTP webhook before dispatch.
Suggested fix (high level)
- Store a reference to the main loop when starting the WS client from async startup.
- In the sync WS callback, parse the event into a
body_dict (same shape as webhook).
- Dispatch with
asyncio.run_coroutine_threadsafe(_async_handle_message(agent_id, body_dict), main_loop) and log exceptions from the returned Future.
Optionally normalize connection_mode with .strip().lower() so values like WebSocket still enable WS mode.
Environment
- Feishu / Lark app: event subscription via long connection (WebSocket), not HTTP callback.
- Backend: FastAPI +
lark-oapi WS client as in this repo.
Happy to open a PR with the above if maintainers agree with the direction.
Summary
When Feishu channel
connection_modeiswebsocket, the bot can send messages via the REST API, but user → bot messages often never reachprocess_feishu_event, so the agent appears to ignore DMs. Outbound traffic still works because it does not depend on the event pipeline.Root cause
In
backend/app/services/feishu_ws.py, the Lark SDK invokes the registeredhandle_messagecallback from a worker thread, where there is usually no running asyncio event loop.The code uses
asyncio.get_running_loop()andloop.create_task(self._async_handle_message(...)). That fails withRuntimeErrorin the common case. The fallback tries to guess the main loop viaasyncio.all_tasks()[0].get_loop(), which is non-deterministic and unreliable, so the coroutine may never run on the FastAPI loop.Expected behavior
Every
im.message.receive_v1event should be serialized and executed on the same event loop that runs FastAPI (e.g. via a loop captured atstart_client/start_allandasyncio.run_coroutine_threadsafefrom the SDK callback thread). The payload should be normalized to the samedictshape as the HTTP webhook before dispatch.Suggested fix (high level)
body_dict(same shape as webhook).asyncio.run_coroutine_threadsafe(_async_handle_message(agent_id, body_dict), main_loop)and log exceptions from the returned Future.Optionally normalize
connection_modewith.strip().lower()so values likeWebSocketstill enable WS mode.Environment
lark-oapiWS client as in this repo.Happy to open a PR with the above if maintainers agree with the direction.