fix(apps): resolve streaming error handling for 2-minute timeout & content stream not allowed by Athosone · Pull Request #453 · microsoft/teams.py

Athosone · 2026-06-06T21:22:50Z

PR has evolved due to a number of raised issues wrt streaming error handling. This PR solely focuses on handling the 403 for the 2-minute timeout, and surfacing the "Content stream is not allowed" error.

This table highlights all the different error codes, we are concerned with the 403s:
https://learn.microsoft.com/en-us/microsoftteams/platform/bots/streaming-ux?tabs=csharp#error-codes

Content stream is not allowed -> Error is surfaced, not as a cancelled error
Content stream is not allowed on an already completed streamed message -> Not possible with the SDK
Content stream finished due to exceeded streaming time. -> Handled in this PR. A new regular message is sent instead.
Content stream was canceled by user -> Already handled today
Request streamed content should contain the previously streamed content -> Not possible with the SDK

TO BE DISCUSSED:

Bursty informative updates - continued discussion in HttpStream sends streaming activities faster than the 1 request/second limit #452 to reach an ideal solution
Message size too large -> To be discussed in smoother error handling for "message size too large" on stream #488

OLD DESCRIPTION:

Fixes #452

Problem

HttpStream can send streaming activities faster than the Teams 1 request/second
per-stream limit. Teams then throttles the stream and returns
403 ContentStreamNotAllowed, which _send converts to StreamCancelledError
and the user sees as a red error toast.

Two causes in http_stream.py:

_flush() sent every queued informative update back to back, then the text
chunk. A burst of update() calls became a burst of POSTs in one flush.
The only pacing was a call_later(0.5, ...) reschedule, armed only when a
backlog survived a flush. Once the queue drained, _timeout was None, so the
next emit() fired an immediate flush with no pacing.

Fix

A per-stream leaky-slot limiter (make_limiter in utils/limiter.py), gating
every chunk send in _send_activity. Its next_slot is instance state shared
across flushes, so it paces both the in-flush burst and the next emit() after
the queue drains. The first send is not delayed, so the progress bar still
appears promptly.

min_send_interval on HttpStream.__init__ (default 1.0s) lets callers buffer
toward the docs' 1.5 to 2s advice.
coalesce_informative_updates defaults to True: a burst of informative
updates in one flush collapses to the latest, matching the docs' "one
informative message, reused for each update". Set it False to pace out every
update at 1 req/s instead (see Notes).

Tests

Limiter spacing and input validation; rapid update() burst coalesces to the
latest by default; with coalesce_informative_updates=False every update is sent
in order, none dropped, paced at the interval; emits are paced across flushes (the
post-drain case); the interval is configurable; the flag never drops text.
Existing test_http_stream.py behaviour is unchanged. Full packages/apps suite
passes, ruff and pyright clean.

Notes

The limiter is acquired in _send, so every HTTP attempt is paced: retries and
close()'s final send included. acquire() only waits when a send would
actually be too soon, so close pays no latency unless it lands within the
interval of the last chunk. Pass min_send_interval=0 to disable pacing.
With coalesce_informative_updates=False, _flush holds its lock across the
paced sleeps, so a very long informative burst could keep the lock long enough
that close() times out. The default (coalesce on) avoids this.
The same gap likely exists in the sibling TS and .NET SDKs; not addressed here.

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Introduces a reusable async rate limiter and integrates it into HttpStream to pace outbound activity sends, with additional tests covering pacing and coalescing behavior.

Changes:

Added make_limiter() utility for async rate limiting.
Updated HttpStream to enforce a per-stream minimum send interval and optionally coalesce informative updates.
Added/updated tests to validate pacing, ordering, and coalescing semantics.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
packages/apps/src/microsoft_teams/apps/utils/limiter.py	New async limiter utility based on monotonic time and `asyncio.sleep()`.
packages/apps/src/microsoft_teams/apps/utils/init.py	Exposes `make_limiter` via package exports.
packages/apps/src/microsoft_teams/apps/http_stream.py	Adds pacing/coalescing knobs and enforces pacing via limiter in `_send_activity`.
packages/apps/tests/test_limiter.py	New unit test validating limiter spacing semantics.
packages/apps/tests/test_http_stream.py	Updates existing tests to use shorter intervals; adds new pacing/coalescing tests.

Athosone · 2026-06-06T22:35:35Z

+from typing import Awaitable, Callable
+
+
+def make_limiter(rate: int, period: float = 1.0) -> Callable[[], Awaitable[None]]:


Already handled: make_limiter raises ValueError for rate < 1 and period < 0, so rate == 0 (ZeroDivisionError) and negative periods fail fast at construction. period == 0 is allowed on purpose and means "no pacing" (interval 0), which is how callers disable the limiter.

HttpStream could send streaming activities faster than the Teams 1 request/second per-stream limit, which gets the stream throttled and returned as 403 ContentStreamNotAllowed (surfaced to the user as a red error toast). Two causes: _flush() sent every queued informative update back to back plus the text chunk, and the only pacing was a 0.5s reschedule that was dropped once the queue drained, so the next emit() fired immediately. Add a per-stream leaky-slot limiter (make_limiter) and gate every chunk send through it in _send_activity. Because the limiter state is shared across flushes, it paces both the in-flush burst and the next post-drain emit. min_send_interval is exposed (default 1.0s) so callers can buffer toward the docs' 1.5-2s advice. A burst of informative updates in one flush coalesces to the latest by default (coalesce_informative_updates, matching the docs' one-reused-message guidance); set it False to pace out every update instead.

Athosone · 2026-06-06T22:30:19Z

@microsoft-github-policy-service agree company="Michelin"

Acquire the per-stream limiter in _send instead of _send_activity so every HTTP attempt is paced, including retries and close()'s final send. This keeps a retry or the final message from landing within the interval of the last chunk and tripping the Teams 1 req/s throttle. acquire() only waits when a send would actually be too soon, so close() pays no latency after an idle gap; min_send_interval=0 disables pacing. Tests that don't assert pacing construct with min_send_interval=0 to stay fast. Addresses Copilot review feedback on microsoft#453.

Athosone · 2026-06-10T14:29:19Z

Thanks a lot for taking the time to review @lilyydu I will come back to you soon.

Btw I noticed two other issues while playing with it:

2 minutes timeout hard-coded for the streaming response. This is a problem for us, we are running a chat bot that helps our support user to resolve incident, doing RCA etc. sometimes it can takes some times to gather all the required info, would you mind if I proposed a new issue/pr to address this? The fix would be to allow the user to configure the timeout via the app config?
When we send a payload to the stream if it is too big we receive a 413 entity too large, The fix I would like to propose is that the httpstream chunk the payload and send it in batch of "MAX_SIZE". I can also make a PR/issue if you are ok with it?

However I wonder if the httpstream should carry that responsability alone, are these constraints only tight to it?

…ault, simpler limiter - Plumb stream_min_send_interval and stream_coalesce_informative_updates through AppOptions -> ActivitySender -> HttpStream - Flip coalesce_informative_updates default to False so informative updates are paced, not silently dropped; coalescing is now opt-in - Simplify make_limiter(rate, period) to make_limiter(interval) and reword its docstring as a fixed-interval gate (token bucket of size 1)

heyitsaamir · 2026-06-10T18:50:23Z

Thanks for the push here!

Some thoughts:

Let’s not expose configurable flags for this. Informative updates should be coalesced by default because they’re status replacements, not cumulative content. If we skip intermediate informative updates, let’s log only the count at debug level.
I think HttpStream should have two modes:
- informative mode: before text streaming starts, send the latest pending informative update
- text mode: once text streaming starts, do not send informative updates anymore

Can we still use the same retry path and the same send pacing/limiter for informative updates, text chunks, retries, and final close.

heyitsaamir · 2026-06-10T19:11:06Z

Thanks a lot for taking the time to review @lilyydu I will come back to you soon.

Btw I noticed two other issues while playing with it:

2 minutes timeout hard-coded for the streaming response. This is a problem for us, we are running a chat bot that helps our support user to resolve incident, doing RCA etc. sometimes it can takes some times to gather all the required info, would you mind if I proposed a new issue/pr to address this? The fix would be to allow the user to configure the timeout via the app config?

When we send a payload to the stream if it is too big we receive a 413 entity too large, The fix I would like to propose is that the httpstream chunk the payload and send it in batch of "MAX_SIZE". I can also make a PR/issue if you are ok with it?

However I wonder if the httpstream should carry that responsability alone, are these constraints only tight to it?

Yeah these are outlined in the official streaming docs. Admittedly, the streaming system here could help by just closing the stream and just doing regular updates.
Were you thinking something along the same lines?

I reached out to the folks that implemented streaming to ask what their preferred recommendation would be for these cases.

Athosone · 2026-06-10T21:33:50Z

Hi !
Thanks for your feedback,

If this is a platform limit that you cannot increase then yes we could fallback updating the message (this is what I am doing on my app at the moment but the experience is not as nice as with a stream :p).

For your comment on this PR I totally agree and will update the code, removing the flag in app config and coalesce the informative update and applying the limiter to different code path that emit bytes to the platform.

It matches my experience in the sense of, the user doesnt really care for the next narrative update of an agent or the result of a tool call if you already have the answer ready to be sent.

…stream modes Informative updates are status replacements, not cumulative content, so a burst now always collapses to the latest one (skipped count logged at debug level) instead of being paced out one-by-one or gated behind a flag. The stream now has two explicit modes: informative mode sends the latest pending informative update per flush; once text streaming starts the stream permanently switches to text mode and informative updates are dropped. Text landing in the same flush as informative updates supersedes them. Removes the stream_min_send_interval / stream_coalesce_informative_updates plumbing from AppOptions and ActivitySender; pacing stays fixed at the Teams 1 req/s limit and all sends keep going through the same limiter/retry path.

…eping An informative-only flush after text streaming starts used to fall through and re-send the unchanged cumulative text, burning a paced limiter slot under the flush lock for a no-op request. The chunk send is now guarded on text actually having been added by this flush. Also from the cleanup pass: track the latest informative update with a scalar and counter instead of building a list that only ever yields its last element, drop the unreachable start_length branch, use the typed ChannelData.stream_type field instead of getattr, trim a comment that restated the limiter docs, and deduplicate test boilerplate behind _await_pending_flush/_track_send_times helpers (tests with no timing assertions now run with pacing disabled).

lilyydu · 2026-06-11T21:38:12Z

Hey @Athosone,

We're still waiting on guidance from internal folks, but I'm gonna test out your implementation in the meantime. Thanks for responding to our requests!

lilyydu · 2026-06-15T23:06:38Z

Hey @Athosone,

Can you share what error message you get back for the 403 you are receiving from the 1 request /sec limitation? There's a couple that fall into that bucket here https://learn.microsoft.com/en-us/microsoftteams/platform/bots/streaming-ux?tabs=csharp#error-codes and want to make sure we cover our bases

lilyydu · 2026-06-23T16:16:08Z

Hey @Athosone,

Can you share what error message you get back for the 403 you are receiving from the 1 request /sec limitation? There's a couple that fall into that bucket here https://learn.microsoft.com/en-us/microsoftteams/platform/bots/streaming-ux?tabs=csharp#error-codes and want to make sure we cover our bases

@Athosone soft bump so we can unblock you!

Athosone · 2026-06-23T17:57:39Z

SOrry about that, totally missed the notification,

I got different error messages:

The bot failed to complete the streaming process within the strict time limit of two minutes

And

The bot sent a message that exceeds the current [message size](https://learn.microsoft.com/en-us/microsoftteams/platform/bots/how-to/format-your-bot-messages) restriction

For the first one I morph the stream to a message

For the second I implemented a chunker so that my adapter abstract the complexity to my main code.

lilyydu · 2026-06-24T22:27:51Z

Hey @Athosone ,

First, thanks for all the investigation and work you've done here! Streaming can definitely be improved so we highly appreciate your feedback and suggestions. A number of issues have been raised so we think it's best to isolate each into its own respective PR

In regard to the bursty informative updates, I repro'd the code snippets you attached, and it does result in 429s but these are handled via our retry logic. However, we do agree that we should have some mechanism to control the bursty behaviour. I tried out the fix you're proposing here with 100 informative updates but it coalesced into one single msg - not sure if that's the ideal experience too. Let's continue bouncing ideas in #452 for this

The 403s you were receiving are due to the 2 errors you attached above (2 minute limitation and chunk size). We currently return all 403s as "cancelled" which does not properly showcase the error table I linked above. I'm going to update this PR to surface the 2 minute error instead and then open up additional PRs following after to surface the other errors.

Apologies for the delay with resolving this but glad we are surfacing and inching closer to a better streaming exp :)

Removes the per-stream leaky-bucket limiter and associated changes, resetting the affected files to their main baseline. This provides a clean starting point for a new implementation. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

lilyydu · 2026-06-26T19:37:06Z

Updated the description:

PR has evolved due to a number of raised issues wrt streaming error handling. This PR solely focuses on handling the 403 for the 2-minute timeout, and surfacing the "Content stream is not allowed" error.

This table highlights all the different error codes, we are concerned with the 403s:
https://learn.microsoft.com/en-us/microsoftteams/platform/bots/streaming-ux?tabs=csharp#error-codes

Content stream is not allowed -> Error is surfaced, not as a cancelled error
Content stream is not allowed on an already completed streamed message -> Not possible with the SDK
Content stream finished due to exceeded streaming time. -> Handled in this PR. A new regular message is sent instead.
Content stream was canceled by user -> Already handled today
Request streamed content should contain the previously streamed content -> Not possible with the SDK

TO BE DISCUSSED:

Bursty informative updates - continued discussion in HttpStream sends streaming activities faster than the 1 request/second limit #452 to reach an ideal solution
Message size too large -> To be discussed in smoother error handling for "message size too large" on stream #488

heyitsaamir · 2026-06-10T18:36:01Z

+        stream_min_send_interval: float = 1.0,
+        stream_coalesce_informative_updates: bool = False,


Let's stream_coalesce_informative_updates should be true always. let's not expose this.

heyitsaamir · 2026-06-10T18:36:14Z

+    def __init__(
+        self,
+        client: Client,
+        stream_min_send_interval: float = 1.0,


also don't think we need this.

heyitsaamir · 2026-06-26T21:39:08Z

+                if message != "Content stream was cancelled by user.":
+                    if message == "Content stream finished due to exceeded streaming time.":
+                        self._timed_out = True
+                    logger.warning("Teams encountered an error while streaming. Sending as a regular message.")
+                    raise StreamCancelledError(message) from e


This will raise the StreamCancelled error to the handler. We should probably raise a different error here.

Can we just do a switch statement here for various types of canceled errors? Also point to the learn docs that describe these.

heyitsaamir · 2026-06-26T21:39:27Z

            if e.response.status_code == 403:
+                error = e.response.json().get("error", {})
+                message = error.get("message", "")
+                if message != "Content stream was cancelled by user.":


let's also make sure that the backend has tests for these particular strings

(and maybe a more resilient error could be to just check for phrases like "canceled", "exceeded streaming time" etc.)

Copilot AI review requested due to automatic review settings June 6, 2026 21:22

Athosone mentioned this pull request Jun 6, 2026

HttpStream sends streaming activities faster than the 1 request/second limit #452

Open

Copilot AI reviewed Jun 6, 2026

View reviewed changes

Athosone force-pushed the fix/http-stream-rate-limit branch from 813b322 to 402286b Compare June 6, 2026 21:34

lilyydu reviewed Jun 8, 2026

View reviewed changes

Comment thread packages/apps/src/microsoft_teams/apps/http_stream.py Outdated

lilyydu reviewed Jun 8, 2026

View reviewed changes

Comment thread packages/apps/src/microsoft_teams/apps/http_stream.py Outdated

lilyydu reviewed Jun 8, 2026

View reviewed changes

Comment thread packages/apps/src/microsoft_teams/apps/http_stream.py Outdated

lilyydu reviewed Jun 8, 2026

View reviewed changes

Comment thread packages/apps/src/microsoft_teams/apps/utils/limiter.py Outdated

Athosone and others added 3 commits June 10, 2026 21:42

Merge branch 'main' into fix/http-stream-rate-limit

0099f06

Merge branch 'main' into fix/http-stream-rate-limit

dbb885a

Merge branch 'main' into fix/http-stream-rate-limit

c7f1ac5

lilyydu and others added 2 commits June 24, 2026 15:36

fix stream timeout 403 error

3c1c34c

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

lilyydu changed the title ~~fix(apps): rate-limit HttpStream to 1 req/s streaming limit~~ fix(apps): resolve streaming error handling for 2-minute timeout Jun 26, 2026

lilyydu changed the title ~~fix(apps): resolve streaming error handling for 2-minute timeout~~ fix(apps): resolve streaming error handling for 2-minute timeout & content stream not allowed Jun 26, 2026

lilyydu mentioned this pull request Jun 26, 2026

smoother error handling for "message size too large" on stream #488

Open

heyitsaamir reviewed Jun 26, 2026

View reviewed changes

		from typing import Awaitable, Callable


		def make_limiter(rate: int, period: float = 1.0) -> Callable[[], Awaitable[None]]:

		stream_min_send_interval: float = 1.0,
		stream_coalesce_informative_updates: bool = False,

Uh oh!

Conversation

Athosone commented Jun 6, 2026 • edited by lilyydu Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Tests

Notes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Athosone Jun 6, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Athosone commented Jun 6, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Athosone commented Jun 10, 2026

Uh oh!

heyitsaamir commented Jun 10, 2026

Uh oh!

heyitsaamir commented Jun 10, 2026

Uh oh!

Athosone commented Jun 10, 2026

Uh oh!

lilyydu commented Jun 11, 2026

Uh oh!

lilyydu commented Jun 15, 2026

Uh oh!

lilyydu commented Jun 23, 2026

Uh oh!

Athosone commented Jun 23, 2026

Uh oh!

lilyydu commented Jun 24, 2026

Uh oh!

lilyydu commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

heyitsaamir Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

heyitsaamir Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

heyitsaamir Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

heyitsaamir Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

heyitsaamir Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Athosone commented Jun 6, 2026 •

edited by lilyydu

Loading

lilyydu commented Jun 26, 2026 •

edited

Loading