Skip to content

Retain strong refs to update-processing tasks in AsyncTeleBot#2588

Merged
Badiboy merged 2 commits intoeternnoir:masterfrom
Dramex:fix/2572-retain-update-processing-tasks
Apr 19, 2026
Merged

Retain strong refs to update-processing tasks in AsyncTeleBot#2588
Badiboy merged 2 commits intoeternnoir:masterfrom
Dramex:fix/2572-retain-update-processing-tasks

Conversation

@Dramex
Copy link
Copy Markdown
Contributor

@Dramex Dramex commented Apr 17, 2026

Closes #2572.

What

AsyncTeleBot._process_polling dispatches incoming update batches via asyncio.create_task(...) without retaining a reference to the resulting Task. Per the Python asyncio docs, the event loop only keeps weak references to tasks, so an unreferenced task may be garbage-collected mid-execution and updates silently dropped.

Change

  • Added self._pending_tasks: set to AsyncTeleBot.__init__ to hold strong references to in-flight background tasks.
  • In _process_polling, captured the Task returned by asyncio.create_task(...), added it to _pending_tasks, and attached task.add_done_callback(self._pending_tasks.discard) so the set self-cleans on completion.

This is the exact pattern suggested by the asyncio docs and matches the minimal fix proposed by the reporter (@fshp971).

Why this approach vs the alternatives

@All-The-Foxes suggested two alternatives in the thread:

  1. Process updates via an internal queue — larger architectural change; unclear benefit if process_new_updates is fast.
  2. Just await process_new_updates directly — simplest, but changes concurrency semantics (polling would block on handler execution). That's a behavior change for existing bots.

This PR takes the minimal-risk route: preserves existing fire-and-forget semantics while fixing the GC hazard. Happy to rework toward either alternative if that's the preferred direction.

Tests

Added tests/test_async_telebot.py with a regression test that:

  • Drives _process_polling with stubbed get_updates / process_new_updates / get_me / close_session
  • Verifies the in-flight processing task is present in _pending_tasks during execution
  • Verifies _pending_tasks is emptied after the task completes (discard callback fires)

Full test run: 49 passed, 72 skipped (skipped tests require TOKEN/CHAT_ID env vars, unchanged from before this PR).

asyncio only keeps weak references to tasks returned from create_task,
so a fire-and-forget task with no external reference may be garbage
collected mid-execution, silently dropping updates. This adds a
per-bot _pending_tasks set that holds each processing task until it
completes, mirroring the pattern documented in the asyncio docs.

Closes eternnoir#2572
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a potential asyncio GC hazard in AsyncTeleBot polling by retaining strong references to fire-and-forget tasks created for update processing, and adds a regression test to ensure tasks are tracked and cleaned up.

Changes:

  • Add self._pending_tasks to keep strong references to in-flight update-processing tasks.
  • Track each asyncio.create_task(self.process_new_updates(...)) in _process_polling and discard it on completion.
  • Add a self-contained unit test covering task tracking and cleanup behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
telebot/async_telebot.py Retains strong refs to background tasks created during polling and discards them when done.
tests/test_async_telebot.py Adds regression test validating _pending_tasks tracking during task execution and cleanup after completion.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/test_async_telebot.py Outdated
await bot._process_polling(non_stop=True, interval=0, timeout=0)
# Allow the fire-and-forget task to finish plus one yield for the
# add_done_callback discard to run.
await process_completed.wait()
Copy link

Copilot AI Apr 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test awaits process_completed.wait() with no timeout. If the regression reappears (or the stubbed polling loop changes), this can hang the entire test run. Consider wrapping the wait in asyncio.wait_for(..., timeout=...) so the test fails fast instead of blocking indefinitely.

Suggested change
await process_completed.wait()
await asyncio.wait_for(process_completed.wait(), timeout=1)

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, applied in the latest push (ea13186). The guard now wraps the wait in asyncio.wait_for(..., timeout=1) so a miswired stub can't hang the test run.

Comment thread telebot/async_telebot.py
Comment on lines +463 to +466
# Retain a strong reference so the task isn't GC'd mid-execution.
task = asyncio.create_task(self.process_new_updates(updates))
self._pending_tasks.add(task)
task.add_done_callback(self._pending_tasks.discard)
Copy link

Copilot AI Apr 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The done callback currently only discards the task from _pending_tasks. If process_new_updates raises, the task’s exception is never retrieved/observed, which can lead to noisy Task exception was never retrieved warnings. Consider extending the done-callback to call task.exception() (and possibly log via _handle_exception/logger) before discarding, so failures aren’t silently ignored.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intentionally left out of this PR. Adding task.exception() / _handle_exception into the done callback is a real behaviour change: today an unhandled exception in process_new_updates surfaces through Python's default unobserved-exception handler, which routes to whatever logging the user has configured. Moving that into the library would mean deciding whether to swallow, re-raise, or log at a particular level, and whether to reuse the polling loop's _handle_exception. That feels like a separate design conversation rather than a follow-on to the strong-ref fix. Happy to open a follow-up if @eternnoir wants to pull it in.

Comment thread telebot/async_telebot.py Outdated
# Strong references to background tasks created via asyncio.create_task().
# asyncio only keeps weak references, so unreferenced tasks can be GC'd
# mid-execution; see https://docs.python.org/3/library/asyncio-task.html#asyncio.create_task
self._pending_tasks: set = set()
Copy link

Copilot AI Apr 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self._pending_tasks is annotated as a bare set, which loses the element type in this typed package. Consider using a parameterized type like set[asyncio.Task] (or set[asyncio.Task[None]] if appropriate) so static type checkers and IDEs can reason about what’s stored here.

Suggested change
self._pending_tasks: set = set()
self._pending_tasks: set[asyncio.Task[Any]] = set()

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied in the latest push as set[asyncio.Task[Any]]. Kept Any rather than None since process_new_updates returns None today but the set is general enough that a future caller might hand it a task with a different return.

Apply review feedback:
- Parameterise _pending_tasks as set[asyncio.Task[Any]] so type
  checkers have visibility into the element type.
- Wrap the test's process_completed.wait() in asyncio.wait_for(...,
  timeout=1) so a regression or stub-rewiring can't cause the test
  to hang the whole run.
@Badiboy
Copy link
Copy Markdown
Collaborator

Badiboy commented Apr 19, 2026

Okay, thank you. I'm absolutely out of asynk, so I left the responsibility on you and Copilot )

@Badiboy Badiboy merged commit 37c97af into eternnoir:master Apr 19, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Potential Bug: asyncio.create_task lacks strong references in telebot.AsyncTeleBot._process_polling

3 participants