Preflight access token validation before pipeline execution#2
Preflight access token validation before pipeline execution#2thaitien280401-stack merged 4 commits intoMAINfrom
Conversation
…er diagnostics - Add ThreadsAPIError exception class for clear error typing - Add validate_token() method to check token via /me endpoint before fetching - Add automatic token refresh attempt when validation fails - Add retry logic with backoff for transient connection/timeout failures - Add API error body detection (Meta API can return 200 with error in body) - Add request timeout (30s) to prevent hanging - Fix keyword filtering to fall back to unfiltered list when all are filtered out - Add detailed diagnostic messages with user_id, permission hints, and URLs - Handle ThreadsAPIError separately in main.py for targeted troubleshooting guidance - Remove unused Authorization header (Threads API uses access_token query param) Agent-Logs-Url: https://github.com/thaitien280401-stack/RedditVideoMakerBot/sessions/0fff9f19-a7aa-44c2-a703-9e5a7ec6d880 Co-authored-by: thaitien280401-stack <271128961+thaitien280401-stack@users.noreply.github.com>
…row exception catch Agent-Logs-Url: https://github.com/thaitien280401-stack/RedditVideoMakerBot/sessions/0fff9f19-a7aa-44c2-a703-9e5a7ec6d880 Co-authored-by: thaitien280401-stack <271128961+thaitien280401-stack@users.noreply.github.com>
… and scheduler Agent-Logs-Url: https://github.com/thaitien280401-stack/RedditVideoMakerBot/sessions/adc9d93e-b8a2-4b45-8f6c-50427edeee51 Co-authored-by: thaitien280401-stack <271128961+thaitien280401-stack@users.noreply.github.com>
Agent-Logs-Url: https://github.com/thaitien280401-stack/RedditVideoMakerBot/sessions/adc9d93e-b8a2-4b45-8f6c-50427edeee51 Co-authored-by: thaitien280401-stack <271128961+thaitien280401-stack@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds a Threads access-token “preflight” validation step to surface expired/invalid token issues early (instead of failing later with opaque “No threads found” errors), and improves Threads API error handling.
Changes:
- Introduces
utils/check_token.pyto validate/me, attempt refresh, and print actionable diagnostics before Threads work starts. - Enhances
threads/threads_client.pywith structuredThreadsAPIError, retry/backoff in_get(), and explicit token validate/refresh helpers. - Runs preflight validation before Threads execution paths (CLI startup and scheduled pipeline runs) and improves top-level error messaging.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
utils/check_token.py |
New preflight token checker (validate /me, attempt refresh, config sanity checks). |
threads/threads_client.py |
Adds ThreadsAPIError, retrying _get(), token validate/refresh methods, and improved thread selection behavior. |
main.py |
Runs preflight before Threads modes and prints targeted troubleshooting for ThreadsAPIError. |
scheduler/pipeline.py |
Runs preflight at the start of each pipeline run (scheduled jobs). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ThreadsAPIError: Nếu token không hợp lệ hoặc đã hết hạn. | ||
| """ | ||
| try: | ||
| return self._get("me", params={"fields": "id,username"}) | ||
| except (ThreadsAPIError, requests.HTTPError) as exc: | ||
| raise ThreadsAPIError( | ||
| "Access token không hợp lệ hoặc đã hết hạn. " | ||
| "Vui lòng cập nhật access_token trong config.toml. " | ||
| "Hướng dẫn: https://developers.facebook.com/docs/threads/get-started\n" | ||
| f"Chi tiết: {exc}" | ||
| ) from exc |
There was a problem hiding this comment.
validate_token() currently only wraps ThreadsAPIError and requests.HTTPError, but _get() can also raise requests.Timeout / requests.ConnectionError after the final retry. That means callers may unexpectedly see raw requests exceptions despite the docstring saying ThreadsAPIError is raised. Consider catching requests.RequestException (or at least Timeout/ConnectionError) here and re-raising a ThreadsAPIError with a clearer network diagnostic.
| ThreadsAPIError: Nếu token không hợp lệ hoặc đã hết hạn. | |
| """ | |
| try: | |
| return self._get("me", params={"fields": "id,username"}) | |
| except (ThreadsAPIError, requests.HTTPError) as exc: | |
| raise ThreadsAPIError( | |
| "Access token không hợp lệ hoặc đã hết hạn. " | |
| "Vui lòng cập nhật access_token trong config.toml. " | |
| "Hướng dẫn: https://developers.facebook.com/docs/threads/get-started\n" | |
| f"Chi tiết: {exc}" | |
| ) from exc | |
| ThreadsAPIError: Nếu token không hợp lệ, đã hết hạn, hoặc có lỗi mạng. | |
| """ | |
| try: | |
| return self._get("me", params={"fields": "id,username"}) | |
| except ThreadsAPIError as exc: | |
| raise ThreadsAPIError( | |
| "Access token không hợp lệ hoặc đã hết hạn. " | |
| "Vui lòng cập nhật access_token trong config.toml. " | |
| "Hướng dẫn: https://developers.facebook.com/docs/threads/get-started\n" | |
| f"Chi tiết: {exc}" | |
| ) from exc | |
| except requests.RequestException as exc: | |
| raise ThreadsAPIError( | |
| "Không thể kiểm tra access token do lỗi mạng hoặc kết nối tới Threads API. " | |
| "Vui lòng thử lại sau.\n" | |
| f"Chi tiết: {exc}" | |
| ) from exc |
| except ThreadsAPIError: | ||
| # Token không hợp lệ → thử refresh | ||
| print_substep( | ||
| "⚠️ Token có thể đã hết hạn, đang thử refresh...", |
There was a problem hiding this comment.
get_threads_posts() only catches ThreadsAPIError from validate_token(), but validate_token() can currently propagate requests.Timeout / requests.ConnectionError (after retries) which will bypass this handler and skip the refresh / targeted messaging. Either have validate_token() always wrap request failures into ThreadsAPIError, or broaden this except to include requests.RequestException and handle it explicitly.
| except ThreadsAPIError: | |
| # Token không hợp lệ → thử refresh | |
| print_substep( | |
| "⚠️ Token có thể đã hết hạn, đang thử refresh...", | |
| except (ThreadsAPIError, requests.RequestException) as validate_err: | |
| # Token không hợp lệ hoặc lỗi kết nối khi validate → thử refresh | |
| print_substep( | |
| "⚠️ Không thể xác thực token hiện tại, đang thử refresh...", |
| response.raise_for_status() | ||
| return response.json() | ||
|
|
||
| last_exception: Optional[Exception] = None |
There was a problem hiding this comment.
last_exception is assigned but never used. Please remove it, or use it to raise a final ThreadsAPIError with retry context once all attempts are exhausted.
| # Bước 0: Validate token trước khi gọi API | ||
| print_substep("Đang kiểm tra access token...") | ||
| try: | ||
| user_info = client.validate_token() |
There was a problem hiding this comment.
Token validation/refresh is now performed in preflight_check() (called from main.py / scheduler.pipeline). Doing another validate+refresh here adds extra API calls and duplicates error-handling logic. Consider relying on preflight_check() for the pipeline path and keeping this as an optional/standalone safeguard (e.g., behind a flag or only when get_threads_posts() is used outside the normal entrypoints).
| # Preflight: kiểm tra access token trước khi gọi API | ||
| from utils.check_token import preflight_check | ||
|
|
||
| preflight_check() |
There was a problem hiding this comment.
preflight_check() calls sys.exit(1) on failure (raises SystemExit). In scheduled mode, this will terminate the whole APScheduler process because SystemExit is not caught by scheduled_job() (it only catches Exception). Consider catching SystemExit around preflight_check() here and converting it into a handled failure (e.g., log + return None / raise a regular Exception) so the scheduler can continue running future jobs.
| preflight_check() | |
| try: | |
| preflight_check() | |
| except SystemExit as e: | |
| print_substep("❌ Preflight check thất bại.", style="bold red") | |
| raise RuntimeError("Preflight check failed") from e |
|
|
||
|
|
||
| def _call_me_endpoint(access_token: str) -> dict: | ||
| """GET /me?fields=id,username&access_token=… with minimal retry.""" |
There was a problem hiding this comment.
The docstring says _call_me_endpoint() uses “minimal retry”, but the implementation makes a single request with no retry/backoff. Please either implement the retry (consistent with threads_client._get) or adjust the docstring to match behavior.
| """GET /me?fields=id,username&access_token=… with minimal retry.""" | |
| """GET /me?fields=id,username&access_token=… with a single request.""" |
The bot crashes with an opaque
"No threads found"error when the Threads API access token is expired or invalid. The root cause is never surfaced — the token silently fails, the API returns empty data, and the code raises a genericValueErrordeep in the pipeline.Changes
New:
utils/check_token.pyStandalone preflight checker that runs before any Threads pipeline work begins:
access_tokenanduser_idare present in config/meendpoint to verify token validityth_refresh_tokengrant if validation failsuser_idagainst token owner (warns on mismatch)Can also run standalone:
python -m utils.check_tokenthreads/threads_client.pyThreadsAPIErrorexception class for structured API error handling_get(): retry with backoff for transientConnectionError/Timeout, request timeout, detection of error-in-200-body (Meta Graph API pattern), explicit 401/403 handlingvalidate_token()andrefresh_token()methods onThreadsClientAuthorizationheader (Threads API authenticates via query param)main.pypreflight_check()before mode dispatch (Threads mode only, skipped for--reddit)ThreadsAPIErrorcaught separately in top-level handler with targeted troubleshooting stepsscheduler/pipeline.pypreflight_check()at pipeline start so scheduled jobs validate the token each run