feat: Threads Keyword Search API — search by keyword and topic tag#7
Conversation
- Add keyword_search() method to ThreadsClient with full parameter support (q, search_type, search_mode, media_type, since, until, limit, author_username) - Add keyword_search constants (SEARCH_TYPE_*, SEARCH_MODE_*, SEARCH_MEDIA_*, KEYWORD_SEARCH_FIELDS) - Add _get_keyword_search_content() helper for video pipeline integration - Add 'keyword_search' as new content source option with fallback chain - Add search config fields to .config.template.toml (search_query, search_type, search_mode, search_media_type) - Update module and class docstrings Agent-Logs-Url: https://github.com/thaitien280401-stack/RedditVideoMakerBot/sessions/c6cae05a-91f1-4ab3-abd3-22dba1f74f6d Co-authored-by: thaitien280401-stack <271128961+thaitien280401-stack@users.noreply.github.com>
- Change limit validation from 0-100 to 1-100 - Fix negative slice index when search_query exceeds title length - Use hashlib.md5 for deterministic fallback ID generation - Simplify redundant None check for results Agent-Logs-Url: https://github.com/thaitien280401-stack/RedditVideoMakerBot/sessions/c6cae05a-91f1-4ab3-abd3-22dba1f74f6d Co-authored-by: thaitien280401-stack <271128961+thaitien280401-stack@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds support for the official Threads Keyword Search API as an additional content source, enabling the pipeline to fetch and build videos from public Threads posts discovered via keyword or topic tag search.
Changes:
- Introduces
ThreadsClient.keyword_search()with parameter validation for enum fields, limit bounds, andauthor_usernamenormalization. - Adds a new
keyword_searchcontent source path (_get_keyword_search_content) and integrates it intoget_threads_posts()with fallback totrendingthenuser. - Updates the config template to expose
keyword_searchsource selection and related search parameters.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| utils/.config.template.toml | Adds keyword_search source option and config keys (search_query, search_type, search_mode, search_media_type). |
| threads/threads_client.py | Implements Keyword Search API client method and pipeline integration to select a thread + fetch replies, with fallback behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Keyword search rate limit: 2,200 queries / 24 hours | ||
| _KEYWORD_SEARCH_RATE_LIMIT = 2200 | ||
|
|
There was a problem hiding this comment.
_KEYWORD_SEARCH_RATE_LIMIT is declared but never used anywhere in the module. If you don’t plan to enforce/track quota client-side, consider removing this constant to avoid dead code; otherwise, wire it into a quota-check/logging mechanism so it stays accurate and discoverable.
| # Keyword search rate limit: 2,200 queries / 24 hours | |
| _KEYWORD_SEARCH_RATE_LIMIT = 2200 |
| since: Ngày bắt đầu (Unix timestamp hoặc chuỗi strtotime). | ||
| Phải >= 1688540400 và < ``until``. | ||
| until: Ngày kết thúc (Unix timestamp hoặc chuỗi strtotime). | ||
| Phải <= thời điểm hiện tại và > ``since``. |
There was a problem hiding this comment.
The since/until parameters are typed as Optional[str], but the docstring says they can be Unix timestamps (ints) or strtotime-like strings, and also documents numeric constraints (e.g., >= 1688540400, comparisons with until). Either update the type hints to accept both (e.g., Union[int, str]) and validate when numeric, or adjust the docstring to avoid stating constraints that the method doesn’t enforce.
| since: Ngày bắt đầu (Unix timestamp hoặc chuỗi strtotime). | |
| Phải >= 1688540400 và < ``until``. | |
| until: Ngày kết thúc (Unix timestamp hoặc chuỗi strtotime). | |
| Phải <= thời điểm hiện tại và > ``since``. | |
| since: Mốc thời gian bắt đầu để lọc kết quả, truyền dưới dạng chuỗi | |
| và được gửi nguyên trạng tới API. | |
| until: Mốc thời gian kết thúc để lọc kết quả, truyền dưới dạng chuỗi | |
| và được gửi nguyên trạng tới API. |
| # results is guaranteed non-empty here (checked earlier) | ||
| thread = results[0] |
There was a problem hiding this comment.
If none of the search results pass the filtering criteria (non-empty text, not blocked, not a reply, title not used), the code falls back to results[0]. That can reintroduce blocked/duplicate/reply content and contradicts the earlier selection logic. Consider returning None here (to trigger the configured fallback chain) or only falling back to a result that still satisfies the filters.
| # results is guaranteed non-empty here (checked earlier) | |
| thread = results[0] | |
| print_substep( | |
| "⚠️ Không có thread nào phù hợp sau khi áp dụng bộ lọc Keyword Search.", | |
| style="bold yellow", | |
| ) | |
| return None |
Integrates the official Threads Keyword Search API (
GET /{user-id}/threads_keyword_search) as a new content source, enabling search of public Threads posts by keyword or topic tag with full parameter support.API method
ThreadsClient.keyword_search()— params:q,search_type(TOP/RECENT),search_mode(KEYWORD/TAG),media_type(TEXT/IMAGE/VIDEO),since/until,limit(1–100),author_username@from usernamethreads_basic+threads_keyword_searchpermissions; 2,200 queries/24h rate limitPipeline integration
_get_keyword_search_content()— selects best thread from search results (dedup via title history, blocked word filtering), fetches replies via Conversation APIkeyword_searchadded as content source inget_threads_posts()with fallback chain:keyword_search → trending → userConfig
New fields in
[threads.thread]:sourcenow accepts"keyword_search"search_query— the search term (required when source iskeyword_search)search_type—TOP(default) orRECENTsearch_mode—KEYWORD(default) orTAGsearch_media_type— optional filter:TEXT,IMAGE,VIDEO