You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
/research <topic> — fans out to 20 sources in parallel: arXiv · Semantic Scholar · OpenAlex · HuggingFace Papers · alphaXiv · Google Scholar · HackerNews · GitHub · Reddit · StackOverflow · Google News · Polymarket · SEC EDGAR · Tavily · Brave · Twitter/X · 知乎 · B站 · 微博 · 小红书. 13 sources work zero-config; 7 optional (need keys or cookies).
Engagement-weighted ranking — each source's native signal (HN points, GitHub stars, Reddit upvotes, citations, HF upvotes, B站播放, 微博赞, 小红书赞, Twitter likes, Polymarket USD volume) is log-normalized against a per-source calibration to a shared 0-1 scale. Blended with a 14-day-half-life recency bonus. Cross-source dedup by URL keeps the highest-engagement entry on duplicates.
Time range filter — --range 1d|3d|7d|14d|30d|60d|90d|6m|1y|2y|5y|all (or natural 30days, 6months, 2years) and explicit --since YYYY-MM-DD --until YYYY-MM-DD. Each source translates the window to its native filter: arXiv submittedDate:[...], Semantic Scholar year=LO-HI, OpenAlex from_publication_date:..., HN numericFilters=created_at_i>..., GitHub pushed:>..., Reddit t=hour|day|week|month|year|all, StackOverflow fromdate=/todate=, Google News after:/before:, SEC EDGAR dateRange=custom, Tavily start_published_date, Brave freshness=pd|pw|pm|py, Twitter v2 start_time/end_time, Google Scholar client-side year filter, HuggingFace / Bilibili / Weibo client-side. Polymarket and Zhihu have no date filter API and are documented as exceptions.
Cross-platform attention table — every brief renders a Markdown table: per-platform result count · top engagement label · median result age · domain. Skipped/failed sources appear too with clear reasons. The LLM synthesis prompt copies this table verbatim and adds 2-3 sentences comparing attention distribution (academic-heavy vs. social-heavy vs. news-heavy).
Publication trend sparkline + 12-month bar chart — a compact Unicode sparkline (▁▂▃▄▅▆▇█) across the last 24 months in the brief header; a full per-month bar chart lower down. Built from ALL dated results across academic/news/social sources, giving a single-glance view of where the buzz has moved.
Notable-citer analysis (--citations) — secondary Semantic Scholar calls on top academic results, pulling citing-paper authors and filtering to those with ≥10k total citations (configurable via --citation-threshold). Surfaces a table with name · affiliation · total cites · h-index · which papers they cited. Adds 2-10 API calls per run; recommended to pair with SEMANTIC_SCHOLAR_API_KEY to escape the anonymous 100-req/5-min limit.
Entity extraction — offline, zero-LLM pattern-matching that scans every pulled result for frequent named entities across four categories: models (GPT-5, Claude-Opus-5, Llama-4, Gemini-2.5-Pro, GLM-5.1, Qwen-3, DeepSeek-V3, Grok, Mistral, Phi, Yi, Kimi, …), benchmarks (MMLU, MMLU-Pro, GSM8K, MATH, HumanEval, HumanEval+, SWE-bench, LiveCodeBench, MMMU, MathVista, GAIA, AgentBench, WebArena, Arena-Hard, FrontierMath, ARC-AGI, GPQA-Diamond, HLE, C-Eval, CMMLU, RULER, LongBench, …), orgs (OpenAI, Anthropic, Google DeepMind, Meta, xAI, Mistral AI, DeepSeek, Moonshot, Alibaba, Zhipu, Tencent, ByteDance, Hugging Face, NVIDIA, 01.AI, AI2, Mila, Stanford, MIT, Berkeley, CMU, Tsinghua, …), and people (from academic result author fields). Counts dedupe within a single result so one spammy abstract doesn't skew the ranking. Renders as a "Top mentioned entities" section directly beneath the heat table — one glance answers "what's everyone talking about?" without the LLM round-trip.
Multi-query expansion (--expand or --expand N) — asks the active model to propose 2-6 sibling subqueries (different angles — theory vs. tooling vs. industry deployment vs. controversy — not paraphrases), then runs each in parallel across all sources with proportionally reduced per-source limits. Results merge into the main pipeline (dedup + rank + synth). Example: /research --expand "frontier LLM benchmarks" auto-expands to LLM evaluation methodology, benchmark saturation and contamination, capability measurement frontier models, human preference benchmarks evaluation. Coverage jumps several-fold for broad topics.
Side-by-side compare — /research compare "topic A" vs "topic B" [vs "topic C"] runs 2 or 3 independent research queries in parallel and produces a unified comparative brief: verdict at a glance · side-by-side heat tables · shared themes · unique strengths per topic · open questions. Citations use prefixed [A-N] / [B-N] / [C-N] markers so readers can trace every claim back to the right topic's evidence pool. Falls back to a deterministic no-LLM rendering with all three heat tables + entity tables when no model is configured.
Auto-save to ~/.cheetahclaws/research_reports/ — every /research and /research compare run writes two files: <YYYY-MM-DD_HHMMSS>-<slug>.md (rendered brief) + .json sidecar (serialized Brief + notable citers + entities). Opt out with --no-save. Explicit export via --save-as PATH. New /reports command: list (50 most recent) · open <id> (print) · path <id> (print file path) · delete <id>.
Weekly trend tracking via /monitor — new topic prefix research:<query> (or research:<range>:<query> — e.g. research:30d:RLHF) dispatches to the full 20-source pipeline each scheduled run. Supports daily/weekly/12h/... schedules and --telegram/--slack/console channels. Each invocation: pulls all 20 sources · filters by the subscription's time window · renders the cross-platform heat table + sparkline as the first digest item · writes a full report · pushes to configured channels. Subscribe via /subscribe research:<topic> weekly or the /monitor wizard's new "Trend tracker" option.
/ssj wizard integration — 3 new menu items for zero-flag operation:
16. 🔍 Research — asks topic → time range (1-5) → citations y/N → runs /research with right flags
17. 📊 Trend Track — asks topic → tracking window → frequency → creates the /subscribe research:<range>:<topic> subscription
微博 Weibo — m.weibo.cn getIndex endpoint, requires WEIBO_COOKIE (browser-extracted SUB; SUBP); returns posts with 赞/转/评 engagement. Parses relative Chinese time forms (刚刚, 5分钟前, 2小时前, 今天 HH:MM, MM-DD).
小红书 Xiaohongshu — edith.xiaohongshu.com notes search, requires XHS_COOKIE (+ often XHS_X_S); returns notes with 赞/评/收藏 engagement. Note: Xiaohongshu anti-bot is aggressive; cookies may expire hourly. Fallback: use --sources tavily with <query> site:xiaohongshu.com.
Architecture:
research/ package: __init__.py, types.py, time_range.py, http.py, cache.py (24h SQLite at ~/.cheetahclaws/research_cache.db, keyed on source + query + limit + time range), classifier.py (keyword-based topic→domain routing, zero latency, zero LLM), ranker.py, aggregator.py, synthesizer.py, citations.py, entities.py, reports.py, sources/ (20 modules).