Skip to content

caiopmed13/research-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

research-agent — Autonomous AI Research Pipeline

A self-running agent that scans GitHub, arXiv, Hacker News, Reddit and awesome-lists on a schedule, filters everything through an LLM against my actual projects, and delivers a short, ranked, actionable report — with ready-to-paste prompts for each finding.

Python Claude Windows Deps


The problem

Staying current across CV, ML, sports analytics, LLM agents and freelance leads means drowning in 100–200 new repos, papers and threads a week. Reading them is a part-time job. Most are noise.

research-agent does the reading. It collects the firehose, has an LLM score every item against my specific projects and goals, and hands me 3–15 things actually worth my time — each with a one-paragraph "why it matters" and a copy-paste prompt to act on it. Human stays in the loop on decisions; the machine does the triage.

How it works

┌──────────────┐   ┌───────────────┐   ┌──────────────┐   ┌────────────────┐   ┌──────────┐
│ fetch_sources │ → │ prepare_prompt │ → │ invoke_claude │ → │ ingest_findings │ → │ notify   │
│  (collect)    │   │  (+ memory)    │   │  (LLM score)  │   │ (parse+persist) │   │ (toast)  │
└──────────────┘   └───────────────┘   └──────────────┘   └────────────────┘   └──────────┘
       │                   │                                        │
   GitHub API          user profile                          reports/<date>.md
   arXiv Atom          active topics                         memory/*.md updated
   HN (Algolia)        dismissed list  ← anti-repetition
   Reddit JSON
   awesome diffs
  1. Collectfetch_sources.py queries each source's API (GitHub Search, arXiv Atom, HN Algolia, Reddit JSON, awesome-list README diffs), respecting per-source filters (min stars, recency, keyword in abstract, subreddit allow-list). ~100–200 raw candidates/week.
  2. Contextualizeprepare_prompt.py builds the LLM prompt from a persistent memory of my projects, weighted topics, and a dismissed list so the model never re-surfaces things I already rejected.
  3. Score — Claude ranks every candidate 0–5 (Is it relevant? Novel? Maintained? Does it ship code?) and writes a 3-tier report (High / Medium / Low) — each High finding includes a ready-to-paste prompt for a downstream coding session.
  4. Persistingest_findings.py files the report, appends new findings to the inbox, appends rejections to the dismissed list (feeding step 2 next run), and logs telemetry.
  5. Notifynotify.ps1 fires a Windows toast and auto-opens the report when there's a high-priority hit.

Design decisions worth calling out

  • Near-zero dependencies. One library (PyYAML). Everything else — HTTP, JSON, regex, dates, file I/O — is Python stdlib. Nothing to break, nothing to patch, trivial to run anywhere.
  • Memory as plain Markdown. The agent's "state" is six human-readable .md files (profile, topics, pending, approved, dismissed, session log). I can read and edit them directly; no database, no opaque store.
  • Anti-repetition loop. Rejections are fed back into the next prompt, so the signal-to-noise ratio improves over time instead of repeating the same junk.
  • Cadence-aware collection. Heavy sources (GitHub + arXiv) run Monday, trending (HN + Reddit) Friday, awesome-list diffs monthly — so each run stays cheap and focused.
  • Encoding-hardened ingest. Detects UTF-8 / UTF-16 LE-BE w/ BOM / latin-1, because Windows PowerShell > writes UTF-16 — the kind of bug that silently corrupts a pipeline.
  • Cost-controlled. $0.30–0.80 per run on Sonnet ($5–12/month); one config flag swaps to Haiku for ~$2/month.

Tech stack

Piece Tech
Orchestration Python 3.10+ (stdlib), Windows batch
Scheduling Windows Task Scheduler (4 triggers)
LLM Claude (via CLI), prompt-engineered scoring
Notifications PowerShell + BurntToast (toast), MessageBox fallback
Config & memory YAML sources, Markdown memory
Sources GitHub Search API, arXiv Atom, HN Algolia, Reddit JSON, awesome-list git diffs

Repository layout

memory/    persistent agent state (profile, topics, pending, approved, dismissed, log)
sources/   per-source YAML configs (topics, keywords, filters)
scripts/   the pipeline (fetch → prepare → invoke → ingest → notify)
reports/   dated, human-readable output reports
config.yaml  models, token caps, paths, flags

See OPERATIONS.md for full setup, scheduling and troubleshooting.

Running it

pip install -r requirements.txt     # just PyYAML
scripts/run_daily.bat               # run the full pipeline now
python scripts/fetch_sources.py --dry-run   # preview what would be collected

Built by Caio — Data Analyst & Data Lead. Runs unattended on a schedule; designed to save hours of manual research every week.

About

Autonomous, self-improving research agent: scans arXiv / GitHub Trending / HackerNews / Reddit on a schedule, triages findings with Claude, keeps a persistent file-based memory. Near-zero deps.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors