Popular repositories Loading
-
rag-eval-harness
rag-eval-harness PublicEnd-to-end RAG over Paul Graham essays with a hand-curated eval harness — pgvector, cross-encoder rerank, LLM-as-judge scoring. Includes A/B/C eval results and discussion of why rerank didn't help …
Python
-
langgraph-research-agent
langgraph-research-agent PublicStateful research agent: LangGraph + Tavily + Claude. Plans, searches, reads, drafts, self-critiques. Eval harness vs Sonnet+web_search baseline.
Python
-
mcp-automations
mcp-automations PublicProduction-grade MCP server (FastMCP, Pydantic-typed tools, Resources, Prompts) with Claude Desktop + Streamlit playground integration. Deployable to Fly.io.
Python
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.