- San Francisco Bay Area
-
13:47
(UTC -07:00) - in/achachlouei
Pinned Loading
-
mrm-prompt-bench
mrm-prompt-bench PublicA Python benchmark for comparing LLM prompting strategies on instruction-following document generation for Model Risk Management validation reports. It evaluates techniques like zero-shot, few-shot…
Jupyter Notebook
-
evidence-graph
evidence-graph PublicA multi-agent AI research system that treats research as graph construction, explicitly surfacing contradictions to produce grounded, evidence-backed reports.
Python
-
eval-fabric
eval-fabric PublicPluggable evaluation orchestration framework for LLM and agentic systems. Defines versioned eval contracts, async runners, evaluator/judge plugins, reproducible traces, typed judgments, and OpenTel…
Python 1
-
rag-comparison
rag-comparison PublicEducational comparison of KG-RAG, Hybrid RAG, and Agentic RAG patterns: same corpus, same questions, three pipelines.
Jupyter Notebook
If the problem persists, check the GitHub status page or contact support.
