Skip to content

lanl/arcs

Repository files navigation

ARCS logo

ARCS

Agentic Retrieval-Augmented Code Synthesis with Iterative Refinement

CI Python License: GPL v3+ LANL O#

© 2025. Triad National Security, LLC. All rights reserved. Produced under U.S. Government contract 89233218CNA000001 for Los Alamos National Laboratory (LANL), operated by Triad National Security, LLC, for the U.S. Department of Energy / National Nuclear Security Administration. LANL Open-Source Disclosure Reference: O4936. See LICENSE for the full notice and the GPLv3 license text.

ARCS turns a question and a codebase into runnable, tested code. It indexes your library with a vector store, breaks the question into function-sized steps, generates code one step at a time with the relevant context retrieved from the index, and then executes a self-written test suite in an isolated sandbox — refining the code until the tests pass.


Highlights

  • Three workflow presetssmall (lookup), medium (generate), large (generate + test + refine).
  • Pluggable LLMs — OpenAI / OpenAI-compatible gateways (LiteLLM, SambaNova Cloud, vLLM) or local HuggingFace transformers.
  • Pluggable vector stores — embedded ChromaDB by default; Milvus via the milvus extra. Add your own by implementing one interface.
  • Sandboxed Python execution — per-collection venv, opt-in dependency installation behind a package allow-list, matplotlib artifact capture, 60s timeout.
  • Multi-language runners — C/C++, Java, Fortran, JavaScript, Python — bundled as LangChain tools for research workflows.
  • HTTP + WebSocket API — FastAPI server with the original /api/start, /api/upload, /api/collections, /ws contract.
  • React frontend under apps/web preserved verbatim.
  • CLI for one-shot runs, server launches, and codebase ingestion.
  • First-class evaluation — JSONL benchmark runner + assertion scorer used in the original paper.

Quick start

git clone https://github.com/lanl/arcs.git
cd arcs
python -m venv .venv && source .venv/bin/activate

# Pick the extras you need. The package core is just pydantic +
# python-dotenv + typing-extensions; everything else is optional.
pip install -e ".[langchain,chroma,ingest,server]"

cp .env.example .env                      # then edit OPENAI_API_KEY
arcs serve                                # FastAPI on :8000

In another terminal, launch the React UI:

cd apps/web
npm install
npm run dev                               # http://localhost:5173

Or use the CLI directly:

arcs ingest ./my_repo --collection my_repo --requirements
arcs run "Implement an LRU cache" --collection my_repo --system medium

Or embed the library:

import asyncio
from arcs import ARCs, Settings

async def main():
    arcs = ARCs.from_settings(Settings.load())
    result = await arcs.run(
        query="Use the joke library to print 5 random jokes",
        collection="jokes",
        system="medium",
    )
    print(result.final_output)

asyncio.run(main())

result is a typed RunResult (final_output, history, artifacts, events, iterations, saved_to).

Repository layout

arcs/
├── src/arcs/
│   ├── __init__.py              # Lazy public API: ARCs, Config, Settings,
│   │                            # Message, SocketMessage, RunResult, LLMClient
│   ├── version.py               # __version__
│   ├── config.py                # Settings (env-driven runtime config)
│   ├── cli.py                   # `arcs` console script (serve / run / ingest)
│   │
│   ├── core/                    # Orchestrator + typed contracts
│   │   ├── orchestrator.py      # ARCs.process(config) -> RunResult
│   │   ├── routes.py            # small / medium / large
│   │   ├── config.py            # Config / Message / SocketMessage
│   │   └── result.py            # RunResult
│   │
│   ├── workflows/               # LangGraph stages
│   │   ├── question_breakdown.py
│   │   ├── code_generation.py
│   │   ├── code_testing.py
│   │   └── agentic_selection.py # async worker pool
│   │
│   ├── llm/                     # LLMClient protocol + adapters
│   │   ├── protocol.py          # LLMClient (Protocol), LangChainLLM
│   │   ├── openai_chat.py       # SimpleChatOpenAI
│   │   ├── local.py             # LocalLLM, LocalFunctionCaller
│   │   └── factory.py           # build_llm(settings)
│   │
│   ├── data/                    # Connectors + memory + metadata generator
│   │   ├── connectors/          # BaseConnector + Chroma + Milvus
│   │   │                        # + LegacyClassmethodAdapter
│   │   ├── memory/              # InMemoryStore, MemoryConnector,
│   │   │                        # CodeMetadataGenerator (lazy)
│   │   ├── distributor.py       # DataDistributor (lazy)
│   │   └── factory.py           # build_connector(settings)
│   │
│   ├── execution/               # Sandboxed code execution
│   │   ├── venv.py              # create_venv, install_into_venv
│   │   ├── python_runner.py     # run_py_no_tool, ensure_dependencies
│   │   └── archive.py           # safe_extract (zip-slip safe)
│   │
│   ├── messaging/               # Event stream + WebSocket manager
│   ├── prompts/                 # prompts.json + typed PromptRegistry
│   ├── framework/               # Research agentic graph (Node/FunctionNode/Edge/AgenticSystem)
│   ├── cot/                     # Chain-of-Thought system (generate / generate_half)
│   ├── research/                # Multi-lang runners + cpp_analyzer + on-prem transformer
│   ├── tools/                   # Curated re-exports of LLM-callable tools
│   ├── evaluation/              # JSONL benchmark runner + assertion scorer
│   └── server/                  # FastAPI app (create_app, run)
│
├── apps/web/                    # React frontend (Vite + TS + Tailwind)
├── examples/                    # 5 runnable end-to-end demos
├── tests/                       # 38 pytest cases
├── docs/                        # Index, getting-started, architecture, api, deployment
├── Dockerfile                   # Multi-stage production image
├── docker-compose.yml           # Local dev convenience
├── Makefile                     # Common dev targets
├── pyproject.toml               # Packaging (extras: langchain, chroma, milvus,
│                                # ingest, server, onprem, analysis, dev, all)
├── requirements.txt             # Mirrors the `dev` extra
├── .env.example                 # Configuration template
├── CHANGELOG.md  CONTRIBUTING.md  LICENSE
└── .github/workflows/ci.yml     # CI: Python (3.10/3.11/3.12) + Frontend + Docker

A concise architecture overview is in docs/architecture.md.

Configuration

ARCS reads configuration from environment variables (and optionally a .env file). The full set is documented in .env.example; the most important keys are:

Key Purpose
ARCS_LLM_PROVIDER openai (default) or local
OPENAI_API_KEY OpenAI / OpenAI-compatible token
ARCS_LOCAL_BASE_URL URL of your on-prem gateway when provider is local
ARCS_VECTOR_BACKEND chroma (default) or milvus
ARCS_LIBRARIES_DIR Where indexed codebases live (default: ./runtime/Libraries)
ARCS_OUTPUTS_DIR Where saved runs go (default: ./runtime/Outputs)
ARCS_PORT HTTP port (default: 8000)

Workflows

Route What it does Use when
small vector search + one LLM call docs lookup, "explain this"
medium breakdown + iterative code generation execution unsafe / not needed
large breakdown + generation + test + refine end-to-end implementation

Each route publishes the same Message events to the in-process event stream, which transports forward to the frontend (over WebSocket), the CLI (as JSON lines), or your own subscriber.

Documentation

Development

make dev        # install with dev extras
make test       # pytest
make lint       # ruff
make type       # mypy (advisory)
make fmt        # ruff format + import sort
make docker     # build the production image
make web        # launch the Vite dev server

.[dev] intentionally excludes the Chroma / sentence-transformer stack so unit-test setup stays lightweight. Install .[chroma] separately when you need to exercise the embedded vector store.

The CI workflow (.github/workflows/ci.yml) runs the same targets across Python 3.10 / 3.11 / 3.12 plus the frontend build and a Docker image build.

Contributing

Contributions welcome — see CONTRIBUTING.md for the workflow.

Licence

ARCS is released under the GNU General Public License, version 3 or later (GPLv3+). See LICENSE for the full text.

Copyright notice

ARCS: Agentic Retrieval-Augmented Code Synthesis with Iterative Refinement © 2025. Triad National Security, LLC. All rights reserved.

This program was produced under U.S. Government contract 89233218CNA000001 for Los Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC for the U.S. Department of Energy / National Nuclear Security Administration. All rights in the program are reserved by Triad National Security, LLC, and the U.S. Department of Energy / National Nuclear Security Administration. The Government is granted for itself and others acting on its behalf a nonexclusive, paid-up, irrevocable worldwide license in this material to reproduce, prepare derivative works, distribute copies to the public, perform publicly and display publicly, and to permit others to do so.

(End of Notice)

LANL Open-Source Disclosure Reference: O4936 — please retain this reference in any redistributed README so copyright assertion can be confirmed.

How to cite ARCS?

@article{bhattarai2025arcs,
  title={ARCS: Agentic Retrieval-Augmented Code Synthesis with Iterative Refinement},
  author={Bhattarai, Manish and Cordova, Miguel and Vu, Minh and Santos, Javier and Boureima, Ismael and O'Malley, Dan},
  journal={arXiv preprint arXiv:2504.20434},
  year={2025}
}

Acknowledgements

Los Alamos National Lab (LANL), T-1