refactor: reorganize project layout for phase 13#440
Conversation
Move Dockerfile -> infra/docker/api.Dockerfile and Dockerfile.ray -> infra/docker/ray.Dockerfile, and entrypoint.sh -> infra/scripts/. The build context stays at the repo root, so the COPY for entrypoint becomes infra/scripts/entrypoint.sh and a build-context comment is added to each Dockerfile.
Relocate docker-compose.yaml, .env.example, .env.ollama, the Grafana and Prometheus configs (from openrag_metrics/) and milvus.yaml (from vdb/) under infra/compose/. The standalone metrics stack becomes monitoring.docker-compose.yaml. Update the compose file's internal paths: build context -> ../.. with dockerfile infra/docker/api.Dockerfile, the milvus include -> milvus/milvus.yaml, extern includes and bind-mount sources -> ../../ to resolve from the new location.
Relocate ansible/ -> infra/ansible/, charts/ -> infra/charts/ and quick_start/ -> infra/quick_start/. quick_start is self-contained (its own vdb/ and extern/ subtrees), so its internal relative paths are unaffected.
Relocate the prompt templates from prompts/example1/*.txt to the package at openrag/prompts/templates/*.txt and add openrag/prompts/__init__.py, which reads them into DEFAULT_SEEDS (keyed by filename stem) at import time as the first-boot fallback for DB-stored per-partition overrides. The PathsConfig.prompts_dir default now resolves package-relative via __file__, so it works regardless of CWD. Drop the hardcoded prompts_dir from conf/config.yaml and the PROMPTS_DIR overrides from the Helm values and the api-test harness so the bundled default applies; repoint pytest's PROMPTS_DIR to openrag/prompts/templates. Both Dockerfiles no longer COPY prompts/ — the templates ship inside the package via COPY openrag/.
Provide an ergonomic top-level ui/ path for the admin frontend. The indexer-ui submodule stays registered at its .gitmodules path under extern/; ui/ is a relative symlink so it resolves on any checkout.
Exclude infra/ and tests/load/ from ruff, and add a package-data entry so the prompt templates under openrag/prompts/templates/ (read at import time) ship inside the built wheel/sdist rather than being skipped as non-.py files.
Point the build workflows at infra/docker/api.Dockerfile and infra/docker/ray.Dockerfile, the helm workflow at infra/charts/openrag-stack, and the api-test harness build at infra/docker/api.Dockerfile. The integration/unit/lint/layer-guard workflows need no change: they target self-contained compose stacks or paths that did not move.
The Ray cluster config is deployment infrastructure; relocate it alongside the helm charts. The other root-level deployment files (Dockerfile, Dockerfile.ray, docker-compose.yaml, entrypoint.sh, .env.example, .env.ollama) were already relocated when infra/ was created.
Move the standalone data_indexer.py CLI from utility/ to scripts/ and drop the redundant utility/requirements.txt (httpx and loguru are already project dependencies). Repoint the helm .tgz ignore at infra/charts/ after the 13A move.
Add a Project Layout section to CLAUDE.md and point its Docker, prompt template, migration, and test-config references at the new locations. Update the README getting-started flow for infra/compose/.env.example and infra/quick_start, fix the helm/ansible and ray-cluster example paths, and mark PROMPTS_DIR optional in the doc env samples now that templates ship in the package.
|
Important Review skippedToo many files! This PR contains 199 files, which is 49 over the limit of 150. To get a review, narrow the scope: ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (7)
📒 Files selected for processing (199)
You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThis PR reorganizes the OpenRAG repository to consolidate infrastructure files under an ChangesInfrastructure Reorganization and Prompt Template Bundling
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
docs/content/docs/documentation/deploy_ray_cluster.md (1)
121-121:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winUpdate the Ray image build-file reference to the new Dockerfile path.
Line 121 still points to
Dockerfile.ray, but this refactor moved it toinfra/docker/ray.Dockerfile. Keeping the old name will send users to a non-existent file.
As per coding guidelines: “Docker build files ininfra/docker/must be named with the.Dockerfileextension and built from the repository root”.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/content/docs/documentation/deploy_ray_cluster.md` at line 121, The docs still reference the old Dockerfile name "Dockerfile.ray"; update that string to the new filename "ray.Dockerfile" located under the infra/docker area so the docs point to the correct build file and follow the repo convention; locate and replace the "Dockerfile.ray" reference in the deploy_ray_cluster.md content (around the base image instruction) to "ray.Dockerfile" and ensure the text states the image must be built from the repository root as per guidelines.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@CLAUDE.md`:
- Around line 11-27: The fenced code block in CLAUDE.md lacks a language tag
which triggers markdownlint MD040; update the opening triple-backtick for that
block to include a language identifier (e.g., change ``` to ```text) so the
block becomes ```text ... ```, preserving the existing content and closing
backticks unchanged.
---
Outside diff comments:
In `@docs/content/docs/documentation/deploy_ray_cluster.md`:
- Line 121: The docs still reference the old Dockerfile name "Dockerfile.ray";
update that string to the new filename "ray.Dockerfile" located under the
infra/docker area so the docs point to the correct build file and follow the
repo convention; locate and replace the "Dockerfile.ray" reference in the
deploy_ray_cluster.md content (around the base image instruction) to
"ray.Dockerfile" and ensure the text states the image must be built from the
repository root as per guidelines.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: d2819719-c7b8-478e-a670-8c83452b9523
⛔ Files ignored due to path filters (1)
infra/charts/openrag-stack/Chart.lockis excluded by!**/*.lock
📒 Files selected for processing (65)
.github/workflows/build.yml.github/workflows/build_dev.yml.github/workflows/helm.yaml.gitignoreCLAUDE.mdREADME.mdconf/config.yamldocs/assets/env_example.envdocs/assets/env_linux_gpu.envdocs/content/docs/documentation/deploy_ray_cluster.mdinfra/ansible/README.mdinfra/ansible/ansible.cfginfra/ansible/deploy.shinfra/ansible/inventory.ini.exampleinfra/ansible/playbooks/docker.ymlinfra/ansible/playbooks/main-playbook.ymlinfra/ansible/playbooks/nvidia-drivers-toolkit.ymlinfra/ansible/playbooks/openrag.ymlinfra/charts/openrag-stack/.helmignoreinfra/charts/openrag-stack/Chart.yamlinfra/charts/openrag-stack/templates/_helpers.tplinfra/charts/openrag-stack/templates/configmap-env.yamlinfra/charts/openrag-stack/templates/indexer-ui.yamlinfra/charts/openrag-stack/templates/infinity.yamlinfra/charts/openrag-stack/templates/ingress.yamlinfra/charts/openrag-stack/templates/openrag.yamlinfra/charts/openrag-stack/templates/pvc.yamlinfra/charts/openrag-stack/templates/raycluster.yamlinfra/charts/openrag-stack/templates/secrets-env.yamlinfra/charts/openrag-stack/values.yamlinfra/cluster.yamlinfra/compose/.env.exampleinfra/compose/.env.ollamainfra/compose/docker-compose.yamlinfra/compose/grafana/dashboards/gpu-metrics.jsoninfra/compose/grafana/dashboards/openrag-http.jsoninfra/compose/grafana/dashboards/system-overview.jsoninfra/compose/grafana/provisioning/dashboards/dashboard.ymlinfra/compose/grafana/provisioning/datasources/datasource.ymlinfra/compose/milvus/milvus.yamlinfra/compose/monitoring.docker-compose.yamlinfra/compose/prometheus/prometheus.ymlinfra/docker/api.Dockerfileinfra/docker/ray.Dockerfileinfra/quick_start/docker-compose.yamlinfra/quick_start/extern/infinity.yamlinfra/quick_start/extern/vllm/Dockerfile.cpuinfra/quick_start/vdb/milvus.yamlinfra/scripts/entrypoint.shopenrag/core/config/infrastructure.pyopenrag/core/prompts/query_rewriter.pyopenrag/prompts/__init__.pyopenrag/prompts/templates/chunk_contextualizer_tmpl.txtopenrag/prompts/templates/hyde.txtopenrag/prompts/templates/image_captioning_tmpl.txtopenrag/prompts/templates/multi_query_pmpt_tmpl.txtopenrag/prompts/templates/query_contextualizer_tmpl.txtopenrag/prompts/templates/spoken_style_answer_tmpl.txtopenrag/prompts/templates/sys_prompt_tmpl.txtpyproject.tomlpytest.iniscripts/data_indexer.pytests/api_tests/api_run/docker-compose.yamluiutility/requirements.txt
💤 Files with no reviewable changes (1)
- utility/requirements.txt
…fresh docs The infra/compose/.env.example still set PROMPTS_DIR=../prompts/example1 after the templates moved into the package; copying it to .env (the documented getting-started step) injected a path that resolves to /app/prompts/example1 in the container, which no longer exists, so load_template raised FileNotFoundError and the API crashed on startup. Comment it out and document that the bundled templates are used unless PROMPTS_DIR overrides them. Move the pytest configuration from pytest.ini into pyproject.toml [tool.pytest.ini_options] and delete pytest.ini, preserving testpaths/pythonpath/python_files/env/markers and adding --strict-markers. Asyncio stays in strict mode since the suite uses explicit markers. Repoint the env_vars and milvus_migration docs at openrag/prompts/templates and infra/compose/milvus/milvus.yaml.
- fix stale package paths in CLAUDE.md (components/routers/.hydra_config
-> core/services/api/di + conf/config.yaml); refresh test command docs
- move automatic-evaluation-pipeline/ into tests/load/
- remove orphaned root vdb/ and stale openrag/{components,config,utils,tests}
leftovers; ignore db/, vdb/, coverage artifacts
- add asyncio_mode=auto and --tb=short to pytest config
- add tests/unit/conftest.py (mock ports) and tests/unit/api/conftest.py
(ASGI client) prescribed by the phase 13 plan
- skip the pydub audio test on python 3.13 (audioop removed)
Phase 13 — Project Layout
Restructures the top-level project from the flat post-Phase-12 layout to a clean, predictable structure. Mostly moves + path rewiring; no application logic changes.
This branch now covers all of Phase 13 (13A–13F), combining both tracks.
13A — Deployment infra →
infra/Dockerfile/Dockerfile.ray→infra/docker/api.Dockerfile/ray.Dockerfile(build context stays repo root;COPYfor entrypoint repointed).docker-compose.yaml+ service configs →infra/compose/(grafana, prometheus fromopenrag_metrics/;vdb/milvus.yaml→infra/compose/milvus/; standalone metrics stack →monitoring.docker-compose.yaml). All internal compose paths rewired (context: ../.., includes, bind mounts).ansible/,charts/,quick_start/,cluster.yaml→infra/.13B — Scripts out of the package
openrag/scripts/{backup,restore,embed,filter-logs,check_file_counts}.py→ top-levelscripts/(filter-logs.pyrenamedfilter_logs.py);utility/data_indexer.py→scripts/.openrag/services/persistence/migrations/.scripts/_bootstrap.pysys.path shim (interim, until the package-wideopenrag.*import migration).13C — Test restructure
tests/:tests/unit/(mirrors the package),tests/integration/{api,repos,robot},tests/load/(wasbenchmarks/).pytest.iniintopyproject.toml [tool.pytest.ini_options]with--strict-markers;pytest.inideleted.13D — Prompts into the package
prompts/example1/*.txt→openrag/prompts/templates/*.txt; addedopenrag/prompts/__init__.pyloading them intoDEFAULT_SEEDS(keyed by stem) at import.PathsConfig.prompts_dirdefault now package-relative via__file__; dropped the hardcodedconf/config.yamlvalue and the danglingPROMPTS_DIRoverrides (helm, api-test harness,infra/compose/.env.example). Dockerfiles no longerCOPY prompts/;package-dataships the templates in the wheel/sdist.13E — UI symlink
ui/→extern/indexer-ui(relative symlink).extern/left untouched (simple/recommended approach).13F — pyproject, CI, root cleanup, docs
infra//tests/load/from ruff; addedpackage-data.infra/docker/*.Dockerfile; helm →infra/charts/openrag-stack; api-test harness build path →infra/docker/api.Dockerfile.CLAUDE.md; README getting-started, env/prompt/milvus references repointed at the new locations.Verification
uv sync, layer guard,docker compose config, andDEFAULT_SEEDS(=7) all pass.infra/docker/*.pyproject.toml(configfile confirmed); unit suite collects.Known follow-ups (out of scope)
CLAUDE.mdArchitecture section still references deletedopenrag/components/...paths.tests/unit/.../audio/test_openai.pyfails to collect under Python 3.13 (audioopremoved from stdlib, breakspydub); unrelated to the layout move.tests/unit/run is the joint final check.