docs: finalize AnythingLLM docker-compose and deployment guide#11
docs: finalize AnythingLLM docker-compose and deployment guide#11mshahid538 wants to merge 1 commit into
Conversation
romandidomizio
left a comment
There was a problem hiding this comment.
═══════════════════════════════════════════════════════════════════════════════
🏐 #ContextVolley | @rmn → @shd | 2026-03-16 | W12
═══════════════════════════════════════════════════════════════════════════════
📐 CCC Metadata
| Field | Value |
|---|---|
| CCC-ID | RMN_2026-W12_035 |
| CCC Version | 3.1.3.1 |
| Context Type | 📋 TASK-BLOCKER |
| Target | @shd (DevOps) |
| Priority | 🔴 P0 |
| Handoff Protocol | SEEK:CONFIRM → BUILD → PR-UPDATE |
🎯 ISSUE SUMMARY
Your PR in weownnetwork/ai/anythingllm/docker contains ethdenver-specific config with basic documentation, .env.example that doesn't match our target variables, and no deployment script. We need a repeatable framework/template. This blocks team replication. We need generic, reusable deployment tooling for all future instances.
📦 REQUIRED FIXES
1. Remove Instance-Specific Naming
| Current | Required |
|---|---|
anythingllm_ethdenver |
<CONTAINER_NAME> (user input) |
/root/ethdenver_storage |
<HOST_PATH> (user input) |
ETHDenver.CCC.bot |
<DOMAIN> (user input) |
2. Create Unified Deploy Script (deploy.sh)
Script must handle:
# Pre-flight Checks
✓ Docker Engine installed on target droplet
✓ Docker Compose available
✓ User authenticated to correct droplet (SSH/DO API)
# Interactive Prompts
✓ Container name (no defaults)
✓ Host storage path
✓ Domain for exposure (Caddy/Traefik)
✓ LLM Provider (default: openrouter, NOT openai)
✓ LLM Model (user selection)
✓ Embedding Model (user selection)
✓ API Key (secure injection via Infisical or prompt)
✓ Port mapping
# Secure Env Handling
✓ Generate .env from prompts (no hardcoded values)
✓ Sanitized .env.example for GitHub
✓ Inject secrets at runtime (not baked into compose)
# Deployment Execution
✓ docker compose up -d
✓ SSL verification (Caddy auto-HTTPS)
✓ Health check confirmation3. Reference Pattern
Use our AnythingLLM Helm Chart deploy script as template logic. Adapt for Docker Droplet context (no K8s, single-node for docker).
4. File Structure
anythingllm/docker/
├── docker-compose.yml # Generic, no hardcoded names
├── .env.example # Sanitized template
├── deploy.sh # Unified deployment script
├── README.md # Usage instructions + pre-reqs
└── .env # Runtime only (NOT committed)
🚫 WHAT TO REMOVE
| File | Issue | Action |
|---|---|---|
docker-compose.yml |
Hardcoded anythingllm_ethdenver |
Replace with variable injection |
.env.example |
LLM_PROVIDER=openai |
Change default to openrouter |
README.md |
ethdenver-specific docs | Generalize for any instance |
✅ ACCEPTANCE CRITERIA
| # | Criteria | Status |
|---|---|---|
| 1 | Script runs on any DO Docker droplet | ⏳ |
| 2 | No hardcoded instance names in config | ⏳ |
| 3 | Prompts for all required variables | ⏳ |
| 4 | Secure API key handling (Infisical-ready) | ⏳ |
| 5 | OpenRouter as default provider | ⏳ |
| 6 | README covers pre-reqs + usage | ⏳ |
| 7 | Single script (no fragmented commands) | ⏳ |
📝 README.md ADDENDUM — SYSTEM UPDATES & CONFIG
Purpose: Standardize update procedures, resource requirements, and extension configs (MCP/Env).
Location: Append to README.md or create docs/DEPLOYMENT.md.
🔄 System Updates & Configuration
1. Automated Update Script
To ensure all services are pulled and restarted cleanly, use the provided update script.
# ./scripts/update.sh
#!/bin/bash
echo "🔄 Pulling latest changes..."
git pull origin main
echo "🐳 Restarting containers..."
docker compose down
docker compose up -d --pull always
echo "✅ Update complete."Usage:
chmod +x scripts/update.sh
./scripts/update.sh2. AnythingLLM (Docker Self-Hosted)
We utilize AnythingLLM for document retrieval and agent context with offloaded inference and embedding. Ensure your host meets minimum requirements before deployment.
| Resource | Minimum | Recommended |
|---|---|---|
| RAM | 2 GB | 4 GB+ |
| CPU | 1 Cores | 2 Cores+ |
| Storage | 10 GB | 50 GB+ (SSD) |
📖 Official Docs: AnythingLLM Docker Requirements
3. Environment Variables
Configure core system behavior via .env. Key variables include as a start, for example:
# Community Hub Configuration
# Enable agent skill imports from AnythingLLM Hub
# "1" = Allow verified/private items only (recommended for enterprise)
# "allow_all" = Allow all items including unverified (not recommended)
COMMUNITY_HUB_BUNDLE_DOWNLOADS_ENABLED: "1" # Enterprise security: verified items only
# MCP Configuration
MCP_SERVER_ENABLED=true
MCP_CONFIG_PATH=/etc/mcp/servers.json4. Custom MCP Servers
To add custom Model Context Protocol (MCP) servers, edit the MCP configuration file, for example:
File: /etc/mcp/servers.json
{
"mcpServers": {
"custom-tool": {
"command": "node",
"args": ["/app/tools/custom-server.js"],
"env": {
"API_KEY": "${YOUR_API_KEY}"
}
}
}
}Restart Required: After modifying MCP configs, restart the agent service:
docker compose restart agent📬 HANDOFF
Action Required:
- Review this volley
- Update PR with generic config + deploy.sh
- Remove ethdenver-specific references
- Test against fresh droplet (not ethdenver instance)
- Tag @rmn for review before merge
Blocker Status: 🔴 PR cannot merge until corrected
═══════════════════════════════════════════════════════════════════════════════
Resolved all remaining issues from PR #5 Copilot review: Issue #1 - Workflow branch triggers: - Added explicit branch patterns: maintenance, feature/*, fix/*, docs/*, hotfix/* - Excluded experimental/* branches to prevent unintended PRs - Maintains security while supporting defined branching strategy Issue #2 - Dynamic repository values: - Changed hardcoded 'WeOwnNetwork' to ${{ github.repository_owner }} - Changed hardcoded 'ai' to ${{ github.event.repository.name }} - Enables workflow portability across forks and repos Issue #3 - Improved PR title fallback: - Added commit count when available - Uses latest commit subject as additional hint - Provides context: 'Merge branch into main (X commits)' - Falls back gracefully through multiple options Issue #4 - Copilot date context: - Updated to current date: January 26, 2026 (Sunday) - Clarified Copilot cannot use web search during reviews - Focus on format validation vs exact date calculation Issue #5 & #9 - Version format clarity: - Clarified 3.4.0 as SEASON.WEEK.DAY with DAY=0, VERSION omitted - Updated special cases table with explicit component breakdowns - Added note explaining shorthand format vs full 4-part format Issue #6 - CI/CD dry-run validation: - Removed '|| true' error suppression - Allows failures to propagate and fail pipeline - Aligns with quality gates (blocking on K8s failures) Issue #7 - README absolute paths: - Changed ../docs/ to /docs/ for HELM_VALUE_MANAGEMENT.md - Ensures links work across all documentation contexts Issue #11 - Example day inconsistency: - Fixed Jan 25, 2026 from Saturday (6) to Sunday (7) - Provided complete example version: 2.5.7.1 Issue #12 - CHANGELOG date: - Updated from 2026-01-25 to 2026-01-26 (current date) Issue #14 - WordPress version clarity: - Clarified as 'WordPress application version 3.2.5' - Distinguishes from WeOwnVer chart versioning Issue #15 - Security consistency: - Pinned all actions/checkout@v4 to specific SHA - Added comment: # v4.1.5 for version tracking - Consistent with other pinned actions in workflow All paths now use absolute /docs/ references, all version format ambiguities resolved, security controls enforced consistently.
…ot PR #36 findings Re-bases the INT-P01 site on the s004 reference pattern (Path C slim cloud-init + Layer 2 bootstrap-secret rotation; see docs/INFRA_BOOTSTRAP_PATTERN.md), and closes every inline review comment left by copilot-pull-request-reviewer on PR #36 (15 comments, all fixed). Path C + Layer 2 adoption (single biggest change): - terraform/templates/cloud-init.yaml: now ONLY handles first-boot bootstrap — Docker, Infisical CLI (artifacts-cli.infisical.com apt repo, not the legacy install-cli.sh capped at v0.38), the v1 → v2 Machine Identity rotation via Infisical Universal Auth API (revokes the v1 secret embedded in terraform state + DO droplet metadata within minutes of provisioning), and a .bootstrap-complete marker. Compose.yaml + Caddyfile + backup.sh + cron NO LONGER ship with cloud-init — they live in ansible/deploy.yml. - ansible/deploy.yml (new): owns all post-bootstrap state on the droplet. Asserts .bootstrap-complete + .infisical-auth.env exist, uploads compose+Caddyfile+backup.sh, installs daily backup cron with logrotate, runs docker compose up under infisical run, pulls images via community.docker.docker_image_pull (no SDK required on the droplet), updates DO tags (commit-<sha> + skinny-backup) via scripts/tag-droplet.sh, waits for /api/ping health. Idempotent — re-runnable any time without tofu taint. - scripts/deploy.sh: now a thin `ansible-playbook` wrapper requiring INFISICAL_PROJECT_ID env var; installs community.docker:==3.13.0 collection if missing. - terraform/backend.tf + init.sh (new): DO Spaces remote tofu state backend (SSE-C encrypted, S3-compatible). init.sh reads spaces_* credentials from terraform.tfvars and forwards them to `tofu init -backend-config=`. - terraform/main.tf: adds lifecycle ignore_changes = [user_data, tags] so the runtime tag mutations from ansible + bootstrap scripts stick. - terraform/variables.tf: adds spaces_access_key, spaces_secret_key, spaces_encryption_key, ssh_source_cidrs. - docker/compose.prod.yaml: bind-mounts /var/log/caddy into Caddy so the otel-agent filelog receiver can ship logs and they survive container recreation. Caddyfile dual-hostname preserved across the refactor: ai-stage.weown.agency, ai.weown.agency { … } Production cutover (Phase 6) is a pure DNS A-record swap on the same droplet — Caddy already has the cert for both names in one site block. Copilot review findings (PR #36) all addressed: #1, #2, #4, #5 Empty Infisical values in cloud-init/deploy/restore → fixed by Path C; cloud-init now uses HCL templatefile substitution (${infisical_*}) that resolves at tofu apply time from var.* — no more pre-baked empty strings. #3, #6, #7 Floating image tags (reg.mini.dev/anythingllm:latest) → pinned to :1.7.2 (same as s004; documented as the WeOwnLLM hardened-image version Shahid verified on s004.ccc.bot). #8 ADR #WeOwnVer mis-computed → v3.5.5.1 → v3.4.5.1. May = month 4 of S3, ISO W22 - W18 + 1 = offset 5, iteration 1 → v3.4.5.1. Math shown inline in the Version line per VERSIONING_WEOWNVER.md. #9 Broken link to private notes/Perpetuator/... in a public repo → replaced with in-repo references (Tuleap A174 / #1238 + the in-repo runbook). #10 Path inconsistency /opt/int-p01/ vs /opt/intp01/ → unified on /opt/int_p01_anythingllm/ throughout README, runbook, scripts, cloud-init. project_name (terraform var) is `int-p01-anythingllm` (hyphenated); underscore form for paths/volumes is `int_p01_anythingllm`. Matches s004's convention. #11 Bash-incompatible `read -rs "VAR?prompt"` (zsh-only) → switched to the canonical zsh-first + bash-fallback pattern: `read -rs "VAR?prompt" 2>/dev/null || read -rsp …` (matches global CLAUDE.md secrets pattern). #12, #13 .terraform.lock.hcl gitignored → unignored at both the sites/.gitignore root and the per-site .gitignore. Lock file is now tracked for reproducible provider versions across machines + CI runs. #14 backup.sh remote mode not wrapped in infisical run → adopted s004's backup.sh which sources the droplet's .infisical-auth.env over SSH and re-execs itself under `infisical run` so SPACES_* are injected for the S3 upload step. Requires INFISICAL_PROJECT_ID env var in remote mode. #15 README restore example used `anythingllm-ai_backup_…` template placeholder → replaced the entire "Migration from Helm/Kubernetes" section with a pointer to MIGRATION_RUNBOOK.md (which uses the real `int-p01-anythingllm_backup_<TS>` naming). Migration artifacts preserved + updated: - MIGRATION_RUNBOOK.md: replaced ssh + manual infisical-run restore invocations with the new Path C flow (INFISICAL_PROJECT_ID=<id> ./scripts/deploy.sh for app layer, then ./scripts/restore.sh for the DOKS data swap). Added explicit Layer 2 rotation verification step. Phase 1.5 local-laptop dry-run pinned to :1.7.2 and renamed volumes/networks to match production (int_p01_anythingllm_*). - scripts/migrate-from-doks.sh: PROJECT_NAME → int-p01-anythingllm so the produced tarball matches what restore.sh on the droplet expects. End-of-script "next step" instructions updated to call the new restore.sh wrapper instead of raw ssh + infisical run. CHANGELOG resolution from rebase: merged the otel-agent additions from main with the INT-P01 + ADR-005 entries; ordered newest-first. Rebased onto origin/main (commit 455be2a). The branch is now a linear 2-commit history: feat (site) + docs (ADR), with this third commit on top covering the full refactor + review-feedback round. User said "we will squash later" so commits are kept granular. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
#36) * feat(anythingllm-docker): add INT-P01 (ai.weown.agency) DOKS->Docker migration plan Generates anythingllm-docker/sites/ai.weown.agency/ from the existing copier template (project_name=int-p01, WeOwnLLM hardened image) and adds the tooling to migrate INT-P01 off DOKS via a parallel-build + DNS-cutover pattern: - scripts/migrate-from-doks.sh - one-shot bridge that kubectl-execs into the live DOKS pod, streams /app/server/storage out as a tarball, and wraps it in the same skinny-backup layout the template's restore.sh already understands. Optional --upload-to-spaces stages the artifact at s3://weown-backups/int-p01/ for redundancy. - MIGRATION_RUNBOOK.md - phased runbook: inventory/freeze, staging droplet provision (temporary hostname), DOKS extraction, restore, Jason/Yonks staging validation, production cutover, 7-day soak, rollback path. - anythingllm-docker/sites/README.md - directory-level explainer matching the existing keycloak-docker/sites/ convention. - anythingllm-docker/sites/.gitignore - blocks terraform state, real tfvars, backup tarballs, and stray .env files from being committed. Source plan: D383 / Tuleap A174 (#1238). Trigger: Signal #WeOwn.Dev ask from Jason 2026-05-21 (SearXNG broken on DOKS for the Calhoun MetaAgent). DOKS instance is never modified during the migration - rollback is a DNS flip until decommission (T+7 days post-cutover). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(adr): add ADR-005 for INT-P01 DOKS retirement; runbook + Caddyfile updates Folds in feedback on the migration plan: - Caddyfile + cloud-init: dual-hostname from first boot ('ai-stage.weown.agency, ai.weown.agency' in one site block) so the production cutover is a DNS A-record swap on the same droplet - no re-deploy of compose or Caddyfile required at cutover. - Runbook: replaces the previous int-p01-new.ccc.bot staging hostname with ai-stage.weown.agency (same parent zone), simplifies Phase 6 accordingly, and adds Phase 1.5 - an optional local-laptop dry-run that round-trips the DOKS backup through restore.sh against a throwaway docker container before any droplet exists. - Image-path open question removed: always reg.mini.dev/anythingllm:latest, with a note that 'mini_key' is an API key fragment that must come from Infisical (A126) or DOCR (D341), never embedded in the URL. - ADR-005 (Proposed): decision record for the retirement, the parallel-build + DNS-cutover pattern, two human validation gates (Phase 4 Jason/Yonks soak, Phase 6 CTO cutover approval), and compliance mappings across NIST CSF 2.0, SOC 2, ISO/IEC 27001:2022, ISO/IEC 42001:2023, CIS Controls v8. Status flips to Accepted at the close of the 7-day post-cutover soak. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(int-p01): adopt Path C + Layer 2 standard; address all Copilot PR #36 findings Re-bases the INT-P01 site on the s004 reference pattern (Path C slim cloud-init + Layer 2 bootstrap-secret rotation; see docs/INFRA_BOOTSTRAP_PATTERN.md), and closes every inline review comment left by copilot-pull-request-reviewer on PR #36 (15 comments, all fixed). Path C + Layer 2 adoption (single biggest change): - terraform/templates/cloud-init.yaml: now ONLY handles first-boot bootstrap — Docker, Infisical CLI (artifacts-cli.infisical.com apt repo, not the legacy install-cli.sh capped at v0.38), the v1 → v2 Machine Identity rotation via Infisical Universal Auth API (revokes the v1 secret embedded in terraform state + DO droplet metadata within minutes of provisioning), and a .bootstrap-complete marker. Compose.yaml + Caddyfile + backup.sh + cron NO LONGER ship with cloud-init — they live in ansible/deploy.yml. - ansible/deploy.yml (new): owns all post-bootstrap state on the droplet. Asserts .bootstrap-complete + .infisical-auth.env exist, uploads compose+Caddyfile+backup.sh, installs daily backup cron with logrotate, runs docker compose up under infisical run, pulls images via community.docker.docker_image_pull (no SDK required on the droplet), updates DO tags (commit-<sha> + skinny-backup) via scripts/tag-droplet.sh, waits for /api/ping health. Idempotent — re-runnable any time without tofu taint. - scripts/deploy.sh: now a thin `ansible-playbook` wrapper requiring INFISICAL_PROJECT_ID env var; installs community.docker:==3.13.0 collection if missing. - terraform/backend.tf + init.sh (new): DO Spaces remote tofu state backend (SSE-C encrypted, S3-compatible). init.sh reads spaces_* credentials from terraform.tfvars and forwards them to `tofu init -backend-config=`. - terraform/main.tf: adds lifecycle ignore_changes = [user_data, tags] so the runtime tag mutations from ansible + bootstrap scripts stick. - terraform/variables.tf: adds spaces_access_key, spaces_secret_key, spaces_encryption_key, ssh_source_cidrs. - docker/compose.prod.yaml: bind-mounts /var/log/caddy into Caddy so the otel-agent filelog receiver can ship logs and they survive container recreation. Caddyfile dual-hostname preserved across the refactor: ai-stage.weown.agency, ai.weown.agency { … } Production cutover (Phase 6) is a pure DNS A-record swap on the same droplet — Caddy already has the cert for both names in one site block. Copilot review findings (PR #36) all addressed: #1, #2, #4, #5 Empty Infisical values in cloud-init/deploy/restore → fixed by Path C; cloud-init now uses HCL templatefile substitution (${infisical_*}) that resolves at tofu apply time from var.* — no more pre-baked empty strings. #3, #6, #7 Floating image tags (reg.mini.dev/anythingllm:latest) → pinned to :1.7.2 (same as s004; documented as the WeOwnLLM hardened-image version Shahid verified on s004.ccc.bot). #8 ADR #WeOwnVer mis-computed → v3.5.5.1 → v3.4.5.1. May = month 4 of S3, ISO W22 - W18 + 1 = offset 5, iteration 1 → v3.4.5.1. Math shown inline in the Version line per VERSIONING_WEOWNVER.md. #9 Broken link to private notes/Perpetuator/... in a public repo → replaced with in-repo references (Tuleap A174 / #1238 + the in-repo runbook). #10 Path inconsistency /opt/int-p01/ vs /opt/intp01/ → unified on /opt/int_p01_anythingllm/ throughout README, runbook, scripts, cloud-init. project_name (terraform var) is `int-p01-anythingllm` (hyphenated); underscore form for paths/volumes is `int_p01_anythingllm`. Matches s004's convention. #11 Bash-incompatible `read -rs "VAR?prompt"` (zsh-only) → switched to the canonical zsh-first + bash-fallback pattern: `read -rs "VAR?prompt" 2>/dev/null || read -rsp …` (matches global CLAUDE.md secrets pattern). #12, #13 .terraform.lock.hcl gitignored → unignored at both the sites/.gitignore root and the per-site .gitignore. Lock file is now tracked for reproducible provider versions across machines + CI runs. #14 backup.sh remote mode not wrapped in infisical run → adopted s004's backup.sh which sources the droplet's .infisical-auth.env over SSH and re-execs itself under `infisical run` so SPACES_* are injected for the S3 upload step. Requires INFISICAL_PROJECT_ID env var in remote mode. #15 README restore example used `anythingllm-ai_backup_…` template placeholder → replaced the entire "Migration from Helm/Kubernetes" section with a pointer to MIGRATION_RUNBOOK.md (which uses the real `int-p01-anythingllm_backup_<TS>` naming). Migration artifacts preserved + updated: - MIGRATION_RUNBOOK.md: replaced ssh + manual infisical-run restore invocations with the new Path C flow (INFISICAL_PROJECT_ID=<id> ./scripts/deploy.sh for app layer, then ./scripts/restore.sh for the DOKS data swap). Added explicit Layer 2 rotation verification step. Phase 1.5 local-laptop dry-run pinned to :1.7.2 and renamed volumes/networks to match production (int_p01_anythingllm_*). - scripts/migrate-from-doks.sh: PROJECT_NAME → int-p01-anythingllm so the produced tarball matches what restore.sh on the droplet expects. End-of-script "next step" instructions updated to call the new restore.sh wrapper instead of raw ssh + infisical run. CHANGELOG resolution from rebase: merged the otel-agent additions from main with the INT-P01 + ADR-005 entries; ordered newest-first. Rebased onto origin/main (commit 455be2a). The branch is now a linear 2-commit history: feat (site) + docs (ADR), with this third commit on top covering the full refactor + review-feedback round. User said "we will squash later" so commits are kept granular. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Nik <nik.cimino@gmail.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
No description provided.