Skip to content

docs(AGENT-662): Add IAS v2.0 deployment runbook#933

Closed
maxtechera wants to merge 2 commits into
stagingfrom
feature/AGENT-662-ias-v20-deployment-blocked
Closed

docs(AGENT-662): Add IAS v2.0 deployment runbook#933
maxtechera wants to merge 2 commits into
stagingfrom
feature/AGENT-662-ias-v20-deployment-blocked

Conversation

@maxtechera
Copy link
Copy Markdown
Collaborator

Summary

  • Adds comprehensive deployment documentation for IAS v2.0
  • Includes pre-deployment checklist, migration steps, and rollback procedures

Test Plan

  • Documentation reviewed for accuracy
  • Runbook validated against staging deployment

🤖 Generated with Claude Code

@linear
Copy link
Copy Markdown

linear Bot commented Feb 5, 2026

@vercel
Copy link
Copy Markdown

vercel Bot commented Feb 5, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
answerai-docs Building Building Preview Feb 5, 2026 4:30pm
the-answerai Building Building Preview Feb 5, 2026 4:30pm

Request Review

@claude
Copy link
Copy Markdown

claude Bot commented Feb 5, 2026

PR Review: IAS v2.0 Deployment Runbook

Overview

This PR adds comprehensive deployment documentation for IAS v2.0, including pre-deployment checklists, migration steps, and rollback procedures. The documentation is well-structured and follows deployment best practices.

✅ Strengths

1. Comprehensive Coverage

  • Excellent pre-deployment checklist with access requirements clearly marked as BLOCKED
  • Dual deployment strategy (Render vs AWS Copilot) well documented
  • Security reminders section is thorough and appropriately emphasizes critical points
  • Rollback procedures included for both platforms

2. Security Best Practices

  • Strong emphasis on FLOWISE_SECRETKEY_OVERWRITE criticality (lines 46-49)
  • Clear warnings about credential management
  • Proper guidance on secrets storage
  • Good security checklist at the end

3. Structure & Organization

  • Logical flow from pre-deployment → deployment → post-deployment → rollback
  • Clear section headers and formatting
  • Helpful cross-references to other documentation
  • Revision history table for tracking changes

4. Operational Awareness

  • Links related Linear tickets (AGENT-149, AGENT-639, etc.) for context
  • Distinguishes between completed, in-progress, and pending issues
  • Post-deployment verification checklist is actionable

🔍 Issues & Suggestions

Critical Issues

1. Database Migration Command (High Priority)

Location: Lines 76-80

Issue: The runbook references pnpm db:migrate which per CLAUDE.md should NEVER be run without explicit user approval.

Current:

- [ ] Run Prisma migrations: pnpm db:migrate
- [ ] Run TypeORM migrations: pnpm migration:run

Recommendation:

- [ ] Run Prisma migrations: pnpm db:deploy (production) or pnpm db:migrate (dev - requires approval)
- [ ] Run TypeORM migrations: pnpm migration:run
- [ ] Verify migrations completed successfully

Rationale: Production should use pnpm db:deploy which runs existing migrations. pnpm db:migrate generates NEW migrations and should only be used in development.

2. Environment Variable Clarification (Medium Priority)

Location: Lines 31-50

Issue: Need to clarify distinction between AUTH0_DOMAIN and AUTH0_BASE_URL.

Recommendation:

  • AUTH0_DOMAIN - Auth0 tenant domain (e.g., ias-tenant.auth0.com)
  • AUTH0_BASE_URL - Production application URL (e.g., https://ias.theanswer.ai)

3. Database Connection String Format (Medium Priority)

Issue: Runbook does not specify how DATABASE_URL should be constructed.

Recommendation: Add to Database Preparation:

  • For Render: DATABASE_URL is constructed automatically from DATABASE_SECRET
  • For AWS Copilot: Use standard PostgreSQL format

Minor Issues

4. Hardcoded Date (Low Priority)

Line 5: Target Date: February 7, 2026

Verify if this is still accurate given access is BLOCKED, or mark as TBD.

5. Incomplete Contact Information (Low Priority)

Lines 196-201: Contacts section has TBD entries. Add pre-deployment checklist item to fill this in.

6. Missing E2E Test Command (Low Priority)

Lines 128-132: Consider adding pnpm test:e2e to test suite per CLAUDE.md

7. Git Branch Strategy (Low Priority)

Add note about deploying from staging branch after PR merge, per CLAUDE.md conventions.

8. Auth0 Configuration Reference

Lines 84-91: Could add reference to /scripts/auth0-setup-guide.md for detailed setup.

9. S3 Configuration Details

Lines 45-48: Add notes about:

  • Bucket must be pre-created
  • CORS configuration needed
  • IAM permissions required (PutObject, GetObject, DeleteObject)

🧪 Testing Recommendations

  1. Test runbook in staging environment before production
  2. All external references validated ✅
  3. Verify Linear tickets exist in workspace

📋 Pre-Merge Checklist

  • Update target date or mark as TBD
  • Fix database migration command to use pnpm db:deploy for production
  • Clarify AUTH0_DOMAIN vs AUTH0_BASE_URL
  • Add DATABASE_URL construction guidance
  • Plan to fill Contacts section before deployment
  • Consider adding git branch strategy note
  • Consider adding E2E test command

🎯 Overall Assessment

Quality: High ⭐⭐⭐⭐

This is a well-crafted deployment runbook with strong operational maturity. Structure is logical, security considerations are prominent, and dual-platform approach is well documented.

Primary concerns:

  1. Critical: Database migration command needs correction
  2. Important: Environment variable clarifications needed
  3. Minor: Some incomplete sections

Recommendation:Approve with requested changes

Issues are straightforward to address. Once critical database migration command is corrected, this will be production-ready.


Reviewed by: Claude Code
Review Date: 2026-02-05

Critical fixes:
- Changed Prisma migration command to 'pnpm db:deploy' for production (not 'pnpm db:migrate')
- Added clarification for AUTH0_DOMAIN vs AUTH0_BASE_URL environment variables
- Added DATABASE_URL construction guidance with format example
- Added S3 bucket prerequisites and IAM permission requirements
- Added E2E test command to deployment verification steps
- Added git branch strategy note for deployment workflow
- Added database healthcheck verification step

Addresses all critical and medium-priority issues from Claude's review.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@claude
Copy link
Copy Markdown

claude Bot commented Feb 5, 2026

PR Review - IAS v2.0 Deployment Runbook

✅ Strengths

  1. Comprehensive Coverage: Excellent documentation covering pre-deployment, deployment, and post-deployment phases
  2. Well-Structured: Clear sections with checklists make it actionable for deployment teams
  3. Multiple Platform Support: Documents both Render and AWS Copilot deployment options
  4. Security Focused: Good emphasis on credential management and security reminders
  5. References Existing Docs: Properly links to related documentation files (all verified to exist)

🔴 Critical Issues

1. Missing Test Command (High Priority)

Location: Step 1: Verify Build

pnpm test:auth  # ❌ This command does not exist in package.json

Impact: Deployment verification step will fail

Fix: Either:

  • Add the missing script to package.json, OR
  • Remove this line and update to use existing test commands

Recommendation:

# Replace with:
pnpm test:chatflows
pnpm test:e2e
# Or if auth tests exist in packages/server/test/auth:
pnpm --filter flowise-server test

2. Incomplete Contact Information (Medium Priority)

Multiple [TBD] placeholders in Contacts section:

  • Deployment Lead contact
  • IAS Admin contact
  • On-call Engineer

Impact: In a production emergency, responders won't know who to contact

Fix: Fill in actual contact information before deployment, or add a note to complete before go-live


⚠️ Documentation Issues

3. Git Strategy Ambiguity (Medium Priority)

Location: Step 1 note

Deploy from staging branch after PR merge per git strategy conventions

Issue: According to CLAUDE.md:

  • staging → Pre-production (CREATE PRS AGAINST THIS)
  • main → Production (DO NOT PR AGAINST THIS)

Concern: The runbook doesn't clarify:

  • Should deployment happen FROM staging to production environment?
  • Or should staging be merged to main first, THEN deploy from main?

Recommendation: Add clarity:

**Important:** 
1. Merge this PR to staging branch
2. Test in staging environment (if available)
3. Merge staging to main via PR
4. Deploy to production from main branch

4. Environment Variable Documentation Gap (Low Priority)

The runbook mentions several env vars not found in the referenced render.yaml:

  • AUTH0_ORGANIZATION_ID - Not in render.yaml
  • AUTH0_JWKS_URI - Not in render.yaml
  • AAI_DEFAULT_OPENAI_API_KEY - Not in render.yaml

Recommendation: Either:

  • Add these to render.yaml as examples, OR
  • Note which are IAS-specific vs standard deployment

5. REDIS_URL Format Missing (Low Priority)

Location: Environment Configuration

- [ ] `REDIS_URL` - Redis connection URL (e.g., `redis://localhost:6379`)

Issue: Example shows localhost, which won't work for production

Recommendation:

- [ ] `REDIS_URL` - Redis connection URL
  - Format: `redis://{host}:{port}` or `rediss://{host}:{port}` (TLS)
  - For Render: Provided automatically via Redis service
  - Verify after provisioning

💡 Suggestions for Improvement

6. Add Health Check Endpoints

The post-deployment verification mentions health checks but they're not clearly defined:

Recommendation: Add a section clarifying actual endpoints:

### Health Check Endpoints

| Service | Endpoint | Expected Response |
|---------|----------|-------------------|
| Flowise API | `GET /api/v1/ping` | `{"status":"ok"}` |
| Web App | `GET /api/health` or `/healthcheck` | HTTP 200 |
| Database | Via `pnpm db:healthcheck` | Connection successful |

7. Known Issues Tracking

The "Known Issues to Monitor" section is excellent, but consider:

  • Adding links to actual Linear tickets (e.g., https://linear.app/workspace/issue/AGENT-660)
  • Adding acceptance criteria for each issue
  • Creating a separate "Post-Deployment Validation" checklist that maps to these issues

8. Rollback Testing

Missing: No mention of testing the rollback procedure

Recommendation: Add to pre-deployment checklist:

- [ ] Verify rollback procedure in staging (if available)
- [ ] Document current production version/commit hash before deployment
- [ ] Ensure backup of database before migration (if applicable)

9. Database Migration Reversibility

Location: Database Preparation

The runbook mentions running migrations but doesn't address:

  • Are these migrations reversible?
  • What happens to data if rollback is needed?
  • Should a database backup be taken first?

Recommendation: Add to Step 3:

#### Before Running Migrations:
- [ ] **Create database backup** (especially for production)
- [ ] Review migration files for destructive changes
- [ ] Verify migrations are reversible (or document manual rollback steps)
- [ ] Test migrations in staging environment first

📋 Minor Improvements

10. Typo/Consistency Issues

  • pnpm test:auth - Command doesn't exist (see Answers Integration Beta v1 #1)
  • Consider adding version/commit hash to track what's being deployed
  • Add estimated deployment time (helps with planning maintenance windows)

✅ Approval Recommendation

Status: Approve with required changes

Required Before Merge:

  1. ✅ Fix the pnpm test:auth command issue (Answers Integration Beta v1 #1)
  2. ⚠️ Clarify git branching strategy for deployment (Feature/aai 3 copilot deployment #3)
  3. ⚠️ Fill in TBD contact information (Tools Sandbox #2) - or add note to complete before deployment

Recommended Before Deployment:
4. Address environment variable documentation (#4)
5. Add database backup steps (#9)
6. Add rollback testing to checklist (#8)


🎯 Summary

This is a high-quality deployment runbook that demonstrates thorough planning. The structure is excellent and it covers most critical aspects of deployment. With the fixes above (especially the test command and git strategy), this will be a solid production deployment guide.

Great work on:

  • Comprehensive pre-deployment checklist
  • Security-first approach
  • Multiple deployment platform support
  • Clear rollback procedures

Next steps:

  1. Fix critical issues (Answers Integration Beta v1 #1, Tools Sandbox #2, Feature/aai 3 copilot deployment #3)
  2. Test runbook against staging environment
  3. Complete TBD placeholders
  4. Get sign-off from IAS team before deployment

📖 Reviewed using repository CLAUDE.md conventions
🤖 Generated by Claude Code PR Review

@maxtechera maxtechera closed this Feb 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant