# Technical Challenge - Code Review and Deployment Pipeline Orchestration

**Format:** Structured interview with whiteboarding/documentation  
**Assessment Focus:** Problem decomposition, AI prompting strategy, system design

**Please Fill in your Responses in the Response markdown boxes**

---

## Challenge Scenario

You are tasked with creating an AI-powered system that can handle the complete lifecycle of code review and deployment pipeline management for a mid-size software company. The system needs to:

**Current Pain Points:**
- Manual code reviews take 2-3 days per PR
- Inconsistent review quality across teams
- Deployment failures due to missed edge cases
- Security vulnerabilities slip through reviews
- No standardized deployment process across projects
- Rollback decisions are manual and slow

**Business Requirements:**
- Reduce review time to <4 hours for standard PRs
- Maintain or improve code quality
- Catch 90%+ of security vulnerabilities before deployment
- Standardize deployment across 50+ microservices
- Enable automatic rollback based on metrics
- Support multiple environments (dev, staging, prod)
- Handle both new features and hotfixes
---

## Part A: Problem Decomposition (25 points)

**Question 1.1:** Break this challenge down into discrete, manageable steps that could be handled by AI agents or automated systems. Each step should have:
- Clear input requirements
- Specific output format
- Success criteria
- Failure handling strategy

**Question 1.2:** Which steps can run in parallel? Which are blocking? Where are the critical decision points?

**Question 1.3:** Identify the key handoff points between steps. What data/context needs to be passed between each phase?

## Response Part A

# Question 1.1
### Step 1: PR Creation
**Input Requirements:-** Git Branch(feature_name), Environment(dev,staging), PR title/Description

**Output Format:-** JSON Object with PR Metadat(ID, author, context)

**Success Criterai:-** PR Branch and Target Branch are valid. Author is valid. 

**Failure Handling:-** Immediate Fail, Notify author to fix and CTO by messaging another agent. 

### Step 2: AI SAST (Static Analysis & Security Scan)
**Input Requirements:-** Changed code and Full Codebase snapshot

**Output Format:-** JSON Object (Problems, Problem Criticality{high, mid, low} Locations, Description, Suggested Fix)

**Success Criterai:-** Zero Problems with high criticality. Code complexity isn't too much(not dependent on a threshold value,
rather in comparision to the Full Codebase snapshot)

**Failure Handling:-** Block merge. Leave meaningful comment on the PR highlighting all the Problems. Notify author and CTO through another agent. 

### Step 3: Automated Testing/ Unit Integration
**Input Requirements:-** Test Suite (ideally, no code should be allowed without proper tests in place) 

**Output Format:-** JSON Object (Tested: no of tests performed, Passed, Failed, Summary)

**Success Criterai:-** >95% tests pass. 100% tests marked as 'mandatory' pass.  

**Failure Handling:-** Block merge. Leave comment on PR and inform author and CTO with full description of Test Failures. 

### Step 4: AI Review and suggestion engine
**Input Requirements:-** Changed Code, SAST report, original code file(for comparison) 

**Output Format:-** Markdown file. Full extensive report on the code changes, suggested improvements and optimizations. Score 
out of 10 for every change made. 

**Success Criterai:-** Risk Score is below of pre-decided threshold, say 80%. PR Quality(Average of Scores for every change) is 
higher that an decided minimum value, say 8.5.  

**Failure Handling:-** Reject the PR. Notify the Author and CTO. 

### Step 5: Human Review, Assignment and Approval
**Input Requirements:-** AI Risk Score, AI Summary and List of relevant Team Members.  

**Output Format:-** Approval Status(Accepted/Declined). Remarks. In JSON most probably.  

**Success Criterai:-** Approval Status: Accepted

**Failure Handling:-** Reject the PR and Notify the Author and CTO, as usual. However, this is a Human in the Loop, therefore AI must be passed the context on the reason for Rejection.

### Step 6: Staging Environment Deployment
**Input Requirements:-** Docker Image/Build instructions, 

**Output Format:-** JSON Object (URL of deployed instance, Build Status)

**Success Criterai:-** Build Success and /health(or any such enpoint) returns 200 Status Code. 

**Failure Handling:-** Retry Build based on the feedback and inform author and CTO.

### Step 7: Canary Deployment and Metric Monitoring (PROD only)
**Input Requirements:-** Percentage of users to deploy the service on. Configure monitoring tools (through a different agent)

**Output Format:-** Real Time stream of golden signals(Latency, Traffic, Error Rate etc)

**Success Criterai:-** Error Rate is under a defined baseline for a defined Time Period. 

**Failure Handling:-** Trigger Step 8: Automatic Rollback. Alert every related team mebers. 

### Step 8: Automated Rollback/Promotion
**Input Requirements:-** Canary Success/Failure Boolean Value. 

**Output Format:-** JSON Object: {"canary_result":{"pr_id":"XXXX", status:"success", description:"new version fully promoted"}} or {"canary_result":{"pr_id":"XXXX", status:"failed", description:"reverted back to previous state"}}

**Success Criterai:-** status is success

**Failure Handling:-** Manual SRE Intervention Required. Lock the promotion service and redeploy the previous state. 

# Question 1.2
### Parallel Steps (Non Blocking with the CI/CD Phase)
**1.** Step 2(SAST, Static Application Security Testing) and Step 3(Automated Testing) can run in parallel immediately after step 1.

**2.** Step 4(AI Review) and Step 5(Human Review Assignment) can run in parallel.

### Blocking Steps
**1.** Step 1(Initial Commit/PR). The code must be available before any testing or analysis can begin. 

**2.** Step 5(Human Approval) ---> Step 6(Staging Deployment). Ideally, deployment even in Staging shouldn't be initiated before an Human Review of the code changes. 

**3.** Step 7(Monitoring) --> Step 8(Rollback/Promotion) A fixed time is set for monitoring. Until then Step 8 shouldn't be perfomed unless the results are disastrous and a Rollback is as soon as possible is needed. 

### Critical Decision Points
**1.** Step 2 and 3(PASS/FAIL gate). Does our code changes meet the minimum required security and quality benchmarks?

**2.** Step 5(Human Approval). As I mentioned before, no code should ever be let out without a human approval. 

**3.** Step 7(Monitoring). This is the heart of the problem. Does our changes qualify to be out there? Should it be promoted or rolled back?

# Question 1.3
The system must ensure all necessary context, not just simple pass/fail flags, is maintained throughout the workflow. There are 6 handoff points in our 8 step workflow. 

**1.** Step 1 --> Step 2 and Step 3. Full Git PR, the previous code and the new code changes. 

**2.** Step 2 and Step 3 --> Step 4 and Step 5. Structured JSON of Security Findings, Test Reports and PASS/FAIL Status

**3.** Step 4 ---> Step 5 AI Review to Human Review. AI generated Risk Score(say, 1-10), Quality Score(say, 1-10), markdown summary of changes, location and comments. 

**5.** Step 6 --> Step 7. Canary Deployment to Monitoring. Real Time updates on the status of deployed code for a fixed time period. URL, Metric Values and Time to Monitor. 

**6.** Step 7 -> Step 8 Monitoring to Final Decision(Rollback/Promote). Boolean signal. 

## Part B: AI Prompting Strategy (30 points)

**Question 2.1:** For 2 consecutive major steps you identified, design specific AI prompts that would achieve the desired outcome. Include:
- System role/persona definition
- Structured input format
- Expected output format
- Examples of good vs bad responses
- Error handling instructions

**Question 2.2:** How would you handle the following challenging scenarios with your AI prompts:
- **Code that uses obscure libraries or frameworks**
- **Security reviews for code**
- **Performance analysis of database queries**
- **Legacy code modifications**

**Question 2.3:** How would you ensure your prompts are working effectively and getting consistent results?

## Response Part B:
## Part B: AI Prompting Strategy

# Question 2.1: AI Prompt Design
## AI Static Analysis and Security Scan 
**SYSTEM ROLE/PERSONAE**

You are "Sentinel," a Level 5 Security Auditor. Your primary function is to enforce strict security policies and identify common vulnerabilities (e.g., OWASP Top 10) and language-specific flaws in the provided code diff. Do not suggest stylistic or minor optimization changes.

**STRUCTURED INPUT FORMAT**

JSON containing: { "pr_id": "PR-1234", "file_path": "/src/api/user_service.py", "language": "Python", "base_branch_code": "<FULL CODE OF FILE BEFORE CHANGE>", "new_code_diff": "<NEW CHANGES>" }

**EXPECTED OUTPUT FORMAT**

JSON Array of findings. If no issues are found, return []. [ { "severity": "HIGH" (or MEDIUM/CRITICAL), "vulnerability_type": "SQL Injection", "line_number": 45, "description": "User input is directly concatenated into a SQL query.", "suggested_fix": "Use prepared statements or an ORM for all database interactions.", "is_blocking": true } ]

**EXAMPLE GOOD RESPONSE**

[ { "severity": "CRITICAL", "vulnerability_type": "Hardcoded Secret", "line_number": 12, "description": "The API key for the payment gateway is exposed as a string literal.", "suggested_fix": "Move the key to a secure vault (e.g., HashiCorp Vault or AWS Secrets Manager) and access it via environment variables.", "is_blocking": true } ]

**EXAMPLE BAD RESPONSE**

"The code looks fine, but maybe change the variable name user_data to user_information." (Incorrect persona/focus). OR SQL query error line 45. (Lacks structured format/detail/fix).

**ERROR HANDLING INSTRUCTIONS**

If input is corrupted or the diff is too large (>500 lines), return a simplified JSON: { "status": "Error", "message": "Diff size exceeds processing limit or input is invalid. Require human review for security." } The system should then alert the SRE team and proceed to Step 4 with a HIGH Risk Score flag.

## AI Review and Suggestion Engine
**SYSTEM ROLE/PERSONAE**

You are "The Mentor," an experienced Principal Engineer. Your goal is to provide concise, constructive, and educational feedback. Summarize the changes and prioritize architectural, complexity, and maintainability improvements, incorporating findings from the prior SAST step.

**STRUCTURED INPUT FORMAT**

JSON containing: { "pr_id": "PR-1234", "diff": "<unified diff format of changes>", "sast_findings": <Output JSON from Step A>, "test_coverage_report": "92%", "base_risk_score": 3, "loc_changed": 55 }

**EXPECTED OUTPUT FORMAT**

Markdown output with a "Reviewer Summary" section and an "Inline Comments" section. Reviewer Summary must include the final calculated RISK_SCORE (1-10) and PRIORITY (High/Medium/Low).

**EXAMPLE GOOD RESPONSE**

RISK_SCORE: 7 (High)... Summary: 55 lines changed in critical user service. SAST found 1 CRITICAL error (hardcoded secret) which must be addressed. Performance in the new process_data function is O(n2). Inline Comment (File: user_service.py: line 78): "Consider refactoring this nested loop to a hashmap lookup to reduce complexity to O(n)."

**EXAMPLE BAD RESPONSE**

"Looks good. Ship it." (Lacks analysis/summary). OR A response that ignores the SAST findings.

**Error Handling Instructions**

If the SAST findings are invalid or missing, calculate the Risk Score based purely on LOC and file criticality (set minimum Risk Score to 5). State clearly in the summary: "WARNING: SAST results were unavailable. Risk Score may be under-estimated."







# Question 2.2: Handling Challenging Scenarios
We handle Challenging Scenarios by adjusting System Role/Personae and injecting specific Contextual Data into the prompt

**1. Obscure Libraries and frameworks:** Strategy: Augment the prompt's Contextual Data with documentation snippets. Prompt Adjustment: Prepend the diff with: "CONTEXT: The developer is using the obscure_data_lib library. The relevant documentation snippet is: <Paste relevant API documentation on state/side effects here>. Analyze the code specifically for correct usage and common pitfalls related to this context."

**2. Security Reviews:** Strategy: Hardcode the SAST tool's priorities (OWASP, internal policy) and strict formatting. Prompt Adjustment: (As seen in Step A) Define the persona as a Level 5 Security Auditor and explicitly instruct it to only focus on severity and blocking issues, preventing it from being distracted by style issues.

**3. Performance Analysis of DB queries:** Strategy: Require an Execution Plan in the input and shift the focus to complexity/indexing. Prompt Adjustment: Adjust Input to include: "query_plan": "<Raw output of EXPLAIN ANALYZE for the query>". Instruct the AI: "Analyze the provided SQL query and its execution plan. Identify missing indexes, full table scans, or joins that could lead to O(n2) complexity. Suggest changes to the query or required index additions."

**4. Legacy Code modifications:** Strategy: Emphasize compatibility, testing, and side effects. Prompt Adjustment: Modify the Mentor's persona: "You are an expert in Legacy System Preservation. Focus your review on backward compatibility, unintended side effects, and verify that appropriate tests were added to cover the modified legacy logic. Suggest ways to decouple the new code if possible."


# Question 2.3: Ensuring Prompt Effectiveness and Consistency
Structured Inputs, Iterations and Benchmarkings to sum it up in short.
**Strict Output Schemas (JSON/Markdown):** Enforcing the AI to always return a predictable structure (JSON or strict Markdown) prevents hallucination and ensures downstream systems (like the CI/CD pipeline) can reliably parse the results.

**Golden Set Benchmarking:** Create a "Golden Set" of 50 PRs (including good, bad, and edge-case examples) with human-verified optimal review findings. The AI's performance is tested against this set after every prompt modification or model update. Consistency is measured by the change in the F1−score for detecting known issues.

**Temperature/Determinism Control:** Run the AI model with a low temperature (≈0.1−0.3) to make the output more deterministic and less creative, ensuring security findings are consistent rather than varied.

**Feedback Loop and Fine-Tuning:** Collect data where human reviewers override the AI's Risk Score or suggestions. Use this data to fine-tune the model or refine the prompt with specific examples of "false positives" or "missed critical issues."

---

## Part C: System Architecture & Reusability (25 points)

**Question 3.1:** How would you make this system reusable across different projects/teams? Consider:
- Configuration management
- Language/framework variations
- Different deployment targets (cloud providers, on-prem)
- Team-specific coding standards
- Industry-specific compliance requirements

**Question 3.2:** How would the system get better over time based on:
- False positive/negative rates in reviews
- Deployment success/failure patterns
- Developer feedback
- Production incident correlation

## Response Part C:
# Question 3.1: Achieving System Reusability

Reusability is achieved through decoupling core logic from configuration and utilizing standardized, parameterized templates. This approach ensures the same underlying engine can serve 50+ microservices across diverse stacks.

**1. Configuration Management (Policy-as-Code)**

    -Centralized Repository: Maintain a central repository for all project configurations, defining policies rather than specific steps.

    -Configuration Files: Use standardized files (e.g., .iac-config.yaml or .ai-review-policy.json) for each service. These files specify the language, the required test coverage threshold, the target cloud provider, and the minimum required risk score for merging.

    -Policy Enforcement: The AI/Automation engine reads this configuration file at the start of the pipeline to dynamically load relevant tools and rules.

**2. Language/Framework Variations**

    -Modular Tooling Agents: The system is built with pluggable agents. The core pipeline logic simply calls an abstract function (run_linter()).

    -If the config specifies language: python, the agent uses Black/Bandit.

    -If the config specifies language: typescript, the agent uses ESLint/TSLint.

    -Language-Specific LLM Fine-Tuning: The AI Review Engine (The Mentor) maintains separate skill-sets or fine-tuned models for different languages (e.g., Java vs. Go) to provide contextually accurate suggestions.

**3. Different Deployment Targets**

    -Abstract Deployment Logic (Pipelined Steps): Implement the deployment process as a series of abstract steps that use Infrastructure-as-Code (IaC) templates.

    -Cloud Agnostic IaC: Use tools like Terraform or Pulumi with parameterized modules that abstract cloud provider differences. The configuration dictates the provider (provider: aws vs. provider: azure) and the IaC module handles the provider-specific syntax (e.g., creating an EKS cluster vs. an AKS cluster).

    -Standardized Artifacts: All services must build standardized artifacts (e.g., Docker images or language-agnostic packages) to ensure the deployment step remains consistent regardless of the target.

**4. Team-Specific Coding Standards**

    -Hierarchical Rulesets: Allow standards to be defined at two levels:

    -Global Standard: Company-wide security and quality rules (mandatory for all).

    -Team Override: Team-specific rules (e.g., team-alpha.linter-rules.json) that are loaded after the global standard, allowing for variations in indentation, naming conventions, etc.

**5. Industry-Specific Compliance Requirements**

    -Compliance Check Agent: Introduce a dedicated, mandatory pipeline step called the Compliance Agent.

    -It uses the project's configuration (e.g., compliance: pci-dss) to check for specific rules:

    -SAST Integration: Ensures specific checks for known vulnerabilities related to the compliance standard are prioritized.

    -Audit Trail: Enforces that specific data (e.g., review comments, approval IDs) are logged and immutable.

# Question 3.2: System Improvement overtime (Feedback Loops)
The system is designed to be a self-improving, closed-loop mechanism utilizing MLOps principles to continuously refine the AI models and automation thresholds.

**1. Learning from False Positive/Negative Rates (AI Review Refinement)**

    -Human Feedback Loop: Every time a human reviewer accepts an AI suggestion, it's a True Positive (TP). Every time they dismiss a suggestion, it's a False Positive (FP). Every time they add a comment that the AI missed, it's a False Negative (FN).

    -Model Retraining: This structured feedback data (TP/FP/FN) is collected and used to retrain the AI Review Engine (The Mentor) on a weekly or bi-weekly cycle. The goal is to maximize the precision of suggestions (reducing FPs) and recall (reducing FNs).

**2. Learning from Deployment Success/Failure Patterns (Rollback Thresholds)**

    -Adaptive Thresholds: The Automatic Rollback AI Model uses the continuous stream of deployment results (Step 7/8).

    -If a deployment failed when metrics were only 1σ off the baseline, the system tightens the rollback threshold for that specific microservice.

    -If a deployment succeeded despite metrics being 3σ off, the system loosens the threshold.

    -ML Model Training: Historical data points (Metrics → Human Decision/Auto Rollback) are continuously fed into a supervised learning model to optimize the predictive decision logic for triggering a rollback, moving beyond simple static metric rules.

**3. Learning from Developer Feedback**

    -Sentiment Analysis: The system performs sentiment analysis on developer responses to AI comments. If the feedback is consistently negative (e.g., "The AI is spamming irrelevant suggestions"), it indicates a prompt or model decay that requires immediate attention and prompt engineering adjustments.

    -Usability Metrics: Track the time developers spend interacting with the AI comments. If the interaction time is high but the acceptance rate is low, the AI's suggestions are likely poorly formatted or confusing.

**4. Learning from Production Incident Correlation**

    -Root Cause Analysis (RCA) Integration: When a production incident occurs, the system traces the failure back to the last deployed PR.

    -Backtesting the Review: The system re-runs the final code against the original SAST and AI Review models.

    -If the AI failed to flag the line of code that caused the incident, the failure is labeled as a Critical Miss. This specific code segment becomes an immediate, high-priority addition to the Golden Set Benchmarking data to ensure the model learns from the mistake and prevents recurrence.



---

## Part D: Implementation Strategy (20 points)

**Question 4.1:** Prioritize your implementation. What would you build first? Create a 6-month roadmap with:
- MVP definition (what's the minimum viable system?)
- Pilot program strategy
- Rollout phases
- Success metrics for each phase

**Question 4.2:** Risk mitigation. What could go wrong and how would you handle:
- AI making incorrect review decisions
- System downtime during critical deployments
- Integration failures with existing tools
- Resistance from development teams
- Compliance/audit requirements

**Question 4.3:** Tool selection. What existing tools/platforms would you integrate with or build upon:
- Code review platforms (GitHub, GitLab, Bitbucket)
- CI/CD systems (Jenkins, GitHub Actions, GitLab CI)
- Monitoring tools (Datadog, New Relic, Prometheus)
- Security scanning tools (SonarQube, Snyk, Veracode)
- Communication tools (Slack, Teams, Jira)

## Response Part D:


# Question 4.1: Implementation Roadmap
The priority has to be tackling the biggest time sinks and quality issues first: manual code reviews and inconsistent quality. The Minimum Viable Product (MVP) will focus solely on the pre-merge gates. Here's a 6 months roadmap.

**(MONTH 1-2) PHASE I: MVP**
### FOCUS AND GOAL
Implement Automated Code Review (Steps 1-3) on one pilot team/microservice. This focuses on blocking broken/insecure code before deployment.
### SUCCESS METRICS
: >95% test success rate. 100% coverage of Critical security issues caught by SAST. Average Review Time Reduction of >50% for the pilot service.

**(MONTH 3-4) PHASE II: CI/CD Standard + AI Assitant**
### FOCUS AND GOAL
Standardize the CI/CD pipeline template across 10 key microservices. Introduce the AI Review/Suggestion Engine (Step 4) as a non-blocking helper tool.
### SUCCESS METRICS
Adoption Rate: 10 microservices using the standard pipeline. AI Acceptance Rate: >60% of AI's suggestions are accepted by reviewers.

**(MONTH 5-6) PHASE III: Production Automation**
### FOCUS AND GOAL
Implement Canary Deployment (Step 7) and Automatic Rollback (Step 8) on 3 high-traffic, non-critical services. Finalize IaC templates for all environments.
### SUCCESS METRICS
Rollback Success Rate: 100% of failed Canaries result in successful automated rollback. 0 production incidents caused by the 3 pilot services during this phase.

##Pilot Program Strategy

We’ll start with one small, non-critical microservice owned by a tech-forward, cooperative team (the "Early Adopters").

**Baseline Measurement:** Collect 1 month of data: average review time, deployment frequency, and security scan reports.

**MVP Rollout:** Implement Phase 1 tools and run in parallel with the old process for 2 weeks.

**Go-Live:** Switch the pilot service entirely to the new AI-powered workflow.

**Feedback:** Collect daily feedback on friction points (e.g., "The AI kept failing my build for a minor style error"). Use this feedback to rapidly tune the AI prompts and linter rules before rolling out to Phase 2.


# Question 4.2: Risk Mitigation
Some possible weak points and how we would handle them. 

**1. False Positive-Negatives:** The "Human Override" Safety Net: In the MVP, the AI is a helper, not a gate. Give human reviewers the power to override the AI Risk Score. Collect every override and feed it back for immediate prompt engineering refinement or model retraining. Start Softly—don't let the AI block merges in the first month.

**2. System downtime during deployments:** Decouple and Redundancy: Use managed, highly available CI/CD infrastructure (e.g., cloud-based GitHub Actions/GitLab CI) instead of self-hosted Jenkins. The core rollback logic (Step 8) should be external and idempotent—a simple, small service running on highly reliable infrastructure that only needs the service ID and version to execute a rollback command.

**3. Integration Failures with existing tools:** Wrapper Layer: Build small, well-tested API wrappers/micro-services for each legacy tool (e.g., a "Security Scanner Wrapper"). If the underlying tool fails, the wrapper returns a graceful error instead of crashing the pipeline, allowing the rest of the flow to proceed with a HIGH RISK flag.

**4. Reistance from development teams:** Sell the Time Savings: Market the system as "Less Boilerplate, Faster Reviews." Emphasize that the AI handles 90% of the boring, time-consuming style and minor bug checks, freeing them up for interesting architectural reviews. Make the AI non-blocking initially to build trust.

**5. Compliance requirements:** Audit Trail by Default: Every action and decision (SAST result, human approval, AI Risk Score, rollback trigger) is automatically logged to a WORM (Write Once, Read Many) audit log. This ensures all compliance checks (e.g., separation of duties) are non-repudiable and easily extractable for auditors.

# Questino 4.3: Tool Selection ( I used Gemini for this )

We should integrate and build upon best-of-breed tools to accelerate implementation. We won't try to build a CI/CD system from scratch!

| Tool Category | Recommended Platform(s) | Role in the System |
| :--- | :--- | :--- |
| **Code Review Platform** | **GitHub Enterprise/GitLab (whichever is primary)** | **The Core Interface:** This is where the AI posts comments, the Human Approval gate resides, and the PR merge is triggered. |
| **CI/CD Systems** | **GitHub Actions / GitLab CI** | **The Pipeline Orchestrator:** Use their native **Pipeline-as-Code** features for standardized, reusable YAML templates. This is the backbone for Steps 1-8. |
| **Monitoring Tools** | **Prometheus (for metrics) & Grafana (for dashboards)** | **The Rollback Brain:** Centralized collection of Golden Signals. The AI Rollback Agent (Step 7) queries Prometheus for real-time metric deviations. |
| **Security Scanning Tools** | **SonarQube (Code Quality) & Snyk (Dependency/SAST)** | **The Pre-Merge Gate Keepers (Step 2):** SonarQube for consistent code quality/smells; Snyk for fast, integrated dependency and security scanning. |
| **Communication Tools** | **Slack/Teams** | **Alerting & Escalation:** Immediate notifications for high-risk AI scores, deployment failures, and successful automated rollbacks. Use dynamic channels based on the microservice owner. |
| **AI/LLM Backend** | **Vertex AI / Azure AI / OpenAI (with private deployment)** | **The Brains (Steps 2 & 4):** Host the specialized fine-tuned models (Sentinel and The Mentor) for security analysis and constructive code review suggestions. |

