# Day 2 - Lab 2: Documenting Key Decisions with ADRs

**Objective:** Use an LLM as a research assistant to compare technical options and synthesize the findings into a formal, version-controlled Architectural Decision Record (ADR).

**Estimated Time:** 60 minutes

**Introduction:**
Great architectural decisions are based on research and trade-offs. A critical practice for healthy, long-lived projects is documenting *why* these decisions were made. In this lab, you will use an LLM to research a key technical choice for our application and then generate a formal ADR to record that decision for the future.

For definitions of key terms used in this lab, please refer to the [GLOSSARY.md](../../GLOSSARY.md).

## Step 1: Setup

We'll start by ensuring our environment is ready and adding the standard pathing solution to reliably import our `utils.py` helper.

**Model Selection:**
For research and synthesis tasks, models with large context windows and strong reasoning abilities are ideal. `gpt-4.1`, `gemini-2.5-pro`, or `meta-llama/Llama-3.3-70B-Instruct` would be excellent choices.

**Helper Functions Used:**
- `setup_llm_client()`: To configure the API client.
- `get_completion()`: To send prompts to the LLM.
- `load_artifact()`: To read the ADR template.
- `save_artifact()`: To save the generated ADR template and the final ADR.

In [1]:
import sys
import os

# Add the project's root directory to the Python path to ensure 'utils' can be imported.
try:
    project_root = os.path.abspath(os.path.join(os.getcwd(), '..', '..'))
except IndexError:
    project_root = os.path.abspath(os.path.join(os.getcwd()))

if project_root not in sys.path:
    sys.path.insert(0, project_root)

from utils import setup_llm_client, get_completion, save_artifact, load_artifact

client, model_name, api_provider = setup_llm_client(model_name="gemini-2.5-pro")

2025-10-29 13:06:41,198 ag_aisoftdev.utils INFO LLM Client configured provider=google model=gemini-2.5-pro latency_ms=None artifacts_path=None


## Step 2: The Challenges

### Challenge 1 (Foundational): The ADR Template

**Task:** A good ADR follows a consistent format. Your first task is to prompt an LLM to generate a clean, reusable ADR template in markdown.

**Instructions:**
1.  Write a prompt that asks the LLM to generate a markdown template for an Architectural Decision Record.
2.  The template should include sections for: `Title`, `Status` (e.g., Proposed, Accepted, Deprecated), `Context` (the problem or forces at play), `Decision` (the chosen solution), and `Consequences` (the positive and negative results of the decision).
3.  Save the generated template to `templates/adr_template.md`.

In [2]:
# Write a prompt to generate a markdown ADR template.
adr_template_prompt = """You are a principal engineer creating a reusable Architectural Decision Record template for a cross-functional product team. Produce polished markdown with clear headings and bullet placeholders. Include the following sections in this order: Title, Status, Context (with bullet prompts for problem, driving forces, constraints), Decision (with rationale line), and Consequences (sub-bullets for positive outcomes, negative trade-offs, and future work). Output pure markdown only—no surrounding code fences, no commentary."""

print("--- Generating ADR Template ---")
adr_template_content = get_completion(
    adr_template_prompt,
    client,
    model_name,
    api_provider,
    temperature=0.2,
)
print(adr_template_content)

# Save the artifact
if adr_template_content:
    save_artifact(adr_template_content, "templates/adr_template.md", overwrite=True)

--- Generating ADR Template ---
# [Title of ADR]

## Status

[Proposed | Accepted | Deprecated | Superseded]

## Context

*   **Problem:** Describe the problem or challenge that this decision addresses. What is the user story, business need, or technical debt being solved?
*   **Driving Forces:** List the key factors influencing this decision. Examples include non-functional requirements (performance, security, scalability), team skills, strategic alignment, or product goals.
    *   -
*   **Constraints:** List any constraints or limitations that must be considered. Examples include budget, timeline, existing technology stack, legal/compliance requirements, or company policies.
    *   -

## Decision

We will [describe the chosen solution or approach in a clear, concise statement].

**Rationale:** [Explain why this decision was made. Reference the context, compare it to alternatives considered, and justify the choice based on the driving forces and constraints.]

## Consequences

*   *

### Challenge 2 (Intermediate): AI-Assisted Research

**Task:** Use the LLM to perform unbiased research on a key technical decision for our project: choosing a database for semantic search.

**Instructions:**
1.  Write a prompt instructing the LLM to perform a technical comparison.
2.  Ask it to compare and contrast two technical options: **"Using PostgreSQL with the `pgvector` extension"** versus **"Using a specialized vector database like ChromaDB or FAISS"**.
3.  The prompt should ask for a balanced view for the specific use case of our new hire onboarding tool.
4.  Store the output in a variable for the next step.

> **Tip:** To get a balanced comparison, explicitly ask the LLM to 'act as an unbiased research assistant' and to list the 'pros and cons for each approach.' This prevents the model from simply recommending the more popular option and encourages a more critical analysis.

In [3]:
# Write a prompt to research database options.
db_research_prompt = """Act as an unbiased staff-level research assistant evaluating database options for OnboardPro’s semantic search and analytics needs. Compare using PostgreSQL with the pgvector extension versus adopting a specialized vector database such as ChromaDB or FAISS-backed service. Provide a structured markdown response with: (1) summary of the workload requirements (based on a new-hire onboarding SaaS handling secure HR data, moderate scale, need for analytics/dashboarding); (2) pros, cons, and operational risks for PostgreSQL + pgvector; (3) pros, cons, and operational risks for a specialized vector database; and (4) a balanced recommendation criteria list highlighting when each option is preferable. Avoid code fences and keep tone objective."""

print("--- Researching Database Options ---")
db_research_output = get_completion(
    db_research_prompt,
    client,
    model_name,
    api_provider,
    temperature=0.2,
)
print(db_research_output)

--- Researching Database Options ---
Of course. Here is an evaluation of database options for OnboardPro's semantic search and analytics needs, presented in the requested format.

***

**To:** Engineering Leadership
**From:** Staff Research Assistant
**Date:** October 26, 2023
**Subject:** Evaluation of Database Options for Semantic Search and Analytics

This document provides an unbiased comparison between using PostgreSQL with the pgvector extension and adopting a specialized vector database to meet OnboardPro's product requirements.

### 1. Summary of Workload Requirements

The OnboardPro platform requires a database solution to support two primary functions: semantic search and user analytics. The specific characteristics of our workload are:

*   **Semantic Search:** The core feature is enabling new hires to ask natural language questions and find relevant information within a corpus of company policies, onboarding guides, and knowledge base articles. This requires storing vector 

### Challenge 3 (Advanced): Synthesizing the ADR

**Task:** Provide the LLM with your research from the previous step and have it formally document the decision.

**Instructions:**
1.  Load the `adr_template.md` you created in the first challenge.
2.  Create a new prompt instructing the LLM to act as a Staff Engineer.
3.  Provide the `db_research_output` as context.
4.  Instruct the LLM to populate the ADR template, formally documenting the decision to **use PostgreSQL with pgvector** and justifying the choice based on the synthesized pros and cons.
5.  Save the final, completed ADR as `artifacts/adr_001_database_choice.md`.

In [4]:
adr_template = load_artifact("templates/adr_template.md")

# Write a prompt to synthesize the final ADR.
synthesis_prompt = f"""You are a staff engineer documenting an Architectural Decision Record for the OnboardPro onboarding platform. Populate the provided ADR template using the research summary while finalizing the decision to adopt PostgreSQL with the pgvector extension as the primary vector store. Requirements:
- Set Status to Accepted and date the context to the current quarter generically (e.g., Q4 2025).
- Reference key findings from the research when justifying the decision, noting trade-offs and mitigation strategies.
- Keep tone concise and professional, suitable for version-controlled documentation.
- Use only plain markdown and preserve the template structure without extra sections or commentary.
\n\nADR TEMPLATE\n--------------\n{adr_template}\n\nRESEARCH SUMMARY\n----------------\n{db_research_output}\n"""

print("--- Synthesizing Final ADR ---")
if adr_template and 'db_research_output' in locals() and db_research_output:
    final_adr = get_completion(
        synthesis_prompt,
        client,
        model_name,
        api_provider,
        temperature=0.2,
    )
    print(final_adr)
    save_artifact(final_adr, "artifacts/adr_001_database_choice.md", overwrite=True)
else:
    print("Skipping ADR synthesis because template or research is missing.")

--- Synthesizing Final ADR ---
# ADR-001: Vector Store for Semantic Search and Analytics

## Status

Accepted

## Context

*   **Problem:** The OnboardPro platform requires a backend to support two core features: 1) A semantic search capability allowing users to ask natural language questions against a corpus of onboarding documents, and 2) Analytics dashboards for HR administrators to track search usage and effectiveness. This solution must store vector embeddings alongside structured metadata and support complex, filtered queries.
*   **Driving Forces:** List the key factors influencing this decision.
    *   - **Rich, Filtered Queries:** Search queries must be filterable by structured metadata (e.g., department, location, hire date) in a single, atomic operation.
    *   - **Unified Analytics:** The system must support complex analytical queries that join search activity data with relational HR data for dashboarding.
    *   - **Operational Simplicity:** Minimizing new infrastructur

## Lab Conclusion

Well done! You have used an LLM to automate a complex but critical part of the architectural process. You leveraged its vast knowledge base for research and then used it again for synthesis, turning raw analysis into a formal, structured document. This `adr_001_database_choice.md` file now serves as a permanent, valuable record for anyone who works on this project in the future.

> **Key Takeaway:** The pattern of **Research -> Synthesize -> Format** is a powerful workflow. You can use an LLM to gather unstructured information and then use it again to pour that information into a structured template, creating high-quality, consistent documentation with minimal effort.