# Day 2 - Lab 2: Documenting Key Decisions with ADRs

**Objective:** Use an LLM as a research assistant to compare technical options and synthesize the findings into a formal, version-controlled Architectural Decision Record (ADR).

**Estimated Time:** 60 minutes

**Introduction:**
Great architectural decisions are based on research and trade-offs. A critical practice for healthy, long-lived projects is documenting *why* these decisions were made. In this lab, you will use an LLM to research a key technical choice for our application and then generate a formal ADR to record that decision for the future.

For definitions of key terms used in this lab, please refer to the [GLOSSARY.md](../../GLOSSARY.md).

## Step 1: Setup

We'll start by ensuring our environment is ready and adding the standard pathing solution to reliably import our `utils.py` helper.

**Model Selection:**
For research and synthesis tasks, models with large context windows and strong reasoning abilities are ideal. `gpt-4.1`, `gemini-2.5-pro`, or `meta-llama/Llama-3.3-70B-Instruct` would be excellent choices.

**Helper Functions Used:**
- `setup_llm_client()`: To configure the API client.
- `get_completion()`: To send prompts to the LLM.
- `load_artifact()`: To read the ADR template.
- `save_artifact()`: To save the generated ADR template and the final ADR.

In [1]:
import sys
import os

# Add the project's root directory to the Python path to ensure 'utils' can be imported.
try:
    project_root = os.path.abspath(os.path.join(os.getcwd(), '..', '..'))
except IndexError:
    project_root = os.path.abspath(os.path.join(os.getcwd()))

if project_root not in sys.path:
    sys.path.insert(0, project_root)

from utils import setup_llm_client, get_completion, save_artifact, load_artifact

client, model_name, api_provider = setup_llm_client(model_name="gemini-2.5-pro")

2025-10-28 10:48:42,456 ag_aisoftdev.utils INFO LLM Client configured provider=google model=gemini-2.5-pro latency_ms=None artifacts_path=None


## Step 2: The Challenges

### Challenge 1 (Foundational): The ADR Template

**Task:** A good ADR follows a consistent format. Your first task is to prompt an LLM to generate a clean, reusable ADR template in markdown.

**Instructions:**
1.  Write a prompt that asks the LLM to generate a markdown template for an Architectural Decision Record.
2.  The template should include sections for: `Title`, `Status` (e.g., Proposed, Accepted, Deprecated), `Context` (the problem or forces at play), `Decision` (the chosen solution), and `Consequences` (the positive and negative results of the decision).
3.  Save the generated template to `templates/adr_template.md`.

In [2]:
# TODO: Write a prompt to generate a markdown ADR template.
adr_template_prompt = """
You are a technical documentation expert tasked with creating a reusable Architectural Decision Record (ADR) template.

**Your Task:**
Generate a clean, professional markdown template for documenting architectural decisions. This template will be used by engineers to record important technical choices throughout a project's lifecycle.

**Required Sections:**
1. **Title**: A clear, concise title in the format "ADR-XXX: [Decision Title]"
2. **Status**: The current state of the decision (e.g., Proposed, Accepted, Rejected, Deprecated, Superseded)
3. **Date**: When the decision was made or proposed
4. **Context**: 
   - What is the issue or problem we're addressing?
   - What technical, business, or organizational forces are at play?
   - What constraints or requirements must we consider?
5. **Decision**: 
   - What solution or approach have we chosen?
   - Be specific and clear about what will be implemented
6. **Consequences**:
   - **Positive**: What benefits does this decision bring?
   - **Negative**: What drawbacks, risks, or trade-offs does this decision create?
   - **Neutral**: What are the side effects or implications?

**Format Requirements:**
- Use markdown formatting with clear headers
- Include placeholder text in each section to guide the author
- Use italics for instructional text that should be replaced
- Make it professional and easy to follow
- Keep it concise but comprehensive

Generate the ADR template now:
"""

print("--- Generating ADR Template ---")
adr_template_content = get_completion(adr_template_prompt, client, model_name, api_provider)
print(adr_template_content)

# Save the artifact
if adr_template_content:
    save_artifact(adr_template_content, "templates/adr_template.md")

--- Generating ADR Template ---
Of course. Here is a clean, professional, and reusable markdown template for an Architectural Decision Record (ADR).

---

```markdown
# ADR-XXX: [Short, descriptive title of the decision]

*Replace XXX with the ADR number (e.g., 001, 002).*

## Status

**Status**: Proposed

*Can be: Proposed, Accepted, Rejected, Deprecated, Superseded by [ADR-YYY].*

## Date

**Date**: YYYY-MM-DD

## Context

*This section describes the "why" of the decision. What is the problem, situation, or business need we are addressing? What are the constraints and requirements?*

**Problem Statement**
*Describe the issue, problem, or challenge we are addressing. For example, "Our current authentication system does not support single sign-on (SSO), which is a key requirement for our new enterprise clients."*

**Drivers & Constraints**
*List the forces influencing this decision. These can be technical, business, or operational.*
*   **Driver:** *e.g., Improve system performance to 

### Challenge 2 (Intermediate): AI-Assisted Research

**Task:** Use the LLM to perform unbiased research on a key technical decision for our project: choosing a database for semantic search.

**Instructions:**
1.  Write a prompt instructing the LLM to perform a technical comparison.
2.  Ask it to compare and contrast two technical options: **"Using PostgreSQL with the `pgvector` extension"** versus **"Using a specialized vector database like ChromaDB or FAISS"**.
3.  The prompt should ask for a balanced view for the specific use case of our new hire onboarding tool.
4.  Store the output in a variable for the next step.

> **Tip:** To get a balanced comparison, explicitly ask the LLM to 'act as an unbiased research assistant' and to list the 'pros and cons for each approach.' This prevents the model from simply recommending the more popular option and encourages a more critical analysis.

In [3]:
# TODO: Write a prompt to research database options.
db_research_prompt = """
You are an unbiased technical research assistant tasked with comparing database solutions for semantic search functionality in a new hire onboarding application.

**Project Context:**
We are building the Momentum Onboarding Platform, a web application that helps companies streamline their new hire onboarding process. The platform will include features like:
- Centralized onboarding hub with company resources, documentation, and FAQs
- Task management and tracking for new hires, managers, and HR
- Searchable knowledge base where new hires can find answers to common questions
- Resource library with policies, handbooks, and training materials

**Technical Decision:**
We need to implement semantic search capabilities to allow new hires to search the onboarding hub using natural language queries (e.g., "How do I request time off?" or "Who is my HR contact?"). This requires storing and querying vector embeddings of our content.

**Your Task:**
Provide a balanced, objective comparison of two technical approaches for implementing this semantic search capability:

**Option 1: PostgreSQL with pgvector extension**
- PostgreSQL is a mature, open-source relational database
- pgvector is an extension that adds vector similarity search capabilities

**Option 2: Specialized vector databases (ChromaDB or FAISS)**
- ChromaDB is a purpose-built vector database designed for AI applications
- FAISS is a library for efficient similarity search maintained by Meta

**Analysis Requirements:**
For each option, provide:
1. **Pros**: What are the advantages and strengths of this approach?
2. **Cons**: What are the drawbacks, limitations, or challenges?
3. **Use Case Fit**: How well does it align with our specific needs (onboarding tool with moderate scale, relatively small team, need for reliability)?
4. **Operational Considerations**: What are the infrastructure, maintenance, and skill requirements?

**Important Guidelines:**
- Be objective and balanced - do not favor one option simply because it's more popular
- Consider practical factors like: operational complexity, cost, learning curve, ecosystem maturity
- Think about the specific context: this is an enterprise onboarding tool, not a large-scale consumer application
- We need a solution that balances capability with simplicity

Provide your detailed comparison now:
"""

print("--- Researching Database Options ---")
db_research_output = get_completion(db_research_prompt, client, model_name, api_provider)
print(db_research_output)

--- Researching Database Options ---
Of course. Here is a balanced, objective comparison of PostgreSQL with pgvector and specialized vector databases for the Momentum Onboarding Platform's semantic search needs.

***

### **Executive Summary**

This analysis compares two primary approaches for implementing semantic search in the Momentum Onboarding Platform.

*   **Option 1 (PostgreSQL + pgvector):** Integrates vector search capabilities directly into our primary relational database. This approach prioritizes architectural simplicity, unified data management, and operational stability by leveraging a mature, well-understood technology.
*   **Option 2 (Specialized Vector Databases):** Introduces a purpose-built database (like ChromaDB) or library (like FAISS) dedicated solely to high-performance vector search. This approach prioritizes cutting-edge performance and a developer experience tailored for AI/ML workflows, at the cost of increased architectural complexity.

For the Momentum On

### Challenge 3 (Advanced): Synthesizing the ADR

**Task:** Provide the LLM with your research from the previous step and have it formally document the decision.

**Instructions:**
1.  Load the `adr_template.md` you created in the first challenge.
2.  Create a new prompt instructing the LLM to act as a Staff Engineer.
3.  Provide the `db_research_output` as context.
4.  Instruct the LLM to populate the ADR template, formally documenting the decision to **use PostgreSQL with pgvector** and justifying the choice based on the synthesized pros and cons.
5.  Save the final, completed ADR as `artifacts/adr_001_database_choice.md`.

In [4]:
adr_template = load_artifact("templates/adr_template.md")

# TODO: Write a prompt to synthesize the final ADR.
synthesis_prompt = f"""
You are a Staff Engineer at Momentum, responsible for making and documenting key architectural decisions for the Onboarding Platform project.

**Your Task:**
Based on the technical research provided below, you need to create a formal Architectural Decision Record (ADR) that documents the decision to **use PostgreSQL with the pgvector extension** for implementing semantic search capabilities in our application.

**Context:**
After careful consideration of the trade-offs, the engineering team has decided to go with PostgreSQL + pgvector rather than a specialized vector database. Your job is to formally document this decision using the ADR template provided.

**Instructions:**
1. Populate the ADR template with the following details:
   - **Title**: "ADR-001: Database Choice for Semantic Search"
   - **Status**: "Accepted"
   - **Date**: Use today's date (October 28, 2025)
   
2. Write the **Context** section by:
   - Explaining the need for semantic search in the onboarding platform
   - Describing the key forces and constraints (team size, operational complexity, need for reliability, budget)
   - Summarizing the options that were considered
   
3. Write the **Decision** section by:
   - Clearly stating the chosen solution: PostgreSQL with pgvector extension
   - Briefly explaining what this means technically
   - Providing 2-3 key reasons why this option was selected over the alternatives
   
4. Write the **Consequences** section by extracting insights from the research:
   - **Positive**: List 3-4 concrete benefits of this decision (drawn from the research pros)
   - **Negative**: List 2-3 honest drawbacks or limitations we accept (drawn from the research cons)
   - **Neutral**: List 1-2 neutral implications or things to be aware of

**Research Findings:**
{db_research_output}

**ADR Template to Populate:**
{adr_template}

**Output Requirements:**
- Use professional, technical language appropriate for a formal architectural document
- Be specific and concrete - avoid vague statements
- Ensure the document will be valuable to future engineers who need to understand this decision
- Remove any placeholder text or instructional comments from the template
- Output the completed ADR in markdown format

Generate the completed ADR now:
"""

print("--- Synthesizing Final ADR ---")
if adr_template and 'db_research_output' in locals() and db_research_output:
    final_adr = get_completion(synthesis_prompt, client, model_name, api_provider)
    print(final_adr)
    save_artifact(final_adr, "artifacts/adr_001_database_choice.md")
else:
    print("Skipping ADR synthesis because template or research is missing.")

--- Synthesizing Final ADR ---
# ADR-001: Database Choice for Semantic Search

## Status

**Status**: Accepted

## Date

**Date**: 2025-10-28

## Context

### Problem Statement
The Momentum Onboarding Platform requires a semantic search capability to allow users to find relevant information within our knowledge base (e.g., HR policies, technical documentation, setup guides) using natural language queries. A simple keyword search is insufficient as it fails to capture the user's intent. For example, a query for "how do I get paid?" should match documents titled "Understanding Your Paystub" or "Setting Up Direct Deposit." This requires storing and querying high-dimensional vector embeddings generated from our documents.

### Drivers & Constraints
*   **Driver:** Implement a powerful semantic search feature to improve user experience and information discovery.
*   **Driver:** Enable hybrid search, allowing users to filter search results by metadata (e.g., department, author, date) in addi

## Lab Conclusion

Well done! You have used an LLM to automate a complex but critical part of the architectural process. You leveraged its vast knowledge base for research and then used it again for synthesis, turning raw analysis into a formal, structured document. This `adr_001_database_choice.md` file now serves as a permanent, valuable record for anyone who works on this project in the future.

> **Key Takeaway:** The pattern of **Research -> Synthesize -> Format** is a powerful workflow. You can use an LLM to gather unstructured information and then use it again to pour that information into a structured template, creating high-quality, consistent documentation with minimal effort.