# Day 1 - Lab 1: AI-Powered Requirements & User Stories (Solution)

**Objective:** Use a Large Language Model (LLM) to decompose a vague problem statement into structured features, user personas, and Agile user stories, culminating in a machine-readable JSON artifact.

**Introduction:**
This notebook contains the complete solution for Lab 1. It demonstrates how to use an LLM to systematically break down a problem, generate structured requirements, and programmatically validate the output. Each step includes explanations of the code and the reasoning behind the prompts.

For definitions of key terms used in this lab, please refer to the [GLOSSARY.md](../../GLOSSARY.md).

## Step 1: Setup

**Purpose:** This initial block of code prepares our environment for the lab. It adds the project root to the system path to ensure our `utils` package can be imported, and then initializes the LLM API client.

**Model Selection:**
Our `utils` package is configured to work with multiple AI providers. You can change the `model_name` parameter in the `setup_llm_client()` function to any of the models listed in the `RECOMMENDED_MODELS` dictionary in `utils`. For example, to use a Hugging Face model, you could change the line to: `client, model_name, api_provider = setup_llm_client(model_name="meta-llama/Llama-3.3-70B-Instruct")`

**Libraries Explained:**
- **`os`**, **`sys`**: Standard Python libraries for interacting with the file system and Python's path, ensuring our modules are discoverable.
- **`json`**: A standard library for working with JSON data. We use `json.loads` to parse the LLM's text output into a Python dictionary or list, and `json.dumps` to format Python objects into a pretty-printed JSON string for saving.
- **`utils`**: Our custom helper script. 
  - `setup_llm_client()`: Handles reading the `.env` file and initializing the API client.
  - `get_completion()`: Simplifies the process of sending a prompt to the LLM and receiving a text response.
  - `save_artifact()`: Ensures our project artifacts are stored consistently in the `artifacts` directory.
  - `clean_llm_output()`: A new standardized function to remove markdown fences from LLM outputs.
  - `prompt_enhancer()`: An advanced meta-prompt system that takes raw user input and optimizes it using prompt engineering best practices, including role assignment, context grounding, and structural organization.

In [1]:
import sys
import os
import json
import tqdm as notebook_tqdm

# Add the project's root directory to the Python path to ensure 'utils' can be imported.
try:
    # Assumes the notebook is in 'labs/Day_01_.../'
    project_root = os.path.abspath(os.path.join(os.getcwd(), '..', '..'))
except IndexError:
    # Fallback for different execution environments
    project_root = os.path.abspath(os.path.join(os.getcwd()))

if project_root not in sys.path:
    sys.path.insert(0, project_root)

from utils import setup_llm_client, get_completion, save_artifact, clean_llm_output, recommended_models_table, prompt_enhancer

In [2]:
recommended_models_table()

| Model | Provider | Text | Vision | Image Gen | Image Edit | Audio Transcription | Context Window | Max Output Tokens |
|---|---|---|---|---|---|---|---|---|
| Qwen/Qwen-Image | huggingface | ❌ | ❌ | ✅ | ❌ | ❌ | - | - |
| Qwen/Qwen-Image-Edit | huggingface | ❌ | ❌ | ❌ | ✅ | ❌ | - | - |
| black-forest-labs/FLUX.1-Kontext-dev | huggingface | ❌ | ❌ | ❌ | ✅ | ❌ | - | - |
| claude-opus-4-1-20250805 | anthropic | ✅ | ✅ | ❌ | ❌ | ❌ | 200,000 | 100,000 |
| claude-opus-4-20250514 | anthropic | ✅ | ✅ | ❌ | ❌ | ❌ | 200,000 | 100,000 |
| claude-sonnet-4-20250514 | anthropic | ✅ | ✅ | ❌ | ❌ | ❌ | 1,000,000 | 100,000 |
| dall-e-3 | openai | ❌ | ❌ | ✅ | ❌ | ❌ | - | - |
| deepseek-ai/DeepSeek-V3.1 | huggingface | ✅ | ❌ | ❌ | ❌ | ❌ | 128,000 | 100,000 |
| gemini-1.5-flash | google | ✅ | ✅ | ❌ | ❌ | ❌ | 1,000,000 | 8,192 |
| gemini-1.5-pro | google | ✅ | ✅ | ❌ | ❌ | ❌ | 2,000,000 | 8,192 |
| gemini-2.0-flash-exp | google | ✅ | ✅ | ❌ | ❌ | ❌ | 1,048,576 | 8,192 |
| gemini-2.0-flash-preview-image-generation | google | ❌ | ❌ | ✅ | ❌ | ❌ | 32,000 | 8,192 |
| gemini-2.5-flash | google | ✅ | ✅ | ❌ | ❌ | ❌ | 1,048,576 | 65,536 |
| gemini-2.5-flash-image-preview | google | ❌ | ❌ | ✅ | ❌ | ❌ | 32,768 | 32,768 |
| gemini-2.5-flash-lite | google | ✅ | ✅ | ❌ | ❌ | ❌ | 1,048,576 | 65,536 |
| gemini-2.5-pro | google | ✅ | ✅ | ❌ | ❌ | ❌ | 1,048,576 | 65,536 |
| gemini-live-2.5-flash-preview | google | ❌ | ❌ | ❌ | ❌ | ❌ | 1,048,576 | 8,192 |
| gpt-4.1 | openai | ✅ | ✅ | ❌ | ❌ | ❌ | 1,000,000 | 32,768 |
| gpt-4.1-mini | openai | ✅ | ✅ | ❌ | ❌ | ❌ | 1,000,000 | 32,000 |
| gpt-4.1-nano | openai | ✅ | ✅ | ❌ | ❌ | ❌ | 1,000,000 | 32,000 |
| gpt-4o | openai | ✅ | ✅ | ❌ | ❌ | ❌ | 128,000 | 16,384 |
| gpt-4o-mini | openai | ✅ | ✅ | ❌ | ❌ | ❌ | 128,000 | 16,384 |
| gpt-5-2025-08-07 | openai | ✅ | ✅ | ❌ | ❌ | ❌ | 400,000 | 128,000 |
| gpt-5-mini-2025-08-07 | openai | ✅ | ✅ | ❌ | ❌ | ❌ | 400,000 | 128,000 |
| gpt-5-nano-2025-08-07 | openai | ✅ | ✅ | ❌ | ❌ | ❌ | 400,000 | 128,000 |
| meta-llama/Llama-3.3-70B-Instruct | huggingface | ✅ | ❌ | ❌ | ❌ | ❌ | 8,192 | 4,096 |
| meta-llama/Llama-4-Maverick-17B-128E-Instruct | huggingface | ✅ | ❌ | ❌ | ❌ | ❌ | 1,000,000 | 100,000 |
| meta-llama/Llama-4-Scout-17B-16E-Instruct | huggingface | ✅ | ❌ | ❌ | ❌ | ❌ | 10,000,000 | 100,000 |
| mistralai/Mistral-7B-Instruct-v0.3 | huggingface | ✅ | ❌ | ❌ | ❌ | ❌ | 32,768 | 8,192 |
| o3 | openai | ✅ | ✅ | ❌ | ❌ | ❌ | 200,000 | 100,000 |
| o4-mini | openai | ✅ | ✅ | ❌ | ❌ | ❌ | 200,000 | 100,000 |
| stabilityai/stable-diffusion-3.5-large | huggingface | ❌ | ❌ | ✅ | ❌ | ❌ | - | - |
| tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.5 | huggingface | ✅ | ❌ | ❌ | ❌ | ❌ | 4,096 | 1,024 |
| veo-3.0-fast-generate-preview | google | ❌ | ❌ | ❌ | ❌ | ❌ | 1,024 | - |
| veo-3.0-generate-preview | google | ❌ | ❌ | ❌ | ❌ | ❌ | 1,024 | - |
| whisper-1 | openai | ❌ | ❌ | ❌ | ❌ | ✅ | - | - |

'| Model | Provider | Text | Vision | Image Gen | Image Edit | Audio Transcription | Context Window | Max Output Tokens |\n|---|---|---|---|---|---|---|---|---|\n| Qwen/Qwen-Image | huggingface | ❌ | ❌ | ✅ | ❌ | ❌ | - | - |\n| Qwen/Qwen-Image-Edit | huggingface | ❌ | ❌ | ❌ | ✅ | ❌ | - | - |\n| black-forest-labs/FLUX.1-Kontext-dev | huggingface | ❌ | ❌ | ❌ | ✅ | ❌ | - | - |\n| claude-opus-4-1-20250805 | anthropic | ✅ | ✅ | ❌ | ❌ | ❌ | 200,000 | 100,000 |\n| claude-opus-4-20250514 | anthropic | ✅ | ✅ | ❌ | ❌ | ❌ | 200,000 | 100,000 |\n| claude-sonnet-4-20250514 | anthropic | ✅ | ✅ | ❌ | ❌ | ❌ | 1,000,000 | 100,000 |\n| dall-e-3 | openai | ❌ | ❌ | ✅ | ❌ | ❌ | - | - |\n| deepseek-ai/DeepSeek-V3.1 | huggingface | ✅ | ❌ | ❌ | ❌ | ❌ | 128,000 | 100,000 |\n| gemini-1.5-flash | google | ✅ | ✅ | ❌ | ❌ | ❌ | 1,000,000 | 8,192 |\n| gemini-1.5-pro | google | ✅ | ✅ | ❌ | ❌ | ❌ | 2,000,000 | 8,192 |\n| gemini-2.0-flash-exp | google | ✅ | ✅ | ❌ | ❌ | ❌ | 1,048,576 | 8,192 |\n| gemini-2.0-flash-preview

In [3]:
# Initialize the LLM client. You can change the model here.
# For example: setup_llm_client(model_name="gemini-2.5-flash")
brainstormed_features_client, brainstormed_features_model_name, brainstormed_features_api_provider = setup_llm_client(model_name="gemini-2.5-pro")
user_personas_client, user_personas_model_name, user_personas_api_provider = setup_llm_client(model_name="deepseek-ai/DeepSeek-V3.1")

  from .autonotebook import tqdm as notebook_tqdm
2025-09-11 15:37:39,061 ag_aisoftdev.utils INFO LLM Client configured provider=google model=gemini-2.5-pro latency_ms=None artifacts_path=None
2025-09-11 15:37:39,146 ag_aisoftdev.utils INFO LLM Client configured provider=huggingface model=deepseek-ai/DeepSeek-V3.1 latency_ms=None artifacts_path=None


## Step 2: The Problem Statement

We define our starting point—a simple, high-level problem statement—as a Python variable. This makes it easy to reuse in multiple prompts.

In [4]:
problem_statement = "We need a tool to help our company's new hires get up to speed."

## Step 3: The Challenges

Here are the complete solutions for each challenge.

### Challenge 1 (Foundational): Brainstorming Features

**Explanation:**
This first challenge demonstrates the power of prompt enhancement. We start with simple, raw prompts for brainstorming features and identifying user personas. However, instead of using these basic prompts directly, we pass them through our `prompt_enhancer` function, which applies advanced prompt engineering techniques to optimize them.

The `prompt_enhancer` automatically:
- Assigns appropriate expert personas (e.g., "You are a Senior Product Manager")
- Provides structured context and grounding
- Defines clear task instructions with assertive action verbs
- Sets explicit output format expectations
- Organizes the prompt with clear structural delimiters

**Key Efficiency Features:**
- We reuse the existing LLM clients that were already initialized, avoiding duplicate setup
- Different models can be used for different tasks (e.g., Gemini for features, DeepSeek for personas)
- The personas prompt includes the brainstormed features as context for better coherence

This enhancement process transforms simple requests into highly optimized prompts that produce more focused, detailed, and useful outputs. The goal is to generate a broad set of high-quality ideas (features and personas) that will serve as the foundation for the more structured tasks to follow.

In [5]:
# Challenge 1: Brainstorming Features and User Personas
print("=" * 60)
print("CHALLENGE 1: AI-POWERED REQUIREMENTS GENERATION")
print("=" * 60)

# Step 1: Enhance Features Prompt
print("\n--- STEP 1: ENHANCING FEATURES PROMPT ---")
raw_features_prompt = f"Based on the problem statement: '{problem_statement}', brainstorm a list of potential features for a new hire onboarding tool. Format the output as a simple markdown list."

enhanced_features_prompt = prompt_enhancer(raw_features_prompt)
print("Brainstorm Enhanced prompt\n", enhanced_features_prompt)

# Step 2: Generate Brainstormed Features
print("\n--- STEP 2: GENERATING BRAINSTORMED FEATURES ---")
brainstormed_features = get_completion(
    enhanced_features_prompt,
    brainstormed_features_client,
    brainstormed_features_model_name,
    brainstormed_features_api_provider
)
print(brainstormed_features)

# Step 3: Enhance Personas Prompt
print("\n--- STEP 3: ENHANCING PERSONAS PROMPT ---")
raw_personas_prompt = f"Based on the problem statement: '{problem_statement}' and the following brainstormed features: {brainstormed_features}, identify and describe three distinct user personas who would interact with this tool. For each persona, describe their role and main goal."

enhanced_personas_prompt = prompt_enhancer(raw_personas_prompt)
print("Personas Enhanced prompt\n", enhanced_personas_prompt)

# Step 4: Generate User Personas
print("\n--- STEP 4: GENERATING USER PERSONAS ---")
user_personas = get_completion(
    enhanced_personas_prompt,
    user_personas_client,
    user_personas_model_name,
    user_personas_api_provider
)
print(user_personas)

print("\n" + "=" * 60)
print("CHALLENGE 1 COMPLETED SUCCESSFULLY!")
print("=" * 60)

CHALLENGE 1: AI-POWERED REQUIREMENTS GENERATION

--- STEP 1: ENHANCING FEATURES PROMPT ---


2025-09-11 15:37:39,420 ag_aisoftdev.utils INFO LLM Client configured provider=openai model=o3 latency_ms=None artifacts_path=None


Brainstorm Enhanced prompt
 <prompt>
    <persona>
        You are a Senior HR Technology Consultant and Onboarding Specialist with extensive experience designing digital solutions that accelerate new-hire productivity and engagement.
    </persona>

    <context>
        Our company plans to build an internal onboarding tool whose primary goal is to help new employees get up to speed quickly and confidently. The target audience includes new hires, their managers, HR administrators, and IT support staff. The tool should address knowledge transfer, cultural immersion, compliance, feedback loops, and analytics while delivering a smooth, engaging user experience.
    </context>

    <instructions>
        1. Think step by step about the onboarding journey from multiple stakeholder perspectives (new hire, manager, HR, IT) and identify capabilities that remove friction, provide clarity, and foster engagement.  
        2. Brainstorm a comprehensive yet concise set of potential features for 

2025-09-11 15:38:21,107 ag_aisoftdev.utils INFO LLM Client configured provider=openai model=o3 latency_ms=None artifacts_path=None


- Personalized onboarding journeys and checklists tailored to role, department, and location.
- A pre-start date "Welcome Hub" with first-day logistics, team intros, and digital paperwork.
- Dedicated manager dashboard with new hire progress tracking, task lists, and coaching resources.
- Automated IT and equipment provisioning workflow triggered by HRIS integration.
- Interactive company directory with an org chart, employee bios, and a "who-to-ask-about-what" guide.
- Collaborative 30-60-90 day goal-setting module for new hires and their managers.
- Automated "Onboarding Buddy" or mentor matching system with conversation starters.
- Scheduled pulse check surveys (e.g., at 7, 30, and 90 days) to gauge sentiment and belonging.
- Gamified learning modules with progress tracking, badges, and milestone celebrations.
- A centralized, searchable knowledge base including a company acronym and jargon buster.
- Analytics dashboard measuring key metrics like time-to-productivity, engagement sco

### Challenge 2 (Intermediate): Generating Formal User Stories

**Explanation:**
This challenge represents a significant increase in complexity and value. We are no longer asking for simple text; we are demanding a specific, structured data format (JSON). 

The prompt is carefully engineered:
1.  **Persona:** `You are a Senior Product Manager...` tells the LLM the role it should adopt.
2.  **Context:** We provide the previous outputs (`problem_statement`, `brainstormed_features`, `user_personas`) inside `<context>` tags to give the LLM all the necessary information.
3.  **Format:** The `OUTPUT REQUIREMENTS` section is extremely explicit. It tells the LLM to *only* output JSON, defines the exact keys for each object, and specifies the format for nested data (like the array of Gherkin strings). This strictness is key to getting reliable, machine-readable output.
4.  **Parsing:** The `try...except` block is a crucial step. It attempts to parse the LLM's string output into a Python list of dictionaries. If it succeeds, we know the LLM followed our instructions perfectly. If it fails, we print the raw output to help debug the prompt.

In [6]:
# Challenge 2: Generating Formal User Stories
print("=" * 60)
print("CHALLENGE 2: GENERATING FORMAL USER STORIES")

raw_json_prompt = f"""
You are a Senior Product Manager creating a product backlog for a new hire onboarding tool.

Based on the following context:
<context>
Problem Statement: {problem_statement}
Potential Features: {brainstormed_features}
User Personas: {user_personas}
</context>

Your task is to generate a list of 5 detailed user stories.

**OUTPUT REQUIREMENTS**:
- You MUST output a valid JSON array. Your response must begin with [ and end with ]. Do not include any text or markdown before or after the JSON array.
- Each object in the array must represent a single user story.
- Each object must have the following keys: 'id' (an integer), 'persona' (a string from the personas), 'user_story' (a string in the format 'As a [persona], I want [goal], so that [benefit].'), and 'acceptance_criteria' (an array of strings, with each string in Gherkin format 'Given/When/Then').
"""

enhanced_json_prompt = prompt_enhancer(raw_json_prompt)
print("JSON User Stories Enhanced prompt\n", enhanced_json_prompt)

# Step 2: Initialize LLM client for JSON user stories
print("\n--- STEP 2: INITIALIZING LLM CLIENT FOR JSON USER STORIES ---")
client, model_name, api_provider = setup_llm_client(model_name="gpt-4o")
print(f"Using model: {model_name} provider: {api_provider}")

# Step 3: Generate User Stories as JSON
print("\n--- STEP 3: GENERATING USER STORIES AS JSON ---")
# We set a lower temperature to encourage the LLM to stick to the requested format.
json_output_str = get_completion(enhanced_json_prompt, client, model_name, api_provider, temperature=0.2)

print(f"Raw LLM response length: {len(json_output_str)} characters")
print("First 200 characters of response:")
print(repr(json_output_str[:200]))

# Attempt to parse the string output into a Python list.
try:
    # Use our new standardized cleaning function from utils
    cleaned_json_str = clean_llm_output(json_output_str, language='json')
    print(f"\nCleaned JSON length: {len(cleaned_json_str)} characters")
    
    user_stories_json = json.loads(cleaned_json_str)
    print("✅ Successfully parsed LLM output as JSON.")
    print(f"Number of user stories generated: {len(user_stories_json)}")
    
    # Pretty-print the first user story to verify its structure
    print("\n--- Sample User Story ---")
    print(json.dumps(user_stories_json[0], indent=2))
    
except (json.JSONDecodeError, TypeError, IndexError) as e:
    print(f"❌ Error: Failed to parse LLM output as JSON. Error: {e}")
    print("\n--- DEBUGGING INFO ---")
    print("Raw LLM Output preview (truncated to 1000 chars):")
    if 'json_output_str' in locals() and json_output_str:
        print("-" * 60)
        print(json_output_str[:1000])
        print("-" * 60)
    else:
        print("  <No raw LLM output (json_output_str) available>")

    print("\n⚠️  Setting user_stories_json to an empty list to prevent downstream errors.")
    user_stories_json = []

2025-09-11 15:38:38,850 ag_aisoftdev.utils INFO LLM Client configured provider=openai model=o3 latency_ms=None artifacts_path=None


CHALLENGE 2: GENERATING FORMAL USER STORIES


2025-09-11 15:38:51,219 ag_aisoftdev.utils INFO LLM Client configured provider=openai model=gpt-4o latency_ms=None artifacts_path=None


JSON User Stories Enhanced prompt
 <prompt>
  <persona>
    You are a veteran Agile Product Owner and user-story writing expert. You think through requirements methodically, but you only reveal final answers—never your private reasoning chain.
  </persona>

  <context>
    Problem Statement: We need a tool to help our company's new hires get up to speed.

    Potential Features:
    - Personalized onboarding journeys and checklists tailored to role, department, and location.
    - A pre-start date “Welcome Hub” with first-day logistics, team intros, and digital paperwork.
    - Dedicated manager dashboard with new-hire progress tracking, task lists, and coaching resources.
    - Automated IT and equipment provisioning workflow triggered by HRIS integration.
    - Interactive company directory with an org chart, employee bios, and a “who-to-ask-about-what” guide.
    - Collaborative 30-60-90 day goal-setting module for new hires and their managers.
    - Automated “Onboarding Buddy” or 

### Challenge 3 (Advanced): Programmatic Validation and Artifact Creation

**Explanation:**
This is the final and most critical step. We treat the LLM's output as untrusted input and subject it to programmatic validation. This ensures that the artifact we create is reliable and can be consumed by other automated tools in later stages of the SDLC without causing errors. 

The `validate_and_save_stories` function acts as a gatekeeper. It checks for the correct data types (a list of objects) and ensures that all required fields are present in each object. Only if all checks pass do we proceed to save the file using `save_artifact`. This creates a trustworthy `day1_user_stories.json` file that can be confidently used as an input for other automated processes in our SDLC.

In [7]:
def validate_and_save_stories(stories_data):
    """Validates the structure of the user stories data and saves it if valid."""
    if not isinstance(stories_data, list) or not stories_data:
        print("Validation Failed: Data is not a non-empty list.")
        return False

    required_keys = ['id', 'persona', 'user_story', 'acceptance_criteria']
    all_stories_valid = True

    # Loop through each story object in the list.
    for i, story in enumerate(stories_data):
        # Check for the presence of all required keys.
        if not all(key in story for key in required_keys):
            print(f"Validation Failed: Story at index {i} is missing one or more required keys.")
            print(f"   Expected keys: {required_keys}")
            print(f"   Found keys: {list(story.keys()) if isinstance(story, dict) else 'Not a dictionary'}")
            all_stories_valid = False
            continue # Don't bother with further checks for this invalid story
        
        # Check that the acceptance criteria is a list with at least one item.
        ac = story.get('acceptance_criteria')
        if not isinstance(ac, list) or not ac:
            print(f"Validation Failed: Story at index {i} (ID: '{story.get('id')}') has invalid or empty acceptance criteria.")
            print(f"   Expected: list with at least one item")
            print(f"   Found: {type(ac)} with value {ac}")
            all_stories_valid = False

    # Only save the artifact if all stories in the list are valid.
    if all_stories_valid:
        print(f"\n✅ All {len(stories_data)} user stories passed validation.")
        artifact_path = "artifacts/day1_user_stories.json"
        
        # Use the helper function to save the file, creating the 'artifacts' directory if needed.
        # We use json.dumps with an indent to make the saved file human-readable.
        save_artifact(json.dumps(stories_data, indent=2), artifact_path)
        return True
    else:
        print(f"\n❌ Validation failed for one or more stories. Artifact not saved.")
        return False



In [8]:
# Diagnostic: Check the current state of user_stories_json
print("=== DIAGNOSTIC INFO ===")
if 'user_stories_json' in locals():
    print(f"user_stories_json exists: {type(user_stories_json)}")
    print(f"Length: {len(user_stories_json) if hasattr(user_stories_json, '__len__') else 'N/A'}")
    if user_stories_json:
        print("Sample content:", user_stories_json[0] if len(user_stories_json) > 0 else "Empty list")
    else:
        print("user_stories_json is empty or falsy")
        print("This means JSON parsing likely failed in the previous cell.")
        print("Check the raw LLM output above for formatting issues.")
else:
    print("user_stories_json variable does not exist")
    print("This means the previous cell never executed successfully")

# Also check if we have the raw output
if 'json_output_str' in locals():
    print(f"\nRaw LLM output length: {len(json_output_str)} characters")
    print("First 200 characters of raw output:")
    print(repr(json_output_str[:200]))
else:
    print("json_output_str not available")
print("========================")

=== DIAGNOSTIC INFO ===
user_stories_json exists: <class 'list'>
Length: 5
Sample content: {'id': 1, 'persona': 'The Anxious New Hire', 'user_story': 'As The Anxious New Hire, I want a personalized onboarding journey, so that I feel prepared and welcomed from day one.', 'acceptance_criteria': ['Given my role, department, and location, When I access the onboarding tool, Then I see a personalized checklist tailored to my profile.', 'Given a personalized checklist, When I complete an item, Then it is marked as completed and I receive positive feedback.', 'Given my onboarding journey, When I log in for the first time, Then I am greeted with a welcome message and first-day logistics.']}

Raw LLM output length: 3245 characters
First 200 characters of raw output:
'```json\n[\n  {\n    "id": 1,\n    "persona": "The Anxious New Hire",\n    "user_story": "As The Anxious New Hire, I want a personalized onboarding journey, so that I feel prepared and welcomed from day on'


In [9]:
# Run the validation function on the data we parsed from the LLM.
print("=== VALIDATION STEP ===")

if 'user_stories_json' not in locals():
    print("❌ ERROR: user_stories_json variable not found.")
    print("   Make sure to run the previous cell that generates user stories.")
elif not user_stories_json:
    print("❌ ERROR: user_stories_json is empty or None.")
    print("   This usually means JSON parsing failed in the previous step.")
    print("   Solutions:")
    print("   1. Check that your API keys are correctly configured")
    print("   2. Re-run the previous cell to generate user stories")
    print("   3. Examine the raw LLM output for formatting issues")
    
    # Try to re-parse if we have the raw output
    if 'json_output_str' in locals() and json_output_str.strip():
        print("\n🔄 Attempting to re-parse the JSON...")
        try:
            cleaned_json_str = clean_llm_output(json_output_str, language='json')
            user_stories_json = json.loads(cleaned_json_str)
            print("✅ Re-parsing successful! Proceeding with validation...")
            validate_and_save_stories(user_stories_json)
        except (json.JSONDecodeError, TypeError) as e:
            print(f"❌ Re-parsing failed: {e}")
            print("Raw output that failed to parse:")
            print("-" * 50)
            print(json_output_str)
            print("-" * 50)
else:
    print(f"✅ Found user_stories_json with {len(user_stories_json)} stories")
    validate_and_save_stories(user_stories_json)

=== VALIDATION STEP ===
✅ Found user_stories_json with 5 stories

✅ All 5 user stories passed validation.


## Lab Conclusion

Congratulations! You have completed the first lab. You started with a vague, one-sentence problem and finished with a structured, validated, machine-readable requirements artifact. This is the critical first step in an AI-assisted software development lifecycle. The `day1_user_stories.json` file you created will be the direct input for our next lab, where we will generate a formal Product Requirements Document (PRD).

> **Key Takeaway:** The single most important skill demonstrated in this lab is turning unstructured ideas into structured, machine-readable data (JSON). This transformation is what enables automation and integration with other tools later in the SDLC.