# Part 5: The Analyst Report

After you have successfully deployed your pipeline and run the **Burst** profile (500 messages) in the test apparatus, you need to extract the results and answer a few questions.

We use `boto3` to scan the DynamoDB table, handling pagination automatically, and convert the results into standard Python dictionaries and floats.

## Setup: Configure Your Student ID
Replace `YOURID` below with the exact student ID you used for deployment.

In [None]:
STUDENT_ID = "YOURID"  # <--- Change this
TABLE_NAME = f"adflow-{STUDENT_ID}-results"
REGION = "us-east-1"
print(f"Target Table: {TABLE_NAME}")

## Step 1: Export Data from DynamoDB
This cell connects to your DynamoDB table, downloads all records, and converts the Decimal values back to standard floats.

In [None]:
import boto3
from decimal import Decimal
from collections import Counter

# Note: This uses your active AWS credentials (from `aws configure` or exported environment variables)
dynamodb = boto3.resource("dynamodb", region_name=REGION)
table = dynamodb.Table(TABLE_NAME)

results = []
response = table.scan()
results.extend(response.get("Items", []))

# Handle pagination if the table has more than 1 MB of data
while "LastEvaluatedKey" in response:
    response = table.scan(ExclusiveStartKey=response["LastEvaluatedKey"])
    results.extend(response.get("Items", []))

print(f"\nLoaded {len(results)} records from DynamoDB.")

# Convert Decimal types to Python floats for easier math/plotting
for item in results:
    for key in ["winning_bid_amount", "winning_score", "score_margin"]:
        if key in item and isinstance(item[key], Decimal):
            item[key] = float(item[key])

if results:
    print("\nSample record:")
    print(results[0])

## Section 1: Pipeline Evidence
Print the total records and a quick count of auction wins per advertiser across the entire dataset to prove your pipeline successfully routed messages.

In [None]:
# TODO: Print the total number of records
print(f"Total pipeline records: {len(results)}")

# TODO: Compute and print the auction wins per advertiser (overall)
# Hint: Use collections.Counter on the 'winning_advertiser_id' field


**Evidence Requirement:** Don't forget to push a screenshot of the **Test Apparatus** (showing a completed Burst run) to a `screenshots/` directory in this repo when submitting.

---
## Q1: Results Analysis

**Question:** Which advertiser won the most auctions overall? Which advertiser won the most in the `sports` content category specifically? Why do the overall and sports-specific rankings differ? Explain in 2â€“3 sentences, referencing the relevance multiplier table.

In [None]:
# TODO: Find the top winner in the 'sports' category
sports_results = [r for r in results if r.get("content_category") == "sports"]
print(f"Sports records: {len(sports_results)}")


**Your Answer (Q1):**

*Double-click this text to write your answer here...*

---
## Q2: Code Reflection

Answer **one** of the following (your choice):
 
* **Option A (Scale & Limits):** The test apparatus sent messages in small batches. If traffic suddenly spiked from 10 opportunities a second to 10,000 a second, what specific components of our current pipeline (SQS limits, Lambda concurrency, DynamoDB throughput) would become bottlenecks first, and what AWS settings would you adjust to handle the load?
* **Option B (The Distributed Process):** Writing code for an event-driven, queue-based pipeline is very different from writing a single local script. What was the most challenging part of getting SQS, Lambda, and DynamoDB to communicate correctly, or the most confusing bug you encountered, and what did it teach you about distributed architecture?

A well-argued two-paragraph response is sufficient for either option.

**Your Answer (Q2):**

*Double-click this text to write your answer here...*