# Final: LLM Prompt Checkpoint

```{tip}
This is a great time to use few-shot prompting.

Structured outputs are probably overkill since we just want a string
```

1. Design a LLM system prompt for converting a sentence requesting weather into the format `wttr.in` needs.
2. Design several test cases.
3. Evaluate your prompt; iterate if necessary.
4. Upload this completed notebook to Gradescope.


In [1]:
%pip install -q ollama

Note: you may need to restart the kernel to use updated packages.


## Template

Complete this and submit to Gradescope.

### LLM Setup and Prompt

In [None]:
"""This script evaluates an LLM prompt for processing text so that it can be used for the wttr.in API"""

from ollama import Client

LLM_MODEL: str = "gemma3:1b"  # Change this to be the model you want
client: Client = Client(
    host="http://localhost:11434"  # Change this to be the URL of your LLM
)

few_shot_prompt: str = """
Given a weather query, return a formatted string that includes the location for the wttr.in API. This API takes the following formats:

1. Cities
2. 3-letter airport codes

If a popular attraction/geographical location is mentioned, output the city that the attraction is located in. 
Anytime the city is more than one word, replace the spaces between the words with '+' and capitalize all words. For 3-letter airport codes, all three letters are lowercase.

Examples:

Input: What is the weather in Vegas?
Output: Las+Vegas

Input: What's the weather near the Eiffel Tower?
Output: Paris

Input: Please give me the weather in HNL.
Output: Honolulu

Input: I'm at JFK right now. What's the weather?
Output: New+York+City

Input: What's the weather in New York City?
Output: New+York+City

Input: Forecast for Rio Rancho?
Output: Rio+Rancho


Please note how New York City and Las Vegas are each two or more words, so the output includes a '+' where the space between the words should be.
These two examples mentioned above are not the only instances where a '+' is necessary. This rule applies to ANY city that is more than one word.
Therefore, do not give me an output with any spaces. There are '+' instead of spaces in the output.

"""

# TODO: define llm_parse_for_wttr()
def llm_parse_for_wttr(prompt: str) -> str:
    response = client.chat(
        messages= [
            {
                "role": "system", 
                "content": few_shot_prompt,
            },
            {
                "role": "user",
                "content": prompt,
            }
        ],
        model=LLM_MODEL,
    )

    
    output = response["message"]["content"].strip() #used AI for this line

    return output
    

### Test cases and function

In [3]:
# Test cases
test_cases = [  # TODO: Replace these test cases with ones for wttr.in
    {"input": "What's the weather in Rio Rancho?", "expected": "Rio+Rancho"},
    {"input": "What's the weather in PHX?", "expected": "Phoenix"},
    {"input": "What's the weather in New York City?", "expected": "New+York+City"},
    {"input": "What's the weather at the Eiffel Tower? ", "expected": "Paris"},
    {"input": "Give me the weather in Colorado Springs", "expected": "Colorado+Springs"},
]

In [4]:
def run_tests(test_cases: list[dict[str, str]]):
    """run_tests iterates through a list of dictionaries,
    runs them against an LLM, and reports the results."""
    num_passed = 0

    for i, test in enumerate(test_cases, 1):
        raw_input = test["input"]
        expected_output = test["expected"]

        print(f"\nTest {i}: {raw_input}")
        try:
            result = llm_parse_for_wttr(raw_input).strip()
            expected = expected_output.strip()

            print("LLM Output  :", result)
            print("Expected    :", expected)

            if result == expected:
                print("✅ PASS")
                num_passed += 1
            else:
                print("❌ FAIL")

        except Exception as e:
            print("💥 ERROR:", e)

    print(f"\nSummary: {num_passed} / {len(test_cases)} tests passed.")

### Execute tests

In [None]:
run_tests(test_cases=test_cases)
#using 1b model, will likely be more accurate with a larger model


Test 1: What's the weather in Rio Rancho?
LLM Output  : Rio+Rancho
Expected    : Rio+Rancho
✅ PASS

Test 2: What's the weather in PHX?
LLM Output  : Phoenix
Expected    : Phoenix
✅ PASS

Test 3: What's the weather in New York City?
LLM Output  : New+York+City
Expected    : New+York+City
✅ PASS

Test 4: What's the weather at the Eiffel Tower? 
LLM Output  : Paris
Expected    : Paris
✅ PASS

Test 5: Give me the weather in Colorado Springs
LLM Output  : Colorado Springs
Expected    : Colorado+Springs
❌ FAIL

Summary: 4 / 5 tests passed.
