# DX 704 Week 11 Project

In this project, you will develop and test prompts asking a language model to classify text from a home services query and match it to an appropriate category of home services.

The full project description and a template notebook are available on GitHub: [Project 11 Materials](https://github.com/bu-cds-dx704/dx704-project-11).


## Example Code

You may find it helpful to refer to these GitHub repositories of Jupyter notebooks for example code.

* https://github.com/bu-cds-omds/dx601-examples
* https://github.com/bu-cds-omds/dx602-examples
* https://github.com/bu-cds-omds/dx603-examples
* https://github.com/bu-cds-omds/dx704-examples

Any calculations demonstrated in code examples or videos may be found in these notebooks, and you are allowed to copy this example code in your homework answers.

## Part 1 : Design a Short Prompt

The provided file "queries.txt" contains sample text from requests by homeowners by email or phone.
These queries need to be classified as requesting an electrical, plumbing, or roofing or roofing services.
The provided file has columns query_id, query, and target_category.
Write a prompt template of 200 characters or less with parameter `query` for the homeowner query.
Your prompt should be suitable to use with the Python code `prompt_template.format(query=query)`.
Test your prompt with the model `gemini-2.0-flash` and suitable parsing code.

In [1]:
import google.genai as genai
from google.colab import userdata

import pandas as pd
import numpy as np
import json
import os

In [2]:
client = genai.Client(api_key=userdata.get('GEMINI_API_KEY'))

In [3]:
model_name = 'gemini-2.0-flash'

In [4]:
def get_response(contents):
    response = client.models.generate_content(model=model_name,
                                              contents=contents)
    return response.text

In [5]:
path = 'https://raw.githubusercontent.com/bu-cds-dx704/dx704-project-11/main/queries.txt'
df = pd.read_csv(path, sep="\t")

In [6]:
df.head()

Unnamed: 0,query_id,query,target_category
0,1,Hi. Melissa came by and wrecked my roof. Can y...,roofing
1,2,Hi there. This is Jack. I’m looking for someon...,plumbing
2,3,Can you install an automated spotlight by my d...,electrical
3,4,Pest control just cleared out a raccoon that t...,roofing
4,5,Need toilet unclogged ASAP,plumbing


In [7]:
def evaluate_prompt(df_to_evaluate, prompt_template):
    correct_count = 0
    results = []

    for row in df_to_evaluate.itertuples(index=False):
        query_id = row.query_id
        query = row.query
        target = row.target_category

        # Concatenate the prompt_template and the query
        full_prompt = f"{prompt_template} {query}"
        response = get_response(full_prompt)
        predicted_category = response.strip().lower()

        results.append({
            'query_id': query_id,
            'predicted_category': predicted_category
        })

        if predicted_category == target:
            correct_count += 1

    df_predict = pd.DataFrame(results)
    grade = (correct_count / len(df_to_evaluate)) * 100
    return grade, df_predict

In [8]:
# Define the prompt template used previously
prompt_template = "Classify the following query as either roofing, plumbing, or electrical"

grade, df_predictions = evaluate_prompt(df, prompt_template)
print(f"Grade: {grade:.2f}%")
display(df_predictions.head())

Grade: 82.00%


Unnamed: 0,query_id,predicted_category
0,1,roofing
1,2,plumbing
2,3,electrical
3,4,roofing
4,5,plumbing


Save your prompt template in a file "short-prompt.txt".
Save the results of your prompt testing in "short-output.tsv" with columns `query_id` and `predicted_category`.

In [9]:
# Save the prompt and the prompt output
df_predictions.to_csv('short-output.tsv', sep='\t', index=False)

with open("short-prompt.txt", "w") as f:
    f.write(prompt_template)

Submit "short-prompt.txt" and "short-output.tsv" in Gradescope.

Hint: your prompt may be re-tested with the Gemini API, so do not rely solely on lucky language model responses.

## Part 2: Find Short Prompt Mistakes

Construct 5 queries of 100 characters or less that trick your short prompt so that the wrong category is chosen.


In [10]:
# Trick query 1
trick_query_1 = "My toilet just electrocuted me."
trick_cat_1 = "plumbing"

# Trick query 2
trick_query_2 = "I found shingles in my sink's garbage disposal."
trick_cat_2 = "plumbing"

# Trick query 3
trick_query_3 = "There's a toilet on my roof."
trick_cat_3 = "roofing"

# Trick query 4
trick_query_4 = "The satellite dish fell off the roof and damaged my toilet."
trick_cat_4 = "electrical"

# Trick queryt 5
trick_query_5 = "There's an exposed wire on my roof that could electrocute someone sitting on the toilet."
trick_cat_5 = "electrical"

# Create dataframe from the queries
df_trick = pd.DataFrame({
    'query_id': [0,1,2,3,4],
    'query': [trick_query_1, trick_query_2, trick_query_3, trick_query_4, trick_query_5],
    'target_category': [trick_cat_1, trick_cat_2, trick_cat_3, trick_cat_4, trick_cat_5]
})


In [11]:
df_trick

Unnamed: 0,query_id,query,target_category
0,0,My toilet just electrocuted me.,plumbing
1,1,I found shingles in my sink's garbage disposal.,plumbing
2,2,There's a toilet on my roof.,roofing
3,3,The satellite dish fell off the roof and damag...,electrical
4,4,There's an exposed wire on my roof that could ...,electrical


In [12]:
# Define the prompt template used previously
prompt_template = "Classify the following query as either roofing, plumbing, or electrical"

grade, df_predictions_trick = evaluate_prompt(df_trick, prompt_template)
print(f"Grade: {grade:.2f}%")
display(df_predictions_trick.head())

Grade: 0.00%


Unnamed: 0,query_id,predicted_category
0,0,electrical
1,1,plumbing.\n\nthe presence of shingles (roofing...
2,2,"this is a tricky one! on the surface, it sound..."
3,3,this query is a combination of **roofing** and...
4,4,this query is a combination of **roofing** and...


Save your 5 queries in a file "mistakes.tsv" with columns `query`, `target_category` and `predicted_category`.

In [13]:
# Add the query column
df_predictions_trick['query'] = df_trick['query']
df_predictions_trick['target_category'] = df_trick['target_category']
df_predictions_trick.drop(columns=['query_id'])

# Reorder the df_predictions_trick df
df_predictions_trick = df_predictions_trick[['query', 'target_category', 'predicted_category']]
df_predictions_trick.to_csv('mistakes.tsv', sep="\t")

Submit "mistakes.tsv" in Gradescope.

## Part 3: Design a Long Prompt

Repeat part 1 with a length limit of 5000 characters.

In [14]:
# Create a longer prompt and repeat part 1
long_prompt_template = "Classify the following query, using only a single word, as either roofing, plumbing, or electrical"

grade_long, df_predictions_long = evaluate_prompt(df, long_prompt_template)
print(f"Grade: {grade_long:.2f}%")
display(df_predictions_long.head())

Grade: 100.00%


Unnamed: 0,query_id,predicted_category
0,1,roofing
1,2,plumbing
2,3,electrical
3,4,roofing
4,5,plumbing


Save your longer prompt template in a file "long-prompt.txt".
Save the results of your prompt testing in "long-output.tsv".
Both files should use the same columns as part 1.

In [15]:
# Save the results
df_predictions_long.to_csv('long-output.tsv', sep='\t', index=False)

with open("long-prompt.txt", "w") as f:
    f.write(long_prompt_template)

Submit "long-prompt.txt" and "long-output.tsv" in Gradescope.

## Part 4: Code

Please submit a Jupyter notebook that can reproduce all your calculations and recreate the previously submitted files.
You do not need to provide code for data collection if you did that by manually.

## Part 5: Acknowledgements

If you discussed this assignment with anyone, please acknowledge them here.
If you did this assignment completely on your own, simply write none below.

If you used any libraries not mentioned in this module's content, please list them with a brief explanation what you used them for. If you did not use any other libraries, simply write none below.

If you used any generative AI tools, please add links to your transcripts below, and any other information that you feel is necessary to comply with the generative AI policy. If you did not use any generative AI tools, simply write none below.

In [16]:
with open("acknowledgements.txt", "w") as f:
  f.write("None")