# Death at the Aurora Theater: A Polars Murder Mystery

## Introduction

Welcome, Detective! A terrible crime has occurred at the Aurora Theater. Last night, March 15th, 2025, the lead actor Victoria Hayes was found dead in her dressing room shortly after the evening performance. As the lead investigator on this case, your job is to analyze the evidence and identify the killer.

You have access to several datasets that may help you solve this case:

1. **Police Records**: Information about recent crimes, including this murder
2. **Witness Reports**: Statements from people who were at the theater
3. **Stage Access Logs**: Records of who entered and exited different areas of the theater
4. **Staff Database**: Information about theater employees

In this notebook, you'll use the powerful **Polars** library to analyze these datasets and find connections that will lead you to the killer. Let's begin our investigation!

## Setting Up

First, let's import the Polars library and load our datasets.

In [None]:
import polars as pl

# Load the datasets
police_records = pl.read_csv("datasets/police_records.csv")
witness_reports = pl.read_csv("datasets/witness_reports.csv")
stage_access = pl.read_csv("datasets/stage_access.csv")
staff_database = pl.read_csv("datasets/staff_database.csv")

# Display the first few rows of each dataset to get familiar with the data
print("Police Records:")
print(police_records.head())
print("\nWitness Reports:")
print(witness_reports.head())
print("\nStage Access Logs:")
print(stage_access.head())
print("\nStaff Database:")
print(staff_database.head())

## Step 1: Identify the Murder Case

Let's start by finding the specific case related to the murder at the Aurora Theater. We'll use Polars' filtering capabilities to find this information.

In [None]:
# Filter the police records to find our murder case
murder_case = police_records.filter(
    (pl.col("location") == "Aurora Theater") & 
    (pl.col("type") == "murder") & 
    (pl.col("date") == "2025-03-15")
)

print("Murder Case Details:")
print(murder_case)

## Step 2: Examine Witness Reports

Now that we have the case ID, let's look at witness reports related to this case. Witnesses may have seen something important that could help us identify the killer.

In [None]:
# Get the case ID from our murder case
case_id = murder_case.select("case_id").item()
print(f"Case ID: {case_id}\n")

# Filter witness reports for our case
case_witnesses = witness_reports.filter(pl.col("case_id") == case_id)

print(f"Witness Reports for Case {case_id}:")
print(case_witnesses)

## Step 3: Analyze Stage Access Logs

The murder took place in the victim's dressing room. Let's analyze the stage access logs to see who entered and exited the dressing room around the time of the murder.

In [None]:
# Filter access logs for the day of the murder
murder_date = murder_case.select("date").item()
day_access_logs = stage_access.filter(pl.col("date").str.contains(murder_date))

print(f"Access Logs for {murder_date}:")
print(day_access_logs)

Let's focus specifically on the dressing room access, since that's where the victim was found.

In [None]:
# Filter for dressing room access on the murder day
dressing_room_access = day_access_logs.filter(pl.col("location") == "dressing room")

print("Dressing Room Access Logs:")
print(dressing_room_access.sort("date"))

## Step 4: Identify the Victim

From the introduction, we know the victim was Victoria Hayes, the lead actor. Let's confirm this from our staff database.

In [None]:
# Find the victim in the staff database
victim = staff_database.filter(pl.col("employee_name") == "Victoria Hayes")
print("Victim Information:")
print(victim)

# Get the victim's employee ID
victim_id = victim.select("employee_id").item()
print(f"\nVictim ID: {victim_id}")

## Step 5: Track the Victim's Movements

Let's track the victim's movements on the day of the murder.

In [None]:
# Filter access logs for the victim on the murder day
victim_movements = day_access_logs.filter(pl.col("member_id") == victim_id)

print(f"Victim's Movements on {murder_date}:")
print(victim_movements.sort("date"))

## Step 6: Identify Suspicious Activity

Let's look for suspicious patterns in the access logs. We'll focus on people who entered the dressing room around the time the victim was last seen there.

In [None]:
# Find the last time the victim entered the dressing room
last_entry = victim_movements.filter(
    (pl.col("location") == "dressing room") & 
    (pl.col("direction") == "entering")
).sort("date", descending=True).head(1)

last_entry_time = last_entry.select("date").item()
print(f"Victim's last entry to dressing room: {last_entry_time}")

# Find anyone who entered the dressing room after the victim's last entry
suspicious_entries = dressing_room_access.filter(
    (pl.col("date") > last_entry_time) & 
    (pl.col("member_id") != victim_id) & 
    (pl.col("direction") == "entering")
)

print("\nPeople who entered the dressing room after the victim:")
print(suspicious_entries)

## Step 7: Cross-Reference with Staff Database

Let's identify who these suspicious individuals are by joining with the staff database.

In [None]:
# Get the list of suspicious member IDs
suspicious_ids = suspicious_entries.select("member_id").to_series().to_list()

# Find these members in the staff database
suspicious_staff = staff_database.filter(pl.col("employee_id").is_in(suspicious_ids))

print("Suspicious Staff Members:")
print(suspicious_staff)

## Step 8: Analyze All Staff Movements

Let's look at the movements of all staff members on the day of the murder to see if there are any other suspicious patterns.

In [None]:
# Join the access logs with staff information
staff_movements = day_access_logs.join(
    staff_database,
    left_on="member_id",
    right_on="employee_id"
)

# Sort by time to see the sequence of events
staff_movements = staff_movements.sort("date")

print("Staff Movements on the Day of the Murder:")
print(staff_movements.select(["date", "employee_name", "employee_role", "location", "direction"]))

## Step 9: Cross-Reference with Witness Testimonies

Let's look back at the witness testimonies to see if any of them mention our suspicious staff members.

In [None]:
# Function to check if a testimony mentions a specific role
def mentions_role(testimony, role):
    return role.lower() in testimony.lower()

# Register the function with Polars
pl.register_user_defined_function(
    "mentions_role", 
    lambda testimony, role: mentions_role(testimony, role), 
    [pl.Utf8, pl.Utf8], 
    pl.Boolean
)

# Get all the roles from our suspicious staff
suspicious_roles = suspicious_staff.select("employee_role").to_series().to_list()

# Check each witness testimony for mentions of these roles
relevant_testimonies = []
for role in suspicious_roles:
    testimonies = case_witnesses.filter(pl.col("testimony").str.contains(role, ignore_case=True))
    if testimonies.height > 0:
        print(f"Testimonies mentioning {role}:")
        print(testimonies)
        print()
        relevant_testimonies.extend(testimonies.to_dicts())

## Step 10: Identify the Killer

Based on our analysis, let's identify the most likely suspect. We'll look for someone who:

1. Had access to the dressing room around the time of the murder
2. Was mentioned in witness testimonies in a suspicious context
3. Had a motive or opportunity to commit the crime

In [None]:
# Let's look at all the evidence together
print("SUMMARY OF EVIDENCE:\n")

print("1. Suspicious dressing room entries after victim's last entry:")
suspicious_with_names = suspicious_entries.join(
    staff_database,
    left_on="member_id",
    right_on="employee_id"
)
print(suspicious_with_names.select(["date", "employee_name", "employee_role"]))
print()

print("2. Victim's last movements:")
print(victim_movements.sort("date"))
print()

print("3. Relevant witness testimonies:")
for testimony in relevant_testimonies:
    print(f"Witness {testimony['witness_name']}: {testimony['testimony']}")
print()

# Based on the evidence, who do you think is the killer?
print("Based on the evidence, the most likely suspect is...")

## Your Conclusion

Now it's time for you to make your case! Based on the evidence you've analyzed, who do you think killed Victoria Hayes?

Write your conclusion below, explaining:
1. Who you think the killer is
2. What evidence supports your conclusion
3. What you think the motive might have been

In [None]:
# Import hashlib for secure answer checking
import hashlib

# Your conclusion here
suspect_name = "Thomas Wright"  # Replace with your suspect's name

# Function to check if your answer is correct using a hash
def check_answer(name):
    # Convert the name to lowercase and create a hash
    name_hash = hashlib.md5(name.lower().encode()).hexdigest()
    
    # The correct answer hash (for "thomas wright")
    correct_hash = "e6358890e7c9c1f43fe6d93f9c5a4920"
    
    if name_hash == correct_hash:
        print("Congratulations, Detective! You've solved the case!")
        print("\nCase Summary:")
        print("The lighting technician killed Victoria Hayes in her dressing room.")
        print("The evidence shows that he entered her dressing room shortly after she returned from the stage.")
        print("Witnesses reported seeing him arguing with her earlier in the day.")
        print("He had access to the lighting grid, dressing room, and backstage areas.")
        print("The motive appears to be a long-standing professional rivalry and personal grudge.")
    else:
        print("That's not quite right. Review the evidence again and look for patterns in the access logs and witness testimonies.")
        
# Check your answer
check_answer(suspect_name)

## Bonus: Advanced Polars Techniques

If you've solved the case and want to explore more Polars features, try these advanced techniques:

In [None]:
# 1. Use expressions to create a new column showing how long each person spent in each location
# First, let's create a helper function to parse the datetime strings
def parse_datetime(dt_str):
    from datetime import datetime
    return datetime.strptime(dt_str, "%Y-%m-%d %H:%M:%S")

# Register the function with Polars
pl.register_user_defined_function(
    "parse_datetime", 
    lambda dt_str: parse_datetime(dt_str), 
    [pl.Utf8], 
    pl.Datetime
)

# Convert date strings to datetime objects
day_access_with_dt = day_access_logs.with_columns(
    pl.col("date").map_elements(parse_datetime).alias("datetime")
)

# Group by member_id and location to find entry and exit times
# This is a more complex analysis that would require matching entries with exits
# Here's a simplified approach looking at the first entry and last exit for each person/location
location_durations = day_access_with_dt.group_by(["member_id", "location"]).agg([
    pl.col("datetime").min().alias("first_seen"),
    pl.col("datetime").max().alias("last_seen")
])

# Join with staff names
location_durations_with_names = location_durations.join(
    staff_database,
    left_on="member_id",
    right_on="employee_id"
)

print("Time Spent in Each Location:")
print(location_durations_with_names.select(["employee_name", "employee_role", "location", "first_seen", "last_seen"]))

In [None]:
# 2. Use window functions to analyze the sequence of events
# For each person, find who they could have interacted with based on being in the same location at the same time

# First, let's create a more detailed timeline of all movements
timeline = day_access_with_dt.join(
    staff_database,
    left_on="member_id",
    right_on="employee_id"
).sort("datetime")

# For each location, find all people who were there at the same time
# This would require a more complex analysis with overlapping time windows
# Here's a simplified approach looking at who was in the same location on the same day

# Group by location to see who was in each place
location_visitors = timeline.group_by("location").agg([
    pl.col("employee_name").unique().alias("visitors")
])

print("People who visited each location:")
print(location_visitors)

## Conclusion

Congratulations on completing this Polars Murder Mystery! You've used data analysis techniques to solve a crime and learned about several Polars features along the way, including:

- Filtering data with conditions
- Joining datasets
- Sorting and grouping data
- Using expressions and user-defined functions
- Working with strings and dates

These skills are valuable not just for solving fictional murders, but for real-world data analysis tasks. Keep practicing with Polars to become even more proficient at extracting insights from data!