### Step 1: Retrieve and Load Data
Before starting the analysis, you will need both crime reports and historical temperature data for Charlottesville. If you have access to the UVA Open Data Portal, you may download the most recent crime CSV. Otherwise, use the provided file Crime_Data.csv stored in the DATA folder of this repository.

In [None]:
# Install Meteostat library if running outside local environment
!pip install meteostat

# Import required packages
from datetime import datetime
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from meteostat import Point, Daily
from scipy import stats
import numpy as np

# Load the crime dataset
crime_df = pd.read_csv("DATA/Crime_Data.csv")

print("Crime dataset loaded. Number of rows:", len(crime_df))
crime_df.head()


### Step 1.1: Retrieve Daily Weather Data
We now gather daily temperature observations for Charlottesville over the same time period as the crime data.

In [None]:
# Convert timestamps for analysis
crime_df["DateReported"] = pd.to_datetime(crime_df["DateReported"], errors="coerce")

# Determine the full date range represented in the crime data
start_date = crime_df["DateReported"].min()
end_date = crime_df["DateReported"].max()

print("Crime data range:", start_date, "to", end_date)

# Coordinates for Charlottesville, VA
charlottesville = Point(38.03, -78.48)

# Request daily weather information via Meteostat
weather_df = Daily(charlottesville, start_date, end_date).fetch()

print("Weather dataset retrieved. Sample:")
weather_df.head()


### Step 2: Data Preparation and Visual Exploration
In this step, you will process both datasets so they can be merged and compared. You will also produce at least two visualizations to begin exploring trends in crime and temperature.

In [None]:
# Aggregate crime counts by calendar day
crime_daily = (
    crime_df
    .groupby(crime_df["DateReported"].dt.date)
    .size()
    .reset_index(name="CrimeCount")
)

crime_daily["Date"] = pd.to_datetime(crime_daily["Date"])

# Prepare weather dataframe with matching date format
weather_df = weather_df.reset_index()
weather_df["Date"] = pd.to_datetime(weather_df["time"].dt.date)

# Merge crime and temperature into a single dataset
merged_df = pd.merge(crime_daily, weather_df, on="Date", how="inner")

print("Merged dataset shape:", merged_df.shape)
merged_df.head()


### TODO: Create Visualization 1
Suggested ideas:


*   count of crimes by offense type
*   trend of crime over time



In [None]:
# TODO: Add your first visualization here

### TODO: Create Visualization 2
Suggested ideas:


*   temperature over time
*   overlay crime and temperature
*   bar chart of most common offenses



In [None]:
# TODO: Add your second visualization here

### Step 3: Correlation Analysis
Now you will calculate the Pearson correlation coefficient to determine whether higher temperatures are associated with higher daily crime counts in Charlottesville.

In [None]:
# Filter to only the columns needed for correlation
corr_data = merged_df[["CrimeCount", "tavg"]].dropna()

# TODO: Compute overall Pearson correlation and p-value
r_val, p_two_sided = None, None  # Replace these with real calculations

# One-sided p-value for the hypothesis: crime increases as temp increases
p_one_sided = p_two_sided / 2 if r_val is not None and r_val > 0 else None

print(f"[OVERALL] Pearson r = {r_val}")
print(f"[OVERALL] Two-sided p = {p_two_sided}")
print(f"[OVERALL] One-sided p (r > 0) = {p_one_sided}")


### Step 3.1: Offense-Specific Correlations
you will now repear the correlation procedure for individual offense types. Before choosing offenses, it may help to visualize which ones occur most frequently.


In [None]:
# TODO: Choose which offenses you want to analyze
top_offenses = None  # replace with your selection

### TODO: Compute Pearson correlation for each selected offense


In [None]:
# Example structure (fill in your selections and calculations)

# for offense in top_offenses:
#     offense_daily = ...
#     r, p = ...
#     print(f"{offense}: r = {r}, one-sided p = {p/2 if r > 0 else 1.0}")


### Final Step
Once finished, transfer your results into results.md and write a short explanation of your findings.