# Geographic Distribution of Hate Crimes in California (2012–2022)

### Introduction
This page explores the geographic distribution of hate crimes across California's counties from 2012 to 2022. An interactive map allows users to visualize the hate crime counts for each county by selecting a specific year.

Hate crime data includes incidents motivated by race, ethnicity, religion, sexual orientation, gender, and other factors. The visualizations aim to highlight regional disparities and trends over time.

### Loading Geographic Data

This section loads California county boundaries from the `CA_Counties.shp` shapefile using `geopandas`. The shapefile contains geographic features (county shapes) and attribute data (e.g., county names), which are stored in a GeoDataFrame (`gdf`) for mapping and spatial analysis.


In [4]:
import geopandas as gpd

# Specify the path to the .shp file
shapefile_path = "C:/Users/Ilike/CSCI385/Final_Project/ca_counties/CA_Counties.shp"

# Load the shapefile into a Geopandas DataFrame
gdf = gpd.read_file(shapefile_path)

### Visualizing Hate Crimes Across California Counties

This section creates an interactive map to visualize hate crimes in California by county for each year from 2012 to 2022.

### Data Processing
This section processes the hate crime data to calculate yearly hate crime counts for three categories:
1. Racist Hate Crimes
2. LGBTQ Hate Crimes
3. Religion-Based Hate Crimes

The processing includes:
- Filtering data by bias type (e.g., Race/Ethnicity, Sexual Orientation).
- Merging with NCIC jurisdiction data to associate county names.
- Aggregating the hate crime counts for each county and year.
- Combining the results into a single dataset with columns:
  - `RacistHateCrimeCount`
  - `LGBTQHateCrimeCount`
  - `ReligiousHateCrimeCount`

This processed dataset will be used for visualizing hate crime trends.

In [13]:
import pandas as pd

def preprocess_all_data(year):
    # Define bias types and their respective column names
    bias_types = {
        'Race/Ethnicity/Ancestry': 'RacistHateCrimeCount',
        'Sexual Orientation': 'LGBTQHateCrimeCount',
        'Religion': 'ReligiousHateCrimeCount'
    }

    # Load datasets
    hate_crime_data = pd.read_csv('Hate Crime 2012 - 2022.csv')
    ncic_data = pd.read_csv('NCIC Code Jurisdiction List.csv')

    # Initialize an empty DataFrame for the final result
    final_data = pd.DataFrame()

    for bias_type, column_name in bias_types.items():
        # Filter data for the current bias type
        filtered_data = hate_crime_data[hate_crime_data['MostSeriousBiasType'] == bias_type]

        # Merge with NCIC data
        merged_data = pd.merge(filtered_data, ncic_data, left_on='County', right_on='CntyCode', how='left')

        # Group by County and Year and count occurrences
        county_yearly_hate_crimes = merged_data.groupby(['County_y', 'ClosedYear']).agg({'RecordId': 'count'}).reset_index()

        # Rename columns for clarity
        county_yearly_hate_crimes.rename(columns={'RecordId': column_name, 'County_y': 'County'}, inplace=True)

        # Filter data for the specified year
        hate_crimes_year = county_yearly_hate_crimes[county_yearly_hate_crimes['ClosedYear'] == year]

        # Merge with NCIC data to ensure all counties are included
        all_counties_year = pd.merge(ncic_data, hate_crimes_year, how='left', left_on='County', right_on='County')

        # Replace NaN values in the count column with 0
        all_counties_year[column_name] = all_counties_year[column_name].fillna(0).astype(int)

        # Keep only relevant columns
        all_counties_year = all_counties_year[['County', column_name]]

        # Merge the current bias type data with the final dataset
        if final_data.empty:
            final_data = all_counties_year
        else:
            final_data = pd.merge(final_data, all_counties_year, on='County', how='outer')

    # Add the year column
    final_data['Year'] = year

    # Reset index for clarity
    final_data = final_data.reset_index(drop=True)
    return final_data

### Processed Data for 2013
This step processes hate crime data for 2013 using `preprocess_all_data`, calculating counts for:
- **Racist Hate Crimes**
- **LGBTQ Hate Crimes**
- **Religion-Based Hate Crimes**

The output shows the first 10 rows, with columns for each bias type and the year.

In [22]:
# Example: Process data for the year 2022
processed_data = preprocess_all_data(2022)

# Display processed data
print(processed_data.head(5))

             County  RacistHateCrimeCount  LGBTQHateCrimeCount  \
0    Alameda County                    68                   28   
1     Alpine County                     0                    0   
2     Amador County                     0                    1   
3      Butte County                     4                    4   
4  Calaveras County                     0                    0   

   ReligiousHateCrimeCount  Year  
0                       11  2022  
1                        0  2022  
2                        0  2022  
3                        2  2022  
4                        0  2022  


### Visualization
This section visualizes the processed hate crime data on a map of California. 
The user can select:
- **Bias Type**: Racist, LGBTQ, or Religion-based hate crimes.
- **Year**: A specific year from 2012 to 2022.

The interactive map dynamically updates to show the selected data, with counties shaded based on hate crime counts.

In [24]:
import geopandas as gpd
import matplotlib.pyplot as plt
from ipywidgets import interact, widgets

# Load shapefile
shapefile_path = r'C:\Users\Ilike\CSCI385\Final_Project\ca_counties\CA_Counties.shp'
ca_counties = gpd.read_file(shapefile_path)

# Visualization function
def visualize_hate_crime_by_bias(bias, year):
    # Process data for the selected year
    all_data = preprocess_all_data(year)

    # Select the column to plot based on bias type
    column_map = {
        "Racist Hate Crimes": "RacistHateCrimeCount",
        "LGBTQ Hate Crimes": "LGBTQHateCrimeCount",
        "Religion-Based Hate Crimes": "ReligiousHateCrimeCount"
    }
    column_to_plot = column_map[bias]

    # Merge with shapefile data
    merged_geo = ca_counties.merge(all_data, left_on='NAMELSAD', right_on='County', how='left')
    merged_geo[column_to_plot] = merged_geo[column_to_plot].fillna(0).astype(int)

    # Plot the data
    fig, ax = plt.subplots(1, 1, figsize=(12, 8))
    merged_geo.plot(column=column_to_plot,
                    cmap='OrRd',
                    linewidth=0.8,
                    ax=ax,
                    edgecolor='0.8',
                    legend=True)
    ax.set_title(f'{bias} by County in California ({year})', fontsize=16)
    ax.axis('off')
    plt.show()

# Interactive widgets
bias_dropdown = widgets.Dropdown(
    options=["Racist Hate Crimes", "LGBTQ Hate Crimes", "Religion-Based Hate Crimes"],
    value="Racist Hate Crimes",
    description="Bias Type:"
)

# Interact with visualization
#interact(visualize_hate_crime_by_bias, bias=bias_dropdown, year=(2012, 2022))


![Interactive Histogram](HateCrime_Interactive_long.gif)