# Serial Killers Analysis

This project analyzes the "Serial Killers Dataset" from Kaggle to visualize and understand patterns related to serial killers across different regions.

**Disclaimer**: This analysis is purely for academic and educational purposes. The subject matter is sensitive, and the data is presented objectively without any intent to glorify or sensationalize crime.

In [None]:
# Import libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Set plot style
sns.set_style("whitegrid")
plt.rcParams["figure.figsize"] = (12, 8)

## Data Loading

Loading the serial killers dataset from Kaggle.

In [None]:
# Load the dataset
# df = pd.read_csv("serial_killers_data.csv")
# df.head()

## Top 10 Most Deadly Serial Killers

Visualization of serial killers with the highest number of proven victims.

In [None]:
# Code for top 10 most deadly serial killers visualization
# plt.figure(figsize=(12, 6))
# top_killers = df.sort_values("proven_victims", ascending=False).head(10)
# sns.barplot(x="proven_victims", y="name", data=top_killers)
# plt.title("Top 10 Most Deadly Serial Killers by Victim Count", fontsize=16)
# plt.tight_layout()
# plt.savefig("serial_killer_figures/top10_deadliest_killers.png")

## Top 5 Serial Killers in the US

Visualization focusing on the deadliest serial killers in the United States.

In [None]:
# Code for top 5 US serial killers
# us_killers = df[df["country"] == "United States"]
# top_us = us_killers.sort_values("proven_victims", ascending=False).head(5)
# plt.figure(figsize=(12, 6))
# sns.barplot(x="proven_victims", y="name", data=top_us)
# plt.title("Top 5 Most Deadly Serial Killers in the US", fontsize=16)
# plt.tight_layout()
# plt.savefig("serial_killer_figures/top5_us_killers.png")

## Top 5 Countries with Serial Killers

Visualization showing which countries have the highest counts of serial killers.

In [None]:
# Code for top 5 countries with serial killers
# country_counts = df["country"].value_counts().head(5)
# plt.figure(figsize=(10, 10))
# plt.pie(country_counts, labels=country_counts.index, autopct='%1.1f%%', startangle=90)
# plt.title("Top 5 Countries with Serial Killers", fontsize=16)
# plt.axis('equal')
# plt.tight_layout()
# plt.savefig("serial_killer_figures/top5_countries.png")

## Victims by Decade

Analysis of how the number of serial killer victims has changed over different decades.

In [None]:
# Code for victims by decade analysis
# Create a decade column
# df["decade"] = (df["year_first_kill"] // 10) * 10
# decade_victims = df.groupby("decade")["proven_victims"].sum().reset_index()
# plt.figure(figsize=(12, 6))
# sns.lineplot(x="decade", y="proven_victims", data=decade_victims, marker='o')
# plt.title("Number of Serial Killer Victims by Decade", fontsize=16)
# plt.xlabel("Decade")
# plt.ylabel("Number of Victims")
# plt.grid(True)
# plt.tight_layout()
# plt.savefig("serial_killer_figures/victims_by_decade.png")

## Killing Methods Distribution

Examination of the most common methods used by serial killers.

In [None]:
# Code for killing methods distribution
# method_counts = df["method"].value_counts().head(10)
# plt.figure(figsize=(12, 6))
# sns.barplot(x=method_counts.values, y=method_counts.index)
# plt.title("Most Common Killing Methods Used by Serial Killers", fontsize=16)
# plt.xlabel("Count")
# plt.tight_layout()
# plt.savefig("serial_killer_figures/methods_distribution.png")

## Age at First Kill Distribution

Analysis of the age distribution when serial killers committed their first murder.

In [None]:
# Code for age at first kill distribution
# plt.figure(figsize=(12, 6))
# sns.histplot(df["age_first_kill"], bins=20, kde=True)
# plt.title("Age Distribution at First Kill", fontsize=16)
# plt.xlabel("Age")
# plt.ylabel("Count")
# plt.tight_layout()
# plt.savefig("serial_killer_figures/age_distribution.png")

## Key Findings

- Harold Shipman (UK) and Luis Garavito (Colombia) stand out with the highest number of proven victims
- Within the US, Gary Ridgway, John Wayne Gacy, and Ted Bundy are among the most deadly serial killers
- The United States has the highest number of documented serial killers in the dataset
- The 1970s appears to be the decade with the highest number of serial killer victims
- Shooting and strangulation are the most common methods used by serial killers
- Most serial killers in the dataset committed their first murder between the ages of 25-35