# **Title: ByteWise Fellowship - ML/DL**

<h1 style="font-family: 'poppins'; font-weight: bold; color: #90EE90;">👨‍💻Author: Muhammad Uzair Afridi</h1>

[![GitHub](https://img.shields.io/badge/GitHub-Profile-blue?style=for-the-badge&logo=github)](https://github.com/uzairafridi00) 
[![Kaggle](https://img.shields.io/badge/Kaggle-Profile-blue?style=for-the-badge&logo=kaggle)](https://www.kaggle.com/muhammaduzairafridi) 
[![LinkedIn](https://img.shields.io/badge/LinkedIn-Profile-blue?style=for-the-badge&logo=linkedin)](https://www.linkedin.com/in/uzair-afridi00/)

____

# **Analyzing Crime Data in Los Angeles**

In [1]:
# Import required libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

In [4]:
# Read in and preview the dataset
crimes = pd.read_csv("crimes.csv", parse_dates=["Date Rptd", "DATE OCC"], dtype={"TIME OCC": str})
crimes.head()

Unnamed: 0,Year,Population,Murder,Rape,Robbery,Assault,Burglary,CarTheft
0,1965,18073000,836,2320,28182,27464,183443,58452
1,1966,18258000,882,2439,30098,29142,196127,64368
2,1967,18336000,996,2665,40202,31261,219157,83775
3,1968,18113000,1185,2527,59857,34946,250918,104877
4,1969,18321000,1324,2902,64754,36890,248477,115400


## Which hour has the highest frequency of crimes? Store as an integer variable called peak_crime_hour

In [None]:
# Extract the first two digits from "TIME OCC", representing the hour,
# and convert to integer data type
crimes["HOUR OCC"] = crimes["TIME OCC"].str[:2].astype(int)

In [None]:
# Preview the DataFrame to confirm the new column is correct
crimes.head()

In [None]:
# Produce a countplot to find the largest frequency of crimes by hour
sns.countplot(data=crimes, x="HOUR OCC")
plt.show()

In [None]:
# Midday has the largest volume of crime
peak_crime_hour = 12

## Which area has the largest frequency of night crimes (crimes committed between 10pm and 3:59am)? 
## Save as a string variable called peak_night_crime_location

In [None]:
# Filter for the night-time hours
# 0 = midnight; 3 = crimes between 3am and 3:59am, i.e., don't include 4
night_time = crimes[crimes["HOUR OCC"].isin([22,23,0,1,2,3])]

In [None]:
# Group by "AREA NAME" and count occurrences, filtering for the largest value and saving the "AREA NAME"
peak_night_crime_location = night_time.groupby("AREA NAME", as_index=False)["HOUR OCC"].count().sort_values("HOUR OCC",ascending=False).iloc[0]["AREA NAME"]

In [None]:
# Print the peak night crime location
print(f"The area with the largest volume of night crime is {peak_night_crime_location}")

## Identify the number of crimes committed against victims by age group (0-17, 18-25, 26-34, 35-44, 45-54, 55-64, 65+) 

## Save as a pandas Series called victim_ages


In [None]:
# Create bins and labels for victim age ranges
age_bins = [0, 17, 25, 34, 44, 54, 64, np.inf]
age_labels = ["0-17", "18-25", "26-34", "35-44", "45-54", "55-64", "65+"]

In [None]:
# Add a new column using pd.cut() to bin values into discrete intervals
crimes["Age Bracket"] = pd.cut(crimes["Vict Age"], bins=age_bins, labels=age_labels)

In [None]:
# Find the category with the largest frequency
victim_ages = crimes["Age Bracket"].value_counts()
print(victim_ages)