# ANALYSIS OF CAR CRASHES IN NIGERIA

## Overview:
This dataset offers a detailed examination of road traffic crashes in Nigeria, covering the period from Q4 2020 to Q1 2024. It includes quarterly data on the total number of crashes, injuries, fatalities, and vehicles involved, along with key contributing factors such as speed violations, driving under the influence, and poor weather conditions.

## Stake holders
* Policy makers
* Traffic safety analysts


## Data source
The data is sourced from official traffic records and provides insights into the factors influencing road safety in Nigeria.

## Tools used for analysis
* Python was used data cleaning 
* SQL was used for further analysis to analyze part of the data
* Tableau was used for visualization


## Features:
#### Quarter
Description: The quarter in which the data is recorded (e.g., Q4 2020, Q1 2021). This field serves as the temporal reference for the dataset.
#### State
Description: The Nigerian state where the traffic crashes occurred. This variable allows for regional analysis of crash data.
#### Total Crashes
Description: The total number of road traffic crashes reported per quarter for each state.
#### Number Injured
Description: The total number of individuals injured in road traffic crashes per quarter. This metric indicates the severity of the crashes.
#### Number Killed
Description: The total number of fatalities resulting from road traffic crashes per quarter.
#### Total Vehicles Involved
Description: The total number of vehicles involved in the crashes per quarter. This variable can be used to analyze traffic volume and crash rates.
#### Speed Violation (SPV)
Description: The number of crashes attributed to speed violations. This factor is critical in understanding the role of speeding in road traffic crashes.
#### Driving Under Alcohol/Drug Influence (DAD)
Description: The number of crashes where driving under the influence of alcohol or drugs was a contributing factor.
#### Poor Weather (PWR)
Description: The number of crashes that occurred under poor weather conditions, providing insights into weather-related risks.
#### Fatigue (FTQ)
Description: The number of crashes attributed to driver fatigue. This variable highlights the impact of driver alertness on road safety.

## Case study

The following questions will guide my analysis
* Which state has the highest number of car crashes
* What is the trend of car crashes over time
* Which are the top causes of car crashes in Nigeria
* Which periods of the year have high numbers of car crashes

## Below are the SQL queries, python code and visualizations after analysis of the dataset

Import libraries to be used in the analysis

In [1]:
import pandas as pd

Load the data from the csv file

In [2]:
data = pd.read_csv('Nigerian_Road_Traffic_Crashes_2020_2024.csv')

### Getting some basic information about the dataset

### 1. head()

In [3]:
data.head(2)

Unnamed: 0,Quarter,State,Total_Crashes,Num_Injured,Num_Killed,Total_Vehicles_Involved,SPV,DAD,PWR,FTQ,Other_Factors
0,Q4 2020,Abia,30,146,31,37,19,0,0,0,18
1,Q4 2020,Adamawa,77,234,36,94,57,0,0,0,37


### 2. tail()

In [4]:
data.tail(2)

Unnamed: 0,Quarter,State,Total_Crashes,Num_Injured,Num_Killed,Total_Vehicles_Involved,SPV,DAD,PWR,FTQ,Other_Factors
516,Q1 2024,Yobe,39,234,13,55,38,0,0,0,17
517,Q1 2024,Zamfara,13,61,14,16,14,0,0,0,2


### 3. shape

In [5]:
data.shape

(518, 11)

### 4. size

In [6]:
data.size

5698

### 5. columns

In [7]:
data.columns

Index(['Quarter', 'State', 'Total_Crashes', 'Num_Injured', 'Num_Killed',
       'Total_Vehicles_Involved', 'SPV', 'DAD', 'PWR', 'FTQ', 'Other_Factors'],
      dtype='object')

### 6. data types

In [8]:
data.dtypes

Quarter                    object
State                      object
Total_Crashes               int64
Num_Injured                 int64
Num_Killed                  int64
Total_Vehicles_Involved     int64
SPV                         int64
DAD                         int64
PWR                         int64
FTQ                         int64
Other_Factors               int64
dtype: object

### 7. info()

In [9]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 518 entries, 0 to 517
Data columns (total 11 columns):
 #   Column                   Non-Null Count  Dtype 
---  ------                   --------------  ----- 
 0   Quarter                  518 non-null    object
 1   State                    518 non-null    object
 2   Total_Crashes            518 non-null    int64 
 3   Num_Injured              518 non-null    int64 
 4   Num_Killed               518 non-null    int64 
 5   Total_Vehicles_Involved  518 non-null    int64 
 6   SPV                      518 non-null    int64 
 7   DAD                      518 non-null    int64 
 8   PWR                      518 non-null    int64 
 9   FTQ                      518 non-null    int64 
 10  Other_Factors            518 non-null    int64 
dtypes: int64(9), object(2)
memory usage: 44.6+ KB


###  Are there any duplicate records in the dataset? If yes, then remove the duplicate records.

In [13]:
data[data.duplicated()].sum()

Quarter                    0
State                      0
Total_Crashes              0
Num_Injured                0
Num_Killed                 0
Total_Vehicles_Involved    0
SPV                        0
DAD                        0
PWR                        0
FTQ                        0
Other_Factors              0
dtype: object

From the above output, there are no duplicated records in our data

### Are there any NULL values present in any columns? Show with heat-map.

In [14]:
data.isnull().sum()

Quarter                    0
State                      0
Total_Crashes              0
Num_Injured                0
Num_Killed                 0
Total_Vehicles_Involved    0
SPV                        0
DAD                        0
PWR                        0
FTQ                        0
Other_Factors              0
dtype: int64

From the above output, there are no NULL records in our data

## Below are SQL queries to get the tables required for visualization in tableau

In [None]:
-- This is to get a quick overview of the data
select * from nigerian_car_crashes.traffic_crashes

In [None]:
-- This is to view the columns in our dataset
show columns in nigerian_car_crashes.traffic_crashes;

### Crashes per state

In [None]:
-- crashes per state
select State , sum(Total_Crashes) as crashes_per_state
from nigerian_car_crashes.traffic_crashes
group by State
order by crashes_per_state  desc;

![alt text](crashes_per_state.png)

### Crashes per quarter

In [None]:
-- crashes per quarter
select Quarter, sum(Total_Crashes) as crashes_per_quarter
from nigerian_car_crashes.traffic_crashes
group by Quarter
order by crashes_per_quarter;

![crashes per state](crashes_per_quarter.png)

### Fatalities per state

In [None]:
-- fatalities per state
select State , sum(Num_Killed) as fatalities_per_state
from nigerian_car_crashes.traffic_crashes
group by State
order by fatalities_per_state desc;

![alt text](fatalities_per_state2.png)

### Injuries per state

In [None]:
-- injuries per state
select State , sum(Num_Injured) as injuries_per_state
from nigerian_car_crashes.traffic_crashes
group by State
order by injuries_per_state desc;

![alt text](<injuries per state2.png>)

### Trend of crashes over time

In [None]:
-- Trend Analysis Over Time
SELECT Quarter, SUM(Total_Crashes) AS total_crashes, SUM(Num_Killed) AS total_deaths
FROM nigerian_car_crashes.traffic_crashes
GROUP BY Quarter
ORDER BY Quarter;

![alt text](crashes_trend.png)

### Number of  crashes quarterly

In [None]:
-- To show total crashes for grouped quarters

select 
case when Quarter like '%Q1%'  then 'Q1'
     when Quarter like '%Q2%'  then 'Q2'
     when Quarter like '%Q3%'  then 'Q3'
     when Quarter like '%Q4%'  then 'Q4'
     else 'Unknown_Quarter'
     end as Specific_Quarter,
     sum(Total_Crashes)

from nigerian_car_crashes.traffic_crashes  
group by Specific_Quarter;



![alt text](pie_chart_crashes_per_quarter.png)

### Car crashes causes

In [None]:
-- Car crash causes
select sum(SPV) as speeding, sum(DAD) as drunk_driving, sum(PWR) as poor_weather,
 sum(FTQ) as fatigue, sum(Other_Factors) as other_factors
from nigerian_car_crashes.traffic_crashes;

![alt text](crashes_cause.png)

## Key Takeaways
* FCT state and Kaduna have the highest number of car crashes
* Kaduna state has the highest number of fatalities and injuries
* The 4th and 1st quarters have the highest number of car crashes
* Speeding caused the highest number of car crashes
* Car crashes seem to be reducing over time

## Recommendations
1. **FCT and Kaduna State**: High Number of Car Crashes
Recommendation: Improve Road Safety Infrastructure
FCT and Kaduna State: Focus on upgrading road safety infrastructure in these regions. This could include installing more traffic lights, improving road signs, and enhancing street lighting to reduce accidents, especially in high-risk areas.
Increased Law Enforcement: Strengthen traffic law enforcement, particularly in accident-prone areas, to ensure compliance with road safety regulations.

2. **Kaduna State**: Highest Number of Fatalities and Injuries
Recommendation: Emergency Response Enhancements
Rapid Response Services: Improve emergency response services, including quicker ambulance dispatch and better-equipped hospitals to handle trauma cases. This could potentially reduce the severity of injuries and fatalities.
Public Awareness Campaigns: Launch campaigns focused on the importance of using seatbelts, helmets, and other safety gear, which can significantly reduce the impact of accidents.

3. **1st and 4th Quarters**: High Number of Car Crashes
Recommendation: Seasonal Road Safety Measures
Targeted Safety Campaigns: Implement road safety awareness campaigns before and during the 1st and 4th quarters, focusing on the specific risks associated with these periods (e.g., festive seasons, end-of-year rush).
Enhanced Traffic Management: Increase the presence of traffic officers during these quarters to manage congestion and enforce road safety laws. Implement temporary measures such as speed limits in high-risk areas during these periods.

4. **Speeding**: Leading Cause of Car Crashes
Recommendation: Speed Control Measures
Speed Cameras and Radar: Install speed cameras and use radar guns to monitor and penalize speeding drivers, particularly in areas with a high incidence of crashes.
Speed Bumps and Reduced Speed Limits: Introduce speed bumps and lower speed limits in residential areas, school zones, and other high-risk areas to reduce the likelihood of speeding.
Driver Education: Launch educational programs targeting drivers, highlighting the dangers of speeding and promoting adherence to speed limits.
Summary of Implementation Strategy:
Short-Term: Begin with targeted public awareness campaigns and enhanced law enforcement during high-risk quarters and in FCT and Kaduna.
Medium-Term: Invest in speed control infrastructure and improve emergency response services, particularly in Kaduna, where the impact of crashes is most severe.
Long-Term: Continuously assess the effectiveness of these interventions and adjust strategies accordingly, ensuring sustained reductions in car crashes, fatalities, and injuries.


## Things to consider

When considering the recommendations youâ€™ve outlined, here are the key variables and factors to take into account for each strategy:

### 1. **FCT and Kaduna State: High Number of Car Crashes**
**Recommendation: Improve Road Safety Infrastructure**
   - **Variables to Consider:**
     - **Accident Hotspots:** Identify and map locations with the highest accident rates within FCT and Kaduna (e.g., intersections, highways, urban vs. rural roads).
     - **Road Conditions:** Assess the current state of road infrastructure, including the presence of potholes, road markings, and signage quality.
     - **Traffic Volume:** Analyze traffic density and patterns in different areas, as high-traffic zones may require more infrastructure upgrades.
     - **Existing Traffic Control Measures:** Evaluate the current traffic control measures (traffic lights, signs, speed bumps) to identify gaps and inefficiencies.
     - **Lighting Conditions:** Measure the adequacy of street lighting, particularly in areas with frequent nighttime accidents.
     - **Law Enforcement Data:** Review traffic violation records to understand the common types of infractions in these areas.

**Recommendation: Increased Law Enforcement**
   - **Variables to Consider:**
     - **Resource Allocation:** Determine the availability and distribution of traffic law enforcement officers in high-risk areas.
     - **Response Times:** Measure the response times of law enforcement to accidents and traffic violations.
     - **Enforcement Technologies:** Assess the current use of technology (e.g., speed cameras, breathalyzers) in traffic enforcement.
     - **Public Compliance:** Monitor public compliance with traffic laws and the impact of enforcement on reducing accidents.

### 2. **Kaduna State: Highest Number of Fatalities and Injuries**
**Recommendation: Emergency Response Enhancements**
   - **Variables to Consider:**
     - **Emergency Response Time:** Analyze the time taken from the occurrence of an accident to the arrival of emergency services.
     - **Accessibility of Healthcare Facilities:** Evaluate the proximity of trauma centers and hospitals to accident-prone areas.
     - **Ambulance Availability:** Consider the number and distribution of ambulances and their ability to respond quickly to accidents.
     - **Equipment and Training:** Assess the equipment and training levels of emergency response teams, focusing on trauma care.
     - **Mortality and Injury Rates:** Study the correlation between response times and mortality/injury rates to identify critical intervention points.

**Recommendation: Public Awareness Campaigns**
   - **Variables to Consider:**
     - **Demographic Data:** Identify the target demographics most affected by accidents (e.g., young drivers, motorcyclists).
     - **Current Awareness Levels:** Conduct surveys or use existing data to gauge public awareness of road safety measures.
     - **Campaign Reach:** Measure the effectiveness of past public awareness campaigns in reaching and influencing the target audience.
     - **Behavioral Change:** Monitor changes in driver behavior (e.g., seatbelt usage, helmet wearing) as a result of awareness campaigns.

### 3. **1st and 4th Quarters: High Number of Car Crashes**
**Recommendation: Seasonal Road Safety Measures**
   - **Variables to Consider:**
     - **Seasonal Traffic Patterns:** Analyze traffic volume trends during the 1st and 4th quarters, particularly around holidays and festive seasons.
     - **Weather Conditions:** Consider the impact of weather (e.g., harmattan dust, rainfall) on road safety during these quarters.
     - **Event-Driven Traffic:** Identify major events (e.g., religious festivals, public holidays) that contribute to higher traffic and accident rates.
     - **Campaign Timing:** Optimize the timing of road safety campaigns to precede peak accident periods.
     - **Law Enforcement Deployment:** Review the deployment of traffic officers and their impact during these high-risk quarters.

**Recommendation: Enhanced Traffic Management**
   - **Variables to Consider:**
     - **Traffic Flow Analysis:** Study traffic flow and congestion patterns during peak periods in high-risk areas.
     - **Temporary Traffic Measures:** Evaluate the effectiveness of temporary measures such as speed limits and road closures during peak periods.
     - **Public Compliance:** Monitor public adherence to temporary traffic measures and the resulting impact on accident rates.

### 4. **Speeding: Leading Cause of Car Crashes**
**Recommendation: Speed Control Measures**
   - **Variables to Consider:**
     - **Speeding Hotspots:** Identify locations with the highest incidences of speeding and correlate with accident data.
     - **Effectiveness of Speed Cameras:** Analyze the effectiveness of existing speed cameras in reducing speeding and accidents.
     - **Driver Demographics:** Consider the age, gender, and driving experience of individuals most likely to speed.
     - **Road Design:** Assess whether road design encourages or discourages speeding (e.g., long straight stretches vs. curved roads).
     - **Public Perception:** Gauge public perception and acceptance of speed control measures like cameras and speed bumps.

**Recommendation: Driver Education**
   - **Variables to Consider:**
     - **Target Audience:** Identify high-risk groups (e.g., young drivers, commercial drivers) for targeted education.
     - **Current Knowledge Levels:** Survey drivers to understand their current knowledge and attitudes toward speeding and road safety.
     - **Education Delivery Methods:** Evaluate the effectiveness of different educational approaches (e.g., workshops, media campaigns, school programs).
     - **Behavioral Impact:** Monitor changes in driving behavior as a result of education programs.

### **Summary of Data Considerations:**
- **Geospatial Data:** For mapping accident hotspots and infrastructure needs.
- **Traffic Volume and Patterns:** For understanding where and when interventions are needed.
- **Demographic Data:** To tailor public awareness campaigns and driver education programs.
- **Healthcare Access and Response Times:** For improving emergency response services.
- **Weather and Seasonal Data:** To inform seasonal safety campaigns.
- **Enforcement and Compliance Data:** To assess the effectiveness of law enforcement and public adherence to traffic laws. 

These variables will help  to design more targeted and effective interventions based on the specific conditions and risks in each region.