# Dataset Features:

### Weather:The impact of weather conditions on the likelihood of accidents.

Clear: No adverse weather conditions.

Rainy: Rainy conditions increase the chance of accidents.

Foggy: Foggy conditions reduce visibility, increasing accident chances.

Snowy: Snow can cause slippery roads and higher accident probability.

Stormy: Stormy weather can create hazardous driving conditions.


### Road_Type: The type of road, influencing the probability of accidents.

Highway: High-speed roads with higher chances of severe accidents.

City Road: Roads within city limits, typically with more traffic and lower speeds.

Rural Road: Roads outside urban areas, often with fewer vehicles and lower speeds.

Mountain Road: Roads with curves and elevation changes, increasing accident risk.

### Time_of_Day: The time of day when the accident occurs.

Morning: The period between sunrise and noon.

Afternoon: The period between noon and evening.

Evening: The period just before sunset.

Night: The nighttime, often associated with reduced visibility and higher risk.

### Traffic_Density: The level of traffic on the road.

0: Low density (few vehicles).

1: Moderate density.

2: High density (many vehicles).

### Speed_Limit: The maximum allowed speed on the road.

### Number_of_Vehicles: The number of vehicles involved in the accident, ranging from 1 to 5.

### Driver_Alcohol: Whether the driver consumed alcohol.

0: No alcohol consumption.

1: Alcohol consumption (which increases the likelihood of an accident).

### Accident_Severity: The severity of the accident.

Low: Minor accident.

Moderate: Moderate accident with some damage or injuries.

High: Severe accident with significant damage or injuries.

### Road_Condition: The condition of the road surface.

Dry: Dry roads with minimal risk.

Wet: Wet roads due to rain, increasing the risk of accidents.

Icy: Ice on the road, significantly increasing the risk of accidents.

### Under Construction: Roads under construction, which may have obstacles or poor road quality.

### Vehicle_Type: The type of vehicle involved in the accident.

Car: A regular passenger car.

Truck: A large vehicle used for transporting goods.

Motorcycle: A two-wheeled motor vehicle.

Bus: A large vehicle used for public transportation.

Driver_Age: The age of the driver. Values range from 18 to 70 years old.

### Driver_Experience: The years of experience the driver has. Values range from 0 to 50 years of experience.

### Road_Light_Condition: The lighting conditions on the road.

Daylight: Daytime, when visibility is typically good.
Artificial Light: Road is illuminated with streetlights.
No Light: Road is not illuminated, typically during the night in poorly lit areas.

# Цель анализа:
---
Выявить основные причины и факторы, влияющие на вероятность дорожно-транспортных происшествий. Будет проанализировано влияние переменных, таких как погодные условия, тип дороги, ограничение скорости и другие.

# Размер Данных:
---
Размер датасета: 840 строк и 14 столбцов.

# Описательная статистика датасета
---

In [100]:
df.describe()

Unnamed: 0,Traffic_Density,Speed_Limit,Number_of_Vehicles,Driver_Alcohol,Driver_Age,Driver_Experience,Accident
count,798.0,798.0,798.0,798.0,798.0,798.0,798.0
mean,1.001253,71.050125,3.286967,0.160401,43.259398,38.981203,0.299499
std,0.784894,32.052458,2.017267,0.367208,15.129856,15.273201,0.458326
min,0.0,30.0,1.0,0.0,18.0,9.0,0.0
25%,0.0,50.0,2.0,0.0,30.0,26.0,0.0
50%,1.0,60.0,3.0,0.0,43.0,39.0,0.0
75%,2.0,80.0,4.0,0.0,56.0,52.75,1.0
max,2.0,213.0,14.0,1.0,69.0,69.0,1.0


# Столбцы:
---

In [104]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 840 entries, 0 to 839
Data columns (total 14 columns):
 #   Column                Non-Null Count  Dtype  
---  ------                --------------  -----  
 0   Weather               798 non-null    object 
 1   Road_Type             798 non-null    object 
 2   Time_of_Day           798 non-null    object 
 3   Traffic_Density       798 non-null    float64
 4   Speed_Limit           798 non-null    float64
 5   Number_of_Vehicles    798 non-null    float64
 6   Driver_Alcohol        798 non-null    float64
 7   Accident_Severity     798 non-null    object 
 8   Road_Condition        798 non-null    object 
 9   Vehicle_Type          798 non-null    object 
 10  Driver_Age            798 non-null    float64
 11  Driver_Experience     798 non-null    float64
 12  Road_Light_Condition  798 non-null    object 
 13  Accident              798 non-null    float64
dtypes: float64(7), object(7)
memory usage: 92.0+ KB


# Imports

In [52]:
import pandas as pd 
import numpy as np
import matplotlib.pyplot as plt

# Dataset

In [95]:
df = pd.read_csv('dataset_traffic_accident_prediction1.csv')
df

Unnamed: 0,Weather,Road_Type,Time_of_Day,Traffic_Density,Speed_Limit,Number_of_Vehicles,Driver_Alcohol,Accident_Severity,Road_Condition,Vehicle_Type,Driver_Age,Driver_Experience,Road_Light_Condition,Accident
0,Rainy,City Road,Morning,1.0,100.0,5.0,0.0,,Wet,Car,51.0,48.0,Artificial Light,0.0
1,Clear,Rural Road,Night,,120.0,3.0,0.0,Moderate,Wet,Truck,49.0,43.0,Artificial Light,0.0
2,Rainy,Highway,Evening,1.0,60.0,4.0,0.0,Low,Icy,Car,54.0,52.0,Artificial Light,0.0
3,Clear,City Road,Afternoon,2.0,60.0,3.0,0.0,Low,Under Construction,Bus,34.0,31.0,Daylight,0.0
4,Rainy,Highway,Morning,1.0,195.0,11.0,0.0,Low,Dry,Car,62.0,55.0,Artificial Light,1.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
835,Clear,Highway,Night,2.0,30.0,4.0,0.0,Low,Dry,Car,23.0,15.0,Daylight,0.0
836,Rainy,Rural Road,Evening,2.0,60.0,4.0,0.0,Low,Dry,Motorcycle,52.0,46.0,Daylight,1.0
837,Foggy,Highway,Evening,,30.0,4.0,1.0,High,Dry,Car,,34.0,Artificial Light,
838,Foggy,Highway,Afternoon,2.0,60.0,3.0,0.0,Low,Dry,Car,25.0,19.0,Artificial Light,0.0


# Removing unnecessary columns and clearing data
