# Guided Project: Finding Heavy Traffic Indicators on I-94

- O objetivo é analisar um conjunto de dados sobre o tráfego no sentido oeste na rodovia interestadual I-94. Nosso desafio principal nesta análise é determinar alguns indicadores de tráfego intenso na I-94. Esses indicadores podem ser o tipo de clima, a hora do dia, a hora da semana, etc. Por exemplo, podemos descobrir que o trânsito costuma ser mais intenso no verão ou quando neva. 

### Conjunto de dados de tráfego I-94

For more details, check the [dataset data dictionary](https://archive.ics.uci.edu/dataset/492/metro+interstate+traffic+volume).

In [None]:
import pandas as pd
df = pd.read_csv('Metro_Interstate_Traffic_Volume.csv')
df.info()

In [None]:
# Analisando o volume de tráfego

import matplotlib.pyplot as plt
plt.hist(df['traffic_volume'])
plt.show()
df['traffic_volume'].describe()

In [None]:
# Volume de tráfego: dia x noite

df['date_time'] = pd.to_datetime(df['date_time'])
filter_day = df.copy()[(df['date_time'].dt.hour >= 7) & (df['date_time'].dt.hour < 19)]
print(filter_day.shape)
filter_night = df.copy()[(df['date_time'].dt.hour >= 19) | (df['date_time'].dt.hour < 7)]
print(filter_night.shape)

# Para filtrar o período noturno, você está utilizando a condição (df['date_time'].dt.hour >= 19) | (df['date_time'].dt.hour < 7). Aqui, você está usando o operador lógico OR (|) porque o período noturno inclui horas que são maiores ou iguais a 19, bem como horas menores que 7. Usar o operador OR nessa situação garante que qualquer hora após 19h ou qualquer hora anterior a 7h seja considerada como parte do período noturno.

In [None]:
plt.figure(figsize=(11,3.5))

plt.subplot(1, 2, 1)
plt.hist(filter_day['traffic_volume'])
plt.xlim(-100, 7500)
plt.ylim(0, 8000)
plt.title('Traffic Volume: Day')
plt.ylabel('Frequency')
plt.xlabel('Traffic Volume')

plt.subplot(1, 2, 2)
plt.hist(filter_night['traffic_volume'])
plt.xlim(-100, 7500)
plt.ylim(0, 8000)
plt.title('Traffic Volume: Night')
plt.ylabel('Frequency')
plt.xlabel('Traffic Volume')

plt.show()

In [None]:
filter_day['traffic_volume'].describe()

In [None]:
filter_night['traffic_volume'].describe()

In [None]:
# From this cell, if you are using the jupyter extension in vscode, the code may present problems. I suggest running it in Jupyter LAB.

filter_day['month'] = filter_day['date_time'].dt.month
by_month = filter_day.groupby('month').mean()
by_month['traffic_volume'].plot.line()
plt.show()

In [None]:
filter_day['year'] = filter_day['date_time'].dt.year
only_july = filter_day[filter_day['month'] == 7]
only_july.groupby('year').mean()['traffic_volume'].plot.line()
plt.show()

In [None]:
filter_day['dayofweek'] = filter_day['date_time'].dt.dayofweek
by_dayofweek = filter_day.groupby('dayofweek').mean()
by_dayofweek['traffic_volume'].plot.line()
plt.show()

In [None]:
filter_day['hour'] = filter_day['date_time'].dt.hour
bussiness_days = day.copy()[filter_day['dayofweek'] <= 4] # 4 == Friday
weekend = filter_day.copy()[filter_day['dayofweek'] >= 5] # 5 = Saturday
by_hour_business = bussiness_days.groupby('hour').mean()
by_hour_weekend = weekend.groupby('hour').mean()


plt.figure(figsize=(11,3.5))

plt.subplot(1, 2, 1)
by_hour_business['traffic_volume'].plot.line()
plt.xlim(6,20)
plt.ylim(1500,6500)
plt.title('Traffic Volume By Hour: Monday–Friday')

plt.subplot(1, 2, 2)
by_hour_weekend['traffic_volume'].plot.line()
plt.xlim(6,20)
plt.ylim(1500,6500)
plt.title('Traffic Volume By Hour: Weekend')

plt.show()

## Weather Indicators

In [None]:
filter_day.corr()['traffic_volume']

In [None]:
filter_day.plot.scatter('traffic_volume', 'temp')
plt.ylim(230, 320) # two wrong 0K temperatures mess up the y-axis
plt.show()

In [None]:
by_weather_main = day.groupby('weather_main').mean()
by_weather_main['traffic_volume'].plot.barh()
plt.show()

In [None]:
by_weather_description = day.groupby('weather_description').mean()
by_weather_description['traffic_volume'].plot.barh(figsize=(5,10))
plt.show()