<a href="https://colab.research.google.com/github/GuiBatalhoti/Dados_ANTT/blob/main/notebook.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Análise dos dados abertos da ANTT

Os dados utilizados nesse notebook vêm da página de Dados Abertos da ANTT (Agência Nacional de Transportes Terrestres), mais especificamente do grupo "Rodovias", disponível no [link](https://dados.antt.gov.br/group/rodovias]). Para saber mais sobre o Portal de Dados Abertos da ANTT veja o seguinte [link](https://dados.antt.gov.br/about).

## Objetivo

O principal objetivo desse notebook é criar vizualizações para verificar a consistência dos dados fornecidos à ANTT pelas concessionárias de cada trecho de rodovia. Por exemplo, verficar os traçados das rodovias.

# Análises e Vizualizações


## Observações sobre o notebook

Como o notebook está sendo desenvolvido via Google Colab, os dados estão no Google Drive.

In [5]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
import pandas as pd
import geopandas as gpd
import numpy as np
# import matplotlib.pyplot as plt
# import seaborn as sns
# import plotly.express as px
import folium
import shapely.geometry as geom

In [None]:
import warnings
warnings.filterwarnings("ignore", category=FutureWarning)

In [None]:
layout_df = pd.read_csv('data/layout/dados_tracado.csv', sep=';', encoding='latin-1')
layout_df.head()

In [None]:
layout_df["ano_do_pnv_snv"] = layout_df['ano_do_pnv_snv'].astype(int)
layout_df["km_m_inicial"] = layout_df['km_m_inicial'].str.replace(',', '.').astype(float)
layout_df["km_m_final"] = layout_df['km_m_final'].str.replace(',','.').astype(float)
layout_df["latitude_inicial"] = layout_df['latitude_inicial'].str.replace(',', '.').astype(float)
layout_df["longitude_inicial"] = layout_df['longitude_inicial'].str.replace(',', '.').astype(float)
layout_df["latitude_final"] = layout_df['latitude_final'].str.replace(',', '.').astype(float)
layout_df["longitude_final"] = layout_df['longitude_final'].str.replace(',', '.').astype(float)

In [None]:
multipoint = lambda x: geom.MultiPoint([(x['longitude_inicial'], x['latitude_inicial']), (x['longitude_final'], x['latitude_final'])])

layout_df['geometry'] = layout_df.apply(multipoint, axis=1)
layout_gdf = gpd.GeoDataFrame(layout_df, geometry='geometry')
layout_gdf.crs = "EPSG:4326"
layout_gdf = layout_gdf.drop(columns=['latitude_inicial', 'longitude_inicial', 'latitude_final', 'longitude_final'])
layout_gdf.head()

In [None]:
layout_gdf.explore()

This data is only aboout the start and end points of the curvatures. Therefore, we need to get the whole road data.

In [None]:
road_df = pd.read_csv('data/highwayKM/dados_dos_quilometro_principal.csv', sep=';', encoding='latin-1')
road_df.head(10)

In [None]:
road_df.info()

Changing the data types.

In [None]:
road_df['km_m'] = road_df['km_m'].str.replace(',', '.').astype(float)
road_df['ano_do_pnv_snv'] = road_df['ano_do_pnv_snv'].astype(int)
road_df['latitude'] = road_df['latitude'].str.replace(',', '.').astype(float)
road_df['longitude'] = road_df['longitude'].str.replace(',', '.').astype(float)

print(road_df.info())
road_df.head(10)

Changing to GeoDataFrame.

In [None]:
def get_road_lat_long(df: pd.DataFrame) -> pd.DataFrame:
    df_return = {
        "concessionaria": [],
        "rodovia": [],
        "ano_do_pnv_snv": [],
        "km" : [],
        "sentido": [],
        "geometry": []
    }
    road = df['rodovia'][0]
    dealership = df['concessionaria'][0]
    year = df['ano_do_pnv_snv'][0]
    direction = df['sentido'][0]
    line_points = []
    for _, row in road_df.iterrows():
        if row['rodovia'] != road:
            df_return['concessionaria'].append(dealership)
            df_return['rodovia'].append(road)
            df_return['ano_do_pnv_snv'].append(year)
            df_return['km'].append(df.groupby(['rodovia']).get_group(road)['km_m'].max())
            df_return['sentido'].append(direction)
            df_return['geometry'].append(geom.LineString(line_points))

            road = row['rodovia']
            dealership = row['concessionaria']
            year = row['ano_do_pnv_snv']
            direction = row['sentido']
            line_points = []

        line_points.append((row['longitude'], row['latitude']))

    return gpd.GeoDataFrame(df_return)

In [None]:
road_gdf = get_road_lat_long(road_df)
road_gdf.crs = "EPSG:4326"
road_gdf.head(10)

Now lets put the roads on a map.

In [None]:
road_gdf.explore()

The data is not ordered in a way that we can plot the roads, the way of the road is not continuous. For exemple, the road goes from point 1 to point 2 in "ascengin" way and from 2 to 1 in "descending" way.

Ordering before changing to GeoDataFrame.

In [None]:
road_df_group = road_df.sort_values(by=['rodovia', 'km_m', 'sentido'])
road_df_group.head()

In [None]:
road_gdf_group = get_road_lat_long(road_df_group)
road_gdf_group.head(10)

In [None]:
road_gdf_group.explore()

In [None]:
map = folium.Map(location=[-15.788497, -47.879873], zoom_start=4)

colors = ['blue', 'red', 'green', 'purple', 'orange', 'darkred', 'lightred', 'darkblue', 'darkgreen', 'cadetblue', 'darkpurple', 'pink', 'lightblue', 'lightgreen', "gray", 'black']
road = road_df_group['rodovia'].iloc[0]
dealership = road_df_group['concessionaria'].iloc[0]
line_points = []
for index, row in road_df_group.iterrows():
    if row['rodovia'] != road:
        folium.PolyLine(locations=line_points, tooltip=f"Road: {road}, Dealership: {dealership}", color=colors[np.random.randint(0, len(colors))], weight=5).add_to(map)
        line_points = []
        road = row['rodovia']
        dealership = row['concessionaria']
    line_points.append((row['latitude'], row['longitude']))


map