# Synthetic European road freight transport flow data based on ETISplus

## Description

This dataset describes estimated European truck traffic flows between 1,675 regions all over Europe and is based on the publicly available ETISplus project from 2010 (DOI: 10.13140/RG.2.2.16768.25605). The project collected Europe-wide freight volumes and calibrated the resulting origin-destination matrices with real world traffic flows. For the current dataset,  the truck results of the ETISplus project were updated using current Eurostat data (https://ec.europa.eu/eurostat/web/transport/data/database). Additionally, a forecast was added for 2030. Using Dijkstra's algorithm, the freight flows were finally transferred to the European highway network. Therefore, the dataset provides a synthetically generated truck traffic volume for each road section. 
The dataset can be a basis for developing, planning and sizing future road infrastructure, such as charging infrastructure for electric trucks. 
The dataset consists of four files: 

### 01_Trucktrafficflow, 
### 02_NUTS-3-Regions, 
### 03_network-nodes, 
### 04_network-edges. 

All of them are stored as comma separated values with commas as column separators and dots as decimal separators. 

The main dataset 01_Trucktrafficflow describes 1,514,573 directed transport flows in fifteen columns: 

- (1) ID origin region, 
- (2) name origin region, 
- (3) ID destination region, 
- (4) name destination region, 
- (5) shortest path in the modeled E-road network, 
- (6) distance from origin region to the E-road network, 
- (7) distance within the E-road network, 
- (8) distance from the E-road network to the destination region, 
- (9) total distance, 
- (10) road freight flow in tons for 2010, 
- (11) road freight flow in tons for 2019, 
- (12) road freight flow in tons for 2030, 
- (13) truck traffic flow in number of vehicles for 2010, 
- (14) truck traffic flow in number of vehicles for 2019, 
- (15) truck traffic flow in number of vehicles for 2030. 


02_NUTS-3-Regions contains a list with the regions under investigation. 

03_network-nodes and 04_network-edges illustrate the highway network. 

The first contains the following information on each network node as columns: 

- (1) node ID, 
- (2) longitude of the location, 
- (3) latitude of the location, 
- (4) ID of the corresponding NUTS-3 region, 
- (5) country code. 

The second contains information on the edges: 

- (1) edge ID, 
- (2) information whether the edge is manually added or part of the original ETISplus dataset, 
- (3) length of the edge, 
- (4) ID endpoint A, 
- (5) ID endpoint B, 
- (6) number of trucks in 2019 (both directions), 
- (7) number of  trucks in 2030.

# Information

### Data Source: 
- The European Transport policy Information System (ETISplus) (DOI: 10.13140/RG.2.2.16768.25605, data: https://ftp.demis.nl/outgoing/etisplus/datadeliverables ) serves as a basis for the described dataset. 
In a first step, based on Eurostat data (https://ec.europa.eu/eurostat/web/transport/data/database), the national and international growth of freight transport volume was determined on a country-by-country basis between 2010 and 2019. 

### Data Year:

- Subsequently, the country-specific growth rates were used to scale the ETISplus data from 2010 to 2019. Since projections for 2030 vary extremely, the same growth was assumed through 2030 as in previous years. 

Afterwards, the freight volumes were then converted into vehicle trips using an average loading factor of 13.6 t and an empty trip share of 25%.

The road network is also based on the ETISplus project. The original network was filtered for highways, four-lane roads and smaller roads, which are part of the European road network. To ensure that all E-roads are part of the dataset, a comparison with the current E-road network was done by hand and missing edges were manually added. The routes relevant for long-distance traffic, where public refueling or charging infrastructure will be increasingly needed in the future, are thus mapped.  


Then, the vehicle trips were mapped to the road network. Using Dijkstra's algorithm implemented in the Python library NetworkX, shortest paths between die origin and destination regions were determined. Additionally, regional traffic was excluded, since the regional resolution is not high enough to represent them properly. The distance within the region, which cannot be estimated cleanly, would be higher than the distance in the defined road network. Therefore, routes where origin and destination are in the same region were excluded. In addition, routes that do not have a network node in the origin or destination region and are less than 50 km apart or are directly adjacency were deleted. 
Finally, for each individual edge in the network, it was calculated how many trucks would use it. 

# Institutions

Fraunhofer-Institut fur System und Innovationsforschung

# Categories

Transport Infrastructure, Road Transportation, Infrastructure, Road Freight, Road Network

In [1]:
import os
import sys
import numpy as np
import pandas as pd
import geopandas as gpd
from zoomin.data.constants import countries_dict
import matplotlib.pyplot as plt
from typing import Any

In [2]:
source = 'Mendeley Data'
year = 'Data scaled from years 2010-2019'

In [3]:
cwd = os.getcwd()
DATA_PATH = os.path.join(cwd, '..', '..', '..', 'data', 'input')

RAW_DATA_PATH = os.path.join(DATA_PATH, 'raw')
PROCESSED_DATA_PATH = os.path.join(DATA_PATH, 'processed') 

In [4]:
territorial_unit = input(
        'Please enter a character from: LAU, NUTS3, NUTS2, NUTS1, NUTS0, Europe')

# Create data directory

In [5]:
# for country_tag in countries_dict.values():
#     making_directory_path = os.path.join(PROCESSED_DATA_PATH, 'freight_traffic_flow_ETISPLUS_EUROSTAT', 'countries', f"{country_tag}")
#     os.mkdir(making_directory_path)

## Traffic Flow 

In [6]:
def set_up_traffic_flow_dataframe():
    """Prepaearing the traffic flow dataframe"""
    traffic_flow_PATH = os.path.join(RAW_DATA_PATH, 'freight_flow', 
                                    'Synthetic_European_road_freight_transport_flow_data_based_on_ETISplus_Mendeley Data', 
                                    "01_Trucktrafficflow.csv"
                                    )
    traffic_flow_df = pd.read_csv(traffic_flow_PATH, converters={'ID_origin_region': str, 'ID_destination_region': str})
    traffic_flow_df = traffic_flow_df.drop(
        [
        col
        for col in traffic_flow_df.columns
        if "ID_origin_region" not in col and "Name_origin_region" not in col and "ID_destination_region" not in col
        and "Name_destination_region" not in col and "Edge_path_E_road" not in col
        and "Distance_from_origin_region_to_E_road" not in col and "Distance_within_E_road" not in col
        and "Distance_from_E_road_to_destination_region" not in col and "Total_distance" not in col 
        and "Traffic_flow_trucks_2010" not in col and "Traffic_flow_trucks_2019" not in col 
        and "Traffic_flow_trucks_2030" not in col and "Traffic_flow_tons_2010" not in col 
        and "Traffic_flow_tons_2019" not in col and "Traffic_flow_tons_2030" not in col
        ],
        axis=1,
        )
    return traffic_flow_df

In [7]:
def set_up_traffic_flow_mean():
    """Get the mean for eah row of the entire dataframe"""
    traffic_flow_df = set_up_traffic_flow_dataframe()
    traffic_flow_df_mean = traffic_flow_df.drop(
        [
        col
        for col in traffic_flow_df.columns
        if "Distance_from_origin_region_to_E_road" not in col and "Distance_within_E_road" not in col
        and "Distance_from_E_road_to_destination_region" not in col and "Total_distance" not in col 
        and "Traffic_flow_trucks_2010" not in col and "Traffic_flow_trucks_2019" not in col 
        and "Traffic_flow_trucks_2030" not in col and "Traffic_flow_tons_2010" not in col 
        and "Traffic_flow_tons_2019" not in col and "Traffic_flow_tons_2030" not in col
        ],
        axis=1,
        )
    traffic_flow_df_mean = traffic_flow_df_mean.apply('mean')
    traffic_flow_df_mean = traffic_flow_df_mean.to_frame()
    traffic_flow_df_mean = traffic_flow_df_mean.T
    traffic_flow_df_mean = traffic_flow_df_mean.astype(int)
    traffic_flow_df_mean.head(10)
    return traffic_flow_df_mean 
    

In [8]:
# traffic_flow_df_mean = set_up_traffic_flow_mean()
# traffic_flow_df_mean.head(2)

## NUTS-3 Regions

In [9]:
def set_up_NUTS3_regions_dataframe():
    """Setting uo the dataframe with the NUTS3 georeference, it is referenced by the ETISPlus_Zone_ID"""
    ETISplus_zone_ID_gdf_destination = os.path.join(
        PROCESSED_DATA_PATH, 'freight_traffic_flow_ETISPLUS_EUROSTAT', 'countries',
        "ETISPLUS_Zone_ID_data.csv"  
    )
    NUTS_3_regions_PATH = os.path.join(RAW_DATA_PATH, 'freight_flow', 
                                       'Synthetic_European_road_freight_transport_flow_data_based_on_ETISplus_Mendeley Data', 
                                       "02_NUTS-3-Regions.csv"
                                       ) 
    if not os.path.exists(ETISplus_zone_ID_gdf_destination):
        NUTS3_gdf_eu27 = pd.read_csv(NUTS_3_regions_PATH, converters={'ETISPlus_Zone_ID': str})
        NUTS3_gdf_eu27 = NUTS3_gdf_eu27.drop(
            [
            col
            for col in NUTS3_gdf_eu27.columns
            if "ETISPlus_Zone_ID" not in col and "Country" not in col
            and "Geometric_center" not in col 
            ],
            axis=1,
            )
        NUTS3_gdf_eu27 = NUTS3_gdf_eu27.drop(
            ["Geometric_center_X", "Geometric_center_Y"], 
            axis=1,
            )
        NUTS3_gdf_eu27.to_csv(ETISplus_zone_ID_gdf_destination)
    else:
        NUTS3_gdf_eu27 = pd.read_csv(ETISplus_zone_ID_gdf_destination, converters={'ETISPlus_Zone_ID': str})
        NUTS3_gdf_eu27.reset_index(drop=True, inplace=True)
        NUTS3_gdf_eu27 = NUTS3_gdf_eu27.drop(
            [
            col
            for col in NUTS3_gdf_eu27.columns
            if "ETISPlus_Zone_ID" not in col and "Country" not in col
            and "Geometric_center" not in col 
            ],
            axis=1,
            )
    return NUTS3_gdf_eu27

In [10]:
def set_up_ETISPLUS_Zone_ID_regions_dataframe_per_country(country_tag):
    """set up the NUTS3 dataframe for each particular countrie"""
    NUTS3_gdf_destination = os.path.join(
        PROCESSED_DATA_PATH, 'freight_traffic_flow_ETISPLUS_EUROSTAT', 'countries', f"{country_tag}", 
        f"ETISPLUS_Zone_ID_{country_tag}_data.csv"  
    )
    NUTS3_gdf_eu27 = set_up_NUTS3_regions_dataframe()
    if not os.path.exists(NUTS3_gdf_destination): 
        NUTS3_gdf = NUTS3_gdf_eu27[NUTS3_gdf_eu27["Country"].str.contains(f"{country_tag}")]
        NUTS3_gdf = NUTS3_gdf.drop(
            [
            col
            for col in NUTS3_gdf.columns
            if "ETISPlus_Zone_ID" not in col and "Country" not in col
            and "Geometric_center" not in col
            ],
            axis=1,
            )
        NUTS3_gdf.to_csv(NUTS3_gdf_destination) 
    else:
        NUTS3_gdf = pd.read_csv(NUTS3_gdf_destination, converters={'ETISPlus_Zone_ID': str})
        NUTS3_gdf.reset_index(drop=True, inplace=True)
        NUTS3_gdf.drop(
            [
            col
            for col in NUTS3_gdf.columns
            if "ETISPlus_Zone_ID" not in col and "Country" not in col
            and "Distance_from_E_road_to_destination_region" not in col 
            ],
            axis=1,
            )
    return NUTS3_gdf

# Join Traffic_Flow Data Set with ORIGIN Geometric_center_origin and Geometric_center_destination regions 

In [11]:
def setup_polygon_for_point(territorial_unit: Any, country_tag: str) -> gpd.GeoDataFrame:
    """Get polygons geodataframe for each country at a territoriial unit."""
    polygon_shp_path = os.path.join(
        PROCESSED_DATA_PATH, "shapefiles", f"{territorial_unit}.shp"
    )
    polygon_gdf = gpd.read_file(polygon_shp_path, converters={'region_code': str})
    polygon_gdf = polygon_gdf[polygon_gdf["prnt_code"].str.contains(f"{country_tag}")]
    polygon_gdf.drop(
        [
            col
            for col in polygon_gdf.columns
            if "geometry" not in col and "code" not in col
        ],
        axis=1,
        inplace=True,
    )
    polygon_gdf.drop(
        [col for col in polygon_gdf.columns if col.startswith("prnt")],
        axis=1,
        inplace=True,
    )
    polygon_gdf.rename(columns={"code": "region_code"}, inplace=True)
    polygon_gdf.reset_index(drop=True, inplace=True)
    print(f"The number of polygon at {territorial_unit} level in {country_tag} are: ", len(polygon_gdf))
    return polygon_gdf

In [12]:
def get_point_for_freight_traffic_flow(country_tag, traffic_flow_df):
    """Join the traffic flow data set for each country with its geospatial ID"""
    NUTS3_gdf = set_up_ETISPLUS_Zone_ID_regions_dataframe_per_country(country_tag)
    join_df = NUTS3_gdf.set_index('ETISPlus_Zone_ID').join(traffic_flow_df.set_index('ID_origin_region'))
    join_origin_df = join_df.rename(columns = {'Country':'Country_origin', 'Geometric_center':'Geometric_center_origin'})
    # join_df = join_origin_df.set_index('ID_destination_region').join(NUTS3_gdf_eu27.set_index('ETISPlus_Zone_ID'))
    # join_df = join_df.rename(columns = {'Country':'Country_destination', 'Geometric_center':'Geometric_center_destination'})
    join_df = join_origin_df.drop(
        [
        col
        for col in join_origin_df.columns
        if "Country_origin" not in col and "Geometric_center_origin" not in col
        and "Distance_from_origin_region_to_E_road" not in col and "Distance_within_E_road" not in col
        and "Distance_from_E_road_to_destination_region" not in col 
        and "Total_distance" not in col and "Traffic_flow_trucks_2010" not in col
        and "Traffic_flow_trucks_2019" not in col and "Traffic_flow_trucks_2030" not in col
        and "Traffic_flow_tons_2010" not in col and "Traffic_flow_tons_2019" not in col 
        and "Traffic_flow_tons_2030" not in col 
        # and "Country_destination" not in col 
        # and "Geometric_center_destination" not in col
        ],
        axis=1,
        )
    join_df["Geometric_center_origin"] = gpd.GeoSeries.from_wkt(join_df["Geometric_center_origin"])
    point_gdf = gpd.GeoDataFrame(join_df, geometry="Geometric_center_origin")
    # point_gdf["Geometric_center_destination"] = gpd.GeoSeries.from_wkt(point_gdf["Geometric_center_destination"])
    # point_gdf = gpd.GeoDataFrame(point_gdf, geometry="Geometric_center_destination")
    return point_gdf

In [13]:
def overlap_point_and_polygon(
    territorial_unit: str, country_tag: str, traffic_flow_df
) -> gpd.GeoDataFrame:
    """Overlap Origin point over polygon data."""
    overlap_gdf_path_destination = os.path.join(
        PROCESSED_DATA_PATH,
        "freight_traffic_flow_ETISPLUS_EUROSTAT",
        "countries",
        f"{country_tag}",
        f"freight_traffic_flow_Overlap_df_{country_tag}_{territorial_unit}.csv",
    )
    if not os.path.exists(overlap_gdf_path_destination):
        point_gdf = get_point_for_freight_traffic_flow(
        country_tag,
        traffic_flow_df
        )
        point_gdf = point_gdf.set_crs(epsg=4326)
        polygon_gdf = setup_polygon_for_point(
        territorial_unit, 
        country_tag
        )
        if polygon_gdf.crs != 4326:
            polygon_gdf = polygon_gdf.to_crs(epsg=4326)
        overlap_gdf = gpd.sjoin(
            point_gdf, polygon_gdf, how="left", predicate="intersects"
        )
        print(
            f'The total number of overlaped points in "{country_tag}" are: ',
            len(overlap_gdf),
        )
        overlap_gdf = overlap_gdf.drop(
        [
        col
        for col in overlap_gdf.columns
        if "region_code" not in col 
        and "Distance_from_origin_region_to_E_road" not in col and "Distance_within_E_road" not in col
        and "Distance_from_E_road_to_destination_region" not in col 
        and "Total_distance" not in col and "Traffic_flow_trucks_2010" not in col
        and "Traffic_flow_trucks_2019" not in col and "Traffic_flow_trucks_2030" not in col
        and "Traffic_flow_tons_2010" not in col and "Traffic_flow_tons_2019" not in col 
        and "Traffic_flow_tons_2030" not in col 
        ],
        axis=1,
        )
        overlap_gdf_groupped = overlap_gdf.groupby(['region_code']).mean()
        overlap_gdf_groupped.reset_index(inplace=True)
        gdf = overlap_gdf_groupped.rename(columns = {'index':'region_code'})
        gdf.rename(columns={"region_code": f"{territorial_unit}_region_code"}, inplace=True)
        gdf.to_csv(overlap_gdf_path_destination)
    else:
        gdf = pd.read_csv(overlap_gdf_path_destination, converters={'region_code': str})
        gdf = gdf.drop(
        [
        col
        for col in gdf.columns
        if "region_code" not in col 
        and "Distance_from_origin_region_to_E_road" not in col and "Distance_within_E_road" not in col
        and "Distance_from_E_road_to_destination_region" not in col 
        and "Total_distance" not in col and "Traffic_flow_trucks_2010" not in col
        and "Traffic_flow_trucks_2019" not in col and "Traffic_flow_trucks_2030" not in col
        and "Traffic_flow_tons_2010" not in col and "Traffic_flow_tons_2019" not in col 
        and "Traffic_flow_tons_2030" not in col 
        ],
        axis=1,
        )
    print(
        f'The total number of "{territorial_unit}" with origin freight traffic flow in "{country_tag}" are: ',
        len(gdf),
    )
    return gdf

In [14]:
def get_freight_traffic_flow_eu27(
    territorial_unit: str, country_tag: str
) -> gpd.GeoDataFrame:
    """Get the freight traffic flow data for the eu countries."""
    overlap_df_path_destination = os.path.join(
        PROCESSED_DATA_PATH,
        "freight_traffic_flow_ETISPLUS_EUROSTAT",
        "countries",
        f"freight_traffic_flow_Overlap_df_{territorial_unit}.csv",
    )
    if not os.path.exists(overlap_df_path_destination):
        overlap_df_list = []
        for country_tag in countries_dict.values():
            overlap_df_path_source = os.path.join(
                PROCESSED_DATA_PATH,
                "freight_traffic_flow_ETISPLUS_EUROSTAT",
                "countries",
                f"{country_tag}",
                f"freight_traffic_flow_Overlap_df_{country_tag}_{territorial_unit}.csv",
            )
            if os.path.exists(overlap_df_path_source):
                overlap_df = pd.read_csv(overlap_df_path_source)
                overlap_df_list.append(overlap_df)
        overlap_df = pd.concat(overlap_df_list)
        print(
            f'The total number of "{territorial_unit}" regions mapped in the EU27 are: ',
            len(overlap_df),
        )
        overlap_df = overlap_df.drop(
        [
        col
        for col in overlap_df.columns
        if "region_code" not in col 
        and "Distance_from_origin_region_to_E_road" not in col and "Distance_within_E_road" not in col
        and "Distance_from_E_road_to_destination_region" not in col 
        and "Total_distance" not in col and "Traffic_flow_trucks_2010" not in col
        and "Traffic_flow_trucks_2019" not in col and "Traffic_flow_trucks_2030" not in col
        and "Traffic_flow_tons_2010" not in col and "Traffic_flow_tons_2019" not in col 
        and "Traffic_flow_tons_2030" not in col 
        ],
        axis=1,
        )
        overlap_df.to_csv(overlap_df_path_destination)
    else:
        overlap_df = pd.read_csv(overlap_df_path_destination)
        overlap_df = overlap_df.drop(
        [
        col
        for col in overlap_df.columns
        if "region_code" not in col 
        and "Distance_from_origin_region_to_E_road" not in col and "Distance_within_E_road" not in col
        and "Distance_from_E_road_to_destination_region" not in col 
        and "Total_distance" not in col and "Traffic_flow_trucks_2010" not in col
        and "Traffic_flow_trucks_2019" not in col and "Traffic_flow_trucks_2030" not in col
        and "Traffic_flow_tons_2010" not in col and "Traffic_flow_tons_2019" not in col 
        and "Traffic_flow_tons_2030" not in col 
        ],
        axis=1,
        )
        print(
            f'The total number of "{territorial_unit}" regions with origin traffic flow trjectories in the EU27 are: ',
            len(overlap_df),
        )
    return overlap_df

In [15]:
traffic_flow_df = set_up_traffic_flow_dataframe()
for country_tag in countries_dict.values():
    gdf = overlap_point_and_polygon(territorial_unit, country_tag, traffic_flow_df)

The number of polygon at LAU level in BE are:  589
The total number of overlaped points in "BE" are:  56047
The total number of "LAU" with origin freight traffic flow in "BE" are:  44
The number of polygon at LAU level in EL are:  6134
The total number of overlaped points in "EL" are:  0
The total number of "LAU" with origin freight traffic flow in "EL" are:  0
The number of polygon at LAU level in LT are:  60
The total number of overlaped points in "LT" are:  11883
The total number of "LAU" with origin freight traffic flow in "LT" are:  10
The number of polygon at LAU level in PT are:  3092
The total number of overlaped points in "PT" are:  24396
The total number of "LAU" with origin freight traffic flow in "PT" are:  30
The number of polygon at LAU level in BG are:  265
The total number of overlaped points in "BG" are:  23301
The total number of "LAU" with origin freight traffic flow in "BG" are:  28
The number of polygon at LAU level in ES are:  8131
The total number of overlaped po

In [16]:
gdf.sample(15)

Unnamed: 0,LAU_region_code,Distance_from_origin_region_to_E_road,Distance_within_E_road,Distance_from_E_road_to_destination_region,Total_distance,Traffic_flow_trucks_2010,Traffic_flow_trucks_2019,Traffic_flow_trucks_2030,Traffic_flow_tons_2010,Traffic_flow_tons_2019,Traffic_flow_tons_2030
59,1006201390103,11.0,1727.823285,124.558559,1863.381843,223.861746,304.393624,467.993763,3044.519751,4139.753292,6364.715177
25,1003021011203,19.0,1276.557613,125.527435,1421.085048,751.555213,969.889403,1381.017661,10221.150892,13190.495885,18781.840192
16,1002301581003,14.0,1370.422995,125.244145,1509.66714,662.073279,807.962207,1047.181512,9004.196593,10988.286018,14241.668559
61,1007141291701,11.0,1487.602797,126.855944,1625.458741,312.103147,398.423077,557.658217,4244.602797,5418.553846,7584.151748
36,1004221410102,35.0,1486.893437,124.711362,1646.604799,900.651905,1094.314573,1407.176253,12248.865914,14882.678193,19137.597036
52,1006061121410,8.0,1579.96813,126.689802,1714.657932,337.46813,421.009207,567.256551,4589.566572,5725.725212,7714.689093
56,1006181360804,29.0,1574.201797,125.818245,1729.020041,196.826192,272.059433,427.563062,2676.836213,3700.008293,5814.857636
12,1001241511002,2.0,1381.947777,125.695836,1509.643613,685.284051,852.776994,1143.990826,9319.863091,11597.767114,15558.275229
29,1003021056401,4.0,1286.453015,126.159187,1416.612202,983.409888,1194.957924,1536.718969,13374.374474,16251.42777,20899.37798
33,1004041070402,5.0,1477.287805,125.441115,1607.72892,842.387631,1041.531359,1381.77439,11456.471777,14164.826481,18792.131707


In [17]:
overlap_gdf = get_freight_traffic_flow_eu27(territorial_unit, country_tag)
print(overlap_gdf.sample(30))

The total number of "LAU" regions mapped in the EU27 are:  1137
    LAU_region_code  Distance_from_origin_region_to_E_road  \
5           1053095                                    5.0   
11            35005                                    3.0   
2                39                                    8.0   
18           GM0984                                    3.0   
156         7231012                                    7.0   
19           179551                                    7.0   
26           GM1721                                    4.0   
22            60350                                    7.0   
55    1006181351606                                    3.0   
41            91114                                    1.0   
22            23091                                    4.0   
261         9363000                                    2.0   
190         8128139                                    6.0   
232         9176165                                   10.0   
14    

In [18]:
len(overlap_gdf)

1137

In [19]:
# merged_gdf_origin_DE_all_dest = merged_gdf_origin_DE_all_dest.apply('mean')
# merged_gdf_origin_DE_all_dest = merged_gdf_origin_DE_all_dest.to_frame()
# merged_gdf_origin_DE_all_dest = merged_gdf_origin_DE_all_dest.T
# merged_gdf_origin_DE_all_dest = merged_gdf_origin_DE_all_dest.astype(int)
# merged_gdf_origin_DE_all_dest.head(10)

In [20]:
# merged_gdf_1.rename(columns = {'Distance_from_origin_region_to_E_road':'Distance_from_origin_region_to_E_road_DE',
#                                               'Distance_within_E_road':'Distance_within_E_road_DE',
#                                               'Distance_from_E_road_to_destination_region':'Distance_from_E_road_to_destination_region_DE',
#                                               'Total_distance':'Total_distance_DE',
#                                               'Traffic_flow_trucks_2010':'Traffic_flow_trucks_2010_DE',
#                                               'Traffic_flow_trucks_2019':'Traffic_flow_trucks_2019_DE',
#                                               'Traffic_flow_trucks_2030':'Traffic_flow_trucks_2030_DE',
#                                               'Traffic_flow_tons_2010':'Traffic_flow_tons_2010_DE',
#                                               'Traffic_flow_tons_2019':'Traffic_flow_tons_2019_DE',
#                                               'Traffic_flow_tons_2030':'Traffic_flow_tons_2030_DE'}, inplace = True)

In [21]:
# merged_gdf_1.head()

In [22]:
# traffic_flow_df_distance_DE = merged_gdf_1.drop(merged_gdf_1.columns[[4, 5, 6, 7, 8, 9]], axis=1)
# traffic_flow_df_distance_DE.head()

In [23]:
# distance_df = pd.concat([traffic_flow_df_distance, traffic_flow_df_distance_DE], axis=1)

In [24]:
# distance_df.head()

In [25]:
# plotdata = distance_df.sort_values('Total_distance')
# plotdata.plot.bar(rot=0, figsize =(8, 10))
# plt.title("Regional transport distance EU & DE")
# y = ['Distance_from_origin_region_to_E_road']
# plt.legend(bbox_to_anchor =(0.5,-0.27), loc='lower center')
# # plotdata.xaxis.set_visible(False) # same for y axis.
# # plt.legend(bbox_to_anchor =(0.65, 1.25))
# # plotdata.legend(title='Locations',title_fontsize=30,loc='center left', bbox_to_anchor=(1, 0.5))
# # plt.xlabel("Locations")
# plt.ylabel("Distance in [km]")

In [26]:
# traffic_flow_df_trucks_DE = merged_gdf_1.drop(merged_gdf_1.columns[[0, 1, 2, 3, 7, 8, 9]], axis=1)
# traffic_flow_df_trucks_DE.head()

In [27]:
# truffick_flow_trucks_total = pd.concat([traffic_flow_df_trucks, traffic_flow_df_trucks_DE], axis=1)
# truffick_flow_trucks_total.head()

In [28]:
# plt.rcParams["figure.figsize"] = [7.50, 3.50]
# plt.rcParams["figure.autolayout"] = True

In [29]:
# plotdata = truffick_flow_trucks_total.sort_values('Traffic_flow_trucks_2030')
# plotdata.plot.bar(rot=0, figsize =(7, 9), align='center', width=0.4)
# plt.title("Regional transport traffic flow EU & DE")
# # y = ['# of trucks']
# plt.xlabel("Traffic Flow EU-DE -> 2010/2019/2030")
# plt.ylabel("# of Trucks")

In [30]:
# traffic_flow_df_tones_DE = merged_gdf_1.drop(merged_gdf_1.columns[[0, 1, 2, 3, 4, 5, 6]], axis=1)
# traffic_flow_df_tones_DE.head()

In [31]:
# traffic_flow_tones = pd.concat([traffic_flow_df_tones, traffic_flow_df_tones_DE], axis=1)
# traffic_flow_tones.head()

In [32]:
# plotdata = traffic_flow_tones.sort_values('Traffic_flow_tons_2030')
# plotdata.plot.bar(rot=0, figsize =(7, 9), align='center', width=0.4)
# plt.title("Regional transport traffic flow EU & DE")
# # y = ['Tones of freight transport']
# plt.xlabel("Traffic Flow EU-DE -> 2010/2019/2030")
# plt.ylabel("Tones of freight transport")

MERGED Data Set with ORIGIN NUTS-3 regions 


In [33]:
# merged_gdf_2 = NUTS_3_regions_gdf.set_index('ETISPlus_Zone_ID').join(traffic_flow_df.set_index('ID_destination_region'))
# merged_gdf_2.head()

In [34]:
# len(merged_gdf_2)

In [35]:
# merged_gdf_2 = merged_gdf_2.drop(merged_gdf_2.columns[[0, 1, 2, 3, 4, 5, 6]], axis=1)

In [36]:
# av_column_2 = merged_gdf_2.mean(axis=0)
# print(av_column_2)

## Network Nodes

In [37]:
cwd = os.getcwd()

Networks_nodes_PATH = os.path.join(RAW_DATA_PATH, 'freight_flow', 'Synthetic_European_road_freight_transport_flow_data_based_on_ETISplus_Mendeley Data', "03_network-nodes.csv") 

Networks_nodes_df = pd.read_csv(Networks_nodes_PATH, converters={'ETISPlus_Zone_ID': str})

Networks_nodes_df.sample(n=10)

Unnamed: 0.1,Unnamed: 0,Network_Node_ID,Network_Node_X,Network_Node_Y,ETISplus_Zone_ID,Country
15337,15337,198670,19.634474,43.360544,148011800,RS
406,406,102830,3.935428,43.609631,112080103,FR
2573,2573,116642,5.772,52.7113,124020300,NL
7550,7550,105165,10.474823,43.834418,118140102,IT
1127,1127,100930,-2.683552,42.860768,110020101,ES
2083,2083,101481,1.307266,41.661401,110050103,ES
9590,9590,106308,15.603674,46.398812,130000102,SI
7720,7720,125578,13.830327,45.660789,118130404,IT
15454,15454,121613,24.127921,56.937837,122000006,LV
2753,2753,101800,-1.245704,37.980449,110060200,ES


## Network_edges

In [38]:
cwd = os.getcwd()

Networks_edges_PATH = os.path.join(RAW_DATA_PATH, 'freight_flow', 'Synthetic_European_road_freight_transport_flow_data_based_on_ETISplus_Mendeley Data', "04_network-edges.csv") 

Networks_edges_df = pd.read_csv(Networks_edges_PATH, converters={'ETISPlus_Zone_ID': str})

Networks_edges_df.sample(n=10)

Unnamed: 0.1,Unnamed: 0,Network_Edge_ID,Manually_Added,Distance,Network_Node_A_ID,Network_Node_B_ID,Traffic_flow_trucks_2019,Traffic_flow_trucks_2030
1459,1459,2502601,0,0.601,251861,251862,68396,110048
8583,8583,1035523,0,4.918,123861,107164,2512770,3177830
5440,5440,1018645,0,3.82,197008,112873,2366742,3095662
5777,5777,1300376,0,1.774,196523,196526,0,0
17320,17320,1002059,0,6.045,101714,101710,1058596,1347866
17192,17192,1002313,0,6.275,101897,101898,1045619,1468620
7930,7930,1008325,0,8.473,126415,105997,330198,243580
7529,7529,1000263,0,8.119,180317,100985,1937258,2592271
15026,15026,1008189,0,4.66,106150,106149,0,0
11343,11343,1027452,0,3.685,115395,198439,0,0
