# Istanbul Traffic Density Map

For this project, I am going to use Istanbul Municipality Data Service. Specifically, I am going to look at the 'Hourly Traffic Density Data Set' dataset.

My aims in doing this project:

- Calculating average speeds and average number of vehicles.
- Finding the highest and lowest average speeds, calculating the situations with the highest average number of vehicles.
- Creating a heat map of the traffic density in Istanbul using the Istanbul Municipality's Traffic Data Set.

In [1]:
import pandas as pd
import folium
from folium.plugins import HeatMap
from IPython.display import IFrame

In [2]:
api_url = "https://data.ibb.gov.tr/dataset/3ee6d744-5da2-40c8-9cd6-0e3e41f1928f/resource/7b9a35a7-dc9c-4044-b117-1c0003104630/download/traffic_density_202309.csv"
data = pd.read_csv(api_url)
traffic_data = pd.DataFrame(data)

traffic_data

Unnamed: 0,DATE_TIME,LONGITUDE,LATITUDE,GEOHASH,MINIMUM_SPEED,MAXIMUM_SPEED,AVERAGE_SPEED,NUMBER_OF_VEHICLES
0,2023-09-01 00:00:00,29.317017,40.921326,sxkbg1,8,121,65,95
1,2023-09-01 00:00:00,29.163208,40.915833,sxk8z8,4,84,29,18
2,2023-09-01 00:00:00,29.130249,41.130066,sxk9yz,48,118,70,15
3,2023-09-01 00:00:00,29.086304,41.009216,sxk9mc,4,157,56,158
4,2023-09-01 00:00:00,28.811646,40.992737,sxk3pw,13,111,67,110
...,...,...,...,...,...,...,...,...
1640764,2023-09-30 23:00:00,29.020386,41.206970,sxkdkm,32,50,40,5
1640765,2023-09-30 23:00:00,28.855591,41.042175,sxk92x,2,39,18,44
1640766,2023-09-30 23:00:00,29.317017,40.992737,sxkc5n,61,153,92,28
1640767,2023-09-30 23:00:00,29.020386,40.998230,sxk9hr,7,50,22,14


In the code below, I first grouped according to the "GEOHASH" column and then used the first function to retrieve the first values of the "LONGITUDE" and "LATITUDE" columns. This action reduced the number of rows. Then, I used the mean function to average the "AVERAGE_SPEED" and "NUMBER_OF_VEHICLES" columns for the remaining rows.

And also, I used the .reset_index() command to make the code look more organized.

I stored the data I obtained in "aggregated_traffic_data".

In [3]:
aggregated_traffic_data = traffic_data.groupby('GEOHASH').agg({
    'LONGITUDE': 'first',
    'LATITUDE': 'first',
    'AVERAGE_SPEED': 'mean',
    'NUMBER_OF_VEHICLES': 'mean'
}).reset_index()

aggregated_traffic_data

Unnamed: 0,GEOHASH,LONGITUDE,LATITUDE,AVERAGE_SPEED,NUMBER_OF_VEHICLES
0,sx7chk,27.965698,40.981750,84.105187,12.887608
1,sx7chm,27.965698,40.987244,70.066092,24.155172
2,sx7cht,27.976685,40.987244,36.829023,21.455460
3,sx7chw,27.976685,40.992737,67.688218,27.090517
4,sx7chx,27.976685,40.998230,70.951149,27.387931
...,...,...,...,...,...
2451,sxm41s,29.602661,41.157532,71.744409,6.143770
2452,sxm41u,29.613647,41.157532,58.456486,5.213465
2453,sxm445,29.624634,41.152039,73.248214,4.132143
2454,sxm44h,29.624634,41.157532,69.869410,4.334526


This function helps us to find maximums and minimums in dataset.

In [4]:
def extremes_finder(data_frame, column_name, is_smallest=True):
    if is_smallest:
        result = data_frame.nsmallest(25, column_name)
    else:
        result = data_frame.nlargest(25, column_name)
    
    result = result.reset_index().set_index(pd.Index(range(1, 26)))
    return result

In the next three code blocks, I called the function to perform the desired operations.

In [5]:
highest_avg_vehicles = extremes_finder(aggregated_traffic_data, 'NUMBER_OF_VEHICLES', is_smallest=False)

highest_avg_vehicles

Unnamed: 0,index,GEOHASH,LONGITUDE,LATITUDE,AVERAGE_SPEED,NUMBER_OF_VEHICLES
1,591,sxk3xe,28.811646,41.064148,49.714491,432.842181
2,1521,sxk9pq,29.152222,40.992737,40.444763,423.500717
3,1902,sxkbgk,29.328003,40.937805,63.734577,411.017217
4,590,sxk3xd,28.811646,41.058655,49.022956,406.098996
5,1755,sxkb6p,29.273071,40.866394,45.856528,396.345768
6,1897,sxkbge,29.338989,40.932312,67.466284,392.691535
7,1171,sxk985,28.833618,41.064148,53.645624,392.296987
8,1500,sxk9nx,29.119263,40.99823,58.395983,384.121951
9,410,sxk3k8,28.67981,41.003723,48.308465,377.347202
10,463,sxk3py,28.822632,40.992737,53.578192,374.870875


In [6]:
highest_average_speed = extremes_finder(aggregated_traffic_data, 'AVERAGE_SPEED', is_smallest=False)

highest_average_speed

Unnamed: 0,index,GEOHASH,LONGITUDE,LATITUDE,AVERAGE_SPEED,NUMBER_OF_VEHICLES
1,2117,sxkc84,29.185181,41.058655,115.186275,3.862745
2,2114,sxkc81,29.185181,41.053162,111.0767,17.068017
3,1723,sxk9x3,29.152222,41.053162,109.841642,7.304985
4,2119,sxkc90,29.229126,41.047668,109.438218,49.344828
5,2120,sxkc92,29.240112,41.047668,108.837644,59.109195
6,2118,sxkc88,29.207153,41.047668,107.906609,14.170977
7,2068,sxkc3p,29.229126,41.042175,107.781609,18.507184
8,2058,sxkc2x,29.207153,41.042175,107.53592,42.704023
9,2116,sxkc83,29.196167,41.053162,106.664275,32.032999
10,2059,sxkc2z,29.21814,41.042175,106.054598,52.251437


In [7]:
lowest_average_speed = extremes_finder(aggregated_traffic_data, 'AVERAGE_SPEED', is_smallest=True)

lowest_average_speed

Unnamed: 0,index,GEOHASH,LONGITUDE,LATITUDE,AVERAGE_SPEED,NUMBER_OF_VEHICLES
1,1150,sxk97b,28.998413,41.003723,12.0,1.0
2,1143,sxk973,28.97644,41.009216,12.280172,13.45977
3,1144,sxk974,28.965454,41.014709,13.932253,8.063328
4,2301,sxkdht,29.031372,41.163025,14.748634,1.180328
5,2248,sxkd79,28.987427,41.184998,15.358974,1.866667
6,1091,sxk93p,28.877563,41.042175,15.446839,34.725575
7,1367,sxk9hk,29.020386,40.98175,15.705357,5.471726
8,1009,sxk90p,28.833618,40.99823,16.024425,47.883621
9,1351,sxk9gb,28.998413,41.091614,16.040172,34.404591
10,134,sx7ghs,27.976685,41.333313,16.074803,1.374016


In the code block below, I prepared the data I needed to create a heat map by pulling it from the traffic_data dataset.

In [8]:
heat_map_data = traffic_data[['LONGITUDE','LATITUDE','NUMBER_OF_VEHICLES']]

To determine where the map will be located, I averaged the LATITUDE and LONGITUDE columns and added them to map_center as latitude and longitude information.

In [9]:
map_center = [heat_map_data['LATITUDE'].mean(), heat_map_data['LONGITUDE'].mean()]

Then I started creating maps with the folium library. I assigned location to the map_center variable that I set before. zoom_start is the variable that determines how close the map will be when it is first opened. I tried the values 6, 8, 9, 10, and 12 one by one for zoom_start, and the value 10 was the value that best focused on the whole Istanbul.

In [10]:
heat_map = folium.Map(location = map_center, zoom_start = 10)

The following code traverses the df_heat_map_data DataFrame line by line and creates a sublist with LATITUDE, LONGITUDE, and NUMBER_OF_VEHICLES values. Each sublist represents a new data point. As a result, a map is created with these data points.

In [11]:
heat_data = [[row['LATITUDE'], row['LONGITUDE'], row['NUMBER_OF_VEHICLES']] for index, row in heat_map_data.iterrows()]

This code creates a thermal map object using the list created above. The .add_to(heat_map) command places the thermal map on the specified map.

In [12]:
HeatMap(heat_data).add_to(heat_map)

<folium.plugins.heat_map.HeatMap at 0x22f3e7ae750>

Then I saved the created map as an html file.

In [13]:
heat_map.save("heatmap.html")

Since I had problems about displaying the map on github, here is the link that you can see the result of this notebook. Istanbul Traffic Density Map: https://web.itu.edu.tr/citak20/heatmap/