<a href="https://colab.research.google.com/github/garner1/Agile_Data_Code/blob/master/demo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Dataset description

Telia, in association with [Unacast](https://www.unacast.com/), leverage it's telecom network to produce estimates of the general population movement patterns, [Crowd Insights](https://iot.teliacompany.com/crowdinsights.html). 
To help stimulate innovation in the current Corronavirus crisis situation, Telia decided to make this dataset available for this Hackathon.

## Dwells activity counts

We provide counts of how many people dwelled within an area (Administrative level 4) over the whole municipality of Stockholm. Where a dwell is defined as staying in the location for over 15min. This is calculated over every 1h window, between 2020-03-11 and 2020-03-25.

Note: Of course all signals are annonymised and aggregated, so that no personal information can be leaked.   

# How to use this dataset

In [0]:
import pandas as pd

## Download the dataset to your own Google Drive account

https://drive.google.com/open?id=18YGQB3VW8TM08D7-f3IcmpLZtAN_GHVe

In [0]:
drive_path_to_dataset = '/My Drive/activity_dwell_hourly_al4_dw15_march_stockholm/'

## Mount your Google drive

In [0]:
drive_root = '/content/drive'
from google.colab import drive
drive.mount(drive_root)

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [0]:
!ls /content/drive/My\ Drive/activity_dwell_hourly_al4_dw15_march_stockholm/

activity_dwell_hourly_al4_dw15_march_stockholm.csv
demo.ipynb
map_administrative_level_4_stockholm.csv


## Load datasets

In [0]:
df = pd.read_csv(drive_root + drive_path_to_dataset + 'activity_dwell_hourly_al4_dw15_march_stockholm.csv')
df_map = pd.read_csv(drive_root + drive_path_to_dataset + 'map_administrative_level_4_stockholm.csv')

In [0]:
df

Unnamed: 0,local_hour,area_code,people,people_arrived,people_left
0,2020-03-22 01:00:00,0180C4270,791,,
1,2020-03-22 01:00:00,0180C6220,493,,
2,2020-03-22 01:00:00,0180C2280,1666,,10.0
3,2020-03-22 01:00:00,0180C2460,3273,,18.0
4,2020-03-22 01:00:00,0180C1560,682,,
...,...,...,...,...,...
195835,2020-03-14 22:00:00,0180C4040,11207,1801.0,3037.0
195836,2020-03-14 21:00:00,0180C4040,11245,1711.0,1588.0
195837,2020-03-14 12:00:00,0180C4040,24270,10943.0,6079.0
195838,2020-03-14 13:00:00,0180C3060,4007,2030.0,1286.0


In [0]:
df_map

Unnamed: 0,geo,admin_level_4,admin_level_4_code,admin_level_3,admin_level_3_code
0,"POLYGON((17.9252927025474 59.2937139555395, 17...",0180C2220,0180C2220,Skärholmen,212092
1,"POLYGON((17.9128405786466 59.2751212092832, 17...",0180C1490,0180C1490,Skärholmen,212092
2,"POLYGON((17.9224343446569 59.2846953318061, 17...",0180C2010,0180C2010,Skärholmen,212092
3,"POLYGON((17.9423774037654 59.2979374163369, 17...",0180C2550,0180C2550,Skärholmen,212092
4,"POLYGON((17.921924366685 59.2828891387208, 17....",0180C1910,0180C1910,Skärholmen,212092
...,...,...,...,...,...
539,"POLYGON((18.0768061498053 59.3441104389591, 18...",0180C5010,0180C5010,Stockholms Engelbrekt,215028
540,"POLYGON((18.1054211724672 59.3636189189145, 18...",0180C5770,0180C5770,Stockholms Engelbrekt,215028
541,"POLYGON((18.0668648089403 59.3468807376705, 18...",0180C5120,0180C5120,Stockholms Engelbrekt,215028
542,"POLYGON((18.065738205921 59.3707328893546, 18....",0180C5900,0180C5900,Stockholms Engelbrekt,215028


In [0]:
df_ = df[df.local_hour == '2020-03-22 06:00:00']
df_['admin_level_4_code'] = df_['area_code']
del df_['area_code']

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [0]:
df_

Unnamed: 0,local_hour,people,people_arrived,people_left,admin_level_4_code
79497,2020-03-12 13:00:00,330,19.0,60.0,0180C6160
80019,2020-03-12 13:00:00,515,32.0,77.0,0180C6250
80020,2020-03-12 13:00:00,523,32.0,77.0,0180C6220
80624,2020-03-12 13:00:00,749,31.0,74.0,0180C6150
80625,2020-03-12 13:00:00,256,32.0,56.0,0180C6370
...,...,...,...,...,...
91327,2020-03-12 13:00:00,10374,1418.0,1159.0,0180C5260
91344,2020-03-12 13:00:00,9624,1549.0,1099.0,0180C4620
91348,2020-03-12 13:00:00,6991,1425.0,941.0,0180C3880
91389,2020-03-12 13:00:00,11894,2006.0,1363.0,0180C5430


# Plot the results

In [0]:
!pip install geopandas
import geopandas as gpd
from shapely import wkt

Collecting geopandas
[?25l  Downloading https://files.pythonhosted.org/packages/83/c5/3cf9cdc39a6f2552922f79915f36b45a95b71fd343cfc51170a5b6ddb6e8/geopandas-0.7.0-py2.py3-none-any.whl (928kB)
[K     |████████████████████████████████| 931kB 1.4MB/s 
[?25hCollecting fiona
[?25l  Downloading https://files.pythonhosted.org/packages/ec/20/4e63bc5c6e62df889297b382c3ccd4a7a488b00946aaaf81a118158c6f09/Fiona-1.8.13.post1-cp36-cp36m-manylinux1_x86_64.whl (14.7MB)
[K     |████████████████████████████████| 14.7MB 214kB/s 
Collecting pyproj>=2.2.0
[?25l  Downloading https://files.pythonhosted.org/packages/ce/37/705ee471f71130d4ceee41bbcb06f3b52175cb89273cbb5755ed5e6374e0/pyproj-2.6.0-cp36-cp36m-manylinux2010_x86_64.whl (10.4MB)
[K     |████████████████████████████████| 10.4MB 42.1MB/s 
Collecting cligj>=0.5
  Downloading https://files.pythonhosted.org/packages/e4/be/30a58b4b0733850280d01f8bd132591b4668ed5c7046761098d665ac2174/cligj-0.5.0-py3-none-any.whl
Collecting munch
  Downloading https:

In [0]:
df_map['geometry'] = df_map['geo'].apply(lambda g: wkt.loads(g))
map_gpd = gpd.GeoDataFrame(df_map, geometry = 'geometry')
map_gpd.crs = {'init':'epsg:4326'} # aka. WGS 84


  return _prepare_from_string(" ".join(pjargs))


In [0]:
import folium

m = folium.Map()
folium.Choropleth(
 geo_data=map_gpd,
 name='choropleth',
 data=df_,
 columns=['admin_level_4_code', 'people'],
 key_on='feature.properties.admin_level_4_code',
 fill_color='YlGn',
 fill_opacity=0.7,
).add_to(m)
m.fit_bounds([(map_gpd.geometry.bounds.miny.min(), 
               map_gpd.geometry.bounds.maxx.max()), 
              (map_gpd.geometry.bounds.maxy.max(), 
               map_gpd.geometry.bounds.minx.min())])
m