## Loading land use attributes from external sources

Provide a path referring to the building layer of the city/area of interest. This shouldn't be exactly your case-study area but a larger area (e.g. not Boston's city center (case-study area) but the entire city of Boston).
Given such a layer, the "base" for extracting landmarks in the *02-Landmarks_Local_Files* notebook, and a set of other files, a raw land use categorisation is obtained. This will be recategorised in the *02-Landmarks_Local_Files* notebook, at a higher granularity.

In [1]:
import pandas as pd, numpy as np, geopandas as gpd, osmnx as ox
from shapely.geometry import LineString
%matplotlib inline

import warnings
warnings.simplefilter(action="ignore")

pd.set_option('precision', 5)
pd.options.display.float_format = '{:20.2f}'.format
pd.set_option('display.float_format', lambda x: '%.3f' % x)
pd.options.mode.chained_assignment = None

import cityImage as ci

In [2]:
city_name = 'Boston'
epsg = 26986 
crs = {'init': 'epsg:26986', 'no_defs': True}
input_path = 'Input/'+city_name+'/'
option = 1

In [13]:
# from local
buildings = gpd.read_file(input_path+city_name+'_buildings.shp').to_crs(epsg=epsg)
convex_hull_wgs = ci.convex_hull_wgs(buildings)
osm_buildings = ci.get_buildings_fromOSM(convex_hull_wgs, 'OSMpolygon', epsg = epsg, distance = None)

Here different land-use datasets are loaded to assign land-use categorisation to the *buildings* GDF. At the moment this part is city-dependent. From polygons GDFs: Tha land-use of the building *x* in the external GDF is assigned to the building *y* from the *buildings* GDF, only when the intersection area cover 60% of *y*. From points GDFs: simple intersection.


*For Boston a slighlty different approach is used*

## Loading other sources - Boston

In [None]:
# Loading polygons data: parcel from Boston Open Data Portal, buildings from OpenStreetMap
parcels = gpd.read_file(input_path+'otherSources/'+city_name+'_parcels.shp').to_crs(epsg=epsg)

# provide 3 lists:  names of the GDFs loaded; the fields where land-use information is contained; name of the new_columns
gdfs = [parcels, osm_buildings]
columns_lu = ['LU', 'land_use_raw']
new_columns = ['land_use_1', 'land_use_2']
for n, gdf in enumerate(gdfs): 
    buildings = ci.land_use_from_polygons(buildings, gdf, new_columns[n], columns_lu[n])

buildings['land_use_1'][buildings['land_use_1'].isnull()] = buildings['land_use_2']

In [19]:
# libraries, universities, schools and police stations

schools = gpd.read_file(input_path+'otherSources/'+city_name+'_schools.shp').to_crs(epsg=epsg)
universities = gpd.read_file(input_path+'otherSources/'+city_name+'_universities.shp').to_crs(epsg=epsg)
primary_schools = gpd.read_file(input_path+'otherSources/'+city_name+'_primary_schools.shp').to_crs(epsg=epsg)
libraries = gpd.read_file(input_path+'otherSources/'+city_name+'_libraries.shp').to_crs(epsg=epsg)
pools = gpd.read_file(input_path+'otherSources/'+city_name+'_pools.shp').to_crs(epsg=epsg)
police =  gpd.read_file(input_path+'otherSources/'+city_name+'_police.shp').to_crs(epsg=epsg)

Land-use-specific datasets are also used.
When the *land_use_1* field in the *buildings* GDF is still empty or only filled with *residential* or *commercial* values,
the land-use-specific GDFs are used to fill in the field (when geometries intersect).

In [24]:
gdfs = [schools, primary_schools, libraries, universities, pools, police]
classification = ['education', 'education', 'library', 'university', 'sport', 'emergency_service']
list_ignore = ['residential', 'commercial', None]

index_geometry = buildings.columns.get_loc("geometry")+1 
index_land_use = buildings.columns.get_loc("land_use_1")+1

# iterate through the specif GDF and replace land-use information
for n, gdf in enumerate(gdfs):
    sindex = gdf.sindex # spatial index

    for row in buildings.itertuples():
        g = row[index_geometry] # geometry
        possible_matches_index = list(sindex.intersection(g.bounds))
        possible_matches = gdf.iloc[possible_matches_index]
        precise_matches = possible_matches[possible_matches.intersects(g)]
                
        if len(precise_matches)==0: continue # buildings don't intersect
        if row[index_land_use] not in list_ignore: continue # if there's already a land-use value continue
        else: buildings.at[row.Index, 'land_use_1'] = classification[n]

In [30]:
buildings

Unnamed: 0,height,base,area,geometry,land_use_1,land_use_2
0,8.280,0.700,1926.803,"POLYGON ((237513.461 904788.149, 237504.225 90...",residential,residential
1,8.900,0.820,1126.634,"POLYGON ((237735.701 904786.375, 237720.063 90...",residential,residential
2,15.370,4.710,4202.795,"POLYGON ((238074.296 904592.566, 238050.666 90...",,
3,11.700,1.440,4287.751,"POLYGON ((237509.030 904583.715, 237496.549 90...",residential,residential
4,26.640,9.310,1368.853,"POLYGON ((237360.223 904640.894, 237352.406 90...",residential,residential
...,...,...,...,...,...,...
6595,11.130,3.070,474.735,"POLYGON ((232562.762 901634.388, 232586.835 90...",residential,residential
6596,58.620,2.930,868.970,"POLYGON ((232579.801 901593.751, 232586.716 90...",apartments,apartments
6597,16.930,1.410,2392.467,"POLYGON ((232887.683 899608.367, 232906.511 89...",Commercial Land,residential
6598,27.700,1.170,3773.174,"POLYGON ((232814.013 899628.112, 232865.439 89...",Commercial,residential


In [31]:
buildings['landUse'] = buildings['land_use_1']
buildings.drop(['land_use_1', 'land_use_2'], inplace = True, axis = 1)

## Loading other sources - London

Important buildings, functional sites, public tranport stations shapefiles from Ordnance Survey are loaded.
OpenStreetMap building shapefile is loaded too before a point file with Point of Interest in London (Ordnance Survey)

In [11]:
# polygons
imp = gpd.read_file(input_path+'otherSources/'+city_name+'_important_buildings.shp').to_crs(epsg = epsg)
fs = gpd.read_file(input_path+'otherSources/'+city_name+'_functional_sites.shp').to_crs(epsg = epsg)

# points
stations = gpd.read_file(input_path+'otherSources/'+city_name+'_railway_stations.shp').to_crs(epsg = epsg)
POI = gpd.read_file(input_path+'otherSources/'+city_name+'_POI.shp').to_crs(epsg = epsg)

In [None]:
# provide 3 lists:  names of the GDFs loaded; the field where land-use information is contained; name of the new_columns

gdfs = [imp, fs, osm_buildings] 
columns_lu = ['BUILDGTHEM', 'SITETHEME', 'type']
new_columns = ['land_use_1', 'land_use_2', 'land_use_3']

# extracting land-use information from all the GDFs
for n, gdf in enumerate(gdfs): 
    buildings = land_use_from_polygons(buildings, gdf, new_columns[n], columns_lu[n])

In [None]:
# same procedure for all the Point-file loaded
gdfs = [stations, POI]
columns_lu = ['CLASSIFICA','main']
new_columns = ['land_use_4','land_use_5']

for n, gdf in enumerate(gdfs): 
    buildings = land_use_from_points(buildings, gdf, new_columns[n], columns_lu[n])

Please establish a hierarchy. In this case, for example, *land_use_3* is used only when all the others land-use columns
are empty. *land_use_1* is supposed to have priority over the others (when filled).

In [14]:
buildings['land_use_1'][buildings['land_use_1'].isnull()] = buildings['land_use_2']
buildings['land_use_1'][buildings['land_use_1'].isnull()] = buildings['land_use_4']
buildings['land_use_1'][buildings['land_use_1'].isnull()] = buildings['land_use_5']
buildings['land_use_1'][buildings['land_use_1'].isnull()] = buildings['land_use_3']
buildings.head()

Unnamed: 0,area,base,buildingID,height,geometry,land_use_1,land_use_2,land_use_3,land_use_4,land_use_5
0,208.221,0,30162,3.18,"POLYGON ((529891.79 183628.2200000007, 529898....",Transport,,,,Transport
1,942.013,0,30163,12.21,"POLYGON ((529908.4400000004 183173.1600000001,...",,,,,
2,331.74,0,30164,12.27,"POLYGON ((529866.5 183238.3499999996, 529847.9...",Commercial services,,,,Commercial services
3,1637.525,0,30165,27.82,"POLYGON ((527543.29 182478.7400000002, 527460....",Attractions,,,,Attractions
4,1441.718,0,30166,11.59,"POLYGON ((527005.8700000001 182333.7400000002,...",Education,Education,,,Education and health


In [15]:
buildings['landUse'] = buildings['land_use_1']
buildings.drop(['land_use_1', 'land_use_2', 'land_use_3', 'land_use_4', 'land_use_5'], axis = 1, inplace = True)

## Saving

In [32]:
buildings.to_file(input_path+city_name+'_obstructions.shp', driver='ESRI Shapefile')