### Building classification inference.
This notebook uses the attributes computed in the previous notebook and the trained model to classify individual buildings as residential, non-residential, or industrial.

**Computational times**: The notebook was run on a local Machine with the following technical caracteristics: Processor	13th Gen Intel(R) Core(TM) i7-13700H (2.40 GHz), Installed RAM 16.0 GB (15.7 GB usable) and System type and 64-bit operating system, x64-based processor. It was tested on Patan, India, which covers a total of 182.113 buildings. **The total processing time of this notebook was 1 minute.**

In [1]:
import time
starting_time = time.time()

In [2]:
import pandas as pd
import numpy as np
import keras
import geopandas as gpd
from shapely import wkb

#### Importing building footprints for classification

In [3]:
#parquet file with attributes computed in previous steps
parquet_file = r"C:\Users\renec\OneDrive\Documents\SEForALL\GitHub\Buildings_with_attributes_for_classification.parquet"
df_real_data = pd.read_parquet(parquet_file)

In [4]:
df_real_data.head(2)

Unnamed: 0,perimeter_in_meters,building_faces,bf_source,confidence,geometry,longitude,latitude,id,area_in_meters,height_mean,...,nearest_road_type_2,distance_to_2,nearest_road_type_3,distance_to_3,nearest_road_type_4,distance_to_4,road_density_for_4_fixed,road_density_for_5_fixed,SQN,faces
0,6.792135,4,google,0.7125,"b""\x01\x03\x00\x00\x00\x01\x00\x00\x00\x05\x00...",72.118268,23.844193,72.1182684438068:23.844192643954727,2.568806,4.375,...,secondary,1222.496917,tertiary,738.819548,residential,60.665396,4694.459283,6516.519457,0.943885,4
1,7.049124,4,google,0.7023,b'\x01\x03\x00\x00\x00\x01\x00\x00\x00\x05\x00...,72.159891,23.781054,72.15989074332529:23.781054282336356,2.57961,1.0,...,secondary,300.749744,tertiary,232.839774,residential,25.90124,13624.998773,11490.107126,0.911385,4


#### Normalising trained values

In [5]:
normalize_area = 20_000
normalize_int_t=3300
normalize_int_distance=180
normalize_road_count=30
normalize_road_density=5000
normalize_perim_to_area=7
normalize_road_distance=10100
normalize_radius=100
normalize_density_100=200
normalize_smod = 6 
normalize_perimeter = 500

In [6]:
df_real_data['area_in_meters'] = df_real_data['area_in_meters'] / normalize_area
df_real_data['distance_to_1'] = df_real_data['distance_to_1'] / 5000
df_real_data['distance_to_2'] = df_real_data['distance_to_2'] / 4000
df_real_data['distance_to_3'] = df_real_data['distance_to_3'] / 3000
df_real_data['distance_to_4'] = df_real_data['distance_to_4'] / 2000
df_real_data['road_density_for_4_fixed'] = df_real_data['road_density_for_4_fixed'] / 60_000
df_real_data['road_density_for_5_fixed'] = df_real_data['road_density_for_5_fixed'] / 75_000
df_real_data['building_density_100'] = df_real_data['building_density_100'] / normalize_density_100
df_real_data['SQN']
df_real_data['faces']
print("Real data shape:", df_real_data.shape)
df_real_data.columns


#this needs to be updated accordingly
df_real_dropped = df_real_data.drop(columns=['id', 'ghsl_smod', 'geometry', 'building_density_50', 'building_density_250', 'building_density_500', 
                                             'perimeter_to_area_ratio', 'centroid', 'num_vertices', 'centroid_x', 'centroid_y', 'nearest_road_type_1', 
                                             'nearest_road_type_2', 'nearest_road_type_3', 'nearest_road_type_4', 'latitude', 'longitude', "radius_m", 
                                             'normalized_perimeter_to_area_ratio', 'building_perimeter_in_meters_new', 'bf_source', 'urban_split', 
                                             'perimeter_in_meters', 'building_faces', 'confidence', 'height_mean', 'height_median', 'height_max', 'height', 
                                             'floors', 'gfa_in_meters', 'elevation'])
print("New data shape:", df_real_dropped.shape)


Real data shape: (182313, 42)
New data shape: (182313, 10)


#### Importing & running classification model

In [7]:
#!!!! model = your path to model
model = keras.models.load_model(r"C:\Users\renec\OneDrive\Documents\SEForALL\GitHub\new-classification-model-local\Model\building_classification.keras")
predictions = model.predict(df_real_dropped)
predicted_classes = np.argmax(predictions, axis=1)

# Convert the predicted classes
class_labels = ['Non-Residential', 'Residential', 'Industrial']
predicted_labels = [class_labels[i] for i in predicted_classes]

#final file
output_file = "Patan_building_footprint_classification.parquet"
df_real_data["prediction"] = predicted_labels
df_real_data.to_parquet(output_file)


[1m   1/5698[0m [37m━━━━━━━━━━━━━━━━━━━━[0m [1m11:21[0m 120ms/step

  saveable.load_own_variables(weights_store.get(inner_path))


[1m5698/5698[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 1ms/step


#### Reprojecting to normal CRS

In [8]:
# Read broken parquet
df = pd.read_parquet(output_file)
df["geometry"] = df["geometry"].apply(wkb.loads)

# Convert to GeoDataFrame
gdf = gpd.GeoDataFrame(df, geometry="geometry", crs="EPSG:3857")

# Save with full GeoParquet metadata
gdf.to_parquet(output_file, engine="pyarrow")
gdf.shape

(182313, 43)

In [9]:
ending_time = time.time()
total_time = ending_time - starting_time
print(f"Total Executing time: {round(total_time/60, 2)} minutes")

Total Executing time: 0.29 minutes
