# Extract pole-like street furniture

In this notebook we demonstrate how we can extract pole-like objects form labeled point clouds.

In this code we assume the point clouds have been labelled following the process in our [Urban PointCloud Processing](https://github.com/Amsterdam-AI-Team/Urban_PointCloud_Processing/tree/main/datasets) project. For more information on the specifics of the datasets used, see [the description there](https://github.com/Amsterdam-AI-Team/Urban_PointCloud_Processing/blob/main/datasets/README.md).

In [None]:
import numpy as np
import pandas as pd
import os
import pathlib
import laspy
import uuid
from tqdm import tqdm

import set_path  # add project src to path
from upcp.utils import las_utils
from upcp.utils import ahn_utils
import upcp.utils.bgt_utils as bgt_utils
import upc_analysis.pole_extractor as extract

import config as cf  # use config or config_azure

In [None]:
# Retrieve paths to point cloud demo data
files = list(pathlib.Path(cf.dataset_folder).glob(f'{cf.prefix}*.laz'))

# Run extraction only on files bigger than 10kb
files = [f for f in files if os.path.getsize(f) > 10000]

# Create AHN data reader for elevation data
# This is optional: the data is only used when the ground elevation
# cannot be determined from the labeled point cloud itself.
ahn_reader = ahn_utils.NPZReader(cf.ahn_data_folder, caching=False)

# Create BGT data reader for building shapes
# This is optional: the data is used to check whether an extracted
# object is located within a building footprint. This might
# indicate a false positive.
if os.path.exists(cf.bgt_building_file):
    bld_reader = bgt_utils.BGTPolyReader(cf.bgt_building_file)
else:
    bld_reader = None
    print('no building file')

---
## Extracting pole-like objects

This method works by clustering points of a given target class, and then using statistics and PCA analysis on each cluster to determine the exact pole.The result is a dataset with the following features for each extracted object:
```txt
rd_x, rd_y, z = X, Y, Z coordinates of the base of the pole
tx, ty, tz    = X, Y, Z coordinates of the top of the pole
height        = the height of the pole, in m
angle         = the angle of the pole, in degrees w.r.t. vertical
prob          = the average probability of the classification, if this data is available in the point cloud 
m_r, m_g, m_b = mean RGB colors of the pole
radius        = radius of the pole, in m
n_points      = the number of points of the object
in_bld        = flag that indicates whether the object is located inside a building footprint
debug         = debug code, see below
tilecode      = tilecode in which the object was found
```
The debug code `A_B` indicates potential issues with either the ground elevation (A) or the pole extraction (B). A can be either 0 (no problems), 1 (no ground elevation found in the point cloud), or 2 (no ground elevation found in AHN). B can be either 0 (no problems), 3 (not enough data to determine the angle), or 4 (not enough data to determine the exact location).

In [None]:
# Create PoleExtractor object
pole_extractor = extract.PoleExtractor(cf.target_label, cf.ground_labels,
                                ahn_reader=ahn_reader, building_reader=bld_reader,
                                eps_noise=cf.EPS_N, min_samples_noise=cf.MIN_SAMPLES_N,
                                eps=cf.EPS, min_samples=cf.MIN_SAMPLES)   

In [None]:
# Loop over point cloud files and extract objects
locations = []
for file in tqdm(files):
    try:
        with extract.TimeOut(240):
            tilecode = las_utils.get_tilecode_from_filename(file.as_posix())
            pc = laspy.read(file)
            npz_file = np.load(cf.pred_folder + cf.prefix_pred + tilecode + '.npz')
            labels = npz_file['label']
            if 'probability' in pc.point_format.extra_dimension_names:
                probabilities = pc.probability
            else:
                probabilities = np.zeros_like(labels)
            if np.count_nonzero(labels == cf.target_label) > 0:
                points = np.vstack((pc.x, pc.y, pc.z)).T
                colors = np.vstack((pc.green, pc.red, pc.blue)).T
                tile_locations = pole_extractor.get_pole_locations(points,
                                                                   colors,
                                                                   labels,
                                                                   probabilities,
                                                                   tilecode)
                locations.extend([(*x, tilecode) for x in tile_locations])
    except:
        continue

HEADERS = ['rd_x', 'rd_y', 'z', 'tx', 'ty', 'tz', 'height', 'angle', 'm_r', 'm_g', 'm_b',
           'radius','prob', 'n_points', 'in_bld', 'debug', 'tilecode']
poles_df = pd.DataFrame(locations, columns=HEADERS)

print('# extracted poles: ', len(poles_df))
poles_df.head(2)

In [None]:
# Add unique identifier to each row
uuids = [uuid.uuid4() for _ in range(len(poles_df.index))]
poles_df.insert(0, 'identifier', uuids)

# Store all extracted poles
poles_df.to_csv(cf.output_file, index=False)

### Filter out likely false positives

In [None]:
# Remove too small or large 'poles'
poles_df = poles_df[poles_df['height'] > cf.MIN_HEIGHT]
poles_df = poles_df[poles_df['height'] < cf.MAX_HEIGHT]
print('# poles after height filter: ', len(poles_df))

In [None]:
# Remove 'poles' that are actually trees
poles_df = extract.remove_tree_poles(cf.trees_file, poles_df, cf.MAX_TREE_DIST, cf.tree_area)
print('# poles after tree filter: ', len(poles_df))

In [None]:
# Store remaining poles
poles_df = poles_df.reset_index()
poles_df.to_csv(cf.output_file_filter, index=False)