# Automatic labelling of ground and buildings using data fusion

We build a `FusionPipeline` with different `DataFusers`, each of which labels a particular type of object. The result is a labelled pointcloud, where labels are stored in the LAS extra_dim `label`.

There are different types of data fusers available:
- `AHNFuser(..., method='npz')`: use pre-processed AHN data to label ground and building points
- `AHNFuser(..., method='geotiff')`: use GeoTIFF data to label only ground points.

The `FusionPipeline` supports processing a single file, or batch-processing a folder.

In [None]:
# Add project src to path.
import set_path

# Import modules.
import time

import src.fusion as fusion
import src.utils.ahn_utils as ahn_utils
from src.utils.labels import Labels

### Using pre-processed AHN data

Prepare data following notebook [1. AHN preprocessing](1.%20AHN%20preprocessing.ipynb).

In [None]:
# Data folder for the fusers.
ahn_data_folder = '../datasets/ahn/'

# Ground fuser using pre-processed AHN data.
npz_ground_fuser = fusion.AHNFuser(Labels.GROUND, ahn_data_folder,
                                   method='npz', target='ground', epsilon=0.2)
# Building fuser using pre-processed AHN data.
npz_building_fuser = fusion.AHNFuser(Labels.BUILDING, ahn_data_folder,
                                     method='npz', target='building', epsilon=0.2)

# Set-up pipeline.
fusers = (npz_ground_fuser, npz_building_fuser)
#fusers = (npz_ground_fuser,)
pipeline = fusion.FusionPipeline(fusers)

### Using GeoTIFF data

First, download the required GeoTIFF tile(s). E.g. for our demo point cloud, this is:
```sh
mkdir -p datasets/ahn
cd datasets/ahn/
wget https://download.pdok.nl/rws/ahn3/v1_0/05m_dtm/M_25DN2.ZIP
unzip M_25DN2.ZIP
rm M_25DN2.ZIP
```
Run the following cell to do this automatically.

In [None]:
!mkdir -p ../datasets/ahn
!wget https://download.pdok.nl/rws/ahn3/v1_0/05m_dtm/M_25DN2.ZIP -P ../datasets/ahn/
!unzip ../datasets/ahn/M_25DN2.ZIP -d ../datasets/ahn/
!rm ../datasets/ahn/M_25DN2.ZIP

In [None]:
# Data folder for the fusers.
ahn_data_folder = '../datasets/ahn/'

# Ground fuser using AHN GeoTIFF data.
geotiff_ground_fuser = fusion.AHNFuser(Labels.GROUND, ahn_data_folder,
                                       method='geotiff', target='ground', epsilon=0.2,
                                       fill_gaps=True, max_gap_size=100,
                                       smoothen=True, smooth_thickness=2)

# Set-up pipeline.
fusers = (geotiff_ground_fuser,)
pipeline = fusion.FusionPipeline(fusers)

## Process a single file

In [None]:
# Select the file to process. The outfile can be set to 'None' to overwrite the file.
filename = '../datasets/pointcloud/filtered_2386_9702.laz'
outfile = '../datasets/pointcloud/labelled_2386_9702.laz'

# Process the file.
start = time.time()
pipeline.process_file(filename, outfile=outfile)
end = time.time()
print(f'Tile labelled in {end-start:.2f} seconds.')

## Process a folder

In [None]:
# Select the folder to process. 
in_folder = '../datasets/pointcloud/'
# Output folder. 'None' uses the input folder.
out_folder = None
# Suffix to add to the filename of processed files. An empty string indicates 
# that the same filename is kept; when out_folder=None this means overwriting.
suffix = '_labelled'

# Process the folder.
pipeline.process_folder(in_folder, out_folder=out_folder, suffix=suffix)