Skip to content

This repository contains the code, that was used in the paper: Reconstructing façade details using MLS point clouds and Bag-of-Words approach

Notifications You must be signed in to change notification settings

ThomasFroech/ReconstructingFacadeDetailsBoW

Repository files navigation

This repository contains the code, that was used in the paper: Reconstructing façade details using MLS point clouds and Bag-of-Words approach

Project Photogrammetry - Reconstructing façade details using MLS point clouds and Bag-of-Words approach

Introduction

The idea of this project is to make use of the Bag of Words approach [Curka et. al., 2004] in order reconstruct facade details using MLS point clouds. the general idea is described in the following schematic graphic based on [Memon et al., 2019].

Introductional_Graphic
Figure 1: General overview, based on [Memon et al., 2019]


The actuality and relevance of this idea results from the increased availability of MLS point clouds and large CAD databases [Trimble Inc, 2021] in recent years.

Methodology

Overview

The following diagram gives an overview on the implementation in this project. The gereal structure can be divided into two parts:

  • The training part. The codebook is construcetd here. This process consists of several individual steps. These invlove (among others) pose normalization, feature extraction, and a clustering in feature space.
  • The inference part. Here , the description of the windows from the MLS point cloud takes place as well as the histogram comparison between the codeword representation of the model and the object.


Introductional_Graphic
Figure 2: Schematic diagram of the used methodology

The Bag of Words Approach

The general bag of words approach was introduced in 1986 by Salton and McGill. It was originally developed in order to classify texts by creating histograms of relative word occurrences. [Salton & McGill, 1986] This approach was later extended to images by Csurka et. al. in 2004. The concept there is to extract local features from an Image, build a dictionary (codebook) and create a histogram of occurences of the "visual codewords" for each image in the following step. The construction of the codebook invelves several different steps for which specific considerations have to be made [Csurka et. al., 2004]:

  • Feature extraction: Which features should be used?
  • Vector quantization: How can it be performed?
  • Clustering: Which clustering algorithm should be used? What number of clusters would be beneficial?
  • Histogram comparison: Which histogram distance should be used?

Augmentation of the bag of words approach by incorporating global features

A problem in using the bag of words approach when dealing with shapes is that shapes generally have a smaller number of distinct features. This means that geometric objects will be represented by many similar visual words [Bronstein et al., 2011]. A possibility to circumvent this problem is to incorporate global features into the representation of the object. This way it is possible to make use of global information on the object during the process. There have been different studies that make use of this concept:

  • [Zhu et al., 2016]
  • [Memon et al., 2019]

The realization of the general concept of incorporation of global features in this project is schematically shown in the following diagram:

Introductional_Graphic
Figure 3: Incorporation of global features

CAD Model Selection

Large CAD libraries are one of the foundations of the concept that this project is based on. The CAD models are selected from the SketchUp 3D Warehouse [Trimble Inc, 2021]. For experimental purposes, only a small number of windows is selected. The individual windows are chosen on the basis of their visual appearance according to the windows that appear in the TUM Facade dataset. A model that does not correspond to any window that appears in the TUM Facade dataset is selecte additionally.

The selected models are manually edited in order to add or remove window bars. Furthermore, curtains or similar structures are removed from all of the chosen CAD Models.

In order to make the analysis of the window bars possible, window glass is also removed from all of the models.Through this step, the models correspond more to the actual measurement conditions, since the pulses of the laser scanner usually penetrate window glass. The following figure shows the edited windows that are used in this project:

Introductional_Graphic
Figure 4: Edited CAD window models

CAD Model Sampling

One of the key problems in this project is that the features that are used to describe the model and the features that describe the measured window have to correspond to each other as much as possible. In this project, the CAD models are transferred from the CAD domain to the point cloud domain to ensure this correspondence. This transfer is realized by perfoming an equidistant sampling of the CAD models in FME with the PointCloudCombiner application. During this process, the sampling distance is a critical parameter. It has to be chosen in such a way, that the resulting point cloud represents the CAD model accurately. Large sampling distances will lead to inclomplete representations, while too small sampling distances will result in an unnecessary large number of points.

Introductional_Graphic
Figure 5: Sampled CAD Model

Pose normalization

The first step is to perform a pose normalization. In order to apply the techniques in the following steps, certain invariances must be guaranteed. These following invariances can be guaranteed by a sophisticated pose normalization:

  • Translation invariance
  • Rotation invariance
  • Scale invariance

The Translation invariance can generally be achieved by various different techniques. In this project, it is realized by performing a translation of the mean coordinate point to the orogin of the coordinate system. The translation parameters are simply the mean x,y and z values.

The scale invarianve is achieved by scaling the pointclouds th such a way that the maximal distance of a point from the origin of the coordinate system is equal to 10. The scaling factor is calculated from The largest coordinates in terms of absolute value.

The most common method in order to achieve rotational invariance is to calculate the principal axes of the object and align them with the coordinate system. Often, the principal axes of objects are calculated by using principal component analysis (PCA) [Martens & Blankenbach, 2020]. This concept was implemented in this project experimentally, but has proven not to be unsuitable for use, here. The reason is that PCA is very sensitive to towards noise and point sampling density variations [Martens & Blankenbach, 2020]. Both disturbing fasctors are present in the pointclouds that are used in this project. Because of that, the point clouds are are aligned to the coordinate axes manually. There would be other options for alignemnt, for example a rectilinearity based approach by [Lian et al., 2009] that are not investigate further in this project.

Downsampling and Outlier Removal

In order to increase the speed of the computations a downsampling of the pointclouds is performed. for this purpose, the respective Open3D functionality ("normalized_points.voxel_down_sample(voxel_size=0.05")) is used. in this project, the voxel size that is used in this function is set to 0.5. For the removal of outliers and the reudction of noise, the Open3D functionality ".remove_radius_outlier(nb_points=16, radius=0.5))" is used. the parameter settings for this function were determined by a viusal trial and error approach. The settings were adapted until the binary image that was generated from the resulting projected point cloud showed a satisfactory appearance.

2D Image Generation

A facade detail (or more specifically, a window) that is extracted from an MLS point cloud is generally available as a 3 dimensional collection of points. When thinking about how we as humans perceive facade details (or again more specifically windows), we notice that we mostly rely on the front view of the window. Most windows dont have prominent 3 dimensional structures that can be used for identification. Because of thís, only the front view of the windows is considered in this project. For practicability and simplified feature extraction, a 2D binary image is created from the pointcloud. The process of creating such images consits of the following two steps:

  1. Projection of the 3D point cloud into a 2D point cloud. Because of the manual pose normalization that is performed in the previous step, the front view is already parallel to one of the coordinate planes. This simplifys the projection of the pointcloud.
  2. Then the image is created from the projected point cloud. A zero-matrix that represents the 2D structure of the binary image is to 1 for every cell that corresponds to a projected point

Since there are three coordinate planes that the front view could be parallel to, the correct one has to be found. In this project, a simple manual selection is applied for this purpose. During this selection process, the user is presented the images generated from the three different projection directions subsequently. The user can either discard or accept the presented views. Another, more elegant prossibility that is implemented experimentally but not used in the project, is to find the correct projection direction by finding the view with the maximum covariance.

The following graphic shows an example of the binary 2D images generated from a model

Introductional_Graphic
Figure 6: Projected imaged from all 3 different views

2D Image Processing

In general, the points clouds that are aquired by MLS are corrupted by a certain level of noise and will show point density variations. In order to be able to extract meaningful information from the binary images that are created in the previous step, different processing steps are necessary. The following processing steps are applied:

  • Morphological Operators: Dilation

    This step is necessary beacause the initial image consits of a collection of individually separated points. in order to perform an edge detection later on, the represented object must be a single homogenous area in the image. This is achieved by applying a dilation with a sufficiently large kernel size. In this project a kernel size of 20 pixels is used. The following graphic gives an example of the dilated image:

    Introductional_Graphic
    Figure 7: Dilated Image
  • Edge Detection: Laplace Filter

    The geometry (and also the topology) of the window is mainly described by the edges and corners of the window. Therefore it is reasonable to perform an edge detection filter to the previously dilated image before extracting features from it. The following graphic gives an example of the image after the application of the laplace filter:

    Introductional_Graphic
    Figure 8: Laplacian Image
  • Line Simplification: Douglas-Peucker Algorithm

    Often, noise will lead to a lot of irregularities in the images. In order to increase robustness to such noise, the Douglas-peucker algorithm [Douglas & Peucker, 1973] is applied to the contours that are found in the images. The following graphic shows an example of such a simplification. On the left side there is the image without the simplification by Douglas-Peucker, on the right side there is the same image after the application of the Douglas-Peucker algorithm. For the implementation of the algorithm, the OpenCV Library is used.


    Figure 9: Application of the Douglas-Peucker Algorithm


    The following figure demonstrates how the extraction of meaningful interest points can be enhanced by applying this method. In the left image, a lot of the detected interest points are not related to the general structure of the window, but are solely detected because of the presence of noise. In the right image, where the Douglas-Peucker algorithm was applied, more meaningful interest points are found.




    Figure 10: Extraction of meaningful interest points

Feature Extraction

In order to construct the codebook and describe objects by means of this codebook, features have to be extracted (vgl. Figure 2). In this project, the extraction of the features takes place in the point cloud domain. Various features were implemented but not all of them are used, as some have proven to be unfit for use in the framework of this project. The following table summarizes all the implemented features:

Feature Feature Type Acquisition Source Explanation Additional information
AV-Ratio of the Bounding Box global 3D point cloud Ratio between the area and the volume of the bounding box the bounding box is calculated by using the Open3D functionality "get_oriented_bounding_box()"
Cubeness of the Bounding Box global 3D point cloud The Bounding Box Cubeness compares the surface area of the bounding box to the surface area of a cube with the same volume
  • the bounding box is calculated by using the Open3D functionality ".get_oriented_bounding_box()"
  • Inspired by [Labetski et al., 2022]
    AV-Ratio of the Convex Hull global 3D point cloud Ratio between the area and the volume of the convex hull the convex hull is calculated by using the Open3D functionality ".compute_convex_hull()"
    Cubeness of the Convex Hull global 3D point cloud The Cubeness of the Convex Hull compares the surface area of the convex hull to the surface area of a cube with the same volume
    • the convex hull is calculated by using the Open3D functionality ".compute_convex_hull()"
    • Inspired by [Labetski et al., 2022]
    Circularity of the Convex Hull global 3D point cloud The Circularity of the Convex Hull measures the volume deviation between the convex hull and is equal area hemisphere
    • the convex hull is calculated by using the Open3D functionality ".compute_convex_hull()"
    • Inspired by [Labetski et al., 2022]
    Cohesion of the Convex Hull global 3D point cloud
    • The cohesion is a measure of overall accessibility from all points to others within a polyhedron.
    • "The Cohesion Index is the ratio of the average distance-squared among all points in an equalarea circle and the average distance-squared among all points in the shape" [Angel et al., 2010]
    -
    Exchange of the Convex Hull global 3d point cloud "[The exchange] measures how much of the area inside a circle is exchanged with the area outside it to create a given shape" [Angel et al., 2010] The calculation of this feature was attempted, but has proven to be much to complex in order to be completed within the scope of this project. A proper geometry engine would be necessary to perform this task.
    Convex Hull to Bounding Box Area Ratio global 3D point cloud The ratio of the area of convex hull and the area of the bounding box
    • the bounding box is calculated by using the Open3D functionality ".get_oriented_bounding_box()"
    • the convex hull is calculated by using the Open3D functionality ".compute_convex_hull()"
      Convex Hull to Bounding Box Volume Ratio global 3D point cloud The ratio of the vlume of convex hull and the volume of the bounding box
      • the bounding box is calculated by using the Open3D functionality ".get_oriented_bounding_box()"
      • the convex hull is calculated by using the Open3D functionality ".compute_convex_hull()"
        Fractal Dimension of the Point Cloud global 3D point cloud - It has turned out, that the fractal dimension of the point cloud is dependent on the ratio of the box size that is used during the box counting algorithm and the density of the point cloud. Because of that the use of the fractal dimension was not pursued any further in this project. It could however be possible to find a way on how to adapt the box size according to the density of the point cloud. This is a challenging task, because the point density of the point cloud that represents a window, is not only dependent on the acquisition conditions, but on the shape and structure of a window, too. To find the true density of points that actually represent parts of the building, knowledge about the structure of the window would be necessary.
        Squareness of the 2D convex hull global
        • 2D pointcloud (projected 3D point cloud)
        • 2d image
        The 2D squareness of the convex hull measures the perimeter deviation between the 2D convex hull and a square with the same area Inspired by [Labetski et al., 2022]
        Squareness of the 2D bounding box global 2D pointcloud (projected 3D point cloud) The 2D squareness of the bounding box measures the perimeter deviation between the 2D convex hull and a square with the same area Inspired by [Labetski et al., 2022]
        Circularity of the 2D projected Pointcloud global 2D pointcloud (projected 3D point cloud) - -
        2d Convex hull 2d bounding box area ratio global 2d image measures the area deviation of the 2d convex hull and the 2d bounding nox -
        2d Convex hull 2d bounding box perimeter ratio global
        • 2D pointcloud (projected 3D point cloud)
        • 2d image
        measures the perimeter deviation of the 2d convex hull and the 2d bounding nox -
        Cohesion of the 2d convex hull global 2d image
        • The cohesion is a measure of overall accessibility from all points to others within a polyhedron.
        • "The Cohesion Index is the ratio of the average distance-squared among all points in an equalarea circle and the average distance-squared among all points in the shape" [Angel et al., 2010]
        -
        Number of straight lines in the image global 2D image
        • Number of lies detected in the image
        • The lines are detected by applying the hough transform
        • 5 parameters can be varied:
          • The distance resolution in pixels of the Hough grid
          • The angular resolution in radians of the Hough grid
          • The minimum number of votes (intersections in Hough grid cell)
          • The minimum number of pixels making up a line
          • The maximum gap in pixels between connectable line segments
        • The different parameters were set by a trial and error approach
        • The calculation is done via the OpenVC-Library
        Histogram of Oriented Gradients semi-global 2D - image -
        • The calculation is done via the OpenVC-Library
        • Based on [Dalal & Triggs, 2005]
        Hu - Moments global 2D - image
        • "[The Hu moments are a set of] two-dimensional moment invariants for planar geometric figures" [Hu, 1962]
        • The Hu-moments are invariant to:
          • Translation
          • Rotation
          • Scaling
        • The calculation is done via the OpenVC-Library
        • Based on [Hu, 1962]
        Orb-descriptor local 2D image "very fast binary descriptor based on BRIEF [...] which is rotation invariant and resistant to noise" [Rublee et al., 2011]
        • Only the descriptor is used in this project, the position of the keypoint itsself is not used.
        • "two orders of magnitude faster than SIFT" [Rublee et al., 2011]
        • The calculation is done via the OpenVC-Library
        • Based on [Rublee et al., 2011]

        Codebook Construction

        Histogram Comparison

        The final step in the process that is described in figure 2 is the comparison of the histograms that represent the occurence of the visual words. There are a number of ways in which such a comparison can be realized. For this project several different histogram distances are implemeted. The following table summarizes them:

        Name Formula Further information
        Minkowski Distance
        • Introductional_Graphic
        • [Cha, 2008 ]
        In this project, the parameter p if fixed to 2.
        Chi-Square Distance
        • Introductional_Graphic
        • [Cha, 2008 ]
        For the calculation of the Chi-Square distance, the respective function in the OpenCV Library is used.
        Kullback-Leibler Divergence
        • Introductional_Graphic
        • [Joyce, 2014]
        -
        Jensen-Shannon Divergence
        • Jensen-Shannon_Divergence
        • [Fuglede & Topsøe, 2004]
        D is the Kullback Leibler Divergence
        Combined Histogram Distance - A weighted average of all the previously listed histogram distances.

        Implementation

        The implemetation is done in Python 3.10. In this project various openly accessible python libraries are used.The following are the most important ones:



        There are a number of different files that contain specific functionalities. The following table lists all the implemented files and briefly explains their functionality

        Name Functionality Further information
        main.py The main file that is used in order to run the code. In this file, the input-, output- and inference directories are defined an /
        Normalization.py This file contains the functionalities that are used in order to achieve translation and scale invariance of a point cloud.
        • The scaling is set in such a way that the largest extent in a coordinate direction is set to 10
        • The trasnslation is performed in such a way that the mean point of the point cloud becomes the new origin of the local coordinate system
        • Initially, a funtionality for rotation invariance based on PCA was also implemented here. It was later moitted because of the sensitivity of PCA towards asymmetry and point density variations
        PCA.py This file contains a functionality to perform a Principal Component Analysis (PCA) on a given Point cloud. Input are the points as an numpy array, output are the eigenvectors and their respective eigenvalues. During the experiments for this project, PCA was not used.
        twoDImage.py This file contains the functionalities that are necessary to create binary two dimensional images from the projected pointclouds. These funtionalities include:
        • The image generation. Input: the coordinate axes to be projected to, the image size and the input Open3D point cloud
        • A function to calculate the covariance of a two dimensional point cloud (discarded in this project)
        • A method that calculates the projection direction with the maximum covariance (discarded for this project, not clear if it actually works)
        • A functionality for the manual selection of a projection direction
        • The functionalit for the Douglas-Peucker algorithm
        The code in the oether files has been adapted in order to work on the manual selection of the best projection directio. It would be a laboreous task to make it work for the automatic projection direction detection
        Global_Features.py This file contains functionalities to calculate global features for a given pointcloud. This file contains a lot of unused and unfinished code.
        twoDFeatures.py This file contains the functionalities that are used in order to extract features from the binary images that are created by the functions in the twoDimage.py file. The functions here include:
        • ORB
        • SIFT (not used in this project)
        • MSER (Not used in this project)
        • HOG
        • Dilation function
        • Laplace function
        • Line detection & counting
        • Template matching (unused and unfinished)
        • Function for the extraction of the Hu-Moments
        • the functions for the extraction of the global features from the 2d bounding box and convex hull
        /
        FractalDimension.py This file contains the functionality to calculate the fractal dimension of a given pointcloud by applying the boxcounting algorithm. This functionality is not used in this project because of several issues.
        CodeBook.py This file contains all the functionalities that are used in order to create the codebook from a set of given model-pointclouds. these functionalities include:
        • A function to create the local feature space and write it into a respective .txt file
        • A function to create the global feature space and write it into a respective .txt file
        • A functoinality to perform a clustering in the clocal feaure space
        • a function do generate the codebook and write it to a respective .txt file
        As an interface between the different functions, the .txt file is chosen because of several advantages:
        • Possibility of re-use of single files
        • Simplified manipulation of individual files for debugging and testing purposes
        • Simplified Archiving of results
        Inference.py This file contains all the functionalities that are necessary in order to perform the actual object representations by means of the codebook. The function that are implemented in this file include:
        • A function to represent a point cloud by means of the codebook
        • A function vor vector quantization
        • A function for the histogram comparison that calculates the distance of a given gistogram to all the histograms of the model representations and writes them into the "Result.txt" file.
        • auxiliary functions for the histogram distance calculation
        • A function that analyzes the result, finds the best matches and writes them into the "Matchings.txt" file.
        /
        ArtificialNoise.py This file contains a function that applies random noise of a certain defined strength to an input pointcloud THis functionality is used in the small scale experiments in this project.

        Discarded Ideas

        • Template matching:

          One idea was to count the number of occurences of certain patterns in the 2D binary images. For experimental purposes a template matchin algorithm was implemented that was insteded to match specially designed cross patterns with the binary images an count the numer of found matches. There are several issues why this approach was discarded from this project:

          • Invariance to rotation and scaling takes a high effort to construct.
          • Sensitivity to noise
          • For different models, different new templates might have to be crafted.
          The following grapic contains a selection of cross patterns that were crafted in order to test this functionality:



          Introductional_Graphic
          Figure 11: Selection of templates
        • The number of found Contours

          Another feature implemente but not used is to count the number of found contours in the images. The idea here is that distinguishing feature betwee windows with windowbars and windows without widnow bars is overall numer of found contours in the image. For windows with window bars, the number of contours will theoretically be higher than for windowas without window bars. With this idea, the perfomance of the discrimination based on the interior structure of the windows should have been improved. The implementation of the contour detection and counting is based on the OpenCV library. The results are hevily affected by noise and sparsity, which makes the feature unfit for use in this project.

        • 3D Insterest Point Detectors

          There is the idea to use 3D Interest point detectors and their respective Descriptors in order to extract local features directly from the 3d point cloud. Examples for such detectors would be Harris-3D [Sipiran & Bustos, 2011], SIFT 3D [Scovanner et al., 2007] or the USIP detector which makes use of deep leraning approaches [Li & Lee, 2019]. This idea was discarded, because the often very irregular shape of the windows from the MLS point clouds does not often correspond to the very regular 3D-shape of the windows that are sampled from the CAD models. This means that the extracted keypoints would most probably not correspond to the keypoints found in the model in most cases. The following graphic illustrates this problem:

          Introductional_Graphic
          Figure 12: Window from MLS point cloud and window sampled from CAD

        Experiments

        Small Scale Experiments

        • The first experiment that is conducted has the goal to identify any grave implementation mistakes that could have been made. In order to assess this, the Inference is done on exactly the same pointclouds that are used in order to create the codebook. In theory, the same features should be extracted from all of these pointclouds, and the resulting matching distances should always be zero while the matching should be 100% correct. Several exeriments for different combinations of features and different histogram distances have shown that this is exactly the case.
        • The second set of experiments is conducted in order to assess the performans of different features in the presence of noise. Different combinations of features are used and their performance is measured. For the inference, the point clouds that are used to generate the codebook are overlaid with random noise of various strengths. The following graphic schematically describes the experiments:

          Introductional_Graphic
          Figure 13: Schema of the second experiment


          As test windows, the same windows as in figure 3 are used. In the test dataset, consisting of the windows with added noise, each of these windowas appears 4 times. That leads to a total number of 32 test windows. the results of these tests are summarized in the following table:
        Feature combination noise level Histogram Distance Performance
        Only local features: ORB 0.2 Chi-Square-Distance Confusion_Matrix_1__1_.drawio_1_
        • Overall Accuracy: 0.469
        • Kappa-Coefficient: 0.356 (~ "Weak")
        Only global features: HOG 0.2 Chi-Square-Distance Confusion_Matrix_1__1_.drawio_2_
        • Overall Accuracy: 0.750
        • Kappa-Coefficient: 0.714 (~ "Good")
        • ORB
        • HOG
        0.2 Chi-Square-Distance Confusion_Matrix_1__1_.drawio
        • Overall Accuracy: 0.688
        • Kappa-Coefficient: 0.646 (~ "Good")
        • ORB
        • HOG
        0.4 Chi-Square-Distance Unbenanntes_Diagramm.drawio
        • Overall Accuracy: 0.500
        • Kappa-Coefficient: 0.428 (~ "Moderate")
        • ORB
        • HOG
        • Hu-Moments
        0.2 Chi-Square-Distance Confusion_Matrix_1__1_.drawio_2_
        • Overall Accuracy: 0.781
        • Kappa-Coefficient: 0.750 (~ "Good")
        • ORB
        • HOG
        • Hu-Moments
        0.4 Chi-Square-Distance Confusion_Matrix_1__1_.drawio_2_
        • Overall Accuracy: 0.375
        • Kappa-Coefficient: 0.278 (~ "Weak")

        Discussion of the Small Scale Experiments

        • The first thing that can be observed is that the quality of the matching rapidly decreases with the noise level. Generally, the experiments that were conducted on the higher noise level show a lower overall accuracy.


        • Apparently the matching is generally more stable and better for some windows. For others, there are large differences in the matching itsself as well as in the quality of the matching. This observation is strengthened by the examination of the user's and producer's accuracy of the individual window types. the following to tables list the user's and producer's accuracies for the individual window types. Each row of the table represets a single experiment. the order of the experiments is the same as in the table above.

          Introductional_Graphic




          Introductional_Graphic


          This table shows that the arched window with no bars and the two octagon-shaped windows are matched in the most stable manner compared to the other window types. In general, the rectangular and quadratic windows show less good user's and producers accuracies in these experiments. A look into the variances and standard deviations of the user's and pproducer's accuracies that are listed in the table below, strengthen this observation.

          Introductional_Graphic


          The reason for this might be that the windows that are detected in a more stable way (the arched window with no cross and the octagon shaped windows) have properties that can be described by the used features in a more precise way than the other windows. The windows, for which the performance is weaker, probably have properties that can be described by the used features in a less precise way. Therfore the description of these windows will probably have a very similar description that clould lead to a wrong classification in the presence of noise.


        • The perormance of the local features (ORB) on it's own is very poorly. If the local features are augmented by incorporating the (semi-) global hog features into the process, the overall performance is increased.

        Experiments on the TUM Facade Dataset

        The next set of experiments is performed on a set of 42 windows that are extracted from the TUM-Facade dataset. In this dataset only large rectangular and arched windows occur.

        Feature combination Histogram Distance Dataset Type Performance
        • HOG
        Jensen-Shannon Divergence no molding Unbenanntes_Diagramm.drawio
        • Overall Accuracy: 0.36
        • ORB
        • HOG
        Minkowski-Distance no molding Unbenanntes_Diagramm.drawio
        • Overall Accuracy: 0.405
        • ORB
        • HOG
        • Douglas-Peucker tolerance set from 10 to 7
        Jensen-Shannon Divergence no molding Unbenanntes_Diagramm.drawio
        • Overall Accuracy: 0.52
        • ORB
        • HOG
        Jensen-Shannon Divergence no molding Unbenanntes_Diagramm.drawio
        • Overall Accuracy: 0.57
        • ORB
        • HOG
        Jensen-Shannon Divergence with molding Unbenanntes_Diagramm.drawio
        • Overall Accuracy: 0.57
        • ORB
        • HOG
        • Squareness of the 2D projected Pointcloud
        • Circularity of the 2D projected Pointcloud
        • Perimeter ratio of 2d Convex Hull and 2d Bounding Box
        Jensen-Shannon Divergence with molding Unbenanntes_Diagramm.drawio
        • Overall Accuracy: 0.524



        It has to be mentioned that an experiment where only local features (ORB) were used is not inluded in this list. The reason is that the performance was very poorly. The use of different histogram distances lead towards strong biases towards certain window types in this experiment.

        Discussion of the Experiments on the TUM Facade Dataset

        • Quality strongly dependent on:
          • Choice of features. The best results are achieved by using a combination von local features and (semi-) global HOG features.
          • Choice histogram distance. the best results are achieved by using the jensen shannon divergence
          • Filter radius & number of neighbours. In this project, the filter radius that is used during the outlier removal is set to 0.5. The number of neighbouring points is set to 16
          • Douglas-Peucker tolerance. The best results were achieved with a douglas Peucker tolerance of 15
          • Number of K-Means clusters. In this project, 25 clusters are usen in any of the experiments.
          • Dilation kernel size. The best results were achieved by using a dilation kernel size of 20 pixels
          • Vector quantization method. In this project, only the minkowski distance is used for the vector quantization For this the parameter p is set to 2.


        • An investigation in the position of the windows in the facade reveals that, in the experiment with the best performnce, the falsely identified windows are not distributed randomly. The following graphic shows that most of the falsly classified windows are located in the second floor of the building:

          Introductional_Graphic
          Figure 14: Correctly and falsely matched windows in the facade


          The reason for this might be that the point density of the MLS point cloud is lower in the upper parts of the facade due to the higher distance from the sensor. This, in combination with assumingly unfavorable parameter settings during the outlier removal, might lead to the loss of structures that are important for the correct identification of the window. The following graphic supports this assumption. It shows the binary image that is created from the 4th window from the right in the second floor of the building:



          Figure 15: Sparse window from the second floor




        Outlook

        There are many ways in which this project can be extended improved and further developed.Future work could focus on various different aspects.

        There is the possibility of performing experiments on larger stets of data. This way, deeper insight into the behaviour of the approach under a larger variation of circumstances could be gained. This project focuses on the concept itsself, and experimets are only conducted on a relatively small scale. Therfore the reliability of the insight that is gained by the analysis of the different experiments is relatively limited. Reliable statistical data would be one of the foundations of a further development of the approach introduced in this project. Future work could help to set this foundation.

        Another objective could be to minimize the influence of noise and sparsity of the MLS Point clouds. The quality of the resulting matching is, as the experiments that are conducted in this projects show, largely dependent on the chosen parameters. Future work could also aim at optimizing the choice of parameters in order to maximize the matching quality with respect to these two disturbing factors. A way to achieve this could be by focusing on the "scene gist" or in other words on more high level information, rather than on very detailed low level information.

        A further possibility to increase the performance of the approach could be to use a dense grid of keypoints in order to extract local features instead of using interest point detectors like ORB or SIFT.

        Right now the whole approach is implemented in a very experimental way. Future work could also aim at making the concept mor applyable for everyday use. This might also invole implementation in lower level programming language to increase the temporal performance of the whole process. The construction of a sophisticated user interface could also contribute to further development of this approach.

        There is also the possibility of altering this approach, or to use a related approach instead. For example the approach could be modified in a way, as describen in the following diagram:


        Figure 16: Schematic Description of an altered version of the apprach


        This approach could be used to reduce computation time for larger datasets and databases. The approach that is used in this project has the disadvantage that there has to be a similarity comparison between every of the N windows in the dataset and every of the M windows in the CAD library (N * M histogram distance calculations). The computational effort will become very large for larger datasets and databases. The number of histogram distance calculations in the approach that is introduced in the previous graphic will be just (L * M) with L being the number of window types in the dataset. This number will generally be smaller, because we can assume that for a the assumption L < N holds for any sensible clustering algorithm. In th worst case, where there are no identical windows in the dataset, there will still maximally be (L * M) = (N * M) histogram distance calculations necessary. In the best case, when there is just one type of window in a dataset, there will be just one histogram distance calculation necessary.

        List of Figures

        • Figure 1: General overview, based on [Memon et al., 2019]
        • Figure 2: Schematic diagram of the used methodology
        • Figure 3: Incorporation of global features
        • Figure 4: Edited CAD window models
        • Figure 5: Sampled CAD Model
        • Figure 6: Projected imaged from all 3 different views
        • Figure 7: Dilated Image
        • Figure 8: Laplacian Image
        • Figure 9: Application of the Douglas-Peucker Algorithm
        • Figure 10: Extraction of meaningful interest points
        • Figure 11: Selection of templates
        • Figure 12: Window from MLS point cloud and window sampled from CAD
        • Figure 13: Schema of the second experiment
        • Figure 14: Schematic diagram of the used methodology
        • Figure 15: Sparse window from the second floor
        • Figure 16: Schematic Description of the an altered version of the apprach

        References

        • Bronstein AM, Bronstein MM, Guibas LJ, Ovsjanikov M (2011) Shape google: geometric words and expressions for invariant shape retrieval. ACM Transactions on Graphics 30(1): 1-20
        • Cha SH (2008) Taxonomy of nominal type histogram distance measures. In: Long C, Sohrab SH (eds). AMERICAN CONFERENCE ON APPLIED MATHEMATICS (MATH '08), Harvard, Massachusetts: 325-330
        • Csurka G, Dance CR, Fan L, Willamowski J, Bray C (2004) Visual categorization with bags of keypoints. Workshop on statistical learning in computer vision, ECCV. 1(1-22): 1-2
        • Douglas DH, Peucker TK (1973) Algorithms for the reduction of the number of point required to represent a digitized line or its caricature. Cartographica: the international journal for geographic information and geovisualization 10(2): 112-122
        • Fuglede B, Topsøe (2004) Jensen-Shannon divergence and Hilbert space embedding. In International Symposium on Information Theory, 2004, Chicago, IL, USA. IEEE: 1-6
        • Hu MK (1962) Visual pattern recognition by moment invariants. IRE Transactions on Information Theory 8(2): 179-187
        • Joyce JM (2011) Kullback-Leibler Divergence, International Encyclopedia of Statistical Science. Springer, Berlin, Heidelberg: 720-722
        • Kang Z, Yang J (2018) A probabilistic graphical model for the classification of mobile LiDAR point clouds. ISPRS Journal of Photogrammetry and Remote Sensing 143 (2018): 108-123
        • Labetski A, Vitalis S, Bijecki F, Ohori KA, Stoter J (2022) 3D building metrics for urban morphology. International Journal of Geographical Information Science, 37(1): 36-67.
        • Martens J, Blankenbach J (2020) An evaluation of pose-normalization algorithms for point clouds introducing a novel histogram-based approach. Advanced Engineering Informatics 46(2020): 101132
        • Memon SA, Mahmood T, Akhtar F, Azeem M, Shaukat Z (2019) 3D shape retrieval using bag of words approaches. In 2019 2nd International Conference on Computing, Mathematics and Engineering Technologies (iCoMET).Sukkur, Pakistan: IEEE: 1-7
        • Li J, Lee GH (2019) USIP: unsupervised stable interest point detection from 3d point clouds. In: Lee KM, Forsyth T, Pollefeys M, Tang X (eds). Seoul, Korea: IEEE: 361-370
        • Lian Z, Godil A, Sun X (2010) Visual similarity 3d shape retrieval using bag-of-features. In: Pernot JS, Spagnuolo M, Falcidieno B, Veron P (eds) 2010 Shape modelling International Conference. Los Alamitos: IEEE: 25-36
        • Salton G, McGill MJ (1986) Introduction to Modern Information Retrieval. New York, NY, USA: McGraw-Hill, Inc.
        • Scovanner P, Saad A, Saha M (2007) A 3-dimensional sift descriptor and its application to action recognition. In Lienhart R, Prasad AR (eds) 15th ACM international conference on Multimedia. Augsburg, Germany. Association for Computing Machinery, New York, NY, United States: 631-640
        • Sipiran I, Bustos B (2011) Harris 3D: a robust extension of the Harris operator for interest point detection on 3D meshes. The Visual Computer 27(2011): 963–976
        • Trimble Inc (2021) SketchUp 3D Warehouse. https://3dwarehouse.sketchup.com/ (4 February 2021)
        • Wysocki O, Hoegner L, Stilla U (2022) TUM-Façade: Reviewing and Enriching Point Cloud Benchmarks for Façade Segmentation. Int. Arch. Photogramm. Remote Sens. Spatial Inform. Sci. XLVI-2/W1-2022, 529–536
        • Zhu Q, Zhong Y, Zhao B, Xia GS, Zhang L (2016) Bag-of-visual-words scene classifier with local features for high spatial resolution remote sensing imagery. IEEE Geoscience and Remote Sensing Letters 13(3): 747-751

        About

        This repository contains the code, that was used in the paper: Reconstructing façade details using MLS point clouds and Bag-of-Words approach

        Resources

        Stars

        Watchers

        Forks

        Releases

        No releases published

        Packages

        No packages published

        Languages