Skip to content

candrsn/canupo

Repository files navigation

This is the CANUPO project (CAractérisation de NUages de POints)

You'll find here a software suite for processing 3D point clouds, such as can
be captured by LiDAR systems. The goal is to recognise automatically various
elements in the scene, like rocks, sand and vegetation. This is performed
using a multi-scale dimensionality criterion which caracterises the geometric
properties of the above elements in the scene. Each class can then be
separated using a graphically defined classifier that can be edited (if
necessary) very easily by non-specialists of machine learning.

To make it clearer consider a scene comprising rocks, sand, and vegetation
patches. At a small scale the sand looks like a 2D surface, the rocks look 3D,
and the vegetation is a mixture of small elements like stems and leaves (mostly
1D and 2D). At a larger scale the sand still looks 2D, the rocks now look more
2D than 3D, and the vegetation has become more like a 3D bush. When combining
information from different scales we can thus build signatures of the scene at
each point. This signature can then be used to discriminate vegetation from soil
for example.

The full technique is described in our article "3D Terrestrial LiDAR data
classification of complex natural scenes using a multi-scale dimensionality
criterion: applications in geomorphology", by Nicolas Brodu and Dimitri Lague.
That article is available on the first author web page as well as on the ArXiv:
    http://arxiv.org/abs/1107.055

The file you are reading comes with the source code and with the binary
distribution. It describes how to use the software suite. Both source and
binaries are available at the project home page:
    http://nicolas.brodu.numerimoire.net/en/recherche/canupo/


==== Usage ====

- When you don't know what a program does, just run it in a terminal. It will
  tell you what it does and what arguments it expects on the standard output.

- Using a 3D cloud edition software (tip: CloudCompare is free and quite
  efficient), prepare at least one sample of each class you wish to recognise
  in the scene. Example: select a vegetation bush and some portion of soil.
  Save these samples in separate files.

- Start by running "canupo" on the full data set and the samples. Give it
  a set of scales to look at, which you think discriminates your samples (see
  the introduction above). It will generate the multi-scale files.

- Run "suggest_classifier_lda" for separating samples from two distinct
  classes. You may optionally add in the full scene for semi-supervised
  learning (but start with just the class samples to begin with).

- Review the generated SVG file with a graphical editor like Inkscape. You may
  optionally edit this classifier definition file: in this case move/add/remove
  the nodes in the class separation path. You may use as many nodes as you wish
  so long as there is only one path comprising only straight lines. But you may
  simply ignore this step and use the default classifier.

- Run "validate_classifier" on the SVG file. It will produce a binary parameter
  file containing the classifier in a condensed form. Optionally feed the
  validate_classifier program with your sample multi-scale files (see step 3).
  It will then give you the performances of the classifier for separating the
  samples. Loop to the previous step if you think you can improve these...

- Finally run "classify" on the whole scene to automatically label each point
  into classes corresponding to your samples. You get an extra column in
  the xyz point cloud telling which class each point is in. Load this file
  for example in CloudCompare and use the extra column as a "scalar>0" so
  each class appears with a distinct color.


==== Advanced Usage ====

- The "canupo" program can read the list of scales from a previously generated
  .prm classifier parameter file. This is especially handy for processing
  a new scene with classifier that is known to work well on similar scenes.

- Use the "density" program to investigate how the samples and the scene
  look like in the dimensionality space at various scales. This may help you
  select some scales offering a discriminative power, and ignore scales
  where the classes are too similar. Note that the multi-scale feature goal
  is to _combine_ the discriminative information at each scale, so usually
  the more scales the better. Up to some limit where you add more noise than
  information of course... a few well selected scales work better than a
  large range of scales with similar information.

- Use the "msc_tool" to project a scene into the plane of maximal separability
  defined by a classifier. This will generate another SVG that you can edit
  (and revalidate into another classifier!). You get a density map with
  all points, which is sometimes more informative than the two-class SVG file
  generated by "suggest_classifier_xxx".

- See "validate_classifier" and "combine_classifiers" for a scenario with more
  than two classes. Note: If you can do as we do in the article, i.e. extract
  classes one against the others one by one, then do it. Educated guesses like
  that tend to work better than the majority vote technique performed when
  using "combine_classifiers". Well, just try and see...

- Play with the SVM classifier. The "=N" parameter increases the chances to get
  a better separation up to some N value where it does not matter anymore. This
  is _slower_ than LDA, so be prepared to wait some time (more for larger N).
  This is sometimes, but seldom, better than LDA (usually not worth the wait).

- You may find the other uses of "msc_tool" handy when dealing with a large
  number of scenes and multiple multi-scale parameters. It can identify which
  scales are present in a multiscale file, as well as convert these to plain
  xyz files (warning: large files!). Similarly you may find the "filter"
  utility occasionnaly useful for splitting a scene into classified elements.


Nicolas Brodu <nicolas.brodu@numerimoire.net>
May 2012