Skip to content

eEcoLiDAR/miscellaneous

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Miscellaneous

This repository contains various scripts and files which are used in the eEcoLiDAR project. Most of these are used only once and, therefore, maintaining them does not have the highest priority.

Virtual Machines

The VMs can be managed on https://ui.hpccloud.surfsara.nl/.

Server Scripts

The AHN3 data is downloaded and re-gridded to make the size of each tile small enough to work with. The re-gridded tiles are then normalised and the features are extracted.

General Scripts

Once all VMs are up-and-running, the data storage can be mounted. All data live on a WebDAV server. Small amounts of data can temporarily be stored locally, but should eventually be copied onto the WebDAV server. To mount the WebDAV under /data/, run the mount_all.sh script. Before the VMs are shut down, the unmount_all.sh scripts can unmount this storage.

During the normalisation and feature extraction, Server0 will function as a master, while the other VMs function as slaves. To allow Server0 to communicate with the other VMs, it must have SSH access to them. The copy_temp_key.sh script sends your local private key to the /tmp/ directory of Server0. This must be repeated every time the VM is shut down and rebooted.

Normalisation Scripts

Normalising a pointcloud means the height of the groundlevel is subtracted from the height of each point. Then, the z-coordinate of each point refers to the height-above-ground. This can be done using the pdal library. Its documentation includes more information.

The normalise_copy_files.sh script first copies the other scripts onto the relevant VM. Note that the VMs must already have the relevant directories, otherwise the copy will not succeed. The normalise_run_all.sh script can be executed from within Server0 and starts up the normalisation procedure on all VMs. When you execute a Unix job in the background and logout from the session, your process will get killed. Our script avoids this using the nohup method, so it is safe to log out of Server0 while the normalisation is running.

On each VM, the normalise_run_tiles_x.sh script starts up the normalisation job. It includes the pdal hag command to adjust the height-above-ground. As the list of all tiles is large, the xargs method is used. This refers to a list of all tiles and iterates through the list, executing the same job for each of them. The xargs methods allows the user to specify how many jobs each server should do simultaneously. Taking care of the memory of each VM and the size of the input files, the normalisation script only runs 2 jobs simultaneously. This list of tiles is located in the normalise_tiles_x.sh file.

The normalise_get_lists.sh script can be run locally and downloads a list of all processed tiles from each server. This can be used in the List Analysis Jupyter Notebook, discussed below, to see what each VM has done so far.

Following the List Analysis, it may turn out some tiles where not processed. These should be investigated manually and, perhaps, re-done. The redo scripts allow this to be executed easily.

Once all tiles have been processed, the output can be moved from the VMs to the WebDAV server. The normalise_move_all.sh script can be run from within Server0 and executes this move.

Feature Scripts

The feature extraction scripts follow a similar logic to the normalisation scripts. In particular, the feature_copy_files.sh script copies the other scripts to the relevant VM. The feature_run_all.sh script can be used from Server0 to start the feature extraction. The feature_run_tiles_x.sh scripts include the command which executes the feature extraction. This uses the computefea_wtargets_cell.py Python script to call the LaserChicken module, which calculates the features. The feature_tiles_x.sh files contain the list of tiles per VM. The feature_move_all.sh script moves the output .ply files to the WebDAV server. There are also redo scripts, similar to the normalisation procedure. The feature_get_lists.sh script downloads the list of tiles which have already been processed. Finally, the feature_get_files.sh downloads the output files to the local environment, such that they can be further processed by the Data Conversion Jupyter Notebook.

Jupyter Notebooks

There are three Jupyter Notebooks which help analyse and convert the data.

List Analysis

The get_lists.sh scripts copy the lists of processed tiles to the data directory discussed below. From there, the ListAnalysis.ipynb reads the lists, together with the tiles_list.txt file which contains the list of all tiles. It then compares the actual output with the expected output and shows how many tiles have already been processed and which ones still need to be processed.

Data Conversion

The output of LaserChicken are .ply files, one for each input tile. These files contain a number of points, depending on the chosen resolution, which have the features as attributes. The DataConversion.ipynb reads these .ply files using the plyfile module. It joins the files into one dataset and converts this to a compressed Numpy array and to a GeoTiff.

Data Verification

To verify that the features are correctly extracted, the DataVerification.ipynb reads the .npz version of the data and prints various details for each feature. This is, then, compared with the expected details for each feature.

Data

The data directory of this repository contains various output files related to the normalisation and feature extraction procedures.

Lists

This directory contains the tiles_list.txt files which is a list of all tiles. It also contains all lists downloaded by the get_lists.sh scripts of the normalisation and feature extraction.

AHN3 Feature Data

The feature_get_lists.sh script downloads all .ply files from the various VMs and places them in the data directory, in a separate directory per resolution.

Terrain Data

The Data Conversion Jupyter Notebook places the compressed Numpy array and the GeoTiff with the terrain features in the data directory.