# Using the Point Data Abstraction Library (PDAL) 

Least importantly, [PDAL](https://pdal.io), is pronounced 'Poo-dal' by its developers [Hobu Inc.](https://hobu.co/), in homage to the creator of [GDAL](http://www.gdal.org/) (who pronounces it 'Goo-dal').

Most importantly, PDAL is a utilitarian software library for handling point cloud data that is fast, scalable, and free!

## Step 1: Get PDAL

We are going to use Docker. This is because we are working on potentially dozens of remote machines, with various software stacks. Docker levels the playing field and allows us to install software without worring about it breaking. If we are running Docker on HPC, we can use Singularity to convert the containers.

### Docker

If you haven't yet, install Docker. 

Use Docker to pull the [PDAL Docker](https://hub.docker.com/r/pdal/pdal/) container from [DockerHub](https://hub.docker.com/)

In [1]:
docker pull pdal/pdal:latest

latest: Pulling from pdal/pdal

[1B
[1BDigest: sha256:5ad5c2bb66d0d7f8e1225fa659982af05d3d1880de3930360712af03a650c477
Status: Downloaded newer image for pdal/pdal:latest


## Step 2: Get Data

At this point you're ready to do some analyses with PDAL

Next, we need to pull some data onto this virtual instance.

From a console, initiate the iRODS connection using the `iinit` command.

|field|response|
|-----|--------|
|host name (DNS)|data.cyverse.org|
|port number|1247|
|user name|$USER|
|irods zone|iplant|
|irods password|type password|

Now you're ready to pull data from the CyVerse data store onto the virtual instance.

Create a new working directory for the lidar data: 

Pull the 2017 SRER classified lidar data from Tyson's data store using the `iget` command:

In [16]:
iget -KPQbrvf /iplant/home/shared/srer/DiscreteLidar/ /home/tswetnam/

D- /home/tswetnam/QUBES_NEON/lessons/srer_laz/classified :
0/339 -  0.00% of files done   0.000/2881.635 MB -  0.00% of file sizes done
Processing NEON_D14_SRER_DP1_501000_3518000_classified_point_cloud.laz - 1.540 MB   2018-04-09.15:20:55
   NEON_D14_SRER_DP1_501000_       1.540 MB | 1.565 sec | 0 thr |  0.984 MB/s
1/339 -  0.29% of files done   1.540/2881.635 MB -  0.05% of file sizes done
Processing NEON_D14_SRER_DP1_501000_3519000_classified_point_cloud.laz - 3.009 MB   2018-04-09.15:20:56
   NEON_D14_SRER_DP1_501000_       3.009 MB | 1.621 sec | 0 thr |  1.857 MB/s
2/339 -  0.59% of files done   4.549/2881.635 MB -  0.16% of file sizes done
Processing NEON_D14_SRER_DP1_501000_3520000_classified_point_cloud.laz - 3.045 MB   2018-04-09.15:20:58
   NEON_D14_SRER_DP1_501000_       3.045 MB | 1.552 sec | 0 thr |  1.962 MB/s
3/339 -  0.88% of files done   7.594/2881.635 MB -  0.26% of file sizes done
Processing NEON_D14_SRER_DP1_501000_3521000_classified_point_cloud.laz - 3.250 MB   201

In [1]:
docker run -v /home/tswetnam/DiscreteLidar/ClassifiedPointCloud:/data pdal/pdal pdal info data/NEON_D14_SRER_DP1_510000_3514000_classified_point_cloud.laz -p 1 

{
  "filename": "data\/NEON_D14_SRER_DP1_510000_3514000_classified_point_cloud.laz",
  "pdal_version": "1.7.1 (git-version: d6902f)",
  "points":
  {
    "point":
    {
      "Classification": 5,
      "EdgeOfFlightLine": 0,
      "GpsTime": 236624.4349,
      "Intensity": 3,
      "NumberOfReturns": 2,
      "PointId": 1,
      "PointSourceId": 62,
      "ReturnNumber": 1,
      "ScanAngleRank": -19,
      "ScanDirectionFlag": 0,
      "UserData": 28,
      "X": 510999.99,
      "Y": 3514073.84,
      "Z": 1215.74
    }
  }
}


Find the boundaries of the tile

In [2]:
docker run -v /home/tswetnam/DiscreteLidar/ClassifiedPointCloud:/data pdal/pdal pdal info data/NEON_D14_SRER_DP1_510000_3514000_classified_point_cloud.laz --boundary

{
  "boundary":
  {
    "area": 1041955.095,
    "avg_pt_per_sq_unit": 1.035100902,
    "avg_pt_spacing": 0.5861495222,
    "boundary": "MULTIPOLYGON (((510063.05217097 3513978.30078804, 510083.09340795 3513995.65700840, 510113.15526343 3513978.30078804, 510213.36144835 3513978.30078804, 510233.40268534 3513995.65700840, 510263.46454081 3513978.30078804, 510293.52639629 3513995.65700840, 510393.73258121 3513995.65700840, 510413.77381820 3513978.30078804, 510443.83567367 3513995.65700840, 510473.89752915 3513978.30078804, 510503.95938463 3513995.65700840, 510534.02124011 3513978.30078804, 510564.08309558 3513995.65700840, 510594.14495106 3513978.30078804, 510664.28928051 3513978.30078804, 510684.33051749 3513995.65700840, 510714.39237297 3513978.30078804, 510744.45422845 3513995.65700840, 510814.59855789 3513995.65700840, 510834.63979488 3513978.30078804, 510864.70165035 3513995.65700840, 510894.76350583 3513978.30078804, 510924.82536131 3513995.65700840, 510964.90783528 3513995.6570084

Now I'm going to create a tile index

In [None]:
docker run -v /home/tswetnam/DiscreteLidar/ClassifiedPointCloud:/data pdal/pdal pdal info data/NEON_D14_SRER_DP1_510000_3514000_classified_point_cloud.laz

# Entwine and Greyhound 

[Entwine](https://entwine.io/) and [Greyhound](https://greyhound.io/) are also owned by Hobu Inc. 

Entwine creates an indexed directory of files for viewing lidar data in your browser. These tiles are served by Greyhound.

Here I'm going to launch the Greyhound server on my localhost, this will allow me to view the Entwine results in my browser

In [None]:
docker run -it -v ~/entwine:/entwine -p 8080:8080 connormanning/greyhound

Next, I'm going to process a single tile from the SRER and put it into a directory that is being read by Greyhound (`~/entwine`) 

In [15]:
docker run -it -v $HOME:$HOME \
    -v $HOME/QUBES_NEON/pdal:/pdal_files \
    -v /home/tswetnam/DiscreteLidar/FilteredClassifiedPointCloud:/data \
    connormanning/entwine build \
    pdal_files/web-mercator.json \
    -i data/NEON_D14_SRER_DP1_510000_3514000_classified_point_cloud.laz \
    -o ~/entwine/srer-test

Scanning for new files...

Continuing previous index...

Version: 1.1.0
Input:
	Files: 408
	Point count hint: 3,032,719 points
	Density estimate (per square unit): 3.03278
	Threads: 16
Output:
	Output path: /home/tswetnam/entwine/srer-test/
	Data storage: laszip
	Scale: 0.01
	Offset: (510500, 3514500, 1410)
	XYZ width: 4
Metadata:
	Native bounds: [(510000, 3514000, 204.09), (510999.99, 3515000, 2608.95)]
	Cubic bounds: [(509270, 3513270, 180), (511730, 3515730, 2640)]
	Reprojection: (none)
	Storing dimensions: [
		X, Y, Z, Intensity, ReturnNumber
		NumberOfReturns, ScanDirectionFlag, EdgeOfFlightLine, Classification, ScanAngleRank
		UserData, PointSourceId, GpsTime, OriginId
	]

	Pushes complete - joining...
Saving hierarchy...
Saving registry...
Saving metadata...

Index completed in 1 seconds.
Save complete.  Indexing stats:
	Points inserted:
		Previously: 3,062,477
		Currently:  0
		Total:      3,062,477
	Points discarded:
		Outside specified bounds: 1,214,868,975
		Overflow past ma

View the outputs here: http://potree.entwine.io/data/custom.html?s=localhost:8080&r=srer-test

We can see that there are a lot of outliers in these data. We want to filter these out using PDAL's outlier tools

In [None]:
mkdir /home/tswetnam/DiscreteLidar/FilteredClassifiedPointCloud

Now, I'm going to iterate individual jobs and docker containers across the number of cores on the machine using the `-P` flag for `xargs`

I use PDAL's `pipeline` feature with a .json file called `outlier.json` to filter the outliers from the individual tiles

In [10]:
cd /home/tswetnam/DiscreteLidar/ClassifiedPointCloud
ls *.laz | cut -d. -f1 | xargs -P16 -I{} \
docker run \
    -v $HOME/QUBES_NEON/pdal:/pdal_files \
    -v /home/tswetnam/DiscreteLidar/ClassifiedPointCloud:/data \
    -v /home/tswetnam/DiscreteLidar/FilteredClassifiedPointCloud:/filtered \
    pdal/pdal pdal \
    pipeline pdal_files/outlier.json \
    --readers.las.filename=/data/{}.laz \
    --writers.las.filename=/filtered/{}.laz



After the collection has been filtered for outliers, we will run Entwine on the entire directory:

In [13]:
docker run -it -v $HOME:$HOME \
    -v $HOME/QUBES_NEON/pdal:/pdal_files \
    -v /home/tswetnam/DiscreteLidar/FilteredClassifiedPointCloud:/data \
    connormanning/entwine build \
    pdal_files/web-mercator.json \
    -i data/ \
    -o ~/entwine/srer

Resolving [file]: data/* ...
	Resolved to 408 paths.
Scanning for new files...

Continuing previous index...

Version: 1.1.0
Input:
	Files: 408
	Point count hint: 1,217,931,148 points
	Density estimate (per square unit): 2.3679
	Threads: 16
Output:
	Output path: /home/tswetnam/entwine/srer/
	Data storage: laszip
	Scale: 0.01
	Offset: (-12340630, 3739540, 1760)
	XYZ width: 4
Metadata:
	Native bounds: [(-12354736, 3725784.4, 854.27), (-12326544, 3753291.3, 2661.96)]
	Cubic bounds: [(-12354750, 3725420, -12360), (-12326510, 3753660, 15880)]
	Reprojection: (from file headers) -> EPSG:3857
	Storing dimensions: [
		X, Y, Z, Intensity, ReturnNumber
		NumberOfReturns, ScanDirectionFlag, EdgeOfFlightLine, Classification, ScanAngleRank
		UserData, PointSourceId, GpsTime, Red, Green
		Blue, OriginId
	]

	Pushes complete - joining...
Saving hierarchy...
Saving registry...
Saving metadata...

Index completed in 0 seconds.
Save complete.  Indexing stats:
	Points inserted:
		Previously: 1,217,931,148

In [None]:
docker run \
-v /vol_c/SRER/L1/DiscreteLidar/FilteredClassifiedPointCloud/:/input \
-v /home/tswetnam/potree/pointclouds/:/output \
potreeconverter PotreeConverter /input \
--overwrite \
-p SRER \
-o /output/SRER \
--title "Santa Rita Experimental Range, Arizona" \
--description "2017 NEON Aerial Observation Platform Discrite Lidar" \
--scale 0.001 \
--output-format LAZ \
--output-attributes INTENSITY \
--projection "+proj=longlat +datum=WGS84 +no_defs"

== params ==
source[0]:         	/input
outdir:            	/output/SRER
spacing:           	0
diagonal-fraction: 	200
levels:            	-1
format:            	
scale:             	0.01
pageName:          	SRER
projection:        	+proj=longlat +datum=WGS84 +no_defs

AABB: 
min: [-1.79769e+308, -1.79769e+308, -1.79769e+308]
max: [1.79769e+308, 1.79769e+308, 1.79769e+308]
size: [inf, inf, inf]

cubic AABB: 
min: [-1.79769e+308, -1.79769e+308, -1.79769e+308]
max: [inf, inf, inf]
size: [inf, inf, inf]

spacing calculated from diagonal: inf
READING:  /input/NEON_D14_SRER_DP1_512000_3528000_classified_point_cloud.laz
INDEXING: 1000000 points processed; 20000 points written; 21.972 seconds passed
INDEXING: 2000000 points processed; 20000 points written; 22.266 seconds passed
INDEXING: 3000000 points processed; 20000 points written; 22.552 seconds passed
READING:  /input/NEON_D14_SRER_DP1_516000_3511000_classified_point_cloud.laz
INDEXING: 4000000 points processed; 20000 points written; 22.