
# Entwine processing

To view the lidar, we want to build EPT from the `*.laz`, to do so:

```{bash}
entwine build -i /workspace/jemez/lidar/laz/NM_SouthCentral_B9_2018/laz/ -o /workspace/jemez/lidar/ept/NM_SouthCentral_B9_2018/ -t 26
```
 
After the EPT are built, and the `ept.json` file is in the first level directory. 
 
You can host an `http-server` for the EPT locally on your computer for viewing in the browser:
 
```{bash}
cd  /workspace/jemez/lidar/ept/NM_SouthCentral_B8_2018
 
# install  nodejs/npm
npm install http-server -g

#run
http-server -p 9000 --cors
```

Now, in the browser, navigate to:

[https://potree.entwine.io/data/view.html?r=%22http://localhost:9000%22](https://potree.entwine.io/data/view.html?r=%22http://localhost:9000%22)

To view the data on [CyVerse Data Store](https://usgs.entwine.io/data/view.html?r=%22https://data.cyverse.org/dav-anon/iplant/home/tswetnam/jemez/lidar/ept/NM_SouthCentral_B9_2018/%22)

```
https://usgs.entwine.io/data/view.html?r=%22https://data.cyverse.org/dav-anon/iplant/home/tswetnam/jemez/lidar/ept/NM_SouthCentral_B9_2018/%22
```

In [1]:
%%html
<iframe src="https://usgs.entwine.io/data/view.html?r=%22https://data.cyverse.org/dav-anon/iplant/home/tswetnam/jemez/lidar/ept/NM_SouthCentral_B9_2018/%22" width="800" height="600"/>

Notice, these original data have vertical outliers in them. 

To remove these use PDAL's `filters` with the `pipeline` json feature.

`nm_sc_b9_outlier_removal.json`:

```{json}
{
 	"pipeline":[
 	   "/workspace/jemez/lidar/laz/NM_SouthCentral_B9_2018/{}.laz",
 	   {
        "type":"filters.outlier",
        "method":"statistical",
        "mean_k":12,
        "multiplier":2.2
       },
       {
          "type":"filters.range",
          "limits":"Classification![7:7]"
       },
       {
 	      "type":"writers.las",
 	      "extra_dims": "all",
          "minor_version" : "4",
	 	  "filename":"/workspace/jemez/lidar/laz/NM_SouthCentral_B9_2018/laz/{}.laz"
 	   }
 	]
 }
```

Because our collections are large, we want to use batch processing on them. 

https://pdal.io/workshop/exercises/batch_processing/batch-processing.html

Run PDAL on the CLI to recursively go through the directory with the `*.laz`. 

This will process one tile per thread. 

Running this on our new server with 255 cores allows us to use an enormous thread count:

```{bash}
cd /workspace/jemez/lidar/laz/NM_SouthCentral_B9_2018/
mkdir laz
ls *.laz | cut -d. -f1 | xargs -P250 -I{} \
pdal pipeline /workspace/jemez/lidar/pdal_pipeline_jsons/nm_sc_b9_outlier_removal.json \
--readers.las.filename=/workspace/jemez/lidar/laz/NM_SouthCentral_B9_2018/{}.laz \
--writers.las.filename=/workspace/jemez/lidar/laz/NM_SouthCentral_B9_2018/laz/{}.laz
```

```{bash}
cd /workspace/jemez/lidar/laz/USGS_LPC_NM_NorthCentral_B1_2016/
mkdir laz
ls *.laz | cut -d. -f1 | xargs -P250 -I{} \
pdal pipeline /workspace/jemez/lidar/pdal_pipeline_jsons/nm_nc_b1_outlier_removal.json \
--readers.las.filename=/workspace/jemez/lidar/laz/USGS_LPC_NM_NorthCentral_B1_2016/{}.laz \
--writers.las.filename=/workspace/jemez/lidar/laz/USGS_LPC_NM_NorthCentral_B1_2016/laz/{}.laz
```

```{bash}
cd /workspace/jemez/lidar/laz/USGS_LPC_NM_NorthCentral_B2_2016/
mkdir laz
ls *.laz | cut -d. -f1 | xargs -P250 -I{} \
pdal pipeline /workspace/jemez/lidar/pdal_pipeline_jsons/nm_nc_b2_outlier_removal.json \
--readers.las.filename=/workspace/jemez/lidar/laz/USGS_LPC_NM_NorthCentral_B2_2016/{}.laz \
--writers.las.filename=/workspace/jemez/lidar/laz/USGS_LPC_NM_NorthCentral_B2_2016/laz/{}.laz
```


We're essentially duplicating the dataset by creating a new sub-directory called `/laz` -- if space is a concern, you will want to delete the unfiltered data AFTER you have checked them to make sure the filtering was successful.