# Reading data from EPT

## Introduction

This tutorial describes how to use [Conda], [Entwine], [PDAL], and [GDAL] to
read data from the [USGS 3DEP AWS Public Dataset]. We will be using PDAL's
[readers.ept] to fetch data, we will filter it for noise using [filters.outlier],
we will classify the data as ground/not-ground using [filters.smrf], and we will
write out a digital terrain model with {ref}`writers.gdal`. Once our elevation model
is constructed, we will use GDAL [gdaldem] operations to create hillshade, slope,
and color relief.

## Write the Pipeline

PDAL uses the concept of [pipelines] to describe the reading, filtering, and writing
of point cloud data. We will construct a pipeline that will do a number of things
in succession.

```{figure} images/pipeline-example-overview.png
:scale: 50%

Pipeline diagram. The data are read from the [Entwine Point Tile] resource at
<https://usgs.entwine.io> for Iowa using {ref}`readers.ept` and filtered through a
number of steps until processing is complete. The data are then written to
an `iowa.laz` and `iowa.tif` file.
```

In [None]:
import os
import sys

conda_env_path = os.environ.get('CONDA_PREFIX', sys.prefix)
proj_data = os.path.join(os.path.join(conda_env_path, 'share'), 'proj')
os.environ["PROJ_DATA"] = proj_data

Our first step is to import the `pdal` library.

In [None]:
import pdal

## Stages

### readers.ept

{ref}`readers.ept` reads the point cloud data from the EPT resource on AWS. We give
it a URL to the root of the resource in the `filename` option, and we also
give it a `bounds` object to define the window in which we should select data
from.

```{note}
The full URL to the EPT root file (`ept.json`)) must be given
to the filename parameter for PDAL 2.2+. This was a change in
behavior of the {ref}`readers.ept` driver.
```

The `bounds` object is in the form `([minx, maxx], [miny, maxy])`.

```{warning}
If you do not define a `bounds` option, PDAL will try to read the
data for the entire state of Iowa, which is about 160 billion points.
Maybe you have enough memory for this...
```

```{figure} images/pipeline-example-readers.ept.png
:scale: 50%

The EPT reader reads data from an EPT resource with PDAL. Options available
in PDAL 1.9+ allow users to select data at or above specified resolutions.
```

In [None]:
pipeline = pdal.Reader.ept(
    "https://s3-us-west-2.amazonaws.com/usgs-lidar-public/IA_FullState/ept.json",
    bounds="([-10425171.940, -10423171.940], [5164494.710, 5166494.710])"
)

### filters.expression

The data we are selecting may have noise properly classified, and we can use
{ref}`filters.expression` to keep all data that does not have a `Classification` {ref}`dimensions`
value of `7`.

```{figure} images/pipeline-example-filters.range1.png
:scale: 50%

The {ref}`filters.expression` filter allows users to
select data for processing or removal.
```

In [None]:
pipeline |= pdal.Filter.expression(expression="Classification != 7")

```{note}
Formerly, this step may have appeared as `pdal.Filter.range(limits="Classification![7:7]")`. While this syntax is still supported, many users will find the more natural expressions supported in the expression filter easier to write and interpret.
```

### filters.assign

After removing points that have noise classifications, we need to reset all
of the classification values in the point data. {ref}`filters.assign` takes the
expression `Classification [:]=0` and assigns the `Classification` for
each point to `0`.

```{figure} images/pipeline-example-filters.assign.png
:scale: 50%

{ref}`filters.assign` can also take in an option to apply assignments
based on a conditional. If you want to assign values based on a
bounding geometry, use {ref}`filters.overlay`.
```

In [None]:
pipeline |= pdal.Filter.assign(assignment="Classification[:]=0")

### filters.reprojection

The data on the AWS 3DEP Public Dataset are stored in [Web Mercator]
coordinate system, which is not suitable for many operations. We need to
reproject them into an appropriate UTM coordinate system ([EPSG:26915](https://epsg.io/32615)).

```{figure} images/pipeline-example-filters.reprojection.png
:scale: 50%

{ref}`filters.reprojection` can also take override the incoming coordinate
system using the `a_srs` option.
```

In [None]:
pipeline |= pdal.Filter.reprojection(out_srs="EPSG:26915")

### filters.smrf

The Simple Morphological Filter ({ref}`filters.smrf`) classifies points as ground
or not-ground.

```{figure} images/pipeline-example-filters.smrf.png
:scale: 50%

{ref}`filters.smrf` provides a number of tuning options, but the
defaults tend to work quite well for mixed urban environments on
flat ground (ie, Iowa).
```

In [None]:
pipeline |= pdal.Filter.smrf()

### filters.range

After we have executed the SMRF filter, we only want to keep points that
are actually classified as ground in our point stream. Selecting for
points with `Classification == 2` does that for us.

```{figure} images/pipeline-example-filters.range2.png
:scale: 50%

Remove any point that is not ground classification for our
DTM generation.
```

In [None]:
pipeline |= pdal.Filter.expression(expression="Classification == 2")

### writers.gdal

Having filtered our point data, we're now ready to write a raster digital
terrain model with {ref}`writers.gdal`. Interesting options we choose here are
to set the `nodata` value, specify only outputting the inverse distance
weighted raster, and assigning a resolution of `1` (m). See {ref}`writers.gdal`
for more options.

```{figure} images/pipeline-example-writers.gdal.png
:scale: 50%

Output a DTM at 1m resolution.
```

### writers.las

We can also write a LAZ file containing the same points that were used to
make the elevation model in the section above. See {ref}`writers.las` for more options.

```{figure} images/pipeline-example-writers.las.png
:scale: 50%

Also output the LAZ file as part of our processing pipeline.
```

Following this step, we will execute the pipeline and report the number of points in the resulting point cloud.

In [None]:
pipeline.execute()
print(f"Processed point cloud contains {len(pipeline.arrays[0])} points")