# ****Cloud-Optimized Geospatial Formats Guide****

**Authors**: Harshini Girish (UAH), Rajat Shinde (UAH), Alex Mandel (DevSeed), Jamison French (DevSeed), Brian Freitag (NASA MSFC), Sheyenne Kirkland (UAH)

**Date**: October 25, 2024

**Description**: The LASER (LAS) file format is designed to store 3-dimensional (x,y,z) point cloud data typically collected from LiDAR. An LAZ file is a compressed LAS file, and a Cloud-Optimized Point Cloud (COPC) file is a valid LAZ file. COPC files are similar to COGs for GeoTIFFs: Both are valid versions of the original file format but with additional requirements to support cloud-optimized data access. In the case of COGs, there are additional requirements for tiling and overviews. For COPC, data must be organized into a clustered octree with a variable-length record (VLR) describing the octree structure.

**Setup**
This tutorial will explore how to:

1. Read a LiDAR LAS file using PDAL in Python
2. Convert the LiDAR LAS file to Cloud-Optimized Point Cloud (COPC) format
3. Validate the generated COPC file

## **Run This Notebook**

To access and run this tutorial within MAAP’s Algorithm Development Environment (ADE), please refer to the [Getting started with the MAAP](#) section of our documentation.

**Disclaimer**: It is highly recommended to run this tutorial within MAAP’s ADE, which already includes packages specific to MAAP, such as maap-py. Running the tutorial outside of the MAAP ADE may lead to errors.


## **Importing Packages**
In this example, we demonstrate how to read a LiDAR LAS file using PDAL in Python and convert the LiDAR LAS file to Cloud-Optimized Point Cloud (COPC) format on the MAAP ADE.

Within your Jupyter Notebook, start by importing the maap package. Then invoke the MAAP constructor, setting the maap_host argument to 'api.ops.maap-project.org'

In [None]:
# import os module
import os
# import the maap package to handle queries
from maap.maap import MAAP
# invoke the MAAP constructor using the maap_host argument
maap = MAAP(maap_host='api.ops.maap-project.org')


## **Downloading The Data**
We are using search_data method from the earthaccess module for searching the granules from the selected collection. The temporal argument defines the temporal range for



In [None]:
results = maap.searchGranule(
    short_name="GLLIDARPC",
    version="001",
    temporal = ("2020"), 
    count=3 
)

<p style="font-family: 'Courier New', Courier, monospace; font-size: 15px; color: black;">Granules found: 72</p>

<div style="overflow-x: auto; white-space: pre-wrap; font-family: 'Courier New', monospace; padding: 5px; font-size:0.9em; line-height: 1.2;">
[Collection: {'EntryTitle': 'G-LiHT Lidar Point Cloud V001'}
 Spatial coverage: {'HorizontalSpatialDomain': {'Geometry': {'GPolygons': [{'Boundary': {'Points': [{'Longitude': -81.03452828650298, 'Latitude': 25.50220025425373}, {'Longitude': -81.01391715300757, 'Latitude': 25.50220365895999}, {'Longitude': -81.01391819492625, 'Latitude': 25.5112430715201}, {'Longitude': -81.03453087148995, 'Latitude': 25.511239665437053}, {'Longitude': -81.03452828650298, 'Latitude': 25.50220025425373}]}}]}}}
 Temporal coverage: {'RangeDateTime': {'BeginningDateTime': '2020-03-11T04:00:00.000Z', 'EndingDateTime': '2020-03-12T03:59:59.000Z'}}
 Size(MB): 238.623
 Data: ['https://e4ftl01.cr.usgs.gov//GWELD1/COMMUNITY/GLLIDARPC.001/2020.03.11/GLLIDARPC_FL_20200311_FIA8_l0s47.las'],
 
Collection: {'EntryTitle': 'G-LiHT Lidar Point Cloud V001'}
 Spatial coverage: {'HorizontalSpatialDomain': {'Geometry': {'GPolygons': [{'Boundary': {'Points': [{'Longitude': -81.02242648723991, 'Latitude': 25.493163090615468}, {'Longitude': -80.99410838333016, 'Latitude': 25.49316468678571}, {'Longitude': -80.99410794242846, 'Latitude': 25.502204110708817}, {'Longitude': -81.02242816553566, 'Latitude': 25.50220251389295}, {'Longitude': -81.02242648723991, 'Latitude': 25.493163090615468}]}}]}}}
 Temporal coverage: {'RangeDateTime': {'BeginningDateTime': '2020-03-11T04:00:00.000Z', 'EndingDateTime': '2020-03-12T03:59:59.000Z'}}
 Size(MB): 248.383
 Data: ['https://e4ftl01.cr.usgs.gov//GWELD1/COMMUNITY/GLLIDARPC.001/2020.03.11/GLLIDARPC_FL_20200311_FIA8_l0s46.las'],
 
Collection: {'EntryTitle': 'G-LiHT Lidar Point Cloud V001'}
 Spatial coverage: {'HorizontalSpatialDomain': {'Geometry': {'GPolygons': [{'Boundary': {'Points': [{'Longitude': -80.94099075054905, 'Latitude': 25.276201329530473}, {'Longitude': -80.9355627247816, 'Latitude': 25.276199059361314}, {'Longitude': -80.9355579494582, 'Latitude': 25.285238744206318}, {'Longitude': -80.94098637748567, 'Latitude': 25.285241015299494}, {'Longitude': -80.94099075054905, 'Latitude': 25.276201329530473}]}}]}}}
 Temporal coverage: {'RangeDateTime': {'BeginningDateTime': '2020-03-11T04:00:00.000Z', 'EndingDateTime': '2020-03-12T03:59:59.000Z'}}
 Size(MB): 91.0422
 Data: ['https://e4ftl01.cr.usgs.gov//GWELD1/COMMUNITY/GLLIDARPC.001/2020.03.11/GLLIDARPC_FL_20200311_FIA8_l0s22.las']]
</div>

Let’s use the file with size 91.04 MB and convert it to a COPC format.

In [None]:
# Download Data - Selecting the 3rd file from the list
gliht_las_file = maap.downloadGranule(las_item_results[2], data_dir)
las_filename = gliht_las_file[0]
print(las_filename)

<div style="overflow-x: auto; white-space: pre-wrap; font-family: 'Courier New', monospace; padding: 5px; font-size: 0.9em; line-height: 1.2;">Getting 1 granules, approx download size: 0.09 GB
File GLLIDARPC_FL_20200311_FIA8_l0s22.las already downloaded
data/GLLIDARPC_FL_20200311_FIA8_l0s22.las
QUEUEING TASKS | : 100%|██████████| 1/1 [00:00<00:00, 1869.12it/s]
PROCESSING TASKS | : 100%|██████████| 1/1 [00:00<00:00, 16131.94it/s]
COLLECTING RESULTS | : 100%|██████████| 1/1 [00:00<00:00, 33554.43it/s]
</div>


# **Validating The Product**
As we can see from output of the below cell, the .copc.laz file is created in the destination directory.

In [None]:
# using -go for removing user details and h for getting memory size in MBs
!ls -goh ./data


<p style="font-family: 'Courier New', Courier, monospace; font-size: 13px; color: black; line-height: 1.2;">total 239888</p>
<p style="font-family: 'Courier New', Courier, monospace; font-size: 13px; color: black; line-height: 1.2;">-rw-r--r--  1     26M Mar 20 11:55 GLLIDARPC_FL_20200311_FIA8_l0s22.copc.laz</p>
<p style="font-family: 'Courier New', Courier, monospace; font-size: 13px; color: black; line-height: 1.2;">-rw-r--r--  1     91M Feb 29 11:27 GLLIDARPC_FL_20200311_FIA8_l0s22.las</p>

Let’s read the created COPC file again and check the value of copc flag from the metadata. If the generated LiDAR file is a valid COPC file, then this flag should be set to True.


In [None]:
# Creating a pipeline to validate COPC file and check metadata
valid_pipe = pdal.Reader.copc(filename=copc_filename) | pdal.Filter.stats()
valid_pipe.execute()

# Getting value for the "copc" key under the metadata
# Output is True for a valid COPC
value = valid_pipe.metadata["metadata"]["readers.copc"].get("copc")
print(value)



<p style="font-family: 'Courier New', Courier, monospace; font-size: 15px; color: black;">True</p>

# **Accessing The Data**
The data values can be accessed from the executed pipeline using valid_pipe.arrays. The values in the arrays represent the LiDAR point cloud attributes such as X, Y, Z, and Intensity, etc.

In [None]:
# Extract array values from the pipeline
arr_values = valid_pipe.arrays

# Print the array values as a dataframe
print(arr_values)

<div style="overflow-x: auto; white-space: pre-wrap; font-family: 'Courier New', Courier, monospace; font-size: 0.9em; color: black; line-height: 1;">
[array([(506245.56, 2796471.44, 0.24, 40740, 1, 1, 1, 0, 2, 0, 0, 0, 0,  16.998, 1,   0, 310483.75227621, 0),
       (506247.16, 2796471.58, 0.27, 35541, 2, 2, 1, 0, 2, 0, 0, 0, 0,  16.998, 1,   0, 310483.75229014, 0),
       (506247.95, 2796471.65, 0.24, 17716, 2, 2, 1, 0, 2, 0, 0, 0, 0,  16.998, 1,   0, 310483.75229699, 0),
       ...,
       (506066.58, 2796032.75, 2.34, 31587, 1, 1, 0, 0, 1, 0, 0, 0, 0, -24.   , 2, 203, 310477.36925451, 0),
       (506067.37, 2796033.29, 2.52, 32876, 1, 1, 0, 0, 1, 0, 0, 0, 0, -22.998, 2, 216, 310477.37590641, 0),
       (506062.6 , 2796033.27, 1.4 , 27393, 1, 1, 0, 0, 1, 0, 0, 0, 0, -24.   , 2, 108, 310477.38259945, 0)],
</div>

As observed from the output of the above cell, the data values are retrieved from the downloaded product. Hence, validating the downloaded file.

