In [1]:
import json
import shutil

import numpy as np
import pandas as pd
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt

import tiledb
import pdal
from pybabylonjs import Show as show

Download the data used in this example by uncommenting and running the next cell:

In [2]:
#!wget -nc "https://github.com/PDAL/data/blob/master/workshop/autzen.laz?raw=true" -O "autzen.laz"

Delete arrays from previous runs:

In [3]:
try:
    shutil.rmtree("autzen")
    pass
except:
    pass

Create and run a PDAL pipeline that ingest the data from the LAZ file into a TileDB sparse array:

In [4]:
pipeline1 = (
  pdal.Reader("autzen.laz") |
  pdal.Filter.stats() |
  pdal.Writer.tiledb(array_name="autzen",chunk_size=10000000)
)

count = pipeline1.execute()  

## 3D Point Cloud

Load a slice from the LiDAR point cloud data and create dictionary for the data to be visualized:

In [5]:
with tiledb.open("autzen1") as arr:
    df = pd.DataFrame(arr[636800:637800, 851000:853000, 406.14:615.26])

data = {
    'X': df['X'],
    'Y': df['Y'],
    'Z': df['Z'],
    'Red': df['Red'] / 255.0,
    'Green': df['Green'] / 255.0,
    'Blue': df['Blue'] / 255.0
}

Visualize the 3D point cloud with `pybabylonjs.Show.from_dict()` by specifying the `style` to use, the `width` and `height` of the frame and the scaling factor `z_scale` of the z-axis:

In [6]:
show.from_dict(data=data,
                      style = 'pointcloud',
                        width = 800,
                        height = 600,
                        z_scale = .3)

BabylonPC(value={'style': 'pointcloud', 'width': 800, 'height': 600, 'z_scale': 0.3, 'wheel_precision': 50, 'e…

## Minimum bounding Rectangle

The minimum bounding rectangle (MBR) of a data tile is a rectangle in the logical view of the array that tightly includes all the non-empty cells whose values are stored in that data tile. All tiles in a sparse array whose MBRs intersect a read query’s subarray are read from disk. Choosing the space tiles in a way that the shape of the resulting MBR’s is similar to the read access patterns can improve read performance, since fewer MBRs may then overlap with the typical read query. 

The MBRS visualisation is a helpful tool to inspect the MBRS of a sparse array.

### Default sorting

Visualize the above array that was created using default parameters. This visualization is created by specifying the `array`, the style as `mbrs` and optional `height` and `width` parameters:

In [7]:
show.from_array(array='autzen',
                               style='mbrs',
                               width=800,
                               height=600,
                               z_scale = 0.5)

BabylonMBRS(value={'style': 'mbrs', 'width': 800, 'height': 600, 'z_scale': 0.5, 'wheel_precision': 50, 'exten…

Create and visualize second array where the chuck size is decreased to 100000:

In [9]:
pipeline2 = (
  pdal.Reader("autzen.laz") |
  pdal.Filter.stats() |
  pdal.Writer.tiledb(array_name="autzen2",chunk_size=100000)
)

count = pipeline2.execute()  

In [10]:
show.from_array(array='autzen2',
                               style='mbrs',
                               width=800,
                               height=600,
                               z_scale = 0.2)

BabylonMBRS(value={'style': 'mbrs', 'width': 800, 'height': 600, 'z_scale': 0.2, 'wheel_precision': 50, 'exten…

## Loading data speed

Read data from both arrays to explore the effect of the MBRS distribution:

In [11]:
%%time
with tiledb.open("autzen") as arr:
    df = pd.DataFrame(arr[636800:637800, 851000:853000, 406.14:615.26])

CPU times: user 4.18 s, sys: 1.63 s, total: 5.8 s
Wall time: 1.02 s


In [12]:
%%time
with tiledb.open("autzen2") as arr:
    df = pd.DataFrame(arr[636800:637800, 851000:853000, 406.14:615.26])

CPU times: user 1.28 s, sys: 1.02 s, total: 2.3 s
Wall time: 442 ms


## 4D Point Cloud 

In [13]:
%%time
with tiledb.open("autzen5") as arr:
    df = pd.DataFrame(arr[636800:637800, 851000:853000, 406.14:615.26])

data = {
    'X': df['X'],
    'Y': df['Y'],
    'Z': df['Z'],
    'Red': df['Red'] / 255.0,
    'Green': df['Green'] / 255.0,
    'Blue': df['Blue'] / 255.0,
    'GpsTime': df['GpsTime']}

CPU times: user 1.26 s, sys: 997 ms, total: 2.25 s
Wall time: 511 ms


In [14]:
show.from_dict(data=data,
                      style = 'pointcloud',
                        width = 800,
                        height = 600,
                        z_scale = .2,
                        time = True)

BabylonPC(value={'style': 'pointcloud', 'width': 800, 'height': 600, 'z_scale': 0.2, 'wheel_precision': 50, 'e…