# Exploring some numpy tricks for use in pointcloud analysis

This notebook takes a look at how we might use [dask](https://dask.org) in our pointcloud analyses. 


In [1]:
filename = 'uhuru_s_b3_total_gcps_group1_densified_point_cloud.xyz'

In [3]:
import numpy as np
import pandas as pd
import xarray as xr
import dask.dataframe as dd
import dask.array as da
import pptk

  data = yaml.load(f.read()) or {}


In [26]:
df = dd.read_csv(filename)
pcloud_np = df.to_dask_array(lengths=True)
#n_points = 10000000
#pcloud_np = da.random.uniform(0.0, 100.0, size=(n_points,3), chunks=(1000,3))

### Pre-processing Steps

1. Build an r-tree for spatial mapping
1. Use tree to thin points based on nearest neighbor distance 

### Generate a new, transposed array

This array will contain only a list of all the `X` values and a list of all the `Y` values.

Uses the [np.T](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.T.html) command

### Discritize the array into the desired resolution

In [27]:
resolution = .2 # Target resolution in meters.
xy = pcloud_np.T[:2]
xy = ((xy + resolution / 2) // resolution).astype(int)

### Find the min and max values

In [28]:
mn, mx = xy.min(axis=1), xy.max(axis=1)
sz = mx + 1 - mn

In [29]:
# Map the xy locations into a single index for faster access
flatidx = np.ravel_multi_index(xy-mn[:, None], sz.compute())
# Sort the index values, returning sorted index locations, not values
z_order = np.argsort(flatidx)

z_reordered = pcloud_np[z_order,2]
sorted_idx = flatidx[z_order]
bin_boundaries = np.where(sorted_idx[:-1] != sorted_idx[1:])[0]



In [30]:
max_height = np.maximum.reduceat(z_reordered.compute(), bin_boundaries)
min_height = np.minimum.reduceat(z_reordered.compute(), bin_boundaries)
print("Min Heights: average:{avg:5.2f}, max:{maximum:5.2f}, min:{minimum:5.2f}".format(
    avg=min_height.mean(),
    maximum=min_height.max(),
    minimum=min_height.min())
     )
print("Max Heights: average:{avg:5.2f}, max:{maximum:5.2f}, min:{minimum:5.2f}".format(
    avg=max_height.mean(),
    maximum=max_height.max(),
    minimum=max_height.min())
     )

Min Heights: average:1713.43, max:1721.43, min:1706.94
Max Heights: average:1713.92, max:1721.96, min:1706.96


In [23]:
pcloud_np.visualize()

RuntimeError: Drawing dask graphs requires the `graphviz` python library and the `graphviz` system library to be installed.

In [231]:
v = pptk.viewer(pcloud_np)
v.set(point_size=0.001)