# Example usage of package

First import the package, and pathlib which is required to handle files.


In [None]:
import lidar
print(f"package version: {lidar.__version__}")
from pathlib import Path

import plotly.express as px

%load_ext autoreload
%autoreload 2
%config IPCompleter.greedy=True

Ignore the warning, which comes from the rospy package.

If you are useing VS code and the devcontainer then everything is setup for you (recommended). Otherwise you can choose to use the Docker file in the .devcontainer folder or make your own virtual environment.

If you geta ModuleNotFoundError for lidar make sure that you actuall installed the package. This can be done with 

$ pip install -e .   

in the folder where setup.py is.

## Reading a ROS .bag file into the lidar.Dataset

In [None]:
testbag = Path().cwd().parent.joinpath("tests/testdata/test.bag")

In [None]:
testset = lidar.Dataset.from_file(testbag,topic="/os1_cloud_node/points",keep_zeros=False)

This reads the bagfile into the Dataset.
Dataset only reads frames from the bagfile if needed, in order to save memory and make it possible to work which huge bagfiles.

In [None]:
print(testset)

In [None]:
len(testset)

In order to see whats availble use "tab" to see the availble properties and methods. Alterantivly, use help(), dir(), and the documentation.
Also shift tab is nice inside jupyter lab.


Lets enquire the start and end time of the dataset

In [None]:
testset.start_time

In [None]:
testset.end_time

# Working with the whole dataset
You can work with the whole dataset. Even if they are huge, since the package used parallel processing with dask in the background.
So make sure that your docker or computer has access to as many CPU cores as possbile.

In [None]:
testset.min()

The Dataset class supports the basic functions like min, max, mean and std. They all work on 3 different level: dataset, frame and point. Lets investigate the differences. The default is over the whole dataset.

In [None]:
min_frame = testset.min("frame")
min_frame

So now we have a pandas DataFrame which gives us the min values of each column for each frame. This can also be used for plotting.

In [None]:
px.line(min_frame,x="timestamp", y="x min")

Now lets investigate on the point level.

In [None]:
min_point = testset.min("point")
min_point

In [None]:
testset.std("point")

So we got a DataFrame with the min value for each point of the whole Dataset. Note that the points are identified by the orginial_id. For some lidars this does not make sense since the points locations changes over time, so please think beforehand if its is usefull for your lidar. Nevertheless, for the Ouster lidars this can be used and is very usefull.

Also note the "N" column which gives the count of the point over the dataset.

All thes methods are based on the aggregate method similar to the one from pandas. It works also on "dataset", "frame" and "point" level. 

In [None]:
testset.agg("min","dataset")

In [None]:
testset.agg(["min","max","std","mean"],"point")

In [None]:
testset.agg({"x":["max","min"]},"point")

In [None]:
testset.agg({"x":"max"},"point")

# Working with Frames

They are based on pandas dataframes and pyntcloud.
This was necessary since, no pointcloud library currently support to store automotive lidar data which consists of more than just y,x,z and maybe R,G,B

First grab the first frame in the dataset:

## Getting a Frame from a dastaset

In [None]:
testframe = testset[0]

In [None]:
print(testframe)

Note that the number of points can vary from frame to frame, since all zero elements are deltede on import (see option keep_zero in the dataset).

In [None]:
len(testframe)

## Reading from a Frame file

In [None]:
lasfile = Path("../tests/testdata/diamond.las")

In [None]:
testframe2 = lidar.Frame.from_file(lasfile)

In [None]:
print(testframe2)

## Plotting
Tip: move the mouse over the points to get detailed information

In [None]:
testframe.plot(color="intensity", point_size=0.5)

This plot uses plotly as the backend, which can be rather time consuming. 
WARNING: delte the output cells with plotly plots, they make the file very big.

## Working with pointcouds
The frame consists maily of the properties "data" and "points".

In [None]:
testframe.data

So data contains everything as a pandas dataframe. With all its power.

In [None]:
testframe.describe()

In [None]:
testframe.data.hist();

Now a closer look a the points. 

In [None]:
testframe.points

So its a Pyntcloud object https://pyntcloud.readthedocs.io/en/latest/PyntCloud.html which in turn is also based on Dataframes with many methods for pointclouds.
In order to access the dataframe use this:

## Pointcloud processing with build in methods
Although you can do a lot with just data and points, on its own the Frame object has methods build in for processing, which in turn return a frame object. The use the power of dataframes, pyntcloud and open3d.


In [None]:
newframe = testframe.limit("x",-5,5).limit("intensity",400,1000)

In [None]:
newframe.describe()

So this is now a smaller Frame with x ranging from -5 to  5, and with intenisties above 400. Processing steps can be chained together since the return a new Frame object.

You can also plot the nweframe and investiget it further with tooltips on each point.

In [None]:
newframe.plot("intensity",hover_data=["range"])

# Plane segmenation, Clustering and overlaying several plots
Please not that not all processing methods are demonstrated here. For more infor please refer to the html documenation of the Frame class.

In [None]:
plane = newframe.plane_segmentation(distance_threshold= 0.01,ransac_n= 3,num_iterations= 50, return_plane_model=True)
print(len(plane))

In [None]:
plane

In [None]:
clusters = newframe.get_cluster(eps=0.5, min_points= 10)
cluster1 = newframe.take_cluster(1,clusters)
cluster2 = newframe.take_cluster(2,clusters)
print(len(cluster1))
print(len(cluster2))

In [None]:
type(cluster1)

In [None]:
newframe.plot(color=None, overlay={"Cluster 1": cluster1,"Cluster 2": cluster2})

# Applying functions to the whole dataset
Now we can develop a pipeline and but everything together. The .agg method is powerfull but sometimes not flexible enouth. So with .apply you can apply a function to the whole dataset.

In [None]:
def isolate_target(frame: lidar.Frame) -> lidar.Frame:
    return frame.limit("x",0,1).limit("y",0,1)

Note the typehints. They are importont as they are used to determine if the result can be a new dataset are not. If the function returns a Frame then the result is another dataset. This is very usefull to chain operations togeterh

In [None]:
testset.apply(isolate_target)

So the result is another Dataset. Now we can chain things together

In [None]:
def diff_to_frame(frame: lidar.Frame, to_compare: lidar.Frame) -> lidar.Frame:
    return frame.diff("frame", to_compare)

In [None]:
result = testset.apply(isolate_target).apply(diff_to_frame, to_compare=testset[0])

Note that this uses lazy evaluation from dask and therfore the result is only calulated when needed. So you could develop a complex chain and then investigate the results.

In [None]:
result[1]

Now we can inquire the resulte even futher by useing .agg from before

In [None]:
result.agg({"x difference":"max"},"frame")