# Clustering

In dit notebook staan verschillende cluster algoritmes die zijn geimplementeerd. 

---

De geimplementeerde algoritmes zijn: 

1. Region growing clustering
    - Random walker clustering 
    - Watershed clustering
    
2. Pointcloud segmentation
    - Standard plane segmentation
    - Cylinder model segmentation
    
3. ...

---

Hieronder volgt code en toelichting voor elk algoritme.


### Utils
Ondestaande code bevat functionaliteit die nodig is om de algoritmes te compilen.

In [None]:
import numpy as np
INPUT_FILENAME = "csvData/160930/160930-20-00.csv"
DATA = ""

def importDataFromCSV(fileName):
    original = np.genfromtxt(fileName, delimiter=',')
    data = original[1:]
    return data

# Import data from input file
DATA = importDataFromCSV(INPUT_FILENAME)

print "Imported data from file: "+INPUT_FILENAME 

## Region growing clustering

De werking van region growing is beschreven in ons eindverslag. Er zijn twee soorten region growing algoritmes uitgeprobeerd, random walker en watershed. 

---

### Random walker algoritme
Het random walker algoritme is hieronder geimplementeerd. Hiervoor moet de scikit package *skimage.segmentation* geinstalleerd worden.

In [None]:
from skimage.segmentation import random_walker

# Random walker segmentation
def createRWSegmentation(data):
    # Format data for input as [X,Y,Z,DBZ]
    inputData = data[:,[3,4,5,0]]

    # Creating markers for dbz below 0 and higher then 7
    markers = np.zeros(inputData.shape, dtype=np.uint)
    markers[data[:,0] < 0] = 1
    markers[data[:,0] > 7] = 2

    # Generating cluster-labels with random walker segmentation
    labels = np.array(random_walker(inputData, markers, beta=10, mode='bf'))

    # Appending labels to the input data resulting in [X,Y,Z,DBZ,Labels]
    result = np.append(inputData, np.reshape(labels[:,0],(-1,1)),axis = 1)
    
    # Export results to output file
    np.savetxt("output/randomWalkerSegmentation.csv",result, delimiter=',')
    
    print "Exporting results to: output/randomWalkerSegmentation.csv"

    
print "****************Running random walker segmentation*****************"
createRWSegmentation(DATA)
print "****************Ending random walker segmentation*****************"

## Watershed algoritme
Het watershed algoritme is hieronder gemimplementeerd. Hiervoor moet de scikit package *skimage.morphology* geinstalleerd worden.

In [None]:
from skimage.morphology import watershed

# Watershed segmentation
def createWatershedSegmentation(data):
    # Format data for input as [X,Y,Z,DBZ]
    inputData = data[:,[3,4,5,0]]

    # Creating markers for dbz below 0 and higher then 7
    markers = np.zeros(inputData.shape, dtype=np.uint)
    markers[data[:,0] < 0] = 1
    markers[data[:,0] > 7] = 2

    # Generating cluster-labels with watershed segmentation
    labels = np.array(watershed(inputData, markers))

    # Appending labels to the input data resulting in [X,Y,Z,DBZ,Labels]
    result = np.append(inputData, np.reshape(labels[:,0],(-1,1)),axis = 1)
    
    # Export results to output file
    np.savetxt("output/watershedSegmentation.csv",result, delimiter=',')

    print "Exporting results to: output/watershedSegmentation.csv"

    
print "****************Running watershed segmentation*****************"
createWatershedSegmentation(DATA)
print "****************Ending watershed segmentation*****************"

# Pointcloud segmentation

Deze pointcloud segmentation is een implementatie van de pointcloud library (http://docs.pointclouds.org/trunk/). Deze library bevat verschillende methodes om pointclouds te clusteren/segmenteren. Voor ons gebruik hadden we een python implementatie nodig van deze library (http://strawlab.github.io/python-pcl/). Echter, er is gebleken dat deze python wrapper nog niet alle functionaliteit bevat van de oorspronkelijke library. Deze is daardoor vrij gelimiteerd en heeft niet tot interessante clustering geleidt. Het grootste probleem is dat alleen x,y en z coordinaten kunnen worden meegegeven maar nog niet extra features zoals een dbz waarde. De code hieronder is gegeven om de poging te laten zien, en eventuele verdere implementatie mogelijk te maken. Er is een vrij simpel segmenteer algoritme geprobeerd en een cylinder model algoritme.

---

Hieronder wordt de python wrapper geimporteerd. Instructies hiervoor kunnen gevonden worden op https://github.com/strawlab/python-pcl. 

In [None]:
import pcl

## Standard segmentation algoritme

In [None]:
# Function to create a standard pointcloud segmentation
def createPointcloudSegmentation(data):
    # Format data for input as [X,Y,Z]
    inputData = data[:,[3,4,5]]

    # Create pointcloud with data imported as float32
    pointCloud = pcl.PointCloud(inputData.astype('float32'))

    # Create segmentation algorithm
    segmenter = pointCloud.make_segmenter()
    segmenter.set_model_type(pcl.SACMODEL_PLANE)
    segmenter.set_method_type(pcl.SAC_RANSAC)
    segmenter.set_distance_threshold(50)
    
    # Run segmentation algorithm
    indices, model = segmenter.segment()
    
    if len(indices) is not 0:
        # Get segmented regions from cloud, "True" being the first label and "False" the second
        cloud_segmented_true = pointCloud.extract(indices, negative=True)
        cloud_segmented_false = pointCloud.extract(indices, negative=False)

        # Save pointclouds
        pcl.save(cloud_segmented_true, 'output/standardPCLsegmentation-True.pcd')
        pcl.save(cloud_segmented_false, 'output/standardPCLsegmentation-False.pcd')
        
        print "Exporting results to: output/standardPCLsegmentation-True.pcd"
        print "Exporting results to: output/standardPCLsegmentation-False.pcd"
    else:
        print "Indices are empty!"
    
    
print "****************Starting standard pointcloud segmentation*****************"
createPointcloudSegmentation(DATA)
print "****************Ending standard pointcloud segmentation*****************"

## Cylinder segmentation model

In [None]:
# Function to create a cylinder model pointcloud segmentation
def createCylinderModelSegmentation(data):
    # Format data for input as [X,Y,Z]
    inputData = data[:,[3,4,5]]

    # Create pointcloud with data imported as float32
    pointCloud = pcl.PointCloud(inputData.astype('float32'))

    # Create segmentation algorithm
    segmenter = pointCloud.make_segmenter_normals(ksearch=50)
    segmenter.set_optimize_coefficients(True)
    segmenter.set_model_type(pcl.SACMODEL_NORMAL_PLANE)
    segmenter.set_normal_distance_weight(0.1)
    segmenter.set_method_type(pcl.SAC_RANSAC)
    segmenter.set_max_iterations(500)
    segmenter.set_distance_threshold(25)
    indices, model = segmenter.segment()
    
    # Run segmentation algorithm
    indices, model = segmenter.segment()

    if len(indices) is not 0:
        # Get segmented regions from cloud, "True" being the first label and "False" the second
        cloud_segmented_true = pointCloud.extract(indices, negative=True)
        cloud_segmented_false = pointCloud.extract(indices, negative=False)

        # Save pointclouds
        pcl.save(cloud_segmented_true, 'output/cylinderModelPCLsegmentation-True.pcd')
        pcl.save(cloud_segmented_false, 'output/cylinderModelPCLsegmentation-False.pcd')
        
        print "Exporting results to: output/cylinderModelPCLsegmentation-True.pcd"
        print "Exporting results to: output/cylinderModelPCLsegmentation-False.pcd"
    else:
        print "Indices are empty!"

print "****************Starting cylinder model pointcloud segmentation*****************"
createCylinderModelSegmentation(DATA)
print "****************ending cylinder model pointcloud segmentation*****************"