# Sample Notebook 3 for Picasso
This notebook shows how to perform DBSCAN clustering with picasso.


## Load the localizations

In [1]:
from picasso import io, clusterer
path = 'data/testdata_locs.hdf5'
locs, info = io.load_locs(path)

print('Loaded {} locs.'.format(len(locs)))

Loaded 564 locs.


# DBSCAN implementation in Picasso

In [2]:
help(clusterer.dbscan)

Help on function dbscan in module picasso.clusterer:

dbscan(locs: 'pd.DataFrame', radius: 'float', min_samples: 'int', min_locs: 'int' = 10, pixelsize: 'int | None' = None) -> 'pd.DataFrame'
    Perform DBSCAN on localizations.
    
    See Ester, et al. Inkdd, 1996. (Vol. 96, No. 34, pp. 226-231).
    
    Parameters
    ---------
    locs : pd.DataFrame
        Localizations to be clustered.
    radius : float
        DBSCAN search radius, often referred to as "epsilon". Same units
        as locs.
    min_samples : int
        Number of localizations within radius to consider a given point
        a core sample.
    min_locs : int, optional
        Minimum number of localizations in a cluster. Clusters with
        fewer localizations will be removed. Default is 0.
    pixelsize : int, optional
        Camera pixel size in nm. Only needed for 3D.
    
    Returns
    -------
    locs : pd.DataFrame
        Clusterered localizations, with column 'group' added, which
        specifie

In [3]:
radius = 3 / 130 # from nm to camera pixels
min_samples = 3
clustered_locs = clusterer.dbscan(locs, radius, min_samples)

In [4]:
# save cluster centers too
centers = clusterer.find_cluster_centers(clustered_locs)

# Save

In [5]:
import os
from picasso import io

base, ext = os.path.splitext(path)
dbscan_info = {
    "Generated by": "Picasso DBSCAN",
    "Min samples": min_samples,
    "Radius (cam. pixels)": radius,
}
info.append(dbscan_info)
io.save_locs(path.replace(".hdf5", "_dbscan.hdf5"), clustered_locs, info)
io.save_locs(path.replace(".hdf5", "_dbscan_centers.hdf5"), centers, info)
print('Complete')

Complete
