# Finding Connected Components

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import imageio.v3 as iio
from skimage import color, filters, measure
import os


In [None]:
def connected_components(
    filename,
    sigma = 1.0,
    thresh = 0.5,
    neighborhood = 2,
):

    image = iio.imread(filename)
    gray_image = color.rgb2gray(image)
    blurred_image = filters.gaussian(gray_image, sigma)
    binary_mask = blurred_image < thresh
    labeled_image, n_objects = measure.label(binary_mask, connectivity=neighborhood, return_num=True)
    
    return labeled_image, n_objects

In [None]:
# data_dir should be matched to your local data directory relative to notebook
data_dir = '../../data/'
file = os.path.join(data_dir, 'shapes-01.jpg')

image = iio.imread(file)
plt.imshow(image)

In [None]:

labeled_im, count = connected_components(file, 2, 0.9)
colored_label_im = color.label2rgb(labeled_im, bg_label=0)
print(count)
plt.imshow(colored_label_im)

## Exercise:

  1. Using the `connected_components` function, find two ways of outputting the number of objects found.
  2. Does this number correspond with your expectation? Why/why not?
  3. Play around with the `sigma` and `thresh` parameters.  
    a. How do these parameters influence the number of objects found?  
    b. OPTIONAL: Can you find a set of parameters that will give you the expected number of objects?


# Morphometrics

Morphometrics is concerned with the quantitative analysis of objects and considers properties such as size and shape. For the example of the images with the shapes, our intuition tells us that the objects should be of a certain size or area. So we could use a minimum area as a criterion for when an object should be detected. To apply such a criterion, we need a way to calculate the area of objects found by connected components. Recall how we determined the root mass in the Thresholding episode by counting the pixels in the binary mask. But here we want to calculate the area of several objects in the labeled image. The skimage library provides the function skimage.measure.regionprops to measure the properties of labeled regions. It returns a list of RegionProperties that describe each connected region in the images. The properties can be accessed using the attributes of the RegionProperties data type. Here we will use the properties "area" and "label". You can explore the skimage documentation to learn about other properties available.

We can get a list of areas of the labeled objects as follows:

In [None]:

roi_list = measure.regionprops(labeled_im, )

def get_object_areas(roi_list):
    return [roi.area for roi in roi_list]

areas = get_object_areas(roi_list)
areas

In [None]:
fig, ax = plt.subplots()
plt.hist(areas)
plt.ticklabel_format(axis='x', style='sci', scilimits=(4,4))
plt.xlabel('Area (pixels)')
plt.ylabel('Number of objects')



In [None]:
min_area = 5e4
large_rois = [roi for roi in roi_list if roi.area > min_area]
get_object_areas(large_rois)


## Exercise:

Adjust the `connected_components` function so that it allows `min_area` as an input argument, and only outputs regions above this minimum.

HINT: check out the [skimage.morphology](https://scikit-image.org/docs/stable/api/skimage.morphology.html) library.