# Workshop 3: Features

In Lecture 6 we talked about feature extraction. In this workshop we will put in practice some of the methods we have learned in the lecture.

## Learning objectives

At the end of this workshop you will be able to:

- Extract blob-like features from images and compare different algorithms
- Use the Hough transform to detect lines and circles in images

## Blob detection

Let's start by looking at blob detection. We will use the [`skimage.feature.blob_log`](http://scikit-image.org/docs/dev/api/skimage.feature.html#skimage.feature.blob_log) function to detect blobs in an image. As we saw in the lecture, this function uses the Laplacian of Gaussian (LoG) to detect blobs.

Let's start by opening and displaying the `retina.jpg` image [Credits: [Librepath - Own work, CC BY-SA 3.0](https://commons.wikimedia.org/w/index.php?curid=45308378)]. This is an haematoxylin and eosin (H&E) stained image of a retina. Haematoxylin stains nuclei in a dark purple colour; eosin stains the extracellular matrix and the cytoplasm in pink.

In [1]:
# Read the image and plot it
# Your code here

We are now interested in counting how many nuclei are in this image. Since nuclei look like dark, uniform blobs on a light background, we can use the LoG to detect blobs.

First of all, however, we need to do some preprocessing.
The `blob_log` function works on both 2D and 3D images. This is a RGB image, and we need to convert it to a grayscale image, otherwise `blob_log` will think we are passing a 3D image and will give us some strange results.

Furthermore, the function expects bright spots on a dark background, so we need to invert the image.

So let's do that!

1. Convert the image to grayscale
2. Invert the image
3. Display the images to check everything is working as expected!

In [2]:
# You can either use rgb2gray or pick a color channel. 
# It might be useful to look at individual channels and see if
# nuclei are more evident in a specific one
# Your code here

Now we can finally apply the `blob_log` function to the inverted image.

Remember that the `blob_log` function takes the following arguments:
    
- `image`: The image to detect blobs in
- `min_sigma`: The minimum standard deviation for the LoG filter
- `max_sigma`: The maximum standard deviation for the LoG filter
- `num_sigma`: The number of standard deviations to consider
- `threshold`: The absolute lower bound for maxima to be considered as blobs

How do we chose these parameters? You can start with some reasonable values and experiment with them. For example, try and manually check the radius of a nucleus, and use values of $\sigma$ that are in that order of magnitude. Remember that the radius of the blob is $\sqrt{2}\sigma$ so if you use $\sigma=10$ you will detect blobs of radius ~14.

_Note_: you might want to keep down the `num_sigma` parameter (say 5 or 10), at least while tuning the rest, as it will take a long time to run otherwise.

We need to check the output of the `blob_log` function. We could print it out, but that would be hard to interpret. Instead, we will plot circles over the image to show the detected blobs.

Let's start by estimating the size of a nucleus

In [3]:
from math import sqrt
# Estimate the size of a nucleus
# Let's show a crop of the image to zoom in on the nuclei and visually estimate their size

# plt.imshow(___)

# Determine the average nucleus radius (by eye)
# Since radius = sqrt(2) * sigma, sigma = radius / sqrt(2)
# print(___ / sqrt(2))

Now let's apply the `blob_log` function to the image.

In [None]:
from skimage.feature import blob_log

# Remember to use the inverted image!
# blobs = blob_log(____)

# Display the image in colour on the left and black and white on the right
# your code here

# Overlay the blobs on the b/w image. The centers of the blobs are returned in
# (y, x) order. The third dimension of the blobs array is sigma

for b in blobs:
    y, x, r = b
    c = plt.Circle((x, y), r, color="red", linewidth=1, fill=False)
    ax[1].add_patch(c)

for a in ax:
    a.axis("off")

plt.tight_layout()


That is not too bad! 
**What are the parts where this approach does not work well? Why do you think this is the case?**

You can try optimising the results by manipulating the input image, e.g. by changing the way you convert it to grayscale, or by manipulating its histogram.
Obviously, also changing the parameters of the `blob_log` function can help improve the results!

Finally, try comparing the `blob_log`, `blob_dog` and `blob_doh` functions. **Which works best?**

**How did it go for you? Post your results on Slack!**

**Can you think of a way to quantitatively evaluate how good different solutions are?**

### Saving results to file

Now it'is probably a good time to save our results to file, so that we can use the data further down the line.

Although we could use Python base I/O functions to read/write files (see the [Python documentation](https://docs.python.org/3/tutorial/inputoutput.html#reading-and-writing-files) on how to do that), when dealing with tabular data, the `pandas` library is super helpful!

If you have not already, you should install `pandas` (e.g. through `pip install pandas`). There are some super-useful tutorials on how to use `pandas` [on their website](https://pandas.pydata.org/docs/getting_started/intro_tutorials/index.html), but essentially for this workshop we will be:

- Importing `pandas` (commonly imported as `pd`)
- Converting our results in a `DataFrame` class
- Using the `to_csv` method to save to a CSV file.

In [None]:
import pandas as pd

# We convert the results from blob_log to a pandas DataFrame
# We can specify column names to make things prettier :)
blobs_df = pd.DataFrame(data = blobs, columns=["Center_Y", "Center_X", "Sigma"])

# Let's see what we have
print(blobs_df)

# Now, save to file. We can use index=False to avoid printing the index (the row number)
blobs_df.to_csv("blobs.csv", index = False)

Et voilà, you now have a CSV output of your analysis! You could read this file for further analysis, for example you could calculate the density of nuclei depending on the layer of the retina where you are in, or look at the average distance between nuclei, or whether there is a relationship between radius and position etc!

## Counting cells

In this exercise we are now going to create a few functions to estimate cell density given a photo of a haemocytometer, just like we saw in the lecture. If you want to know more about how an haemocytometer works please see the [Wikipedia page](https://en.wikipedia.org/wiki/Hemocytometer).

In [None]:
# Import required libraries
# Your code here

# Read the yeast_count.jpg file
img = ____
# Convert to grayscale
img = ____
# Unsharp mask the image to remove noise
img = ___
# Detect edges using Canny
img_canny = ___

# Use the linear Hough transform to detect the lines in the image
# See the lecture for how to do this using hough_line and hough_line_peaks!

# Your code here

# Uses the circular Hough transform to detect cells in the image
# Cells are pretty small, we can try radii of 3-5 pixels

radii = range(3, 5)
# Look at the help of hough_circle and hough_circle_peaks functions to see what parameters are needed!
cells = hough_circle(_____) 
circle_accum, cx, cy, radii = hough_circle_peaks(_____)

# Now show the image with the lines and circles overlayed

# Show the image
# Your code here

# Draw lines

for angle, dist in zip(theta, rho):
    (x0, y0) = dist * np.array([np.cos(angle), np.sin(angle)])
    plt.axline((x0, y0), slope=np.tan(angle + np.pi/2), linewidth=3, color='orange')

# Draw circles
for c_x, c_y, r in zip(cx, cy, radii):
    c = plt.Circle((c_x, c_y), r, color='purple', linewidth=1, fill=False)
    plt.add_artist(c)

plt.show()

**How many cells were detected?**

In [None]:
# Count cells. We just look at how many circles we've got!
# Your code here

**Is there any issue with counting cells this way?**

**Bonus exercises**: 
1. Can you come up with a way of quantifying the number of cells in the central "macro-square" (the one with 16 smaller squares in it that is defined by the larger lines)?
2. Can you tidy up the code and convert it into a class?

Well done! You got to the end of this wokshop!

Feature extraction is a very important part of image analysis, I hope you enjoyed playing with it!