<img src="images/kiksmeisedwengougent.png" alt="Banner" width="1100"/>

<div>
    <font color=#690027 markdown="1">
<h1>DETECTION OF STOMATA</h1>    </font>
</div>

<div class="alert alert-box alert-success">
In this notebook, you will explore a deep neural network to recognize stomata on a microphotograph of a leaf impression.</div>

<div class="alert alert-block alert-warning"> 
How a deep neural network is constructed and trained can be found in the notebook 'Fundamentals of a deep neural network for image recognition'.</div>

Execute the following code cells to be able to use the functions in this notebook.

In [None]:
import sys!{sys.executable} -m pip install pymongo

In [None]:
import importlib.utilspec = importlib.util.spec_from_file_location(name = "deep_neural_network",                # no connection between name and file, just module namelocation = ".scripts/deep_neural_network.py" # path to scriptYou didn't provide any text to translate. Please provide the Dutch text you wish to translate to English.deep_neural_network = importlib.util.module_from_spec(spec)spec.loader.exec_module(deep_neural_network)

<div>
    <font color=#690027 markdown="1">
<h2>1. The Reference Network</h2>    </font>
</div>

In this notebook you use a network consisting of 3 convolutional layers with respectively 32, 64 and 128 filters, supplemented with 1 feedforward layer with 64 neurons. For training, *stochastic gradient descent* was used as an *optimizer*, with a *learning rate* of 0.01. The same regularization techniques as described in the notebook 'Overfitting' were used.
The following image shows the architecture of this network (in the same way as the networks in the notebook 'Foundations of a deep neural network for image recognition').
<img src="images/referentienetwerk.png" width="700"/>
<center> Figure 1: Reference network.</center>
To load this (trained) reference network, execute the following code cell:

In [None]:
deep_neural_network.load_reference_model()

<div>
    <font color=#690027 markdown="1">
<h2>2. Stomata Detection on a Microphotograph of a Leaf</h2>    </font>
</div>

The reference network can make a classification between 2 classes ('Stoma' or 'No stoma') when it receives an image of 120 pixels by 120 pixels as input. However, the network should also be able to find all stomata on a large image. For example the stomata on a photo of 1600 x 1200 pixels, assuming that the stomata are about **120 x 120** pixels large in the photo.
To achieve this, a ***sliding window*** is used. This is a (sliding) window of 120 x 120 pixels which is used to cut out a small square from the large image. This square can be used as input for the trained network, which will then make a prediction for that square. Then, the window moves to the right by 10 pixels, and the process repeats. When the square has covered the entire width of the image, it will jump back to the left side, but this time 10 pixels lower than before. This repeats until the entire image has been processed in jumps of 10 pixels.
The following image provides an illustration. To the right of the image, you can see how confident the network is about the **prediction**. A prediction of '1' means that the network is 100% certain that the square contains a stoma, a prediction of '0' means that the network decides that the square definitely does not contain a stoma. As soon as the value exceeds 0.5 (this is a given threshold value: see later), the square turns yellow.
<img src="images/slidingwindow.gif" width ="600"/><br>
<center> Figure 2: Sliding window in action.</center>
For each square that is cut out of the large image, a prediction will be made in this way.<br> Often there will be multiple squares around a certain stoma that give a positive output, 'Stoma', because only 10 pixels are shifted and the squares thus do not differ much from each other. To merge these many positive predictions, clustering is used. **Clustering** will consider points that are close to each other as one cluster and the center of these clusters is then the discovered stoma. <br>
The following image provides an example of clustering where the blue crosses are the found points and the red crosses are the center of each cluster.
<img src="images/clustering.jpg" width="400"/>
<center> Figure 3: Clustering.</center>
In detection, the **threshold value** plays an important role. This threshold value determines for which values of the output the network regards the input as a stoma. For example, if you take 0.5 as a threshold value, then all output with a certainty greater than 0.5 will be classified as 'Stoma' and all output less than 0.5 as 'No stoma'.
An important consideration you need to make when choosing the threshold value is the ratio between ***precision*** and ***recall***.
- Precision: the percentage of the found stomata that are actually stomata.</li>- Recall: the percentage of the number of stomata on an image that was actually found.</li>
It is clear that a low threshold will result in low precision (more points will be considered as stomata, so more wrong points; there will be more false positives) but high recall (more of the stomata will also be found). Conversely, a high threshold will result in high precision but low recall (there will be more false negatives).

The following code cell will ensure that the reference network detects the stomata on three images. You can adjust the threshold value (thr) and interpret the result.

In [None]:
deep_neural_network.find_stomata()

<div>
    <font color=#690027 markdown="1">
<h2>3. Adversarial learning</h2>     </font>
</div>

In the field of *adversarial learning*, there is a search for input that can **deceive** a machine and thus produces a different output than the expected output.
You can apply this yourself to the reference network. It has been trained to recognize stomata on a microphotograph of a leaf imprint. So when you enter an image showing something completely different, you would therefore expect the network to not find any stomata. However, this turns out not to be the case.
Run the following code cell to see how the network responds to an image of a cat.

In [None]:
deep_neural_network.mislead_network()

You notice that the network is not really learning what a stoma is, but simply learning to recognize patterns. The cat's eyes have roughly the same shape as a stoma and are therefore detected as a stoma.
There are numerous examples of these kinds of 'adversarial images'. Within one area of adversarial learning, people search for images that give one (or multiple) networks a hard time. A well-known example are the images of Chihuahuas and muffins, where a human sometimes has to look twice to distinguish them.
<img src="images/chihuahuamuffin.jpg" width="600"/>
<center> Figure 4: Chihuahua and muffin.</center>
Another category of adversarial learning are the visible manipulations. In this category, the distinction is clearly visible to humans, but a machine can be deceived. Consider, for example, a self-driving car that no longer recognizes a certain person because that person has applied certain makeup. Researchers have also succeeded in deceiving a face recognition network by placing specific patterns on a pair of glasses. In this way, they succeeded in having the network execute that it saw a certain celebrity ([Accessorize to a Crime: Real and Stealthy Attacks on State-of-the-Art Face Recognition](https://www.cs.cmu.edu/~sbhagava/papers/face-rec-ccs16.pdf)). The following image shows at the top the test subjects with the specially designed glasses, at the bottom are the individuals that the network claims to see.
<img src="images/adversarialglasses.jpg" width="600"/>
<center> Figure 5: Deceiving with glasses.</center>
A final category are invisible manipulations. It is possible to manipulate an image in such a way that it appears to be the same image to a human, but a neural network reaches a completely different conclusion. In a paper ([Explaining and Harnessing Adversarial Examples](https://arxiv.org/abs/1412.6572)), several researchers modified an image of a panda by adding noise values. A deep neural network that classifies images was 57.7% certain that there was a panda in the picture before the adjustment; after the adjustment, the network judged with 99.3% certainty that it was a gibbon.
<img src="images/adversarialpanda.jpg" width="600"/>
<center> Figure 3: Panda with noise.</center>
This technique can also be abused, just think of a self-driving car that recognizes traffic signs with deep neural networks. Someone with malicious intentions could replace a normal stop sign with a modified stop sign that looks almost the same to humans, but cannot be detected by the car's neural networks. In this case, the car would ignore the stop sign and be extremely dangerous for the user.

### Assignment- Look for an image with which you will try to deceive the network in the notebook 'Stomatateller'.- Test out. Successful?

<div>
<h2>With support from</h2></div>

<img src="images/kikssteun.png" alt="Banner" width="1100"/>

<img src="images/cclic.png" alt="Banner" align="left" width="100"/><br><br>
Notebook KIKS, see <a href="http://www.aiopschool.be">AI At School</a>, from F. wyffels, A. Meheus, T. Neutens & N. Gesquière is licensed under a <a href="http://creativecommons.org/licenses/by-nc-sa/4.0/">Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License</a>.