# DoxaPy Notebook

https://github.com/brandonmpetty/Doxa

DoxaPy is an image binarization library focused on local adaptive algorithms and metrics.
This notebook will document the API while allowing you to interact with it.

## Setup
The first thing to do when getting started with this library is to install it.
```
pip install doxapy
```
Form more details, see: https://pypi.org/project/doxapy

Alternatively, you can build the library from source as described in the README.MD.

From there, it is as simple as importing the library.  NumPy and Pillow are two other libraries we will use in this demonstration.

In [None]:
from PIL import Image
import numpy as np
import doxapy

## Reading an Image
The first step is to read the image you intend on processing.  The *read_image* helper function uses Pillow to read in a local image and convert it to grayscale.  We then use NumPy to turn that image into an array.

DoxaPy was developed around PyBind 11 and the C++ based Doxa library, making it incredibly fast and efficient.  That said, not every aspect of the Doxa framework has been exposed to Python.  For example, the first step for many binarization algorithms involves color to grayscale conversion.  A lot of effort has gone into understanding the impact of different grayscale algorithms and their effect on binarization performance.  This is one aspect that is not directly exposed in DoxaPy, but is worth considering.  Here, we are using Pillow's "L" option to convert the image to grayscale which uses a "ITU-R 601-2 luma transform."  This is the same as Doxa's BT601 function found in Grayscale.hpp.

In [None]:
def read_image(file):
    return np.array(Image.open(file).convert('L'))

In [None]:
image = read_image("../../README/2JohnC1V3.png")
display(Image.fromarray(image))

## Converting the Image to Binary
Converting an image into black and white may seem easy, but it has been the focus of much research spanning decades.  Doxa was designed to expose this research, traditionally mired by PHD technical jargon, in a very easy to consume fashion.  A lot of work was put into ensuring these algorithms were implemented correctly and effeciently.  Many of these algorithms were first made public by this project and many of them leverage state of the art enhacements to reduce memory utilization and increase speed of operation, found nowhere else.  For more information on an individual algorithm, click one of the links below.

### Algorithms
The Doxa library implements a large number of popular and unique local adaptive binarization algorithms.  Each algorithm has a set of parameters that are required for it to operate.  These parameters can vary from algorithm to algorithm.  Doxa provides sensible defaults that are applied automatically unless you supply your own.  Below is a list of algorithms and their defaults:

* **OTSU**
* **BERNSEN** - {"window": 75, "threshold": 100, "contrast-limit": 25}
* **NIBLACK** - {"window": 75, "k": 0.2}
* **SAUVOLA** - {"window": 75, "k": 0.2}
* **WOLF** - {"window": 75, "k": 0.2}
* **NICK** - {"window": 75, "k": -0.2}
* **SU** - {"window": 9, "minN": 9}
* **TRSINGH** - {"window": 75, "k": 0.2}
* **BATAINEH**
* **ISAUVOLA** - {"window": 75, "k": 0.2}
* **WAN** - {"window": 75, "k": 0.2}
* **GATOS** - {"window": 75, "k": 0.2, "glyph": 60}
* **ADOTSU** - {"window": 75, "k": 0.1, "R": 0.1, "distance": window/2}

In [None]:
# Initialize an image array with the same shape as our grayscale image
binary_image = np.empty(image.shape, image.dtype)

# Create an instance of our algorithm and initialize it based on the characteristics of the incoming image
sauvola = doxapy.Binarization(doxapy.Binarization.Algorithms.SAUVOLA)
sauvola.initialize(image)

# Convert grayscale to binary
sauvola.to_binary(binary_image, {"window": 75, "k": 0.2})
display(Image.fromarray(binary_image))

One of the quickest and most efficient ways of turning your grayscale image into a binary image is to use the *update_to_binary* function.  Instead of allocating more memory to write the image to, it will update the existing image in-place.  It also only takes one line to write!

In [None]:
doxapy.Binarization.update_to_binary(doxapy.Binarization.Algorithms.SAUVOLA, image, {"window": 27, "k": 0.12})
display(Image.fromarray(image))

## Performance Metrics
In order to analyze the performance of an algorithm, Doxa provides a set of common metrics that can all be calculated with one function.  To start that process you need an exemplar binary image, or "ground truth."  By comparing the ground truth to the resulting image of the binarization algorithm, you can start to compare the affects of different algorithms and algorithm parameters.

In [None]:
groundtruth_image = read_image("../../README/2JohnC1V3-GroundTruth.png")
display(Image.fromarray(groundtruth_image))

In [None]:
# Help us 'pretty print' our JSON
import json

# Both of these were done with the Sauvola algorithm, but with different parameters
performance1 = doxapy.calculate_performance(groundtruth_image, binary_image)
performance2 = doxapy.calculate_performance_ex(groundtruth_image, image, drdm=True, accuracy=True, mcc=True)

print("Sauvola - Window = 75, K = 0.2") # Default
print(json.dumps(performance1, indent=2))
print()
print("Sauvola - Window = 27, K = 0.12") # Adjusted
print(json.dumps(performance2, indent=2))