# Introduction to grayscale images
> A short introduction to grayscale images and how they are encoded on the computer. 

- toc: true 
- badges: true
- comments: true
- categories: [images]



In [1]:
#hide
from fastai.vision.all import *

# resize images to x
image_size = 224

path = Path("2020-09-02")
fns = get_image_files(path)
for filename in fns:
    with Image.open(filename) as img:
        resized_img = img.reshape(image_size, image_size)
    resized_img.save(filename)

# About

Before we can dive into multi-spectral satellite images, I think a quick refresher on how images
are encoded and represented in memory is a good starting point.

## Binary encoding

Let's take a short recap of how classical computer vision images are encoded in memory.
Internally a computer (ignoring quantum-computing) only works with binary numbers. A binary number is either a 0 or a 1, on or off.
The value of such a binary number is called a *bit*.
The smallest data-element is called a *byte*. A byte consists of 8 bits.
There are different ways how we could use these 8 bits/1 byte to encode our data.
The data we are trying to store/load defines how we interpret the data. 
If we want to only work with positive integers, we use an unsigned integer type.
An unsigned integer with 8 bits can encode all numbers from 0$-$255.
If all bits are 1, also called *set*, the value is 255.
If all bits are 0 the corresponding value is 0.{% fn 1 %}

## Grayscale images
Images, like everything in a computer, are also only encoded in binary values.
The simplest images are grayscale images. The possible colors of each pixel of grayscale images only range from black to gray to white, with all different gray shades in-between. 
*Pixels* are the basic elements of a picture. The word itself, [pixel](https://en.wikipedia.org/wiki/Pixel#Etymology), is a combination of the words picture and element/cell. So an image consists of pixels similar to how a brick wall consists of bricks.

<figure>
    <div style="display: flex; flex-wrap: wrap; justify-content: center">
        <div>
            <figure>
<img src="2020-09-02/brick.jpg">
            <figcaption><center>Pixel</center></figcaption>
            </figure>
        </div>
        <div>
            <figure>
<img src="2020-09-02/brick-wall.jpg">
            <figcaption><center>Complete Image</center></figcaption>
            </figure>
        </div>
    </div>
    <figcaption><center>My weird analogy</center></figcaption>
</figure>

With the knowledge of our previous simple encoding scheme, we can understand how simple 8-bit grayscale images are encoded.
The 8-bit refers to the [*color-depth*](https://en.wikipedia.org/wiki/Color_depth). It indicates how many bits are used per channel.
For a grayscale image, we only have a single channel, the channel ranging from black to white. (We will take a closer look at different channels in the next post.)
For now, we note that our grayscale channel is encoded with 8-bits. Or, put differently, we use 8-bits for every pixel to show different shades of gray. With 8-bits we can color each pixel in 256 (2⁸) different ways.

With the [numpy](https://numpy.org/) and [PIL](https://pillow.readthedocs.io/) library 
we can easily create our own 8-bit grayscale image by simply changing the value of a byte. 

In [5]:
#collapse
import numpy as np
from PIL import Image, ImageOps

def to_grayscale_image(x):
    grayscale_8_bit_mode = "L"
    return Image.fromarray(x, mode=grayscale_8_bit_mode)

def upscale_image(x, img_width=224, img_height=224):
    return img.resize((img_width, img_height), resample=Image.BOX)

# PIL requires np arrays as input
# Datatype is uint8, our unsigned int consisting of 8-bits
# zero is our single byte/value with value 0
# -> Array has a width and height of 1
zero = np.zeros((1, 1), dtype=np.uint8)

img_values = {
    "pixel_0": zero, 
    "pixel_64": zero + 64, 
    "pixel_192": zero + 192, 
    "pixel_255": zero + 255
}

for name, value in img_values.items():
    img = to_grayscale_image(value)
    img = upscale_image(img)
    img = img.resize((224, 224), resample=Image.NEAREST)
    bordered_img = ImageOps.expand(img, border=1, fill="black")
    # display(bordered_img) # To display in jupyter
    bordered_img.save(f"2020-09-02/{name}.png")


<figure>
    <div style="display: flex; flex-wrap: wrap; justify-content: center">
        <div>
            <figure>
<img src="2020-09-02/pixel_0.png">
            <figcaption><center>0</center></figcaption>
            </figure>
        </div>
        <div>
            <figure>
<img src="2020-09-02/pixel_64.png">
            <figcaption><center>64</center></figcaption>
            </figure>
        </div>
        <div>
            <figure>
<img src="2020-09-02/pixel_192.png">
            <figcaption><center>192</center></figcaption>
            </figure>
        </div>
        <div>
            <figure>
<img src="2020-09-02/pixel_255.png">
            <figcaption><center>255</center></figcaption>
            </figure>
        </div>
    </div>
    <figcaption><center>Visualization of different 8-bit grayscale pixel values</center></figcaption>
</figure>


Until now, we did not care about the [resolution](https://en.wikipedia.org/wiki/Image_resolution#Pixel_resolution) of our images. 
The resolution defines how many pixels we use to visualize the object. A resolution of 1 corresponds to a single pixel.
But, with a single-pixel picture, we cannot retain a lot of information. As shown above, we could only show a single shade of gray
So let's increase our resolution for the following images to a width of 224 pixels x a height of 224 pixels.
With more pixels we can show more levels of detail.

Now we can extend our previous code to draw gradients!

In [72]:
#collapse
zeros = np.zeros((224, 224), dtype=np.uint8)
x_gradient = np.arange(0, 224, dtype=np.uint8).reshape(1, 224)
y_gradient = np.arange(0, 224, dtype=np.uint8).reshape(224, 1)

# Using numpy's broadcasting
x_grad_2d = zeros + x_gradient
y_grad_2d = zeros + y_gradient
sum_grad_2d = x_gradient + y_gradient
diff_grad_2d = x_gradient - y_gradient

# Convert to grayscale image as before
# and save or show files

In [27]:
#hide
to_grayscale_image(x_grad_2d).save(f"2020-09-02/x_grad_2d.png")
to_grayscale_image(y_grad_2d).save(f"2020-09-02/y_grad_2d.png")
to_grayscale_image(sum_grad_2d).save(f"2020-09-02/sum_grad_2d.png")
to_grayscale_image(diff_grad_2d).save(f"2020-09-02/diff_grad_2d.png")

<figure>
    <div style="display: flex; flex-wrap: wrap; justify-content: center">
        <div>
            <figure>
<img src="2020-09-02/x_grad_2d.png">
            </figure>
        </div>
        <div>
            <figure>
<img src="2020-09-02/y_grad_2d.png">
            </figure>
        </div>
        <div>
            <figure>
<img src="2020-09-02/sum_grad_2d.png">
            </figure>
        </div>
        <div>
            <figure>
<img src="2020-09-02/diff_grad_2d.png">
            </figure>
        </div>
    </div>
    <figcaption><center>Visualization of different 8-bit grayscale images</center></figcaption>
</figure>


In [62]:
#hide
peppers_fn = "2020-09-02/peppers.png"
# download_images("2020-09-02", urls=["https://upload.wikimedia.org/wikipedia/en/7/7d/Lenna_%28test_image%29.png"])
download_url("http://sipi.usc.edu/database/download.php?vol=misc&img=4.2.07", peppers_fn)
with Image.open(peppers_fn) as peppers:
    peppers = peppers.resize((224, 224))
    peppers_grayscale = peppers.convert("L")
peppers_grayscale.save(peppers_fn)

If we don't limit ourselves to simple mathematical operations, we can show images with very high details.

<figure>
        <div>
            <figure>
<img src="2020-09-02/peppers.png">
            </figure>
        </div>
    <figcaption><center>Typical test image</center></figcaption>
</figure>


> Important: Even if these images reveal a lot of information to us humans, in the end, they are only stored as 0s and 1s on the computer.

In [66]:
#hide
with Image.open(peppers_fn) as peppers:
    test_image_encoding = np.array(peppers)


The peppers image consists of the following
values. Each value is saved as a single byte
on disk.

In [71]:
#hide_input
test_image_encoding

array([[ 71,  96,  92, ..., 136, 132, 127],
       [ 87, 119, 113, ..., 181, 176, 170],
       [ 83, 114, 111, ..., 178, 174, 166],
       ...,
       [ 92, 120, 108, ..., 191, 198, 198],
       [ 88, 124, 104, ..., 202, 198, 194],
       [ 78, 126, 106, ..., 200, 193, 188]], dtype=uint8)

Increasing the number of pixels (resolution), allows us to encode more details. The color-depth shows us how many bits
we use per channel to encode a color. Our previous 8-bit grayscale pixel can, therefore, encode 256 different shades of gray.
But what is when we want to enrich our image with colors?

How to add colors to our image and how remote image sensing images are different will be the topic of the next blog post!

Until then, have a productive time! :+1:

{{ 'A quick refresher on how to translate binary numbers to unsigned integers can be found on [ryanstutorials](https://ryanstutorials.net/binary-tutorial/)' | fndetail: 1 }}