# Glasses Detector Demo

## Setup

This simply installs the minimum required **Python** version<sup>1</sup>  (`3.12`) and the latest **PyTorch** version (_Nightly_ for compatibility). It may take several minutes to execute the cell because installing **PyTorch** generally takes a while.

> **Tip**: If you want, you can change the environment type before running the notebook to support GPU acceleration: `Runtime` $\to$ `Change runtime type`.

<sub>[1] Please note that python script cells cannot be executed directly because _Colab_ kernel version cannot be set to **3.12** at runtime - for this reason, python scripts are wrapped in strings which are then used as arguments when calling `python 3.12` not through kernel. When _Colab_ supports **Python 3.12** or newer, all these complications can be removed.</sub>

In [None]:
!sudo apt-get update -qq -y && sudo apt-get install python3.12 &> /dev/null
!sudo update-alternatives --quiet --install /usr/bin/python3 python3 /usr/bin/python3.12 1
!curl -sS https://bootstrap.pypa.io/get-pip.py | python3.12 &> /dev/null
!pip install --pre torch torchvision --index-url https://download.pytorch.org/whl/nightly/cu121 &> /dev/null
!pip install ipython pyyaml glasses-detector &> /dev/null

Let's download the demo images form the original [GitHub repository](https://github.com/mantasu/glasses-detector).

In [None]:
!mkdir -p data/demo
!wget -q https://raw.githubusercontent.com/mantasu/glasses-detector/main/data/demo/0.jpg -O data/demo/0.jpg
!wget -q https://raw.githubusercontent.com/mantasu/glasses-detector/main/data/demo/1.jpg -O data/demo/1.jpg
!wget -q https://raw.githubusercontent.com/mantasu/glasses-detector/main/data/demo/2.jpg -O data/demo/2.jpg
!wget -q https://raw.githubusercontent.com/mantasu/glasses-detector/main/data/demo/3.jpg -O data/demo/3.jpg
!wget -q https://raw.githubusercontent.com/mantasu/glasses-detector/main/data/demo/4.jpg -O data/demo/4.jpg
!wget -q https://raw.githubusercontent.com/mantasu/glasses-detector/main/data/demo/5.jpg -O data/demo/5.jpg

This just imports the necessary packages.

In [None]:
import os
import json
import yaml
import pickle
import numpy as np
from PIL import Image

# TODO: uncomment when Colab supports Python 3.12
# from glasses_detector import GlassesClassifier, GlassesDetector, GlassesSegmenter

# TODO: remove when Colab supports Python 3.12
from IPython.display import display

Just a utility function to display images from a directory.

In [None]:
def display_images_in_row(directory, padding=10, sort=False):
    # Load all images in the directory
    files = sorted(os.listdir(directory)) if sort else os.listdir(directory)
    images = [Image.open(os.path.join(directory, img)) for img in files]

    # Create a new image with enough width to hold all images and padding
    total_width = sum(img.width for img in images) + padding * (len(images) - 1)
    max_height = max(img.height for img in images)
    new_img = Image.new("RGB", (total_width, max_height))

    # Paste images into new image with padding
    x = 0
    for img in images:
        new_img.paste(img, (x, 0))
        x += img.width + padding

    # Display collage
    display(new_img)

## Classification

### Command Line

Let's first run the simplest prediction to see if the person is wearing glasses in `demo/0.jpg`. By default, if we don't specify the task `kind`, it will default to `anyglasses` for _classification_. Default `--format` for classification is `str`, meaning _"present"_ will be output for positive predictions and _"absent"_ for negative ones.

In [None]:
!glasses-detector --input data/demo/0.jpg --task classification

Let's now run predict if _sunglasses_ are present in every image under `demo/`. In this example, let's choose to save to a single `csv` file and encode predictions as integers.

In [None]:
!glasses-detector -i data/demo -o is_sunglasses.csv --format int --task classification:sunglasses
!cat is_sunglasses.csv

### Python Script

Here we perform the same prediction considering _anyglasses_ on `demo/1.jpg`. This cell simply shows how to use `process_file` with some _format_ examples. Note that it is also possible to specify `output_path`, specify a list of paths for `input_path`, or, if you only need to return the predictions without showing/saving the results, `predict` method could be helpful.

In [None]:
CLASSIFICATION_SINGLE = r"""
from glasses_detector import GlassesClassifier

# Load the model for anyglasses classification
cls_anyglasses = GlassesClassifier(kind='anyglasses')

def my_format(img, pred):
    # Define a custom format function (custom label and draw on image)
    label = 'Wears glasses' if (pred > 0).item() else 'Does not wear'
    return GlassesClassifier.draw_label(img, label)

# Process the image with various formats (disable the return values with _)
_ = cls_anyglasses.process_file('data/demo/1.jpg', show=True, format='logit')
_ = cls_anyglasses.process_file('data/demo/1.jpg', show=True, format={True: 'yes', False: -1})
_ = cls_anyglasses.process_file('data/demo/1.jpg', show=True, format=my_format)
"""

!python -c "$CLASSIFICATION_SINGLE"

This cell shows how to use `process_dir` to save the predictions for each image into a single file and a directory of multiple files (one for each prediction). Note how `cls_sunglasses` can also be just called directly to confirm the presence of _sunglasses_ in `[0.jpg, 3.jpg]`.

In [None]:
CLASSIFICATION_MULTI = r"""
import json
from glasses_detector import GlassesClassifier

# Load the model for sunglasses classification
cls_sunglasses = GlassesClassifier(kind='sunglasses')

# Process a directory of images and save the results to a json file as well as to a directory as images
cls_sunglasses.process_dir('data/demo', 'is_sunglasses.json', format='proba', batch_size=3, pbar=False)
cls_sunglasses.process_dir('data/demo', 'is_sunglasses', format='img', batch_size=6, pbar=False)

# Confirm '0.jpg' and '3.jpg' are indeed sunglasses, print predictions on the whole dir
print('[0.jpg, 3.jpg]:', cls_sunglasses(image=['data/demo/0.jpg', 'data/demo/3.jpg']))
print(json.load(open('is_sunglasses.json')))
"""

!python -c "$CLASSIFICATION_MULTI"
display_images_in_row("is_sunglasses")

## Detection

### Command Line

This is just an example of a very simple eye area detection on an image with no worn glasses (`demo/2.jpg`) - the bounding box is printed to the terminal.

In [None]:
!glasses-detector --input data/demo/2.jpg --format str --task detection:eyes

Here is just the processing of the whole `demo/` directory where the glasses bounding box predictions are saved as separate `.txt` files (one for each image).

In [None]:
!glasses-detector --input data/demo --output glasses_bboxes -f int -e .txt --task detection:worn
!for file in glasses_bboxes/*.txt; do head -n 1 "$file"; done

### Python Script

This cell simply repeats the same eye-area prediction as before, except it shows the actual image with the drawn bbox (because now _IPython_ environment is available instead of a pure terminal). Notice how the `GlassesDetector` instance can be simply called directly to perform the prediction.

In [None]:
DETECTION_SINGLE = r"""
from glasses_detector import GlassesDetector

# Load the model for eyes detection
det_eyes = GlassesDetector(kind='eyes')
# det_eyes('data/demo/2.jpg') # default format is 'img'
det_eyes('data/demo/2.jpg').save('eyes_bbox.jpg')
"""

!python -c "$DETECTION_SINGLE"
display(Image.open("eyes_bbox.jpg"))

Let's experiment more with different ways multiple files can be processed, different output file formats and different prediction type formats. For bounding boxes, it is generally more preferable to save the results to a single file but saving each prediction to a separate file is also possible as shown before.

In [None]:
DETECTION_MULTI = r"""
import yaml
import pickle
import numpy as np
from glasses_detector import GlassesDetector

# Initialize glasses detection model
det_worn = GlassesDetector(kind='worn')

# Process the images in various ways with various outputs and formats
det_worn.process_file('data/demo/2.jpg', 'eyes_det.txt', show=False, format='str')
det_worn.process_file(['data/demo/0.jpg', 'data/demo/3.jpg'], 'sung_det.yaml', format='float')
det_worn.process_dir('data/demo', 'demo_det.pkl', format='bool', batch_size=2, pbar='Predicting bboxes')

# Show contents of the saved files
print(open('eyes_det.txt').read())
print(yaml.safe_load(open('sung_det.yaml')))
print(np.stack(list(pickle.load(open('demo_det.pkl', 'rb')).values())).shape)
"""

!python -c "$DETECTION_MULTI"

## Segmentation

### Command Line

Mask prediction provides information about every pixel, thus it is best to save the mask as a grayscale image or a compressed object, such as `.pkl`, `.dat` or `.npz` as shown here:

In [None]:
!glasses-detector -i data/demo/0.jpg -o frames_mask.npz -t segmentation:frames -s small
!python -c "import numpy as np; print(np.load('frames_mask.npz')['arr_0'].shape)"

Similar to previous examples, this one just shows how to process a directory of images: here, their _full_ glasses masks are saved as `.png` images.

In [None]:
!glasses-detector -i data/demo -o full_masks_pure -f mask -e .png -t segmentation:full -s medium -b 6 # TODO: change medium to large
!ls full_masks_pure

### Python Script

Let's perform a simple frames segmentation again and show both the regular mask and the inverted one. Notice again, how single image prediction could be simply performed by calling the segmenter instance (or its `predict` method) since there is no need to save any output, but, of course, it is also possible to still call `process_file` without specifying `output_path` and get the result.

In [None]:
SEGMENTATION_SINGLE = r"""
from PIL import Image
import numpy as np
from glasses_detector import GlassesSegmenter

# Initialize frames segmentation model of size small
seg_frames = GlassesSegmenter(kind='frames', size='small')

# Process '0.jpg' to generate glasses frames mask (inverted and original)
inverted_mask = seg_frames('data/demo/1.jpg', format={True: 0, False: 255})
inverted_mask = Image.fromarray(inverted_mask.numpy(force=True).astype(np.uint8))
original_mask = seg_frames.process_file('data/demo/1.jpg', format='mask', show=False)

# Show both masks (TODO: change back to .show())
# inverted_mask.show()
# original_mask.show()
inverted_mask.save('inverted_mask.png')
original_mask.save('original_mask.png')
"""

!python -c "$SEGMENTATION_SINGLE"
display(Image.open("inverted_mask.png"))
display(Image.open("original_mask.png"))

Here is just a further example of different output possibilities when processing a directory of image files. In this case, if we specify all the predictions to be saved to a single file, some file formats, such as `.csv` or `.txt` will automatically flatten the `2D` mask to fit into a single row.

In [None]:
SEGMENTATION_MULTI = r"""
import numpy as np
from glasses_detector import GlassesSegmenter

# Initialize full segmentation model of size large
seg_full = GlassesSegmenter(kind='full', size='medium') # TODO: change medium to large

# Process the directory of images and save the results in various ways
seg_full.process_dir('data/demo', 'full_masks_over', format='img', batch_size=6, pbar=False)
seg_full.process_dir('data/demo', 'full_masks.npy', format='mask', batch_size=6, pbar=False)
seg_full.process_dir('data/demo', 'full_masks.csv', format='proba', batch_size=6, pbar=False)

# Display the generated masks (and the one from CLI)
# display_images_in_row('full_masks_pure', sort=True)
# display_images_in_row('full_masks_over', sort=True)
print(np.load('full_masks.npy').shape)
print(*[','.join(row) + ',...' for row in np.loadtxt('full_masks.csv', delimiter=',', dtype=str)[:, :5]], sep='\n')
"""

!python -c "$SEGMENTATION_MULTI"
display_images_in_row("full_masks_pure", sort=True)
display_images_in_row("full_masks_over", sort=True)