# Zero-Shot Image Matching: Playing Memory

Remember the classic memory game? The one where you flip over cards to find matching pairs? Well, it's back with a twist! This time, we’re letting AI do the hard work for us.

My daughter and I used to play this game together when she was younger. This summer, we decided to bring back the fun and let AI join in on the action. So, we rolled up our sleeves and built an artificial vision algorithm to play the game for us. Spoiler alert: it was awesome!

Now, let me walk you through how we programmed a computer to outmatch us at finding Frozen cards—faster than you can say, "Let it go!"

## Recovering the Data to Test
Alright, let’s get down to business! To put our AI to the test, we need a set of images featuring our beloved Frozen characters. We’ve prepared a small collection of these images specifically for this experiment.

Here’s the plan: we’ll load these images into our notebook and get them ready for our AI to work its magic. Get ready, Elsa and Anna are about to meet their match!

In [None]:
import numpy as np
import pandas as pd
from pathlib import Path
import cv2
from matplotlib import pyplot as plt
from fastprogress.fastprogress import master_bar, progress_bar
import math
import heapq

# Define the input directory
input_dir = Path('/kaggle/input')

# List of all files in the directory
files = sorted([file_path for file_path in input_dir.rglob('*') if file_path.is_file()])

for i, fn in enumerate(files):
    print(f"{i:>4}: {fn}")



In [None]:
# read all the images an store in the images list
images = [cv2.imread(image_path) for image_path in progress_bar(files)]
print(f"{len(images)} images loaded.")

In [None]:
# Utility functions to display images

def show_image(image, ax=None, title=None):
    """
    Display a single image using matplotlib.

    Parameters:
    - image: The image to display, expected in BGR format.
    - ax: Optional; matplotlib axes object to use for plotting.
    - title: Optional; title for the image plot.
    """
    # Convert BGR image to RGB for correct color display
    image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    if ax is None:
        # Create a new plot if no axes object is provided
        fig, ax = plt.subplots(1, 1, figsize=(16, 16))  # Size of the plot window
    ax.imshow(image_rgb)
    if title:
        ax.set_title(title)  # Set the title if provided
    ax.axis('off')  # Hide axes for a cleaner look

def show_images(images, columns=4, titles=None, size_cols=15, size_per_row=3):
    """
    Display multiple images in a grid using matplotlib.

    Parameters:
    - images: List of images to display, each expected in BGR format.
    - columns: Number of columns in the grid.
    - titles: Optional; list of titles for each image.
    """
    if titles is None:
        titles = [None] * len(images)  # Default titles to None if not provided
    rows = (len(images) + columns - 1) // columns  # Calculate the number of rows needed
    fig, axes = plt.subplots(rows, columns, figsize=(size_cols, rows * size_per_row))

    # Flatten the axes array for easy iteration
    axes = axes.flatten()

    # Display each image
    for ax, image, title in zip(axes, images, titles):
        show_image(image, ax=ax, title=title)

    # Hide any remaining empty subplots
    for ax in axes[len(images):]:
        ax.axis('off')

    # Adjust layout and display the plot
    plt.tight_layout()
    plt.show()

# Example usage:
# image_path = files[3]
# image = cv2.imread(image_path)
# show_image(image)

# Display a grid of images with titles
show_images(images, titles=[fn.name for fn in files])


## Selected Image
Here’s the image we selected for our experiment. It’s got all the pieces neatly arranged, no pesky lighting issues, and no overlapping cards. A perfect zenithal (top-down) view to make our AI's job a breeze!

Let’s take a look at our beautifully organized Frozen card collection:

In [None]:
fn_image_to_test = files[1]  # TODO revisar
image_to_test = cv2.imread(fn_image_to_test)
show_image(image_to_test)

## Preprocessing
First things first, let’s get our image ready for some serious AI analysis. We’ll start by performing some preprocessing steps. The goal here is to make it easier for our algorithm to identify and segment the different zones of the image that contain the card pieces.

Here’s the plan:

* Convert to Grayscale: Simplify the image by removing color information.
* Apply Blur: Smooth out the image to reduce noise and irrelevant details.
* Binary Thresholding: Create a binary image to clearly distinguish between the card pieces and the background.


With these steps, our image will be prepped and primed for segmentation. Let’s dive in!

In [None]:
def preprocess_image(image):
    """
    Preprocess the image by converting it to grayscale, applying median blur, and thresholding.

    Parameters:
    - image: The input image in BGR format.

    Returns:
    - gray: Grayscale version of the input image.
    - blur: Blurred version of the grayscale image.
    - thresh: Binary thresholded version of the blurred image.
    """
    # Convert the image to grayscale
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    
    # Apply median blur to the grayscale image to reduce noise
    blur = cv2.medianBlur(gray, 25)
    
    # Apply Otsu's thresholding to the blurred image to get a binary image
    flag, thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
    
    return gray, blur, thresh

gray, blur, thresh = preprocess_image(image_to_test)

show_images((image_to_test, gray, blur, thresh), titles=["Original", "Grayscale", "Blurred", "Thresholded"])


## Finding Contours
Now that we have our segmented image, it’s time to find the contours. This is where the magic of old-school computer vision algorithms comes into play.

We’ll use ```cv2.findContours``` to follow the borders of the objects detected in our binary image. In this simplified black-and-white world, the border is just where black turns to white. By connecting neighboring pixels, we can construct polygons around our card pieces.

Here’s how we refine our approach:

* Filter by Area: Discard polygons that are too small, assuming all card pieces should be of similar size.
* Simplify Polygons: Use cv2.approxPolyDP to approximate each contour with a simpler polygon, ideally with four sides (rectangles or trapezoids).

Cross your fingers and hope for good lighting and minimal overlaps! If all goes well, we’ll have a nice set of polygons representing our card pieces, ready for further analysis.



In [None]:
def apply_countours(image, contours, color=(0, 255, 0), size=10):
    """
    Draw contours on the image.

    Parameters:
    - image: The input image on which contours will be drawn.
    - contours: List of contours to be drawn.
    - color: Optional; color of the contours (default is red).
    - size: Optional; thickness of the contour lines (default is 10).

    Returns:
    - image: Image with contours drawn on it.
    """
    # Make a copy of the image to avoid altering the original
    image = image.copy()
    
    # Draw contours on the image
    cv2.drawContours(image, contours, -1, color, size)
    
    return image


def find_cards(thresh):
    """
    Find and filter contours in the thresholded image to identify potential cards.

    Parameters:
    - thresh: Binary thresholded image.

    Returns:
    - cards: List of approximated polygons representing the detected cards.
    """
    # Find contours in the thresholded image
    contours, _ = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    
    # Calculate the area of each contour
    areas = [cv2.contourArea(contour) for contour in contours]
    
    # Define the area bounds based on the median area of significant contours (removing too small polygons)
    lower_bound = np.median([a for a in areas if a > 100]) * 0.50  # Minimal area filter
    upper_bound = np.median([a for a in areas if a > 100]) * 1.50  # Upper bound filter

    # Filter contours based on the defined area bounds
    interesting_contours = [contour for contour in contours if lower_bound <= cv2.contourArea(contour) <= upper_bound]
    discarded_contours = [contour for contour in contours if not(lower_bound <= cv2.contourArea(contour) <= upper_bound)]
    
    # Approximate polygons for the filtered contours
    cards = [cv2.approxPolyDP(contour, 0.02 * cv2.arcLength(contour, True), True) for contour in interesting_contours]
    
    return cards
show_image(apply_countours(image_to_test, cards:=find_cards(thresh)))

## Labeling Cards
Now that we’ve detected all the cards, it’s time to label them. We’ll draw a label on each card to keep track of our findings. This step is crucial for verifying our AI’s accuracy and ensuring everything is correctly identified.

Here’s what we’ll do:

* Draw Contours: Highlight each detected card by drawing its contour.
* Add Labels: Assign a unique label to each card, making it easy to reference and verify.

Let’s visualize our labeled cards and see how our AI did in identifying the pieces!

In [None]:
def image_labeled_cards(image, cards, labeler=None, fontScale=5, thickness=30, contour_thickness=40):
    """
    Draw contours and labels on detected card regions in an image.

    Parameters:
    - image: The input image on which labels and contours will be drawn.
    - cards: List of card contours to be labeled.
    - labeler: Optional; function to generate labels for each card (default is None).
    - fontScale: Optional; scale of the font used for labels (default is 5).
    - thickness: Optional; thickness of the text (default is 30).
    - contour_thickness: Optional; thickness of the contour lines (default is 40).

    Returns:
    - image_labeled: Image with labeled contours.
    """
    # Make a copy of the image to avoid altering the original
    image_labeled = image.copy()
    
    # Use a colormap for contour colors
    colormap = plt.get_cmap("tab20")

    for i, card in enumerate(cards):
        # Calculate the bounding rectangle of the contour
        x, y, w, h = cv2.boundingRect(card)
        cX = x + w // 2
        cY = y + h // 2

        # Determine the color and label for the current card
        if labeler:
            color_index, label = labeler(i)
        else:
            color_index, label = i, None
        
        # Convert colormap color to BGR format
        color = tuple(int(c * 256) for c in colormap.colors[color_index % len(colormap.colors)])
        
        # Draw the contour on the image
        cv2.drawContours(image_labeled, [card], 0, color, contour_thickness)

        if label is not None:
            # Get the size of the text to be drawn
            text_size = cv2.getTextSize(label, cv2.FONT_HERSHEY_SIMPLEX, fontScale, thickness)[0]

            # Calculate the text position to be centered
            textX = cX - text_size[0] // 2
            textY = cY + text_size[1] // 2

            # Put the text in the center of the bounding rectangle
            cv2.putText(image_labeled, label, (textX, textY), cv2.FONT_HERSHEY_SIMPLEX, fontScale, color, thickness, cv2.LINE_AA)

    return image_labeled

# Apply the labeled contours to the test image and display the result
show_image(image_labeled_cards(image_to_test, cards, labeler=lambda x: (x, f"{x}")))

## Warping
Now for the fun part—warping! We need to zoom in on each card piece and convert it to a uniform size. This helps standardize the pieces for further analysis. To achieve this, we use ```cv2.getPerspectiveTransform``` to correct any perspective distortions.

Here’s the game plan:

* Zoom In: Focus on each detected card piece.
* Standardize Size: Convert each piece to the same size for consistency.
* Correct Perspective: Use cv2.getPerspectiveTransform to adjust for any perspective effects, ensuring each card is viewed head-on.

This process will help us create a clean, uniform dataset of card images, ready for AI to work its matching magic!

In [None]:
WARP_SIZE = 200

def obtain_card_warped(image, cards):
    """
    Warp the detected card regions in the image to a fixed size.

    Parameters:
    - image: The input image containing the detected card regions.
    - cards: List of contours representing the card regions.

    Returns:
    - warped_cards: List of images of the warped card regions.
    """
    numcards = len(cards)
    warped_cards = [None] * numcards

    for i in range(numcards):
        approx = cards[i]
        # Flatten the contour points to a numpy array of float32
        approx = np.array([item for sublist in approx for item in sublist], np.float32)
        
        # Define the destination points for the perspective transform
        h = np.array([[0, 0], [0, WARP_SIZE], [WARP_SIZE, WARP_SIZE], [WARP_SIZE, 0]], np.float32)
        
        # Get the perspective transform matrix
        transform = cv2.getPerspectiveTransform(approx, h)
        
        # Apply the perspective transform to get the warped image
        warped_cards[i] = cv2.warpPerspective(image, transform, (WARP_SIZE, WARP_SIZE))

    return warped_cards

# Warp the card regions in the test image
warped_cards = obtain_card_warped(image_to_test, cards)

# Display the warped card images
show_images(warped_cards, columns=16, titles=[f"{i}" for i in range(len(warped_cards))], size_per_row=2)


## Calculating Similarity
With all our card pieces now standardized in size, it’s time to detect which images are similar to each other. We’ll use a simple yet effective approach to compare the images.

Here’s our plan:

* Image Subtraction: Subtract one image from another to highlight differences.
* Mean Square Root: Calculate the Mean Square Root of these differences across the entire image. This gives us a quantitative measure of similarity.


We can’t assume the images are correctly **oriented**. But here’s the good news—each card piece has four sides, so we only need to check four possible orientations.
By following this method, we’ll determine which cards are pairs based on their similarity scores. Let’s dive into the calculations!

In [None]:
def image_distance(img1, img2):
    """
    Calculate the mean squared error (MSE) between two images.

    Parameters:
    - img1: First image in numpy array format.
    - img2: Second image in numpy array format.

    Returns:
    - mse: Mean squared error between the two images.
    """
    img1 = img1.astype(np.float32)
    img2 = img2.astype(np.float32)
    mse = np.mean((img1 - img2) ** 2)
    return mse

def best_image_distance(img1, img2):
    """
    Find the best rotation of the second image that minimizes the distance to the first image.

    Parameters:
    - img1: First image in numpy array format.
    - img2: Second image in numpy array format.

    Returns:
    - min_distance: Minimum MSE between the first image and the best rotation of the second image.
    - best_rotation: The rotation of the second image that gives the minimum MSE.
    """
    # Generate all 90-degree rotations of the second image
    rotations = [img2]
    for _ in range(3):
        rotations.append(np.rot90(rotations[-1]))

    # Calculate the MSE for each rotation
    distances = [image_distance(img1, rotation) for rotation in rotations]

    # Find the rotation with the minimum MSE
    min_distance = min(distances)
    best_rotation = rotations[distances.index(min_distance)]

    return min_distance, best_rotation


def test_pair(a,b):
    distance, best_rotation = best_image_distance(warped_cards[a], warped_cards[b])
    diff = np.abs(warped_cards[a].astype(np.float32)-best_rotation.astype(np.float32)).astype(np.uint8)

    show_images([warped_cards[a], warped_cards[b], best_rotation, diff],
               titles=[ "Target", "Candidate", f"Best rotation : {distance:.2f}", "Diff"] )

test_pair(0, 1) # shows the test againt two images, and the best rotation 

## Calculating the Similarity for All Cards
Now, it's time to put our similarity method to the test and compare all the cards against each other. By doing this, we can identify the best matches and locate the pairs.

Here’s how we’ll do it:

* Pairwise Comparison: Compare each card with every other card using our image subtraction and Mean Square Root method.
* Find Best Matches: Determine the best match for each card based on the lowest distance score.
* Identify Pairs: Use these best matches to locate the pairs of cards.
By systematically comparing all the cards, we’ll be able to uncover the pairs efficiently. Let’s get matching!

In [None]:
numcards = len(warped_cards)



# Define the number of elements to plot
ELEMENTS_TO_PLOT = 5
ELEMENTS_TO_PLOT = min(ELEMENTS_TO_PLOT, numcards)

# Define the number of best elements to study
BEST_ELEMENTS_TO_STUDY = 5
BEST_ELEMENTS_TO_STUDY = min(BEST_ELEMENTS_TO_STUDY, numcards - 1)
pairs = {}
fig, axs = plt.subplots(ELEMENTS_TO_PLOT, BEST_ELEMENTS_TO_STUDY + 1, figsize=(16, 3 * ELEMENTS_TO_PLOT))
mb = master_bar(warped_cards)

for i, card_selected in enumerate(mb):
    best_options = []
    for j in progress_bar(range(numcards), total=numcards, parent=mb):
        if i != j:
            distance, best_rotation = best_image_distance(card_selected, warped_cards[j])
            best_options.append((distance, j, best_rotation))
            best_options.sort(key=lambda x: x[0])
            best_options = best_options[:BEST_ELEMENTS_TO_STUDY]

    best_pair = best_options[0][1]
    pairs[i] = best_pair
    if i < ELEMENTS_TO_PLOT:
        ax = axs[i]
        show_image(card_selected, ax=ax[0], title=f"#{i} => #{best_pair}")
        for k, (ratio, j, best_rotation) in enumerate(best_options):
            show_image(best_rotation, ax=ax[k + 1], title=f"#{j} {ratio:.2f}")

plt.show()

In [None]:
# pair id definition
pair_id = {}
id = 0
for k,v in pairs.items():
    if k not in pair_id:
        pair_id[k] = id
        assert v not in pair_id, f"{k,v} => {pair_id[v]}"   # this should work if all the mathc has a mutual match
        pair_id[v] = id
        id+=1
print(pair_id)

## Results
Drumroll, please! It’s time to reveal the results of our AI-powered memory game experiment. After processing and comparing all the cards, here’s what we found:



In [None]:
show_image(image_labeled_cards(image_to_test, cards, thickness=15, labeler=lambda x: (pair_id[x], f"{chr(pair_id[x]+ord('A'))}")))

# Conclusions

These results highlight the potential of using AI to tackle classic games like memory. Who knew that Elsa, Anna, and a bit of computer vision could make such a great team?

This project was a delightful summer endeavor shared between a father and daughter, combining play and learning. We had a blast experimenting with various tools and algorithms, discovering how they can solve puzzles and make games more fun. 

Our experiment not only demonstrated the power of AI in gaming but also created wonderful memories. Who knows what we’ll tackle next? The possibilities are endless when you mix family fun with a dash of AI magic!

Maybe not many people enjoy making the computer solve puzzles. But I'm lucky 🍀, I'm not alone.