Rough outline on how to proceed with the poster, mostly my thinking out loud and collating relevant info on each step

# Importing the data 

Link to the Pascal challenge: \
http://host.robots.ox.ac.uk/pascal/VOC/voc2007/index.html


As per Sayantan's pdf, we are only going to be doing this part of the challenge:\
Classification: For each of the twenty classes, predicting presence/absence of an example of that class in the test image.

In other words, no bounding boxes for individual features. We could optionally do that later but maybe now we can dispense with it, our task is to idenfity object classes and classify each image as containing a given class (out of 20).

# Vectorising the data

Reference features provided in the task document are as follows:

1. MPEG-7 Color Layout Descriptor \
 https://en.wikipedia.org/wiki/Color_layout_descriptor
2. Visual Bag-of-Words \
https://en.wikipedia.org/wiki/Bag-of-words_model_in_computer_vision
3. Speeded up robust features (SURF) \
 https://en.wikipedia.org/wiki/Speeded_up_robust_features

Again we could experiment with using different ones. We have to make sure we do feature extraction on a portion of the data, rather than the whole data, for CV, as Sayantan said this would be cheating 


# Feature selection

This is super important, one of the tasks being "Extract meaningful hand-crafted features \[...] using appropriate libraries or implement them from scratch. Reference to some hand-crafted features are provided. Identify and extract at least one extra feature of your choice other than the three features mentioned in the task. Why did you select this these feature(s) out of other options?
"

In [1]:
# You can locally install the open CV for the SURF alg
# !pip3 install opencv-python 

In [1]:
import cv2 as cv
img = cv.imread("./Data_train/JPEGImages/000005.jpg")

# cv.imshow("new window", mat=img)
# k = cv.waitKey(0) # Wait for a keystroke in the window

# Algorithm selection

I'm guessing we could get away with just picking one. I suggest constrained clustering, COP-k-Means. We could even experiment with running two separate algs and compare, like got with PCK-Means and compare the performance to COP-k-Means, they're quite similar

Here is the link to the lecture slides about these algorithms.\
https://elearning.ovgu.de/pluginfile.php/937020/mod_resource/content/2/english_ATiML05_ConstrainedClustering.pdf

Constrained clustering incorporates domain knowledge into the clustering process. In our case, domain knowledge would be super wide, essentially which features should be present for which object class. Domain knowledge that we can introduce includes 
1. Number of clusters (it's 20, Pascal tells us) 
2. Minimum cluster variance (not sure)
3. Min / Max cluster size
4. Must-link constraints
5. Cannot-link constraints

COP-k-Means is a modified k-means clustering algorithm that allows for constraints. It modifies step 1 of the k-means process (initialising cluster centres) by disallowing assignment to a cluster if the must constraint is already satisfied with another cluster or if the cannot-link constraint applies. 

The paper that describes the algorithm is here:
https://web.cse.msu.edu/~cse802/notes/ConstrainedKmeans.pdf \
It's also where the slides got their low quality screenshot of algorithm steps from

# Model selection

CV and the like, gotta plot that error as a function of lambda / M for Sayantan for sure

# Generalisation error / testing

50 % of the Pascal dataset is set to be for testing

# Load and divide the data into train and test set

In [9]:

from bs4 import BeautifulSoup
import numpy as np
from skimage import data
from skimage import io
import os
import sys
from sklearn.model_selection import train_test_split

"""Classes: person, bird, cat, cow, dog, horse, sheep, aeroplane, bicycle, boat, bus, car, motorbike, train, bottle, chair, dining table, potted plant, sofa, tv/monitor"""

jpegPath = './Data_train/JPEGImages'
annotationPath = "./Data_train/Annotations"
images = []
labels = []
x_train, x_test, y_train, y_test

def getImageLables(path: str) -> list[str]:
    tempLabels = []
    with open(path, 'r') as f:
        data = f.read()
        bs_data = BeautifulSoup(data, "xml")

        #GET IMAGE LABELS
        foundObjects = bs_data.find_all('object') #This holds the labels for the image
        for object in foundObjects:
            labelWithTags = str(object.find('name'))
            tempLabels.append(labelWithTags.removeprefix('<name>').removesuffix('</name>'))
        f.close()
    labels.append(tempLabels)

def getImageData(path):
    with open(path, 'r') as f:
        data = f.read()
        bs_data = BeautifulSoup(data, "xml")

        #GET IMAGE LABELS
        foundObjects = bs_data.find('filename') #This holds the labels for the image
        imagePath = os.path.join(jpegPath, str(foundObjects).removeprefix('<filename>').removesuffix('</filename>'))
        images.append(io.imread(imagePath)) #Load image from specified path (ndarray of color values)              

def getTestTrainingData():
    for annoXML in os.listdir(annotationPath):
        path = os.path.join(annotationPath, annoXML)
        getImageLables(path)
        getImageData(path)
    
    global x_train, x_test, y_train, y_test
    x_train, x_test, y_train, y_test = train_test_split(images, labels, test_size=0.5)

getTestTrainingData()
print("Train test split succeded\nTrain size: "+ str(len(x_train))+"\nTest Size: "+str(len(x_test)))

Train test split succeded
Train size: 2505
Test Size: 2506
