# Processing Video Pipeline

Assume you have a working ML model that can process individual images and identify carrots, how would you adapt that model such that you could feed it live video inside a grocery store and have it create a record of any carrots it sees.

To adapt the existing ML model to process live video feeds and identify carrots in a grocery store, I would integrate the model into a video processing pipeline using relevant AI and Python libraries. First, I would use OpenCV to capture and preprocess the video frames in real-time. Each image would be fed into the ML model, which would process the frames and identify any carrots present. Detected carrots would be logged into a record, including metadata such as the timestamp and location within the frame.

To ensure efficient processing, the pipeline would utilize a buffer to manage the flow of frames, ensuring real-time analysis without significant lag. Additionally, optimizing the model for speed and accuracy would be crucial, potentially leveraging GPU acceleration to handle the high volume of data in real-time. This setup would enable continuous monitoring and recording of carrot detections within the grocery store environment.

# Demo

Write a toy implementation of whatever machine learning concept you would like in order to demonstrate your skills. This doesn't need to be in the notebook if you want to use something other than python.

The problems we work on are wholly related to classfication, so your toy implementation should show knowledge of the fundamentals of classification problems.

In [3]:
# Import necessary libraries
import numpy as np
import tensorflow as tf
from sklearn.metrics import accuracy_score

# Load the MNIST dataset
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

# Flatten the images to vectors of size 784
x_train = x_train.reshape(x_train.shape[0], -1)
x_test = x_test.reshape(x_test.shape[0], -1)

# Normalize the pixel values to range [0, 1]
x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0

# Function to calculate Euclidean distance
def euclidean_distance(a, b):
    return np.sqrt(np.sum((a - b) ** 2, axis=1))

# k-NN classifier
def knn(x_train, y_train, x_test, k=3):
    predictions = []
    for test_point in x_test:
        distances = euclidean_distance(x_train, test_point)
        # Get the indices of the k nearest neighbors
        nearest_neighbors_indices = np.argsort(distances)[:k]
        # Get the labels of the k nearest neighbors
        nearest_labels = y_train[nearest_neighbors_indices]
        # Get the most common label (majority vote)
        unique, counts = np.unique(nearest_labels, return_counts=True)
        majority_vote = unique[np.argmax(counts)]
        predictions.append(majority_vote)
    return np.array(predictions)

# Use a subset of the test set for evaluation (e.g., 1000 samples)
subset_size = 1000
x_test_subset = x_test[:subset_size]
y_test_subset = y_test[:subset_size]

# Predict using k-NN with k=3
k = 3
y_pred = knn(x_train, y_train, x_test_subset, k=k)

# Evaluate the accuracy
accuracy = accuracy_score(y_test_subset, y_pred)
print(f'Accuracy: {accuracy * 100:.2f}%')


Accuracy: 96.20%


The provided code implements a k-Nearest Neighbors (k-NN) classifier for the MNIST dataset using NumPy. First, the MNIST dataset is loaded using TensorFlow, and the images are flattened from 28x28 pixel arrays to 784-dimensional vectors. The pixel values are then normalized to the range [0, 1]. A helper function, euclidean_distance, computes the Euclidean distance between two points. The knn function classifies each test point by calculating distances from the test point to all training points, identifying the k nearest neighbors, and determining the most frequent label among these neighbors (majority vote). Finally, the classifier is evaluated on a subset of the test set, and its accuracy is calculated using the accuracy_score function from Scikit-learn. The accuracy is then printed.