**Processing Video Pipeline**

To adapt a working ML model that can identify carrots in individual images for use in a live video inside a grocery store I would first make sure I have all the hardware and software requirements necessary. I would need a camera with sufficient resolution and frame rate to capture clear video footage, as well as a computer with a GPU capabale of handling the processing of video frames quickly enough for real time analysis. I would use OpenCV for frame extraction and handling the video stream. And I would have a database for logging the detected carrots. Firstly, I would set up the camera with open CV and load the pretrained model. Then, on every frame of the video I would first preprocess the frame to a format expected by the model. After that I would insert the frame to the model to make a predicition, I would then take the accuracy scorereturned by the model, we will assume we are using binary classification with a threshold, and if it is within the threshhold then I would log the carrot into the database. To enhance the accuracy I would use Non-maximum suppression techniques to reduce duplicate detections, as well as post processing steps to filter out false positives.


In [None]:
#Sample code to show how I would use a corrot detection model on a live video feed
import cv2
import numpy as np
import datetime
import tensorflow as tf

#load the carrot model
carrot_model = tf.keras.models.load_model('carrot_detection_model.h5')

#function to log detected carrots
def log_detection(frame):
    timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
    filename = f"detected_carrot_{timestamp}.jpg"
    cv2.imwrite(filename, frame)

# Function to preprocess frames
def preprocess_frame(frame):
  width = 224
  height = 224
  resized_frame = cv2.resize(frame, (width, height))
  normalized_frame = resized_frame / 255.0
  return normalized_frame.reshape((1, height, width, 2))

In [None]:
#Capture with default camera
cap = cv2.VideoCapture(0)

if not cap.isOpened():
  print("Error: Could not open video stream.")
  exit()

while True:
  ret, frame = cap.read()
  if not ret:
    break

  #Preprocess the frame
  preprocessed_frame = preprocess_frame(frame)

  #predict with the carrot model
  pred = carrot_model.predict(preprocessed_frame)
  score = pred[0][0]

  #if score is within a threshold then log the detection
  if score > 0.7:
    print("Carrot Detected")
    log_detection(frame)

  cv2.imshow('Carrot Stream', frame)

  if cv2.waitKey(1) & 0xFF == ord('q'):
    break

cap.release()
cv2.destroyAllWindows()

**DEMO**
Write a toy implementation of whatever machine learning concept you would like in order to demonstrate your skills:

I will be using the MNIST dataset to train a CNN. The MNIST dataset consists of 70,000 images of handwritten digits (0-9) where each image is 28x28. We will load the dataset, preprocess it, and train a CNN model then evaluate its performance.

The CNN Model consists of:
- Input Layer: input shape of each image and the number of channels
- Convolutional Layer 1: apply convolution operatioin on the input and extracts features like edges, textures, and shapes. Filters recognize different patterns in the images
  *   32 Kernels
  *   Kernel Size (3,3)
  * Activation function ('relu')
  * Input Shape (28, 28, 1)
- Convolution layer 2
  * 64 kernels
  * Kernel Size (3,3)
  * Activation Function ('relu')
- Pooling layer: Reduces spatial dimensions of feature maps, keeping the most important features while reducing load
 * Pool Size (2,2)
- Dropout Layer: Randomly set a fraction of the input units to 0 during training to prevent overfitting
 * Dropout Rate (0.25)
- Flatten Layer: Conversts 2D feature maps into 1D vectors
  * Layer flattens input to convert matrix into vector
- Dense Layer: perform classification based on features extracted by the convolutional and pooling layers. The final dense layer uses softmax activation to output probabilites.
  * 10 neurons, for each 10 classes of digits
  * Activation Function (softmax)

With the use of CNNs we can define and train a model that has 99% accuracy on the MNIST dataset. We could improve the model further by adding more spefic and fine tuned layers, or trying to combine multiple models to improve performance. We could also use different datasets so that the model is train on a wide variety of data.





In [None]:
#import libraries
import numpy as np
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import EarlyStopping

In [None]:
#Load the dataset
(X_train, Y_train), (X_test, Y_test) = mnist.load_data()

# Reshape the data to include a single channel
X_train = X_train.reshape((X_train.shape[0], 28, 28, 1))
X_test = X_test.reshape((X_test.shape[0], 28, 28, 1))

# Normalize the pixel values to the range [0, 1]
X_train = X_train.astype('float32') / 255.0
X_test = X_test.astype('float32') / 255.0

# Convert the labels to categorical format
y_train = to_categorical(Y_train, 10)
y_test = to_categorical(Y_test, 10)

In [None]:
# Build the CNN model
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(28, 28, 1)))
model.add(Conv2D(64, kernel_size=(3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))

In [None]:
# Compile the model
model.compile(loss='categorical_crossentropy', optimizer=Adam(), metrics=['accuracy'])

# Define early stopping to prevent overfitting
early_stopping = EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)

In [None]:
# Train the model
history = model.fit(X_train, y_train, validation_split=0.2, epochs=20, batch_size=128, callbacks=[early_stopping])

# Evaluate the model on the test set
test_loss, test_accuracy = model.evaluate(X_test, y_test)
print(f'Test accuracy: {test_accuracy:.4f}')