# Background Extraction from videos using Gaussian Mixture Models

**Objective**  
Given a video with sparse and dynamically changing foreground, extract the static background.

**Input:**  
![](./resources/traffic.gif)

**Expected Output:**  
![](./resources/background.png)

In [None]:
import os
import cv2
import numpy as np
import matplotlib.pyplot as plt
from sklearn.mixture import GaussianMixture

In [None]:
vid = cv2.VideoCapture('./data/traffic.avi')

**Algorithm:**  
The step-wise approach is as follows:
1. Extract frames from the video.
2. Stack the frames in an array where the final array dimensions will be *(num_frames, image_width, image_height, num_channels)*
3. Initialize a dummy background image of the same size as that of the individual frames.
4. For each point characterized by the x coordinate, the y-coordinate and the channel, model the intensity value across all the frames as a mixture of two Gaussians.
5. Once modelled, initialize the intensity value at the corresponding location in the dummy background image with the mean of the most weighted cluster. The most weighted cluster will be the one coming from the background whereas owing to the dynamically changing and sparse nature of the foreground, the other cluster will be voted less.
6. Finally, the background image will contain the intensity values corresponding to the static background.

In [None]:
frames = []
frame_count = 0

while True:
    ret, frame = vid.read()
    if frame is not None:
        frames.append(frame)
        frame_count += 1
    else:
        break
frames = np.array(frames)

In [None]:
print("Number of frames extracted is {}".format(frame_count))

In [None]:
print("array dimensions will be (num_frames, image_width, image_height, num_channels)")
print("Shape of frames is {}".format(frames.shape))

**Data Modelling:**  
We are going to model each point in space for all the three image channels, namely **R**, **G** and **B** as a bimodal distribution of Gaussians, where one Gaussian in the mixture accounts for the background and the other for the foreground.

In [None]:
gmm = GaussianMixture(n_components = 2)

In [None]:
# initialize a dummy background image with all zeros
background = np.zeros(shape=(frames.shape[1:]))

In [None]:
print("Shape of dummy background image is {}".format(background.shape))

In [None]:
for i in range(frames.shape[1]):
    for j in range(frames.shape[2]):
        for k in range(frames.shape[3]):
            X = frames[:, i, j, k]
            X = X.reshape(X.shape[0], 1)
            gmm.fit(X)
            means = gmm.means_
            covars = gmm.covariances_
            weights = gmm.weights_
            idx = np.argmax(weights)
            background[i][j][k] = int(means[idx])

In [None]:
# Store the result onto disc
cv2.imwrite('background.png', background)