## TEMPORAL MEDIAN FILTER with MASKED ARRAY

In the very few seconds of the video in which the person is moving right across the same spot over
and over, both methods are "absorbing" it into the background model.

To improve this part we decided to try implementing a **selective update** for the temporal median using the ``numpy.MaskedArray`` module in order to insert in the buffer a ``masked frame`` i.e. a frame and its
associated boolean validity mask.

If the pixel have been detected as foreground by the algorithm, the validity mask would indicate that the pixel is not valid in the median computation.

The results obtained by this approach, on the TemporalMedian foreground mask was qualitative better.
The foreground pixels were indeed getting "absorbed" less. 

Let's explore the implementation.

In [None]:
import numpy as np
import numpy.ma as ma
import cv2
import cv2.version

print(f"cv2.version.opencv_version: {cv2.version.opencv_version}")
print(f"np.version.full_version: {np.version.full_version}")

### np.MaskedArray.median method

Given a vector ``V`` with ``N`` non masked values, the median of ``V``
is the middle value of a sorted copy of ``V`` (``Vs``) - i.e.
``Vs[(N-1)/2]``, when ``N`` is odd, or ``{Vs[N/2 - 1] + Vs[N/2]}/2``
when ``N`` is even.

String used in lieu of missing data when a masked array is printed. By default, this string is ``'--'``.

In [25]:
not_valid = ma.array(   data = [    1, 2, 2, 5, 10, 15, 25, 40], 
                        mask = [    1, 1, 1, 1,  1,  1,  1,  1])
print(f"median: {ma.median(not_valid)}")
print(f"median: {ma.median(not_valid).astype(np.uint8)}")

good = ma.array(data = [    1, 2, 2, 5, 10, 15, 25, 40], 
                mask = [    1, 0, 1, 1,  0,  1,  1,  0])
print(f"median: {ma.median(good)}")
print(f"median: {ma.median(good).astype(np.uint8)}")

good_2d = ma.array(data = [     [1, 2, 2],
                                [5, 10, 15], 
                                [25, 40, 50]], 
                mask = [[1, 1, 0], 
                        [1, 0, 0],
                        [1, 1, 0]])
print(f"median: {ma.median(good_2d)}")
print(f"median: {ma.median(good_2d).astype(np.uint8)}")

median: --
median: 0
median: 10.0
median: 10
median: 12.5
median: 12


### Temporal Median with Selective Update
This code part was developed only to understand how to use and implement the selective update for the temporal median background model. <br>
**Note:** It does *not* follow the code structure presente in the report.

In [None]:
def open_video(path):
    cap = cv2.VideoCapture(path)

    assert cap.isOpened(), "Not opened!"

    fps = int(cap.get(cv2.CAP_PROP_FPS))
    total_frame_count = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
    length = total_frame_count / fps

    width  = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
    height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))

    print(f"[I] Video FPS: {fps}")
    print(f"[I] Video Total frame count: {total_frame_count}")
    print(f"[I] Video Length: {length}")
    print(f"[I] Video Frame Width: {width}")
    print(f"[I] Video Frame height: {height}")
    return cap, fps, total_frame_count, (width, height)

In [None]:
cap, fps, total_frame_count, (width, height) = open_video("../video/rilevamento-intrusioni-video.wm")

FRAME_BUFFERED_PER_SECOND = 2                  # 2 images added each second to frame buffer
MAX_HISTORY = fps                              # 12 frames stored in the "circular buffer"
SKIP_FRAMES = fps // FRAME_BUFFERED_PER_SECOND

frame_count = 0
skip_count = 0

font = cv2.FONT_HERSHEY_SIMPLEX

stacked = np.ma.masked_all((MAX_HISTORY, height, width))
n = 0

In [None]:
# Select the first 5 seconds of frames by using 2 frames per second
frameIds = range(0, int(cap.get(cv2.CAP_PROP_FPS) * 5), SKIP_FRAMES)
 
# Store selected frames in an array
frames_5sec = []
for fid in frameIds:
    cap.set(cv2.CAP_PROP_POS_FRAMES, fid)
    ret, frame = cap.read()
    # frame = cv2.GaussianBlur(frame, (3, 3), 3)
    frames_5sec.append(frame)

# Calculate the median along the time axis
medianFrame = np.median(frames_5sec, axis=0).astype(dtype=np.uint8)
medianFrame = cv2.cvtColor(medianFrame, cv2.COLOR_BGR2GRAY)

cap.set(cv2.CAP_PROP_POS_FRAMES, 0)

In [None]:
masked_frame = None
temporalMedianBackground = None
oldTemporalMedianBackground = None

while(cap.isOpened()):
    ret, frame_original = cap.read()
    if not ret or frame_original is None:
        cap.release()
        print("Released Video Resource")
        break

    frame_count += 1
    skip_count += 1
    
    frame = cv2.cvtColor(frame_original, cv2.COLOR_BGR2GRAY)
    if n > 0:
        diff = cv2.absdiff(frame, temporalMedianBackground)
        # [ DETECT FOREGROUND ]
        _, binary = cv2.threshold(diff, 35, 255, cv2.THRESH_BINARY)
        binary = cv2.erode(binary, np.ones((3,3), np.uint8))
        binary = cv2.dilate(binary, np.ones((3,3), np.uint8), iterations=4)
        binary = cv2.erode(binary, np.ones((3,3), np.uint8))
        binary = cv2.dilate(binary, np.ones((3,3), np.uint8), iterations=1)

        # [ DO NOT USE FOREGROUND IN THE TEMPORAL MEDIAN ]
        masked_frame = ma.masked_where(binary == 255, frame)
        
        cv2.imshow("mask", binary)


    if len(stacked) == 0 or skip_count == SKIP_FRAMES:
        skip_count = 0
        stacked[n] = masked_frame if masked_frame is not None else ma.masked_array(medianFrame)
        n+=1
        if n == MAX_HISTORY:
            n = 0

        temporalMedianBackground = ma.median(stacked, axis=0)
        if oldTemporalMedianBackground is not None:
            missing_mask = temporalMedianBackground.mask == True

            # The mask of a masked array is accessible through its mask attribute.
            # We must keep in mind that a True entry in the mask indicates an invalid data.
            missing_mask_copy = missing_mask.copy()
            missing_mask_copy = missing_mask_copy.astype(dtype='uint8')
            missing_mask_copy *= 255
            cv2.imshow("missing_mask", missing_mask_copy)
            
            # If all the values int he buffer are invalid for that position, keep the old one.
            # For this reason in this implementation we estimated the median frame to have a reasonable init value.
            temporalMedianBackground[missing_mask] = oldTemporalMedianBackground[missing_mask]
            
            # the values are now valid, empty the mask
            temporalMedianBackground.mask = np.zeros_like(temporalMedianBackground.mask)
        
        temporalMedianBackground = temporalMedianBackground.astype(dtype=np.uint8)
        oldTemporalMedianBackground = temporalMedianBackground
        temporalMedianBackground_copy = temporalMedianBackground.copy()
        cv2.putText(temporalMedianBackground_copy, f"FRAME: {frame_count}/{total_frame_count}", (5, 25), font, 0.5, (0, 0, 0), 1) 
        cv2.imshow("temporalMedianBackground", temporalMedianBackground_copy)
    
    cmd = cv2.waitKey(0)
    if cmd == ord("q"):
        break
    if cmd == ord("n"):
        continue


cap.release()
cv2.destroyAllWindows()

The main problem were:
1. higher computational time
2. due to combination with `MOG2`, the problem was only half solved.

The increased computation time was causing the algorithm to slow down. Even after reducing the buffer size —a reasonable adjustment since, in theory, the buffer should now contain only valid values— the issue persisted.

Another issue arises from the combination of masks performed in `Step [4]` and the fact that in the `MOG2` method, pixels are quickly "absorbed".

Given a quasi static object, over time, the `MOG2` method produces a "fading" foreground mask; the longer an object remains still, the more the foreground mask diminishes. On the other hand, the `temporalMedian` could, in theory, still perceive the difference in the "static" phase of the object.

However, due to the fact that `MOG2`, our short time horizon algorithm, dictates the ROI in which the combination of the foreground masks happens, a quasi static object wouldn't result in the `combined foreground mask`, even if the `TemporalMedian` is correctingly detecting it.

Finally, when the person starts moving again and change its position rapidly, as seen in a few frames of the video, the `MOG2` method will create a "ghost" effect, duplicating the figure.

For these reasons, <ins>**we decided against incorporating this part**.</ins>