## This tests the buffer of gestures for movement recognition idea

We'll use a deque (which is efficient to append and pop) for this one

In [50]:
from collections import deque
import numpy as np
import time

Let's create a buffer class in time later, now we just want to be able to take a sequence of gestures and catch the mov_gesture

Bellow we have the sequence of static gestures that generate the gesture for eleven.

In [30]:
eleven = ['one', 'closed_fist', 'one', 'closed_fist']
twelve = ['one', 'closed_fist', 'two', 'closed_fist']

Assume our buffer is the best possible case, i.e., it contains eleven (and only eleven)

In [8]:
buffer = eleven.copy()

One way to check if its eleven is to walk trough the buffer and see if attains a check_point == len(eleven)

In [16]:
def seq_in_buff(buffer, gesture_movie):
    cp = 0 # Checkpoint starts at zero
    len_gesture_movie = len(gesture_movie)
    for el in buffer:
        if el == gesture_movie[cp]:
            cp += 1
        if cp == len_gesture_movie:
            return True
    return False

print(f'Eleven is in buffer: {seq_in_buff(buffer, eleven)}')

Eleven is in buffer: True


We could make this work with a buffer that contains random other strings, as long as buffer contains gesture_movie in the correct order, it should work.

In [48]:
buffer_long = [
    "two",
    "three",
    "one", # First
    "three",
    "two",
    "one",
    "closed_fist", # Second
    "closed_fist",
    "closed_fist",
    "closed_fist",
    "two",
    "three",
    "one", # Third
    "three",
    "two",
    "one",
    "two",
    "three",
    "one", # Last
    "three",
    "two",
    "one",
    "closed_fist",
    "closed_fist",
    "closed_fist",
    "closed_fist",
]
print(f'Eleven is in buffer: {seq_in_buff(buffer_long, eleven)}')
print('If we exclude ther first closed_fist block from buffer we get:')
# buffer_long[6:10] = 'one'
# print(f'Eleven is in buffer: {seq_in_buff(buffer_long, eleven)}')

Eleven is in buffer: True
If we exclude ther first closed_fist block from buffer we get:


The problem with this approach is, supposedly, that it might be slow. If we assume we are able to optimize everything to make the code recognizer be at 60 fps, we would, for gestures with at most five seconds, have to go through the entire buffer list (with 300 elements) as many times as there are gestures with movement. This can be problematic.

One way to deal with this would be to go through the buffer just once, but have one checkpoint for each movement with gesture.

In [129]:
def seqs_in_buff(buffer, gesture_movies_array):
    cp = [0]*len(gesture_movies_array) # Checkpoints start at zero
    len_gesture_movies = [len(gesture_movies) for gesture_movies in gesture_movies_array]
    for el in buffer:
        for i, gesture_movies in enumerate(gesture_movies_array):
            if cp[i] < len_gesture_movies[i] and el == gesture_movies[cp[i]]:
                cp[i] += 1
    identified = [0]*len(gesture_movies_array)
    for i in range(len(identified)):
        if cp[i] == len_gesture_movies[i]:
            identified[i] = 1
    return identified

In [130]:
buffer = ['one', 'two', 'closed_fist', 'one', 'two', 'closed_fist']
print(f'Eleven and Twelve are in buffer: {seqs_in_buff(buffer, [eleven, twelve])}')
print('If we exclude the second two we get:')
buffer[4] = 'one'
print(f'Eleven and Twelve are in buffer: {seqs_in_buff(buffer, [eleven, twelve])}')

Eleven and Twelve are in buffer: [1, 1]
If we exclude the second two we get:
Eleven and Twelve are in buffer: [1, 0]


In [131]:
print(f'Eleven and Twelve are in buffer long: {seqs_in_buff(buffer_long, [eleven, twelve])}')

Eleven and Twelve are in buffer long: [1, 1]


Is this more efficient? Let's add a thousand elements in front of buffer and see how long it takes to run seq_in_buff in eleven and twelve

In [182]:
n_movements = 1000
movs = [eleven, twelve] * int(n_movements/2)
elements = int(300)
long_garbage = ['garbage'] * (elements-len(buffer_long))
long_buffer = long_garbage + buffer_long# Thousand elements buffer

In [183]:
start_time = time.time()
# for mov in movs:
    # seq_in_buff(long_buffer, mov)
for i in range(10):
    [seq_in_buff(long_buffer, mov) for mov in movs]
end_time = time.time()
print(f'Time elapsed: {(end_time - start_time)/10}')

Time elapsed: 0.02841064929962158


In [184]:
start_time1 = time.time()
for i in range(10):
    seqs_in_buff(long_buffer, movs)
end_time1 = time.time()
print(f'Time elapsed: {(end_time1 - start_time1)/10}')

Time elapsed: 0.048932552337646484


Apparently its faster to just run seq_in_buff for all movs instead of trying to do all togheter

Another way would be to make the checkpoints in the same way they are implemented in the class right now and just update them, would this be faster?

In [185]:
def update_cps(incoming_frame, cps, gesture_movies_array):
    for i, gesture_movies in enumerate(gesture_movies_array):
        if cps[i] < len(gesture_movies) and incoming_frame == gesture_movies[cps[i]]:
            cps[i] += 1
    return cps

In [186]:
cps = [0]*len(movs)
start_time2 = time.time()
for i in range(10):
    [update_cps(el, cps, movs) for el in long_buffer]
end_time2 = time.time()
print(f'Time elapsed: {(end_time2 - start_time2)/10}')

Time elapsed: 0.04528102874755859


Which is better but not best, the difference here is that if we want to add another frame, the check is much faster

In [187]:
long_buffer_p1 = long_buffer + ['one']
start_time3 = time.time()
for i in range(1000):
    update_cps(['one'], cps, movs)
end_time3 = time.time()
print(f'Time elapsed: {(end_time3 - start_time3)/1000}')

Time elapsed: 0.00014226889610290529


Which is very good, $143 \mu s$. There are as many checks as there are movements, and no more, regardless of how long the buffer is!

The only difference in the class would be to make check point be a list!

Note also that this doesn't lock any gesture with movement recognition!!!!!!!!!!!

One thing that could be done is to keep, along with cp, the error for the measurement of each static figure. If there are more than one cp identified at the end of 5 seconds, we can then choose the one that has the smaller error. 