## I've been hand labeling the videos and then extracting clips from my own labels, this is not feasible to do in a timely manner and requires alot of manual labor. Lets find a way to 

### To develop this functionality, we want to utilize:
    1) 2d keypoints of a full video
    2) Find the most relevant frames that correspond to a swing
    3) Crop and extract a corresponding video for each swing

### With this in place, all the user/system operator needs to provide is the actual scores

In [1]:
#| include: false
from fastai.vision.all import *
from utils import *

In [2]:
#| include: false
base_path = '../../../data/full_videos'
swing_days = ['jun8', 'aug9', 'sep14']
files = get_files(f'{base_path}/{swing_days[-1]}', extensions='.MOV')
df_sep14 = pd.read_csv(f'{base_path}/{swing_days[2]}/cleaned.csv').reset_index(drop=True)
len(files), files[0], df_sep14.head(1)

(8,
 Path('../../../data/full_videos/sep14/IMG_1090.MOV'),
      input_file  swing_index  score start   end  output_file
 0  IMG_1086.MOV            0      2  0:30  0:33          NaN)

In [3]:
#| echo: false
start_times = df_sep14[df_sep14.input_file == files[0].name].reset_index(drop=True).start.to_list()
print(f'The start times of every swing in {files[0].name} are:\n {start_times}')

The start times of every swing in IMG_1090.MOV are:
 ['0:28', '0:53', '1:27', '1:54', '2:25', '3:01']


#### Lets pull snippet of a video to see how much of actual video is spent on the swing itself

In [4]:
#| code-fold: true
print(f'Grabbing a clip from {files[0].name}')
frames, fps = get_frames(files[0], 
                         per_second=False, # only grab every fps frame
                         start_idx=600, # start 10 seconds in
                         #start_idx=1200, # start 20 seconds in
                         num_frames=1500, # only pull down 25 seconds of video
                         #num_frames=250, # only pull down 4ish seconds of video)
                         resize_dim=(256,256),
                         show_progress=True
                        )
save_frames(frames=frames, fps=fps, fname='useless_frames.mp4')

Grabbing a clip from IMG_1090.MOV


100%|████████████████████████████████████████████████████████████████████████████| 1500/1500 [02:34<00:00,  9.68it/s]


### The start of a swing video:

{{< video useless_frames.mp4 width="400" height="300" >}}

 - First 10 seconds are useless and not included
 - Only about 3 seconds of the first 35 seconds is relevant
 - The full video has 6 swings and is over 3 minutes long!

### Given this, how can we extract only the useful frames of the video?

 - I tried a few different approaches that did not work very consistently or well at all
 - After a few days lets see if a simple approach will get the job done
 - We only want a close approximation of something in the middle of the swing frames themselves, once we have it, we can just pull 1-2 seconds before and after this middle frame\
   - using our existing physics code to normalize the swings around the peak of the backswing like we've already fleshed out (shown to work)

### The approach
 - The simple extraction method will be to use a filter on 2d keypoints and say yes to any frame where both hands are above the shoulder
     - If both hands are above the shoulders, we know we're inside of a swing frames
     - When giving the score, one hand is above the shoulder, so we need BOTH above

In [7]:
#| code-fold: true
# logger = MMLogger.get_instance('mmpose')
# logger.setLevel('ERROR')  # or 'ERROR' for even less output
# labeler = get_labeler('vit');
# generate_labels(labeler, 'useless_frames.mp4', out_dir='keypoints');

### The same video labeled:

{{< video keypoints/useless_frames.mp4 width="400" height="300" >}}

 - First
 - Only 
 - The 

In [16]:
#| code-fold: true
from swing_data import *
kps = KpExtractor('keypoints/useless_frames.pkl').keypoint_data.kps
kps.shape, kps[0,9]

In [54]:
#| code-fold: true
l_shoulder = kps[:, 5, 1] # Left Wrist KP
r_shoulder = kps[:, 6, 1] # Right Wrist KP
l_elbow = kps[:, 7, 1] # Left Elbow KP
r_elbow = kps[:, 8, 1] # Right Elbow KP
l_wrist = kps[:, 9, 1] # Left Wrist KP
r_wrist = kps[:, 10, 1] # Right Wrist KP
# less than is above
left_wrist_above_elbow = l_wrist < l_elbow
right_wrist_above_elbow = r_wrist < r_elbow
left_wrist_above_sh = l_wrist < l_shoulder
right_wrist_above_sh = r_wrist < r_shoulder

In [61]:
#| code-fold: true
combined_true = left_wrist_above_elbow & right_wrist_above_elbow & left_wrist_above_sh & right_wrist_above_sh
higher_idxs = np.where(combined_true)[0]
print(f'There are {len(higher_idxs)} frames with the wrists above the elbow and shoulders')

There are 77 frames with the wrists above the elbow and shoulders


In [57]:
#| code-fold: true
higher_frames = np.stack([frames[idx] for idx in higher_idxs])
save_frames(fname='higher_frames.mp4', frames=higher_frames)
higher_frames.shape

(77, 256, 256, 3)

### Just the frames when wrist is above shoulder and elbow:

{{< video higher_frames.mp4 width="400" height="300" >}}

 - ...  

In [63]:
#| code-fold: true
first_high_idx = higher_idxs[0]
first_high_idx

1150

In [75]:
#| code-fold: true
# get 1.5 seconds before and after first high index
init_idx = first_high_idx - 90
final_idx = first_high_idx + 90
init_idx, final_idx

(1060, 1240)

In [78]:
#| include: false
final_frames = frames[init_idx:final_idx]
save_frames(fname='final_frames.mp4', frames=final_frames)

### Just the frames when wrist is above shoulder and elbow:

{{< video final_frames.mp4 width="400" height="300" >}}

 - ...  

First 90 seconds of video

{{< video all_frames.mp4 width="400" height="300" >}}

...

First 90 seconds of video clipped!

{{< video ninety/final_frames.mp4 width="400" height="300" >}}

...