<h1 style="font-size:30px;">Application: Golf Swing Analysis and Training</h1> 

Sports training and instruction are heavily based on video analysis to better inform students of proper form for various activities such as throwing, kicking, and swinging. Instructors often use videos of professionals to help instruct their students, but visual comparison can be tedious, often requiring the use of a mouse to manually annotate video frames at various stages to show key differences.

<br>
<center>
<img src = "https://opencv.org/wp-content/uploads/2021/10/c0-m17-02-Feature-Image.png" alt="Golf-swing-analysis">
</center>
<br>

In this application notebook we will demonstrate how to use landmark points from MediaPipe Pose to develop a visual application that allows you to compare a student's golf swing to a reference golf swing from a professional. Video frames are automatically annotated with important geometric relationships that help inform the student in a more efficient manner. 

Designing a proper sports training aid requires domain expertise for the given sport. This notebook provides the basis for such an application applied to the golf swing. The body of information for any given sport for proper technique is deep and even subjective in some cases, but there is no doubt that professionals in any given sport collectively demonstrate certain fundamentals which can be visually quantified and compared to the form of the student's execution. This particular application allows you to visually compare golf swings at key moments during the swing to determine where the student's form can be improved by comparing their video frames to a reference video from a professional golfer.

Reference golf swing from: [GolfswingHD](https://www.youtube.com/watch?v=wIiLM8ufWVI)

In [None]:
if 'google.colab' in str(get_ipython()):
    print("Downloading Code to Colab Environment")
    !wget https://www.dropbox.com/sh/gz8x999ela84qr2/AABHRkHY_QALmo6JHSpOruA8a?dl=1 -O module-code.zip -q --show-progress
    !unzip -qq module-code.zip
    !pip install --upgrade opencv-contrib-python
    !pip install mediapipe
    %cd Applications/
else:
    pass

In [None]:
import cv2
import time
import math
import mediapipe as mp
import numpy as np
from moviepy.editor import VideoFileClip

# 1. Preview Videos

## 1.1 Student Golf Swing

In [None]:
clip = VideoFileClip('./Student_Swing_DTL_wide.mp4')
clip.ipython_display(width=1000)

## 1.2 Reference Golf Swing

In [None]:
# Reference video from GolfswingHD: https://www.youtube.com/watch?v=wIiLM8ufWVI
clip = VideoFileClip('./Reference_Swing_DTL.mp4')
clip.ipython_display(width=600)

# 2. Initializations

### <font style="color:rgb(50,120,230)">Create video capture object<font/>

In [None]:
file_name  = 'Reference_Swing_DTL.mp4'
video_cap = cv2.VideoCapture(file_name)
if not video_cap.isOpened():
    print('Unable to open: ' + file_name)

### <font style="color:rgb(50,120,230)">Create video writer object<font/>

In [None]:
fps = int(video_cap.get(cv2.CAP_PROP_FPS))
frame_w = int(video_cap.get(cv2.CAP_PROP_FRAME_WIDTH))
frame_h = int(video_cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
frame_size = (frame_w, frame_h)

video_out_file = file_name[:-4] + '_out_landmarks.mp4'
video_output = cv2.VideoWriter(video_out_file, cv2.VideoWriter_fourcc(*'mp4v'), fps, frame_size)

# 3. Extract Landmark Coordinates

### <font style="color:rgb(50,120,230)">Process image frames and extract landmark coordinates<font/>

Documentation [**MediaPipe Pose**](https://google.github.io/mediapipe/solutions/pose.html) 

In [None]:
# BGR Colors
color_light  = (255, 255, 0)
color_marks  = (0, 255, 255)
color_join   = (0, 20, 200)

# In a more advanced application the initial position of 
# the golf ball would be automatically detected.
ball_pos_x = 650
ball_pos_y = 927

mp_pose = mp.solutions.pose

with mp_pose.Pose(min_detection_confidence=0.5, min_tracking_confidence=0.5) as pose:
   
    while True:
        
        has_frame, frame = video_cap.read()
        if not has_frame:
            break
            
        # Convert the BGR image to RGB.
        frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

        # Process the image frame.
        keypoints = pose.process(frame)
        landmarks = keypoints.pose_landmarks
        enum_pose  = mp_pose.PoseLandmark
        
        # Convert the image back to BGR.
        frame = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)
        
        if landmarks is not None:
        
            # Right ear.
            r_ear_x = int(landmarks.landmark[enum_pose.RIGHT_EAR].x * frame_w)
            r_ear_y = int(landmarks.landmark[enum_pose.RIGHT_EAR].y * frame_h)
            r_ear_p = np.array([r_ear_x, r_ear_y])

            # Right wrist.
            r_wrist_x = int(landmarks.landmark[enum_pose.RIGHT_WRIST].x * frame_w)
            r_wrist_y = int(landmarks.landmark[enum_pose.RIGHT_WRIST].y * frame_h)
            r_wrist_p = np.array([r_wrist_x, r_wrist_y])
            
            # Right hip.
            r_hip_x = int(landmarks.landmark[enum_pose.RIGHT_HIP].x * frame_w)
            r_hip_y = int(landmarks.landmark[enum_pose.RIGHT_HIP].y * frame_h)
            r_hip_p = np.array([r_hip_x, r_hip_y])

            # Join landmarks.
            cv2.line(frame, (r_hip_p[0], r_hip_p[1] ), (r_ear_p[0], r_ear_p[1]), color_join, 2, cv2.LINE_AA)
            cv2.line(frame, (r_hip_p[0], r_hip_p[1] ), (r_wrist_p[0], r_wrist_p[1]), color_join, 2, cv2.LINE_AA)
            cv2.line(frame, (r_ear_p[0], r_ear_p[1]), (r_wrist_p[0], r_wrist_p[1]), color_join, 2, cv2.LINE_AA)
            cv2.line(frame, (ball_pos_x, ball_pos_y), (r_wrist_p[0], r_wrist_p[1]), color_light, 2, cv2.LINE_AA)

            # Draw landmarks.
            cv2.circle(frame, (r_ear_p[0], r_ear_p[1]), 3, color_marks, -1)
            cv2.circle(frame, (r_wrist_p[0], r_wrist_p[1]), 3, color_marks, -1)
            cv2.circle(frame, (r_hip_p[0], r_hip_p[1] ), 3, color_marks, -1)
            cv2.circle(frame, (ball_pos_x, ball_pos_y), 3, color_marks, -1)

            video_output.write(frame)
            
video_cap.release()
video_output.release()

In [None]:
# Load output video.
clip = VideoFileClip(video_out_file)
clip.ipython_display(width=600)

# 4.  Defining Vectors Between Landmark Points
<br>
<center>
<img src = "https://opencv.org/wp-content/uploads/2021/10/c0-m17-02-Vectors.png" alt="vectors">
</center>
<br>

# 5. Function to Compute the Angle Between Two Vectors

The dot product between two vectors is defined as follows:

\begin{align}
{a\bullet b} = {|a||b|}\bullet \cos(\theta)
\end{align}

We can therefore compute the angle between the two vectors re-arranging the above equation.


\begin{align}
\theta = \arccos (\frac{a\bullet b}{|a||b|})
\end{align}

In [None]:
def compute_angle(v1, v2):

    # Unit vector.
    v1u = v1 / np.linalg.norm(v1)
    # Unit vector.
    v2u = v2 / np.linalg.norm(v2)
    # Compute the angle between the two unit vectors.
    angle_deg = np.arccos(np.dot(v1u, v2u)) * 180 / math.pi

    return angle_deg

# 5. Initializations

### <font style="color:rgb(50,120,230)">Create Video Capture Object<font/>

In [None]:
run_modes = ['reference', 'student', 'student2']

mode = run_modes[0]

if mode == 'reference':
    file_name = 'Reference_Swing_DTL.mp4'
elif mode == 'student':
    file_name = 'Student_Swing_DTL.mp4'
elif mode == 'student2':    
    file_name = 'Student_Swing_DTL_wide_impact_2months_later.mp4'

video_cap = cv2.VideoCapture(file_name)

if not video_cap.isOpened():
    print('Unable to open: ' + file_name)

### <font style="color:rgb(50,120,230)">Create Video Writer Object<font/>

In [None]:
fps = int(video_cap.get(cv2.CAP_PROP_FPS))
frame_w = int(video_cap.get(cv2.CAP_PROP_FRAME_WIDTH))
frame_h = int(video_cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
frame_size = (frame_w, frame_h)

video_out_file = file_name[:-4] + '_out_analysis.mp4'

video_output = cv2.VideoWriter(video_out_file, cv2.VideoWriter_fourcc(*'mp4v'), fps, frame_size)

# 6. Convenience Function to Extract Landmark Coordinates

In [None]:
def get_landmark_point(landmarks, landmark_point, w, h):
    x = int(landmarks.landmark[landmark_point].x * w)
    y = int(landmarks.landmark[landmark_point].y * h)
    point = np.array([x, y])
    return point

# 7. Golf Swing Analysis Implementation

In [None]:
color_light  = (255, 255, 0)
color_marks  = (0, 255, 255)
color_yellow = (0, 255, 255)
color_gray   = (127, 127, 127)
color_join   = (0, 20, 200)

first_frame = True

# In a more advanced application the initial position of 
# the golf ball would be automatically detected.
if mode == 'reference':
    # for: Reference_Swing_DTL.mp4
    ball_pos_x = 650
    ball_pos_y = 927
elif mode == 'student':
    # for: Student_Swing_DTL.mp4
    ball_pos_x = 640
    ball_pos_y = 944
elif mode == 'student2':
    # for: Student_Swing_DTL_wide_2months_later.mp4
    ball_pos_x = 783
    ball_pos_y = 613
    
mp_pose = mp.solutions.pose
    
with mp_pose.Pose(min_detection_confidence=0.5, min_tracking_confidence=0.5) as pose:
        
    while True:

        has_frame, frame = video_cap.read()
        if not has_frame:
            break

        # Convert the BGR image to RGB.
        frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

        # Process the image.
        keypoints = pose.process(frame)

        # Convert the image back to BGR.
        frame = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)

        # Use the following assignments to abbreviate the notation further below.
        landmarks = keypoints.pose_landmarks
        enum_pose = mp_pose.PoseLandmark

        if landmarks is not None:
            # Acquire the landmark coordinates.
            r_ear_p = get_landmark_point(landmarks,   enum_pose.RIGHT_EAR, frame_w, frame_h)
            r_wrist_p = get_landmark_point(landmarks, enum_pose.RIGHT_WRIST, frame_w, frame_h)
            r_hip_p = get_landmark_point(landmarks,   enum_pose.RIGHT_HIP, frame_w, frame_h)

            # Compute angle between ear-hip-wrist.
            v_hip_ear = np.subtract(r_ear_p, r_hip_p)
            v_hip_wrist = np.subtract(r_wrist_p, r_hip_p)
            angle_ear_hip_wrist = compute_angle(v_hip_ear, v_hip_wrist)
            text_loc = (r_hip_p[0] - 25, r_hip_p[1] + 20)
            cv2.putText(frame, str(int(angle_ear_hip_wrist)), text_loc, cv2.FONT_HERSHEY_SIMPLEX, .5, 
                        color_light, 1, cv2.LINE_AA)

            # Compute angle between ear-hip-vertical.
            v_hip_ear = np.subtract(r_ear_p, r_hip_p)
            vert_p = np.array([r_hip_p[0], r_hip_p[1] - 1])
            v_hip_vert = np.subtract(vert_p, r_hip_p)
            angle_ear_hip_vert = compute_angle(v_hip_ear, v_hip_vert)
            text_loc = (r_hip_p[0] - 25, r_hip_p[1]  - 10)
            cv2.putText(frame, str(int(angle_ear_hip_vert)), text_loc, cv2.FONT_HERSHEY_SIMPLEX, .5, 
                        color_yellow, 1, cv2.LINE_AA)

            # Join landmarks.
            cv2.line(frame, (r_hip_p[0], r_hip_p[1] ), (r_ear_p[0], r_ear_p[1]), color_join, 2, cv2.LINE_AA)
            cv2.line(frame, (r_hip_p[0], r_hip_p[1] ), (r_wrist_p[0], r_wrist_p[1]), color_join, 2, cv2.LINE_AA)
            cv2.line(frame, (r_ear_p[0], r_ear_p[1]), (r_wrist_p[0], r_wrist_p[1]), color_join, 2, cv2.LINE_AA)
            cv2.line(frame, (ball_pos_x, ball_pos_y), (r_wrist_p[0], r_wrist_p[1]), color_light, 2, cv2.LINE_AA)

            # Draw landmarks.
            cv2.circle(frame, (r_ear_p[0], r_ear_p[1]), 3, color_marks, -1)
            cv2.circle(frame, (r_wrist_p[0], r_wrist_p[1]), 3, color_marks, -1)
            cv2.circle(frame, (r_hip_p[0], r_hip_p[1] ), 3, color_marks, -1)
            cv2.circle(frame, (ball_pos_x, ball_pos_y), 3, color_marks, -1)

            if first_frame:
                r_hip_x_0 = r_hip_p[0]
                r_ear_y_0 = r_ear_p[1]
                first_frame = False
                
            # Draw initial vertical and horizontal lines through right hip and right ear.
            cv2.line(frame, (r_hip_x_0, frame_h), (r_hip_x_0, 0), color_gray, 1, cv2.LINE_AA)
            cv2.line(frame, (0, r_ear_y_0), (frame_w, r_ear_y_0), color_gray, 1, cv2.LINE_AA)

            # Draw dynamic vertical and horizontal lines through right hip and right ear.
            cv2.line(frame, (r_hip_p[0], frame_h), (r_hip_p[0], 0), color_yellow, 1, cv2.LINE_AA)
            cv2.line(frame, (0, r_ear_p[1]), (frame_w, r_ear_p[1]), color_yellow, 1, cv2.LINE_AA)

            video_output.write(frame)

video_cap.release()
video_output.release()

In [None]:
# Load output video.
clip = VideoFileClip(video_out_file)
clip.ipython_display(width=500)

# 8. Analysis of Results

The code above was executed twice. Once for the reference swing video and once for the student swing video. In the sections below we will compare two at key points points during the golf swing starting at the address position.

## 8.1 Comparison at Address

The positon of the student at "address" is very good. The angle of the spine, and the relative positon of the hands, head, and hips are all in the proper position. Also notice that there is only a slight bending at the knees.

<center>
<img src = "https://opencv.org/wp-content/uploads/2021/10/c0-m17-Golf-Swing-1-Address.png" alt="Golf-swing-1">
</center>

## 8.2 Comparison at the Top of the Backswing

There are several things to note about the student's form at the top of the backswing. The angle of the spine should be roughly perpendicular to the swing plane (identified in light blue). In this case, the student's spine is too vertical at the top of the backswing which causes him to be out of position. Also, notice that the student's head has shifted vertically up during the backswing (the horizontal yellow line is now substantially higher than it was as address). 

<center>
<img src = "https://opencv.org/wp-content/uploads/2021/10/c0-m17-Golf-Swing-2-Top.png" alt="Golf-swing2-top">
</center>

## 8.3 Comparison During the Downswing

During the downswing, the club shaft should be parallel to the swing plane (light blue line). Notice that the student's club shaft is at a shallower angle than the swing plane. Notice also that the head of the student is still too high compared to the reference swing. Both of these differences could likely be corrected by maintaining a steeper angle of the spine throughout the downswing. 

<center>
<img src = "https://opencv.org/wp-content/uploads/2021/10/c0-m17-Golf-Swing-3-Down.png" alt="Golf-Swing-3-Down">
</center>

## 8.4 Comparison at Impact
To his credit, the student's position at impact is remarkably similar to the reference swing in spite of the differences earlier in the swing. The head is lower at impact as it should be. The angle of the spine is in a much better position compared to the reference swing. And the right hip has rotated toward the target line as it should. However, getting to this impact position requires significant timing and is hard to repeat in a consistent manner. Improving the form earlier in the swing will lead to more consistent form throughout the swing which will lead to more consistent contact with the ball. 

<center>
<img src = "https://opencv.org/wp-content/uploads/2021/10/c0-m17-Golf-Swing-4-Impact.png" alt="Golf-swing-4-impact">
</center>

# 9. Training Improvement (Two Months Later)

The following video shows the same student two months later. It's difficult to spot any differences until you see the annotated results.

In [None]:
clip = VideoFileClip('Student_Swing_DTL_wide_impact_2months_later.mp4')
clip.ipython_display(width=1000)

In [None]:
clip = VideoFileClip('./Student_Swing_DTL_wide_impact_2months_later_out_analysis.mp4')
clip.ipython_display(width=1000)

# 9.1 Comparison During the Downswing

The following comparison below is for the same student two months later. First, looking at the video above notice that the student's spine is still too vertical (26 deg) at the top of the backswing (at 0:09 sec), but there has also been an improvement; his head has remained in the same position that is was at address (yellow vs. gray horizontal lines). There are also three other notable improvements that are apparent during the downswing, with the aid of automated annotations. First, the tilt of the spine (now at 30 deg vs 23 deg) is in a much better position. Second, the head is at a lower position than it was at address. And, third, the angle of the club shaft is slightly steeper (and therefore closer to the ideal swing plane) than it was in the original swing which is mainly due to the steeper angle of the spine at this position in the downswing.

<br>
<center>
<img src = "https://opencv.org/wp-content/uploads/2021/10/c0-m17-Golf-Swing-Improvement.png" alt="Golf-swing-improvement">
</center>
<br>

# Conclusion 

As you can see the visual cues from dynamically connecting key landmark points throughout the swing can lead to immediate and informative feedback to instructors and students and can also be used to easily assess progress. Video instruction techniques are commonly used in golf instruction but the geometric relationships are typically drawn manually to identify key differences between a student's swing and a reference swing. Using computer vision techniques, we are able to automate the annotation of video frames in a way that makes instruction much more efficient and effective.