In [1]:
# import the necessary packages
import time
import cv2
import os
import numpy as np
import matplotlib.pyplot as plt

## Position of ball

We're given 15 frames of a batter hitting the baseball and we have to programatically find the positions of the baseball in these different frames. It can be done in a few steps:

1. Background Subtraction :  Subtract the previous frame from the current frame to get the moving pixels in the current frame. In our case, objects in motion would mainly be the body of the hitter, the bat and the baseball.
2. Hough Circle : Once, we have the areas of interest, then we perfom `HoughCircles` on these regions to look for circular shaped objects (baseball in our case).
3. Filter : In some cases, `HoughCircles` outputs false positives for a baseball at the bottom of the bat, which is also circular. This can be filtered out in a number of ways- by using greyscale intensities or positions. In our case, we use positions, i.e., we introduced a variable `leftmost`, which represents the leftmost cirle and assumes that the ball is always travelling in the left direction and any detection to the right of this variable is ignored. This gives the correct location for the baseball.

Step 1 : Perform Background Subtraction

In [2]:
seg_imgs = []
firstFrame = None
for i in range(1,16):
    
    img_file = 'images/IMG' + str(i) + '.bmp'
    # read the image
    img = cv2.imread(img_file)
    # convert to greyscale
    gray_ = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    # Remove the noise
    gray = cv2.GaussianBlur(gray_, (15, 15), 0)

    # if the first frame is None, initialize it
    if firstFrame is None:
        firstFrame = gray
        continue
        
    # compute the absolute difference between the current frame and first frame
    frameDelta = cv2.absdiff(firstFrame, gray)
    
    # Create a binary representation
    thresh = cv2.threshold(frameDelta, 25, 255, cv2.THRESH_BINARY)[1]
    
    # dilate the thresholded image to fill in holes
    thresh = cv2.dilate(thresh, None, iterations=2)

    # Get the image
    seg = thresh*gray_

    # Set the new frame to the previous one
    firstFrame = gray
    
    seg_imgs.append(seg)


Step 2 & 3: Apply `Hough Transform` to the subtracted image to get circular shaped objects (baseball in our case)

In [3]:
center = []
leftmost = 1280
for j, seg1 in enumerate(seg_imgs):
    bgr2gray = seg1.copy()
    
    # define maxradius depending on the frame as ball gets bigger in radius in later frames
    if j >6:
        maxRadius = 23
    else:
        maxRadius = 19
        
    #apply houghcircles
    circle = cv2.HoughCircles(bgr2gray, cv2.HOUGH_GRADIENT, 1, 30, param1 = 100, param2 = 14, minRadius = 13, maxRadius = maxRadius)
    circles = np.uint(circle)
    
    # Loop through the circles
    for	i in circles[0,:]:
        
        # only consider points to the left of the center of baseball in previous frame
        if i[0]<leftmost-10:
            #	draw	the	outer	circle
            cv2.circle(bgr2gray,(i[0],i[1]),i[2],(0,255,0),1)
            #	draw	the	center	of	the	circle
            cv2.circle(bgr2gray,(i[0],i[1]),2,(0,0,255),3)
            center.append([i[0],i[1]])
            print('position of ball in {} image : {}'.format(j+2, [i[0],i[1]]))

    # update the leftmost point after evry frame                
    leftmost = np.sort(np.array(center), axis=0)[0][0]


position of ball in 2 image : [551, 802]
position of ball in 3 image : [538, 803]
position of ball in 4 image : [510, 799]
position of ball in 5 image : [481, 792]
position of ball in 6 image : [450, 784]
position of ball in 7 image : [417, 778]
position of ball in 8 image : [379, 769]
position of ball in 9 image : [341, 761]
position of ball in 10 image : [296, 747]
position of ball in 11 image : [253, 742]
position of ball in 12 image : [202, 732]
position of ball in 13 image : [149, 723]
position of ball in 14 image : [93, 707]
position of ball in 15 image : [30, 695]


Write the results to `results/` folder. Each image shows the center of the baseball.

In [4]:
# Write the results 
for j in range(2,16):
    img_file = 'IMG' + str(j) + '.bmp'
    img = cv2.imread(img_file)
    cv2.circle(img,(center[j-2][0],center[j-2][1]),2,(0,0,255),3)
    cv2.imwrite('results/IMG' + str(j) + 'result.jpg', img)

## Velocity of ball

Calculate velocity by dividing the Euclidean Distance between points in subsequent frames with the time elpased between frames.

Euclidean distance is calculated using 

<center>$d(p,q) = \sqrt{(p1-q1)^2 + (p2-q3)^2 + ... + (pn-qn)^2}$</center>
<center>where p and q are n-dimesional points of the same object in subsequent frames</center>

<br>
For time elapsed, we take the reciprocal for `fps`, which is 240 in our case

<center>$t = (1/fps)$</center>

<br>
Finally, velocity is calculated

<center>$vel = d(p,q)/t$</center>

In [5]:
# Get the velocity 
velocity = []
# convert fps so spf 
delta_t = (1/240)
for i in range(1,len(center)):

    # Calculate Euclidean distance between current and previous frame
    dist = np.sqrt((np.int(center[i][0])- np.int(center[i-1][0]))**2 + 
                   (np.int(center[i][1])- np.int(center[i-1][1]))**2)

    # Convert pixel to mm
    pix2mm = dist*0.0048    
    
    # Calculate velocity
    vel = np.round((pix2mm/delta_t),3)
    
    velocity.append(vel)
    print('velocity of ball in {} image : {} mm/s'.format(i+2, vel))
    
    

velocity of ball in 3 image : 15.02 mm/s
velocity of ball in 4 image : 32.583 mm/s
velocity of ball in 5 image : 34.367 mm/s
velocity of ball in 6 image : 36.882 mm/s
velocity of ball in 7 image : 38.639 mm/s
velocity of ball in 8 image : 44.987 mm/s
velocity of ball in 9 image : 44.736 mm/s
velocity of ball in 10 image : 54.291 mm/s
velocity of ball in 11 image : 49.87 mm/s
velocity of ball in 12 image : 59.871 mm/s
velocity of ball in 13 image : 61.93 mm/s
velocity of ball in 14 image : 67.093 mm/s
velocity of ball in 15 image : 73.881 mm/s


## Alternatives

<h>**Lucas-Kanade Method**</h>
 
 I tried to use Sparse Optical flow to track movement of baseball. I performed the following steps:
 
 1. Used `HoughCircles` on the first frame to get a few circular objects (baseball being one of them)
 2. For the next frame, I used `cv2.calcOpticalFlowPyrLK` to get the optical flow of the selected circular object and only kept the ones with new points(gives the baseball location).
 3. Repeated step 2 for all the subsequent frames
 
Problem : After frame 9, it loses track of the baseball and gives wrong predictions.<br>
Reason : After frame 9, the motion gets bigger from camera perspective, i.e., the distance between baseball in subsequent frames increases.<br>
Solutions tried : Choose pyramid structures with Lucas-Kanade and use larger window size.

I was still unable to improve the predictions with the solutions tried so I gave up on this approach.

In [6]:
# Print the Final values

for i, c in enumerate(center):
    
    print('{} frame'.format(i+2))
    print('position of ball: {}'.format(c))
    if i>0:
        print('velocity of ball : {} mm/s'.format(velocity[i-1]))
    print('\n')

2 frame
position of ball: [551, 802]


3 frame
position of ball: [538, 803]
velocity of ball : 15.02 mm/s


4 frame
position of ball: [510, 799]
velocity of ball : 32.583 mm/s


5 frame
position of ball: [481, 792]
velocity of ball : 34.367 mm/s


6 frame
position of ball: [450, 784]
velocity of ball : 36.882 mm/s


7 frame
position of ball: [417, 778]
velocity of ball : 38.639 mm/s


8 frame
position of ball: [379, 769]
velocity of ball : 44.987 mm/s


9 frame
position of ball: [341, 761]
velocity of ball : 44.736 mm/s


10 frame
position of ball: [296, 747]
velocity of ball : 54.291 mm/s


11 frame
position of ball: [253, 742]
velocity of ball : 49.87 mm/s


12 frame
position of ball: [202, 732]
velocity of ball : 59.871 mm/s


13 frame
position of ball: [149, 723]
velocity of ball : 61.93 mm/s


14 frame
position of ball: [93, 707]
velocity of ball : 67.093 mm/s


15 frame
position of ball: [30, 695]
velocity of ball : 73.881 mm/s


