Skip to content

Classifying and tracking players of a soccer exhibition based on recorded videos with different points of view

License

Notifications You must be signed in to change notification settings

TroddenSpade/Soccer-Players-Tracking

Repository files navigation

Soccer Players Tracking

In this project, we aim to reconstruct a soccer game's details from the position of the players and referees to their movements using three recorded videos with different field coverage. Subsequently, the movement and position of the individuals are displayed in a top-view demonstration of a 2D soccer pitch. For this project, we used "Soccer video and player position dataset" from this website.

Input data

"Soccer video and player position dataset" provides three videos, each displaying a constant soccer match synchronously from a different perspective.

Left Center Right

Extracting the background of the inputs

We extract the background of the videos by calculating the mean of all video frames for each pixel.

Left Center Right

Substracting the background

In this section, the obtained backgrounds are used in a KNN Background Subtraction algorithm to detect moving individuals.

Objects to Patches

The detected objects are converted to patches of variable size and saved in a folder named "img". Moreover, with the help of pigeon.anotate a corresponding label is assigned to all the patches.

Patches

Classifying Individuals

After extracting patches and determining labels, we use two convolution layers following a flatten layer and two fully-connected layers to classify the patches into three classes.

Blue Team White Team Referees
Layer (type) Output Shape
conv2d (Conv2D) (None, 26, 8, 64)
max_pooling2d (MaxPooling2D) (None, 13, 4, 64)
conv2d_1 (Conv2D) (None, 11, 2, 128)
max_pooling2d_1 (MaxPooling2) (None, 5, 1, 128)
flatten (Flatten) (None, 640)
dense (Dense) (None, 128)
dense_1 (Dense) (None, 3)

Applying Masks

We apply two types of masks for different purposes on the input images.

Region of Interest (ROI)

This mask is used to define the region of interest in the image. Using this mask, we are able to omit the improper regions, like the big monitor in the left video, pitch-side hoardings, and audience.

Left Center Right

Top-view Coverage Area

We used this mask to define a unique top-view area for each input video. By applying this mask, all of the common areas between input videos will be eliminated.

Left Center Right

Transformation

We transform the masked input videos and their objects to get three complementary top-view presentations.

Final results

References

  1. S. A. Pettersen, D. Johansen, H. Johansen, V. Berg-Johansen, V. R. Gaddam, A. Mortensen, R. Langseth, C. Griwodz H. K. Stensland, and P. Halvorsen, Soccer video and player position dataset, Proceedings of ACM MMSys 2014, March 19.

About

Classifying and tracking players of a soccer exhibition based on recorded videos with different points of view

Topics

Resources

License

Stars

Watchers

Forks

Packages