Skip to content

Models trained at TPD to classify sequences of lightweight tracker observations as either a Jump or Not a Jump

Notifications You must be signed in to change notification settings

TotalPerformanceData/horse-jumps

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 

Repository files navigation

horse-jumps

TPD Logo

http://www.totalperformancedata.com/

Models trained at TPD to classify sequences of lightweight tracker observations as either a Jump or Not a Jump. Typically Speed and Stride Length near obstacles form the following time series, slowing down and shortening strides:

Speed near Obstacles

Stride Length near Obstacles

A Change in Bearing is also used as an input as it's rare that a jump appears on a bend but the stride and speed patterns approaching bends are often similar enough to that of jumps to produce confusion. Both a Live model using previous 8.5 seconds of data points as well as a Retrospective model using an additional 4s after some examined timestamp. Live model applications include live trading bet management, and powering race animations.

Models are trained using keras with tensorflow backend.

The validation score of the Live model was 98.07%. There were 17 (1.44%) instances of a false positive in the validation data, where a positive result is defined as a prediction greater than 0.75. Predictions are made by horse so by combining nearby runners the probability of 2 or more runners supplying a false positive in some regoin is remote unless course topology or pack movement dynamics is to blame for the confusion.

Fences of known locations as professionally surveyed at a handful of racecourses and GPS data points from the TPD Points Feed at 2Hz were used to generate the data set. Testing on Live recordings of the data (at 1Hz and interpolated to 2Hz to match model input requirement) showed comparable accuracy.

Predictions on my machine takes around 7ms + N*0.07ms, for N predictions passed to function. Entry to the predict() function seems to be the main restriction on speed and I expect would be faster if converted to another language and/or compiled though many of the methods appear to be platform specific and require a bunch of tinkering, but I welcome feedback on that front.

Input data for the model is supplied in file 'fenceModelData.pickle', as improvements on the model would be welcome. The dataset (13607 observations) is pickled and compressed through bz2, a simple function 'importTrainData' to import the data to a workspace is including in JumpsModel.py. Returns a dictionary with keys: 'labels' : binary choice either 1 if is a jump at timestep row index 16, 0 otherwise 'Unaltered' : time series data without scaling 'LMDcus24' : customScaler processed timeseries data for restrospective model which includes 4s of future points from row 16 'LMDcus16' : as above, for live model with no future points

Heatmaps can be produced by averaging the scores of nearby Lat-Lon points. The output of which for a couple of races over fences are shown below. The heatmaps show as more red for average scores which are more confident of close proximity to an obstacle, and more blue otherwise.

Retrospective model applied to a race a Worcester Live model applied to a race a Worcester

Retrospective model applied to a race a Uttoxeter Live model applied to a race a Uttoxeter

The performance of the model on Fences is very good.

Fences are a little bigger than Hurdles so the characteristics of the observation attributes when approaching an obstacle are generally clearer and easier to predict. Also, the training dataset only contains observations from Fence races because currently surveyed data for the Hurdles is very difficult to source accurately since they move around the course meeting to meeting so it's likely that there will be occaisions in which the horses are sufficiently fluent over the obstacle such that the model calls it Flat. That being said, the performance of the model on Hurdle races is still useable for some applications as indicated by the heatmap below.

Retrospective model applied to a Hurdle race a Southwell Live model applied to a Hurdle race a Southwell

I've side noted that often the better class, 3+, horses tend to produce more confusion over Hurdles than that of lower class horses which could something to consider.

The folder 'model' contains everything required to run the two trained models against 4 races from the TPD Points Feed sampled at 2Hz. As long as tensorflow/keras is installed the heatmaps can be generated by running JumpsModel.py as main.

Also included is a simple function to interpolate timestamps to a different interval (default 2Hz), which uses the pandas interpolate function.

About

Models trained at TPD to classify sequences of lightweight tracker observations as either a Jump or Not a Jump

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages