---
title: "Introduction"
bibliography: reference.bib
---

# Analyzing What Creates Horse Racing Winners at the Hong Kong Jockey Club (HKJC)
In this project, I will be exploring the factors that influence the performance of horses at the Hong Kong Jockey Club's Happy Valley and Sha Tin racing tracks.

## A Brief History
Horse racing has long been a tradition in Hong Kong. Hong Kong is one of China's two special adminstrative regions, along with Macao, that existed as a British Colony from 1841 until its handover to China on July 1, 1997. Macao, Hong Kong's Portuguese-founded sister state (transfered to China in 1999 INSERT NATIONS ONLINE REF), is often remarked as the "gambling capital of world" (grossing nearly $5.3 billion USD in gaming/gambling revenue in 2022 INSERT STATISTA REF), but many people overlook Hong Kong's gaming history and culture. In the 1840s, Great Britain introduced horse racing to the people of Hong Kong, quickly becoming a staple of entertainment, sporting, and high-class social status within the colony. 

In 1884, the Hong Kong Jockey Club (HKJC) was founded as an amatuer body to promote horse racing within Hong Kong. After existing for almost a century, HKJC evolved into a professional institution in 1971. Even with the handover of the region to China in 1997, HKJC still stands as the only government-granted monopoly for horse racing in Hong Kong. While the handover of the region did cause a brief decline in racing and gaming activity due to poor economic conditions, horse racing in the 21st century has soared at the HJKC. In the 2022-2023 racing season alone, the club achieved a record turnover of HK$305.6 billion ($39.1 billion USD INSERT HKJC FINANCE REF) and also donated a record HK$7.3 billion ($930 million) to charity, cementing itself as a top ten charitable foundation in the world. HKJC's commitment to their community and charitable action make it one of the most uniquely spirited gaming institutions in the entire world.

## The Pari-Mutuel Betting Market
HKJC's horse racing betting market is one that is largely foreign to the legalized sports gaming sphere that was newly introduced to the United States in 2021. With US sportsbooks, like Draftkings, FanDuel, BetMGM, etc., users place their bets on sporting events through the sportsbooks' betting platforms (website or mobile app) according to odds (also known as "lines") set by the sportsbooks. These lines are determined (fixed) by the sportbook, and if the user's bet wins, they are paid according to the odds fixed by the sportsbook. On the contrary, if the user loses, the money wagered by the user entirely goes to the sportsbook. While this is one of the most common way sportsbooks and casinos operate around the world, many horse racing promoters and gaming companies, including HJKC, utilize what is called **Pari-Mutuel Betting**.

Pari-mutuel betting, unlike fixed odds wagering, is a system where users, instead of betting against the bookmakers, players are essentially betting against other bettors. Discussed in the widely renowned 1988 paper titled "Anomalies: Parimutel Betting Markets: Racetracks and Lotteries", written by world-leading experts in behavioral economics and finance, Richard Thaler and William T. Ziemba [@thaler1988anomalies], pari-mutuel systems are betting systems in which all bets are placed into a pool and then dividends are awared based on the number of bets taken and the total amount wagered, in the case of horse racing, total number of bets and total amount wagered on a horse to win. The bookmaker, HKJC in our case, takes a portion, a "takeout", from the pool and then pays out winning users proportionally to the amount of bets/total amount wagered on the winning outcome compared to the rest of the field. For this reason, pari-mutel betting is often called "pool betting".

Pari-mutuel payoffs are determined by three factors (INSERT PARIMUTEL REF):

1. The amount of money placed on the winning horse or combination of horses
2. The track takeout
3. Breakage

Typically you will see horse racing bookmakers take between a 16% and 20% takeout from the pool (a number established internally by the bookmakers). Breakage, desribes the process in which payoffs are rounded down to the nearest dime, a practice that is common in most jurisdictions where horse racing is legal. Looking at a theoretical example of how payouts/odds,takeouts, and breakage work, if 40% of the win pool is wagered on one horse, the wagering public would effectively be saying that the horse in question is a 3:2 proposition. But factor in the track takeout, and that potential payoff would drop to 4:3 (which would display as 1:1, or even money, on your screen). Then breakage could take away up to nine cents per every $2 wagered.

You can probably see how pari-mutuel pricing and odds can get pretty complex, pretty quick, especially when one is betting on mulitple horse placements, either in a specific finishing order or in any finishing order. While this topic is certainly intersting, especially with respects to game theory, it is not neccesarily the focus of this project. While the outcomes of this project may lead to exploration in advanced betting systems in a pari-mutuel horse racing market, it is not going to be of major focus throughout the exploration of this project. This section is intended to lay the groundwork for the rationale behind this project and the reasonings behind why this project is specifically concerning itself with horse racing in Hong Kong.

## Project Rationale
As you can see, the pari-mutuel system in which HKJC operates is unique. With HKJC taking a part of the betting pool for every race, they are guranteed a profit for each race they host, no matter the outcome. This system is different from fixed-odds bookmakers who face the risk of losing money on any given matchup (these bookmakers do intentionally overinflate their odds, adding a "vigorish" to the lines, in order to maximize the attainability of profit, but again, that is out of the scope of this project). Because of this unique characteristic, Hong Kong horse racing can be a widely profitable sport to bet on. This fact is true for two main reasons:

1. Increased market ineffciencies due to the biases of the "uninfomred" betting public
2. The robust existence of publicly-available past performance data


The first point points to the fact that because the odds for each race are essentially being set by the betting public, there leaves room in each of the odds for payouts that are innacurate to the true probability of specific race events/outcomes to occur. In a 2008 paper (originally published in 1994) titled "Computer Based Horse Race Handicapping and Wagering Systems: A Report" written by Bill Benter [benter2008computer] (who is widely considered the most succesful sports handicapper of all time), the author points to the fact that while the public on the whole tends to come pretty close to the true probabilites associated with outcomes in horse racing with their probability estimates (the dispersion of wagers in the betting pool), they do have a slight margin of error that creates slight market ineffciencies. These market ineffciences can be found and exploited with the creation of a computerized handicapping model that seeks to predict horse racing outcomes. It is for this corse reason that this project is being pursued. With the exploration of the foundational analytics insights into horse racing, which will be achieved through this project, an accurate advanced handicapping model becomes one step closer to being realized. 

The second point highlights the fact that HKJC provides data on every race it has ever hosted on both of its tracks, Happy Valley and Sha Tin (since they started tracking race statistics through computerization). The reasoning for providing such data is simply due to the fact that it encourages more people to bet, which in turn allows them to have more of a takeout per pool. American sportsbooks directly profit off their users knowing as little as possible about the events they are betting on, while the contrary is true for HKJC and their horse racing events. Resultantly, HKJC sits on a uniquely rich mountain of data that anyone can use. With publicly-available access to a robust set of data with hundreds of features, it becomes immensely easier and more accesible to not only making an advanced betting model, but also for general analytics surrounding horse racing. This fact is another reason why I am choosing to look at Hong Kong horse racing over horse racing anywhere else in the world. 

On top of these points, there is not a lot of existing publicly-available literature surrounding horse racing in Hong Kong. The literature that does exist is either outdated, or is a largey unprofessional post on blog sites or Git Hub. The lack of literature shows that there is a large gap in research surrounding horse racing in pari-mutuel systems in general, providing further validity to this project.

Again, while this project will not directly develop an advanced betting system, it is intended that will it serve as some of the early foundation for such a system. Through findings gleaned from this project, more advanced machine learning technqiues and predictive mathematics may be explored. This project will help open a window of opportunity for a more advanced and focused exploration of horse racing and game theory.


## Project Outline and Direction
With all the above background information, this project intends to explore the following 10 questions:

1. What factors are most important in determining if a horse wins a race?
2. Does a difference in declared weight of a horse versus its actual weight play at all into if the horse wins?
3. Does draw (which gate the horse starts out of) impact a horses's finishing position. If so, to what extent?
4. Do some horses perform better on turf or grass tracks?
5. Do certain breeds of horses perform better than others?
6. Do past race placments give any insight to future placments?
7. Do horses in their "rookie" seasons perform better or worse than the rest of the field?
8. How much do the closing odds determine where a horse finishes?
9. How much does horse "rating" come into play when trying to determine horse performance?
10. To what extent do past length-behind-winner measurements determine horse perfomance?

### References

::: {#refs}
:::