# Game Win Probability

This notebook explores the probability of winning a tennis game as a function of the probability of winning individual points on serve.

**Key Questions:**
- How does point-winning probability translate to game-winning probability?
- How does the "No-Ad" scoring rule affect game outcomes?
- How do these probabilities change at different score states?

#### Reference
This notebook was used to generate the results presented in the blog post: https://medium.com/p/fd6516ff9c20/edit

## Setup and Imports

In [None]:
from matplotlib        import pyplot as plt
from scipy.interpolate import interp1d
import numpy as np
import os, sys

# add path to the 'tennis_lab' package if not in PYTHONPATH already 
PROJECT_ROOT = os.path.abspath(os.path.join(os.path.dirname('__file__'), '..'))
SRC_DIR = os.path.join(PROJECT_ROOT, 'src')
if SRC_DIR not in sys.path:
    sys.path.append(SRC_DIR)

from tennis_lab.core.game_score        import GameScore
from tennis_lab.core.match_format      import MatchFormat
from tennis_lab.paths.game_path        import GamePath
from tennis_lab.paths.game_probability import probabilityServerWinsGame, loadCachedFunction

## All Possible Score Paths in a Game

A tennis game can unfold in many different ways. The `GamePath` class generates all possible score progressions from a given starting score.

With standard scoring (advantage rule), there are **50 distinct paths** from 0-0 to a game conclusion. Paths that reach deuce (3-3) are "cut off" at that point since deuce can theoretically repeat indefinitely.

Each path is a sequence of `(server_points, receiver_points)` tuples.

In [None]:
# Generate all possible score paths for a game.
# Start at 0-0 and use standard scoring rules.
MF = MatchFormat(noAdRule=False)
SCORE_INIT = GameScore(0, 0, MF)

paths = GamePath.generateAllPaths(SCORE_INIT)

# display all possible score scenarios
for n, path in enumerate(paths):
    print(f"{n+1:>2}: ", path)

## Game Win Probability vs. Point Win Probability

The relationship between **point-winning probability** and **game-winning probability** in tennis is inherently non-linear. Even a modest advantage at the point level is **amplified over the course of a game** by the structure of the scoring system.

For example, a server who wins **60% of points on serve** will win approximately **74% of their service games**. This amplification effect is a fundamental feature of tennis scoring and explains why small differences in point-level performance can translate into large differences in match outcomes.

The `probabilityServerWinsGame` function computes the probability that the server wins a game for any given starting score, assuming a fixed probability of winning an individual point on serve.

### How the Function Works

Internally, the calculation is based on **enumerating possible score progressions** from the current game score to the end of the game:

1. **Generate all possible score paths** starting from the given initial score and ending with the server winning the game.
2. **Compute the probability of each path**, based on the probability \( p \) that the server wins a single point.
3. **Sum the probabilities** of all winning paths to obtain the overall game-winning probability.

In practice, we do **not explicitly generate all possible paths**. Under standard advantage scoring, the number of possible paths is infinite because the game can return to *deuce* arbitrarily many times.

To handle this, path generation stops once the score reaches **deuce**. From deuce onward, the probability that the server eventually wins the game is computed using a **closed-form expression** derived from the geometric series of repeated deuce outcomes:

$$
\frac{p^2}{1 - 2p(1 - p)}
$$

where \( p \) is the probability that the server wins a point.

See the blog post for more details: https://medium.com/p/fd6516ff9c20/edit

In [None]:
# Calculate the probability that the player serving wins the game, 
# as a function of the probability of winning the point when serving.
PLAYER_SERVING = 1

# This probability can be calculated for any current score. 
# Change the score below to see how this probability changes.
INIT_SCORE = GameScore(pointsP1=0, pointsP2=0)

Ps = np.linspace(0, 1, 100)
Ys = [probabilityServerWinsGame(INIT_SCORE, PLAYER_SERVING, p) for p in Ps]

plt.title ("Probability of Winning the Game when Serving")
plt.xlabel("probability of winning the point when serving")
plt.ylabel("probability of winning the game when serving")
plt.xticks(np.arange(0, 1.01, 0.1))
plt.yticks(np.arange(0, 1.01, 0.1))
plt.grid(linewidth=0.1, color='blue')
plt.plot(Ps, Ys)
plt.show()

## Standard vs. No-Ad Scoring

The **No-Ad rule** (used in doubles and in junior tournaments) eliminates the advantage phase at deuce. Instead, at deuce (40â€“40), the **next point decides the game**, with the receiver choosing which side to receive.

This rule changes game-winning probabilities in systematic ways:

- **Weaker servers benefit**: No-Ad scoring increases their chance of winning a service game.
- **Stronger servers are disadvantaged**: They lose the opportunity to leverage their serve advantage over multiple deuce points.

The plot on the right shows the difference between No-Ad and standard scoring. **Positive values indicate that No-Ad scoring favors the server**, while negative values indicate it favors the returner.

In [None]:
# Compare standard versus "NO-AD" scoring

PLAYER_SERVING = 1

# Change the score below to see how probabilities
# change, depending on where we are in the game.
P1_POINTS = 0
P2_POINTS = 0

initScoreRglr = GameScore(P1_POINTS, P2_POINTS, matchFormat=MatchFormat(noAdRule=False))
initScoreNoAd = GameScore(P1_POINTS, P1_POINTS, matchFormat=MatchFormat(noAdRule=True))

Ps   = np.linspace(0, 1, 100)
YADs = [probabilityServerWinsGame(initScoreRglr, PLAYER_SERVING, p) for p in Ps]   # regular scoring
YNOs = [probabilityServerWinsGame(initScoreNoAd, PLAYER_SERVING, p) for p in Ps]   # no-ad rule
Ds   = [YNOs[i] - YADs[i] for i in range(len(YADs))]  # change in prob

fig, ((ax1, ax2)) = plt.subplots(1, 2, figsize=(12, 5)) 

ax1.set_xlabel("probability of winning the point when serving")
ax1.set_ylabel("probability of winning the game when serving")
ax1.set_title ("Probability of Winning the Game when Serving")
ax1.grid(linewidth=0.2)
ax1.plot(Ps, YADs, label="regular score")
ax1.plot(Ps, YNOs, label="no-ad score")
ax1.legend()

ax2.set_xlabel("probability of winning the point when serving")
ax2.set_ylabel("change in probability of winning the game when serving")
ax2.set_title ("Change in Probability of Winning the Game when Serving")
ax2.grid(linewidth=0.2)
ax2.plot(Ps, Ds, color="darkgreen", label="No-Ad Advantage")
ax2.legend()

plt.show()

## Validating Cached Probabilities

For performance reasons, game-winning probabilities can be **precomputed and cached** using  
`scripts/cache-prob-win-game.py`.<br> The `loadCachedFunction` utility loads these cached values and exposes them as **interpolated functions**.

This cell validates that the cached probabilities match the directly computed values.  
- **Black lines** show the true, directly computed probabilities.  
- **Gray circles** show the cached (interpolated) values.

The two should **overlap exactly**, indicating that the caching and interpolation introduce no measurable error.

In [None]:
# Compare cached probabilities vs true values.
# This is a check that probabilities cached using 'cache-prob-win-game.py' are correct.

# Check for either players: 1 or 2
PLAYER_SERVING = 2

# Check for both standard and 'no-ad' format.
MF = MatchFormat(noAdRule=True)

# Check for various scores
INIT_SCORES = [GameScore(0, 0, MF), GameScore(1, 1, MF)]

Ps = np.linspace(0, 1, 30)

# load cached data
Zs = {}
for initScore in INIT_SCORES:
    probWinGameCachedFction = loadCachedFunction(initScore, PLAYER_SERVING)
    Zs[initScore] = [probWinGameCachedFction(p) for p in Ps]

# calculate probabilities 'from scratch'
Ys = {}
for initScore in INIT_SCORES:
    Ys[initScore] = [probabilityServerWinsGame(initScore, PLAYER_SERVING, p) for p in Ps]  

# print both sets of values
plt.title ("Probability of Winning the Game when Serving - Cached vs True")
plt.xlabel("probability of winning the point when serving")
plt.ylabel("probability of winning the game when serving")
plt.xticks(np.arange(0, 1.01, 0.1))
plt.yticks(np.arange(0, 1.01, 0.1))
plt.grid(linewidth=0.1, color='grey')
for i, initScore in enumerate(INIT_SCORES):
    label_true  = 'true values' if i == 0 else None
    label_cache = 'cached data' if i == 0 else None
    plt.plot   (Ps, Ys[initScore], color="black", linewidth=0.4, label=label_true)                            # true   values
    plt.scatter(Ps, Zs[initScore], marker='o', s=40, color="lightgrey", edgecolor="grey", label=label_cache)  # cached values
plt.legend()
plt.show()