# Tiebreak Win Probability

This notebook explores the probability of winning a tennis tiebreak as a function of the probability of winning points on serve for both players.

**Key Questions:**
- How do serve probabilities for both players affect tiebreak outcomes?
- How does a 10-point super-tiebreak compare to a standard 7-point tiebreak?
- Does it matter which player serves first in a tiebreak?

#### Reference
This notebook accompanies the blog post: https://medium.com/p/13ae3ce1c078/edit

## Setup and Imports

In [None]:
from matplotlib        import pyplot as plt
from scipy.interpolate import RectBivariateSpline
import numpy as np
import os, sys

# add path to the 'tennis_lab' package if not in PYTHONPATH already 
PROJECT_ROOT = os.path.abspath(os.path.join(os.path.dirname('__file__'), '..'))
SRC_DIR = os.path.join(PROJECT_ROOT, 'src')
if SRC_DIR not in sys.path:
    sys.path.append(SRC_DIR)

from tennis_lab.core.tiebreak_score        import TiebreakScore
from tennis_lab.core.match_format          import MatchFormat
from tennis_lab.paths.tiebreak_path        import TiebreakPath
from tennis_lab.paths.tiebreak_probability import probabilityP1WinsTiebreak, loadCachedFunction

## All Possible Score Paths in a Tiebreak

A tiebreak has many more possible score progressions than a regular game due to the higher number of poins played (first to 7 or 10 points).

The `TiebreakPath` class generates all possible paths from a given starting score. Each path entry is a tuple of `(player1_points, player2_points, next_server)`.

**Server rotation in tiebreaks:**
- Player 1 serves the first point
- Players alternate every 2 points thereafter
- This rotation is tracked in the path entries

A standard 7-point tiebreak has **2,508 distinct paths** from 0-0 (paths ending at 6-6 are cut off since deuce can repeat indefinitely).

In [None]:
# Generate all possible score paths for a (super-)tiebreak.
# WARNING: This is a long calculation (approx 1 minute) if for a super-tiebreak.

IS_SUPER       = False    # see WARNING above
INIT_SCORE     = TiebreakScore(0, 0, isSuper=IS_SUPER)
PLAYER_SERVING = 1

paths   = TiebreakPath.generateAllPaths(INIT_SCORE, PLAYER_SERVING)
n_paths = len(paths)

# display score scenarios
if n_paths <= 10:
    for p in paths:
        print(p)
else:
    for p in paths[:5]:
        print(p)
    print(".....", n_paths-10, " more paths")
    for p in paths[-5:]:
        print(p)

## Tiebreak Win Probability vs. Serve Probabilities

Unlike a regular service game—where only the server’s point-winning probability matters—a **tiebreak** depends on **both players’ serve probabilities**, because the serve alternates throughout the tiebreak according to a fixed pattern.

The `probabilityP1WinsTiebreak` function computes the probability that **Player 1** wins a tiebreak, given:
- **p1**: the probability Player 1 wins a point when serving
- **p2**: the probability Player 2 wins a point when serving

### How the Function Works

Internally, the calculation follows the same general principle as for games:

1. **Enumerate all possible tiebreak score paths** consistent with the alternating service order.
2. **Compute the probability of each path**, using **p1** for points served by Player 1 and **p2** for points served by Player 2.
3. **Sum the probabilities** of all paths in which Player 1 wins the tiebreak (first to 7 points, leading by at least 2).

Because a tiebreak can continue indefinitely once it reaches 6–6, there are infinitely many possible ways it can unfold. To handle this, we explicitly compute all score paths up to the first 6–6 tie, and then use a mathematical shortcut to account for everything that can happen after that.

### Interpreting the Plot

The plot shows a **family of curves**, each corresponding to a fixed value of **p2**, while **p1** varies along the horizontal axis.

- When **p1 = p2**, the tiebreak is symmetric and Player 1’s probability of winning is approximately **50%**.
- As the serve probabilities diverge, the curves show how even small differences in serve effectiveness can produce meaningful advantages over the course of a tiebreak.

In [None]:
# Calculate the probability that Player1 wins the tiebreak, as 
# a function of the probability that he wins a point when serving.
# There are multiple such curves, for various values of 'p2', the 
# probability that Player2 wins a point when serving.

# WARNING: 
# This is a slow computation if performed 'from scratch' by calling 'probabilityP1WinsTiebreak()'. 
# The better alternative is to use cached data - this is a set of pre-calculated data points.
# Cached data can be generated by running the 'cache-prob-win-tiebreak.py' script.
# To use cached data set the USE_CACHED option:
#  + USE_CACHED=True to use pre-computed cached data (fast, ~seconds)
#  + USE_CACHED=False to compute from scratch (slow, ~10 minutes)
# If cached data is not available the calculation falls back to 'from scratch' mode.
# Without cached data this cell can take 10 minutes to complete for a regular tiebreaker, and at
# least 10 times more for a super-tiebreaker.

USE_CACHED     = True      # Toggle between cached (fast) and computed (slow)
IS_SUPER       = False     # WARNING: very slow without cached data
INIT_SCORE     = TiebreakScore(0, 0, isSuper=IS_SUPER)
PLAYER_SERVING = 1

# P1s is the probability that Player1 wins the point when serving, 
# it is the horizontal axis of the graph.
# P2s is the probability that Player2 wins the point when serving,
# it describes a family of curves (one for each entry in P2s)
P1s = np.linspace(0, 1, 50)
P2s = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]

# Load cached function if available and USE_CACHED is True
cachedFn = None
if USE_CACHED:
    cachedFn = loadCachedFunction(INIT_SCORE, PLAYER_SERVING)
    if cachedFn is None:
        print("Warning: Cached data not available, falling back to computation from scratch")

# plot a family of curves
for idx, p2 in enumerate(P2s):   
    print(f"\rCalculating for p2={p2}... ({idx+1}/{len(P2s)})", end="", flush=True)
    if cachedFn is not None:
        Ys = [cachedFn(p1, p2) for p1 in P1s]
    else:
        Ys = [probabilityP1WinsTiebreak(INIT_SCORE, PLAYER_SERVING, p1, p2) for p1 in P1s]
    plt.plot(P1s, Ys, linewidth=0.8, label=f"p2={p2}")
print("\rDone!                              ")

plt.title ("Probability of Player1 Winning the Tiebreak")
plt.xlabel("probability of winning the point when serving for Player1")
plt.ylabel("probability of winning the tiebreak")
plt.xticks(np.arange(0, 1.01, 0.1))
plt.yticks(np.arange(0, 1.01, 0.1))
plt.grid(linewidth=0.3)
plt.legend(fontsize=7)
plt.show()

## Regular vs. Super Tiebreak (7-point vs. 10-point)

A **super tiebreak** (10-point tiebreak) is used in some formats—most commonly as a **match tiebreak** in place of a deciding set.

### Key Differences

- **Standard tiebreak**: First to 7 points (win by 2)
- **Super tiebreak**: First to 10 points (win by 2)

In this section, we compare the **probability that a player wins the tiebreak** under these two formats, using the same point-level serve probabilities.

Because the super tiebreak requires more points to win, it **reduces randomness** and **amplifies skill differences**: the better player is more likely to win a super tiebreak than a regular 7-point tiebreak, even when the underlying serve probabilities are identical.

As a result, the winning probability curve for a super tiebreak is typically **steeper**, reflecting the stronger influence of point-level advantages over a longer sequence of points.

**Note:** This calculation takes longer to run, since a super tiebreak has **significantly more possible score paths** than a standard tiebreak.

In [None]:
# Compare 7-point vs 10-point tiebreaks

# WARNING: 
# This is a slow computation if performed 'from scratch' by calling 'probabilityP1WinsTiebreak()'. 
# The better alternative is to use cached data - this is a set of pre-calculated data points.
# Cached data can be generated by running the 'cache-prob-win-tiebreak.py' script.
# To use cached data set the USE_CACHED option:
#  + USE_CACHED=True to use pre-computed cached data (fast, ~seconds)
#  + USE_CACHED=False to compute from scratch (slow, ~10 minutes)
# If cached data is not available the calculation falls back to 'from scratch' mode.

USE_CACHED   = False
INIT_SCORE7  = TiebreakScore(0, 0, isSuper=False)
INIT_SCORE10 = TiebreakScore(0, 0, isSuper=True )

# P1s is the probability that Player1 wins the point when serving, 
# it is the horizontal axis of the graph.
# P2s is the probability that Player2 wins the point when serving,
# it describes a family of curves (one for each entry in P2s)
P1s = np.linspace(0, 1, 20)
P2s = [0.3]

# Load cached functions if available
cachedFn7  = loadCachedFunction(INIT_SCORE7,  1) if USE_CACHED else None
cachedFn10 = loadCachedFunction(INIT_SCORE10, 1) if USE_CACHED else None

if USE_CACHED and cachedFn7 is None:
    print("Warning: Cached data for 7-point tiebreak not available, falling back to computation")
if USE_CACHED and cachedFn10 is None:
    print("Warning: Cached data for 10-point tiebreak not available, falling back to computation")

colors = ["blue", "green"]
for idx, p2 in enumerate(P2s):
    if cachedFn7 is not None:
        Y7s = [cachedFn7(p1, p2) for p1 in P1s]
    else:
        Y7s = [probabilityP1WinsTiebreak(INIT_SCORE7, 1, p1, p2) for p1 in P1s]
    
    if cachedFn10 is not None:
        Y10s = [cachedFn10(p1, p2) for p1 in P1s]
    else:
        Y10s = [probabilityP1WinsTiebreak(INIT_SCORE10, 1, p1, p2) for p1 in P1s]

    plt.plot(P1s, Y7s , color=colors[idx], linewidth=0.7, label=f"p2={p2},   7-point")
    plt.plot(P1s, Y10s, color=colors[idx], linewidth=0.7, linestyle="--", label=f"p2={p2}, 10-point")
    plt.vlines(p2, 0, 1, linewidth=0.5, color=colors[idx])

plt.title ("Player1's Probability of Winning Regular vs Super Tiebreak")
plt.xlabel("probability of winning the point when serving for Player1")
plt.ylabel("probability of winning the tiebreak")
plt.grid(linewidth=0.2)
plt.legend(fontsize=8)
plt.show()

## Does It Matter Who Serves First?

A natural question in tiebreak analysis is whether **serving first** provides an advantage.

This section compares the probability that **Player 1** wins the tiebreak under two scenarios:
- **Player 1 serves next** (marked with `x`)
- **Player 2 serves next** (marked with `o`)

### Key Insight

At an **even total score** (e.g., 0–0, 2–4), the probability that Player 1 wins the tiebreak is **identical**, regardless of which player serves next.  
At an **odd total score**, the probabilities differ slightly, reflecting the asymmetric serve order at those points.

### Conclusion

At the start of a tiebreak (**0–0**), **it does not matter who serves first**.

In [None]:
# Does it matter who serves first ?
# For each starting score and probability parameters, we calculate the probability 
# that Player1 wins the tiebreak in two scenarios: when P1 and when P2 serves next.
# We note that this two probabilities are the same when the number of points played 
# so far is even, but differ when it is odd.
# CONCLUSION: it does not matter who serves first at the beginning of the tiebreak.

# WARNING: 
# This is a slow computation if performed 'from scratch' by calling 'probabilityP1WinsTiebreak()'. 
# The better alternative is to use cached data - this is a set of pre-calculated data points.
# Cached data can be generated by running the 'cache-prob-win-tiebreak.py' script.
# To use cached data set the USE_CACHED option:
#  + USE_CACHED=True to use pre-computed cached data (fast, ~seconds)
#  + USE_CACHED=False to compute from scratch (slow, ~10 minutes)
# If cached data is not available the calculation falls back to 'from scratch' mode.

USE_CACHED = True
IS_SUPER   = False
INIT_SCORE = TiebreakScore(0, 0, isSuper=IS_SUPER)

# P1s is the probability that Player1 wins the point when serving, 
# it is the horizontal axis of the graph.
# P2s is the probability that Player2 wins the point when serving,
# it describes a family of curves (one for each entry in P2s)
P1s = np.linspace(0, 1, 25)
P2s = [0.1, 0.5, 0.9]

# Load cached functions for both server scenarios
cachedFn1 = loadCachedFunction(INIT_SCORE, 1) if USE_CACHED else None
cachedFn2 = loadCachedFunction(INIT_SCORE, 2) if USE_CACHED else None

if USE_CACHED and (cachedFn1 is None or cachedFn2 is None):
    print("Warning: Cached data not fully available, falling back to computation")
    cachedFn1 = cachedFn2 = None

# plot a family of curves
for p2 in P2s:   
    if cachedFn1 is not None:
        Y1s = [cachedFn1(p1, p2) for p1 in P1s]
        Y2s = [cachedFn2(p1, p2) for p1 in P1s]
    else:
        Y1s = [probabilityP1WinsTiebreak(INIT_SCORE, 1, p1, p2) for p1 in P1s]
        Y2s = [probabilityP1WinsTiebreak(INIT_SCORE, 2, p1, p2) for p1 in P1s]
    
    plt.scatter(P1s, Y2s, marker='o', s=80, alpha=0.3, label=f"p2={p2} P2 serves next")
    plt.scatter(P1s, Y1s, marker='x', s=15, alpha=0.5, color="black", label=f"p2={p2} P1 serves next")

plt.title (f"Probability of Player1 Winning the Tiebreak from {INIT_SCORE}")
plt.xlabel("probability of winning the point when serving for Player1")
plt.ylabel("probability of winning the tiebreak")
plt.xticks(np.arange(0, 1.01, 0.1))
plt.yticks(np.arange(0, 1.01, 0.1))
plt.grid(linewidth=0.3)
plt.legend(fontsize=6)
plt.show()

## Validating Cached Probabilities

For performance, tiebreak-winning probabilities can be pre-computed and cached using `scripts/cache-prob-win-tiebreak.py`. The `loadCachedFunction` loads these cached values as interpolated functions.

This cell validates that cached values match the true computed probabilities. The black lines show true values; grey circles show cached values. They should overlap perfectly.

In [None]:
# Compare cached probabilities vs true values.
# This is a check that interpolated probabilities cached using 'cache-prob-win-tiebreak.py' are correct.

PLAYER_SERVING = 1
IS_SUPER       = False
INIT_SCORES    = [TiebreakScore(5, 0, isSuper=IS_SUPER), 
                  TiebreakScore(4, 2, isSuper=IS_SUPER),
                  TiebreakScore(6, 4, isSuper=IS_SUPER)]

# P1s is the probability that Player1 wins the point when serving, 
# it is the horizontal axis of the graph.
# P2 is the probability that Player2 wins the point when serving.
P1s = np.linspace(0, 1, 30)
P2  = 0.5

# load cached data
Zs = {}
for initScore in INIT_SCORES:
    probWinTBreakCachedFction = loadCachedFunction(initScore, PLAYER_SERVING)
    Zs[initScore] = [probWinTBreakCachedFction(p1, P2) for p1 in P1s]

# calculate probabilities 'from scratch'
Ys = {}
for initScore in INIT_SCORES:
    Ys[initScore] = [probabilityP1WinsTiebreak(initScore, PLAYER_SERVING, p1, P2) for p1 in P1s]

# print both sets of values
plt.title ("Probability of Winning the Tiebreak - Cached vs True")
plt.xlabel("probability of winning the point when serving")
plt.ylabel("probability of winning the game when serving")
plt.xticks(np.arange(0, 1.01, 0.1))
plt.yticks(np.arange(0, 1.01, 0.1))
plt.grid(linewidth=0.1, color='grey')
for initScore in INIT_SCORES:
    plt.plot   (P1s, Ys[initScore], color="black", linewidth=0.4, label=f"{initScore}")                           # true   values
    plt.scatter(P1s, Zs[initScore], marker='o', s=40, color="lightgrey", edgecolor="grey")  # cached values
plt.legend(fontsize=9)
plt.show()