# Tennis Simulations

I'm a tennis fan (and player). Let's explore a few oddities of tennis scoring.

1. In a famous graduation speech at Dartmouth, recently retired tennis legend Roger Federer, pointed out that, over the course of his career, he won only 53% of the points he played. This seems kind of surprising, since he reached the finals, meaning he won six matches in a row, in 31 of 81 grand slams he competed in (38%). Did he just play a lot better in the grand slams, or is it plausible that he could have a high probability of winning matches even if he had only a slight edge on each point?

2. Some players are really dominant as servers, but not as good at returning. They seem to go tie-breaks a lot, with players alternating serve and the server always winning until they reach 6-6 and play a tiebreak. How big a discrepancy in server vs. return point winning percentage is needed to make this happen?

3. Commentators often talk about certain players being "clutch" or "choking" in high-pressure situations. How much of a difference in match results would it make if a player had a boost in performance on high-pressure points?

Along the way, we'll learn a little about simulation.

# The Tennis Scoring System


## Game Scoring

In standard scoring for a single game, the first player to get four points wins, but you have to win by two. For obscure historical reasons, the actual scoring looks more complicated:
- instead of counting 0-1-2-3, we count love-15-30-40. 
- At 40-40, we stop using numbers and just say "deuce" meaning the score is tied.
- From deuce, whoever wins the next point has the "advantage"; 
- if the player with advantage wins the next point, they win the game; otherwise it returns to deuce


## Winning a Set


A set consists of a sequence of games. The first person to win six games wins the set, but you have to win by two. If the score is tied at 6-6, the players play a "tie-break", with the winner taking the set.

The players alternate who serves on a game-by-game basis, not within each game. When a player is serving, they serve all the points in that game.

At the professional level, the server wins more games than the returner. A bit of terminology, if the server wins the game, it's called a "hold"; if the returner wins the game, it's called a "break".


## Winning a Match

At the professional level, a match is best-of-three-sets, meaning that the first player to win two sets wins. In the four most important tournaments of the year, the so-called Grand Slams, the men play best-of-five-sets, meaning that the first player to win three sets wins.


## Tie-break Scoring

In a tie-break, one player serves one point, the other player serves two points, and thereafter they switch servers every two points.

In a regular tie-break, the first player to reach 7 points wins, but you have to win by two. 

In a deciding-set tiebreak (tied at 1 set each in a regular tournament; tied at 2 sets each in a men's Grand Slam), the victor is the first player to reach 10 points; again, you have to win by two. Thus, it's possible for a tie-break to end with a very high score, like 16-14, if it keeps being tied at 7-7, 8-8, 9-9, etc.


## Winning a Tournament

On the professional tour, tournaments are conducted as single-elimination. In the grand slam tournaments, there are 128 players, so it requires seven rounds to determine a single champion. (In each round, half the players are eliminated; so after one round 64 players are left, after two 32 players are left, and so on. After seven rounds, only one player is left.)

# Simulating Tennis Points, Games, Sets, Matches, and Tournaments

Recall that a simulation involves two things:
- a model that captures some essential features of the real-world process, while omitting other details
- an execution of the model, often many times, to see what happens


### Modeling a Player

We will model a player as an instance of a `Player` class, with a few parameters:
- their service skill, with a range of -10 to +10
- their return skill, with a range of -10 to +10
- their clutch factor, with a range of -10 to +10

### Modeling a Point
We will model the outcome of a point with a function `simulate_point` that takes two players as input and returns the winner of the point. The function will use a random draw for a "Bernoulli" random variable. It's like a coin flip, but the probability of "heads" (the server winning the point) is not necessarily 50%.
- The probability of the server winning the point starts with a base probability of 0.5, to which we add the server's service skill and subtract the returner's return skill. 
  - If the point can decide the game, we also add the server's clutch factor and subtract the returner's clutch factor.
  - If it is a clutch game, we will also add the server's clutch factor and subtract the returner's clutch factor. 
  - The clutch factors could be added twice if it is a game point in a clutch game.

Note that each point is independent of all other points; if a player wins one point, it doesn't affect the probability of winning the next point, except if it gives someone a game point.

In [1]:
import numpy as np
import matplotlib.pyplot as plt

# For reproducibility
np.random.seed(42)

In [2]:
class Player:
    def __init__(self, serve_skill=0, return_skill=0, pressure_skill=0, name=None):
        self.serve_skill = serve_skill  # -10 to +10
        self.return_skill = return_skill  # -10 to +10
        self.pressure_skill = pressure_skill
        self.name = name or "Player"
    def __repr__(self):
        return f"{self.name}(serve={self.serve_power}, return={self.return_power}, pressure={self.pressure_powerup})"
    
class Point:
    def __init__(self, p: float, server_won: bool):
        self.p = p
        self.server_won = server_won
    


In [10]:
def is_game_point(server_points, receiver_points, points_to_win=4):
    """
    Returns True if the current point is a pressure point (game-deciding point).
    For standard: game point for either player (one point away, must win by 2)
    """
    return (server_points >= points_to_win-1 or receiver_points >= points_to_win-1) and server_points != receiver_points

def point_win_probability(
    server: Player,
    returner: Player,
    game_point: bool = False,
    clutch_game: bool = False,
    ):
    """
    Calculate the probability that the server wins a point against the returner.
    """
    prob = 50 + server.serve_skill - returner.return_skill

    # Apply pressure adjustments
    if game_point:
        prob += (server.pressure_skill - returner.pressure_skill)
    if clutch_game:
        prob += (server.pressure_skill - returner.pressure_skill)

    return np.clip(prob/100, 0.0, 1.0)

def simulate_point(server: Player, returner: Player, game_point=False, clutch_game=False, verbose=False):
    if verbose:
        print(f"\t\t\tPoint: {server.name} (S) vs {returner.name} (R) | Game Point: {game_point} | Clutch Game: {clutch_game}")
        print(f"\t\t\tServer win probability: {point_win_probability(server, returner, game_point, clutch_game):.2f}")
    return np.random.rand() < point_win_probability(server, returner, game_point, clutch_game)

def simulate_game(server: Player, returner: Player, clutch_game=False, verbose=False):
    """
    Simulate a game of tennis between two players.
    Returns True if server wins the game, False otherwise.
    """
    points_to_win = 4  # standard game
    server_points = 0
    returner_points = 0

    while (server_points < points_to_win and returner_points < points_to_win) or abs(server_points - returner_points) < 2:
        if verbose:
            print(f"\t\tScore: {server.name} {server_points} - {returner.name} {returner_points}")
        if simulate_point(server, 
                          returner, 
                          is_game_point(server_points, returner_points, points_to_win), 
                          clutch_game,
                          verbose=verbose
                          ):
            server_points += 1
        else:
            returner_points += 1

    return server_points > returner_points

def simulate_tiebreak(player1: Player, player2: Player, points_to_win=7, verbose=False):
    """
    Simulate a tiebreak between two players.
    - player1 serves first
    Returns True if player1 wins the tiebreak, False otherwise.
    """
    p1_points = 0
    p2_points = 0
    total_points = 0
    while (p1_points < points_to_win and p2_points < points_to_win) or abs(p1_points - p2_points) < 2:
        # Determine server based on point number
        first_player_serves = total_points % 4 in [0, 3]  # 0, 3, 4, 7, 8, ... (1st, 4th, 5th, 8th, 9th, ...)

        if verbose:
            print(f"\t\tTiebreak Score: {player1.name} {p1_points} - {player2.name} {p2_points} | {'1st' if first_player_serves else '2nd'} player serving")

        if first_player_serves:
            server = player1
            returner = player2
        else:
            server = player2
            returner = player1

        server_wins = simulate_point(
            server,
            returner,
            game_point=is_game_point(p1_points, p2_points, points_to_win),
            clutch_game=True,
            verbose=verbose
        )

        if server_wins:
            if first_player_serves:
                p1_points += 1
            else:
                p2_points += 1
        else:
            if first_player_serves:
                p2_points += 1
            else:
                p1_points += 1

        total_points += 1

    return p1_points > p2_points

def is_clutch_game(p1_games, p2_games):
    """
    Determine if the current game is a clutch game.
    A clutch game is when one player winning will end the set.
    """
    return (p1_games == 5 and p2_games < 5) or (p2_games == 5 and p1_games < 5) or (p1_games == 6 and p2_games == 6)

def simulate_set(player1: Player, player2: Player, deciding_set=False, verbose=False):
    """
    Simulate a set between two players.
    - player1 serves first
    Returns (p1_games, p2_games) for the set.
    """
    p1_games = 0
    p2_games = 0
    first_player_serves = True  # player1 serves first; then alternate

    while (p1_games < 6 and p2_games < 6) or abs(p1_games - p2_games) < 2:
        if verbose:
            print(f"\tSet Score: {player1.name} {p1_games} - {player2.name} {p2_games} | {'1st' if first_player_serves else '2nd'} player serving")
        clutch_game = is_clutch_game(p1_games, p2_games)
        if p1_games == 6 and p2_games == 6:
            # Tiebreak
            if simulate_tiebreak(player1, player2, points_to_win=10 if deciding_set else 7, verbose=verbose): # at 6-6, player1 will serve first in the tiebreak
                p1_games += 1
            else:
                p2_games += 1
            break # set is over after tiebreak
        else:
            if first_player_serves:
                if simulate_game(player1, player2, clutch_game=clutch_game, verbose=verbose):
                    p1_games += 1
                else:
                    p2_games += 1
            else:
                if simulate_game(player2, player1, clutch_game=clutch_game, verbose=verbose):
                    p2_games += 1
                else:
                    p1_games += 1

            first_player_serves = not first_player_serves  # alternate serve

    return p1_games, p2_games

def simulate_match(player1: Player, player2: Player, best_of=3, verbose=False):
    """
    Simulate a match between two players.
    - player1 serves first in the first set; then alternate sets
    - best_of: 3 or 5 sets
    Returns True if player1 wins the match, False otherwise.
    """
    assert best_of in [3, 5], "best_of must be 3 or 5"
    sets_to_win = best_of // 2 + 1
    p1_sets = 0
    p2_sets = 0
    results = []
    first_player_serves = True  # player1 serves first in the first set; then alternate

    if verbose:
        print(f"Simulating a best-of-{best_of} match between {player1.name} and {player2.name}")
        print("-----------------------------------------------------")


    while p1_sets < sets_to_win and p2_sets < sets_to_win:
        if verbose:
            print(f"Match Score: {player1.name} {p1_sets} - {player2.name} {p2_sets} | {'1st' if first_player_serves else '2nd'} player serving this set")
        deciding_set = (p1_sets == sets_to_win - 1 and p2_sets == sets_to_win - 1)

        if first_player_serves:
            p1_games, p2_games = simulate_set(player1, player2, deciding_set=deciding_set, verbose=verbose)
        else:
            p2_games, p1_games = simulate_set(player2, player1, deciding_set=deciding_set, verbose=verbose)
        results.append((p1_games, p2_games))

        if p1_games > p2_games:
            p1_sets += 1
        else:
            p2_sets += 1
        if (p1_games + p2_games) % 2 == 1:  # if total games is odd, the other player serves first next set
            first_player_serves = not first_player_serves

    return results

In [11]:
p1 = Player(serve_skill=5, return_skill=1, pressure_skill=0, name="Federer")
p2 = Player(serve_skill=2, return_skill=-2, pressure_skill=0, name="Joe Average")

In [12]:
simulate_match(p1, p2, best_of=3, verbose=True)

Simulating a best-of-3 match between Federer and Joe Average
-----------------------------------------------------
Match Score: Federer 0 - Joe Average 0 | 1st player serving this set
	Set Score: Federer 0 - Joe Average 0 | 1st player serving
		Score: Federer 0 - Joe Average 0
			Point: Federer (S) vs Joe Average (R) | Game Point: False | Clutch Game: False
			Server win probability: 0.57
		Score: Federer 1 - Joe Average 0
			Point: Federer (S) vs Joe Average (R) | Game Point: False | Clutch Game: False
			Server win probability: 0.57
		Score: Federer 2 - Joe Average 0
			Point: Federer (S) vs Joe Average (R) | Game Point: False | Clutch Game: False
			Server win probability: 0.57
		Score: Federer 3 - Joe Average 0
			Point: Federer (S) vs Joe Average (R) | Game Point: True | Clutch Game: False
			Server win probability: 0.57
		Score: Federer 3 - Joe Average 1
			Point: Federer (S) vs Joe Average (R) | Game Point: True | Clutch Game: False
			Server win probability: 0.57
	Set Score: Fe

[(6, 2), (7, 5)]

In [13]:
# as a sanity check, let's see what percentage of points Federer wins against Joe Average, when serving
num_simulations = 100000
serve_points_won = 0

for _ in range(num_simulations):
    if simulate_point(p1, p2):
        serve_points_won += 1

print(f"When Federer serves, he wins {serve_points_won/num_simulations*100:.1f}% of points")

# and when receiving
num_simulations = 100000
return_points_won = 0
for _ in range(num_simulations):
    if not simulate_point(p2, p1):
        return_points_won += 1

print(f"When Federer returns, he wins {return_points_won/num_simulations*100:.1f}% of points")

print(f"Overall, Federer wins {(serve_points_won + return_points_won)/(2*num_simulations)*100:.1f}% of points")

When Federer serves, he wins 57.0% of points
When Federer returns, he wins 48.9% of points
Overall, Federer wins 52.9% of points


In [15]:
# simulate lots of games to see what percentage of points Federer wins against Joe Average

num_simulations = 100000
serve_games_won = 0
for _ in range(num_simulations):
    if simulate_game(p1, p2):
        serve_games_won += 1

print(f"When serving, {p1.name} wins {serve_games_won/num_simulations*100:.1f}% of games.")
# and when receiving
num_simulations = 100000
return_games_won = 0
for _ in range(num_simulations):
    if not simulate_game(p2, p1):
        return_games_won += 1

print(f"When receiving, {p1.name} wins {return_games_won/num_simulations*100:.1f}% of games.")

print(f"Overall, {p1.name} wins {(serve_games_won + return_games_won)/(2*num_simulations)*100:.1f}% of games.")

When serving, Federer wins 67.5% of games.
When receiving, Federer wins 47.5% of games.
Overall, Federer wins 57.5% of games.
When receiving, Federer wins 47.5% of games.
Overall, Federer wins 57.5% of games.


In [14]:
# Simulate lots of matches against "Joe Average".
num_simulations = 1000
results = [simulate_match(p1, p2) for _ in range(num_simulations)]
def winner(match_result):
    p1_sets = sum(1 for p1_games, p2_games in match_result if p1_games > p2_games)
    p2_sets = sum(1 for p1_games, p2_games in match_result if p2_games > p1_games)
    return p1_sets > p2_sets
p1_wins = sum(1 for result in results if winner(result))
print(f"{p1.name} won {p1_wins} out of {num_simulations} matches ({(p1_wins/num_simulations)*100:.1f}%)")


Federer won 781 out of 1000 matches (78.1%)


# Some Questions To Answer With Simulations 

1. In a famous graduation speech at Dartmouth, recently retired tennis legend Roger Federer, pointed out that, over the course of his career, he won only 53% of the points he played. This seems kind of surprising, since he reached the finals, meaning he won six matches in a row, in 31 of 81 grand slams he competed in (38%). Did he just play a lot better in the grand slams, or is it plausible that he could have a high probability of winning matches even if he had only a slight edge on each point?
- If the probability of winning each point is 53%, what is the probability of winning a game, with standard scoring and no-ad scoring? (Before running the simulation, see if you can make a guess about whether the probability of winning a game will be higher or lower than 53%, and whether the probability is higher or lower with standard or no-ad scoring.)
- Suppose that Feder actually had a 60% chance of winning each point when he served, and a 46% chance of winning each point when his opponent served, what was his probability of winning a set? the probability of winning a match? the probability of winning six matches in a row? More generally, given that Feder was going to win 53% of points overall, would it have been better for him to have a balanced advantage both serving and returning, or would it be better to have a bigger serving advantage and less when returning?

2. Let's imagine that there are two dominant players on the tour who are much better than the rest of the players and both make it to the finals of tournaments pretty regularly. (This is true today on the men's tour: the players are Carlos Alcaraz and Jannik Sinner.) Note that tournaments use a seeding system so that two two best players do not have to play each other until the finals. Now suppose that you are the coach for one of these top players, and you are obsessed with helping your player win more often against their chief rival. During the short off-season, you have three options for how you can work with your player, with three different expected outcomes. Which of these three power-ups would you choose to give your player?
- You can work on their serve, the shot that starts a point, and follow-up shots to the serve. This will increase the player's winning percentage on their service points by +3.
- You can work on their returns, the shot in response to a serve. This will increase the player's winning percentage when they are not the server by +3.
- You can hire a sports psychologist who will improve the player's ability to handle high pressure points, whether as server or returner:
  - if it's a game (or tie-break) that could decide the match, they improve their winning percentage by +4 on every point
  - if it's a game (or tie-break) that could decide a set, but not the match, they improve their winning percentage by +2 on every point
  - In addition, on any point that could decide the game, they improve their winning percentage by +3
