## Possible Outcomes of NBA Playoff Series

In this short notebook, we will develop some Python code to generate the possible outcomes of an NBA playoff series. This code will serve as a prototype for our later efforts to compute playoff series win probabilities.

We will use the `permutations()` function in the [`itertools`](https://docs.python.org/3/library/itertools.html#module-itertools) module from the Python standard library.

In [1]:
import itertools
from collections import Counter

### Series Home Court Advantage

In any NBA playoff series, one team has series home court advantage. In the [modern best-of-7 NBA playoff series](https://en.wikipedia.org/wiki/NBA_playoffs#Format), games 1, 2, 5 and 7 are played on the home court of the team with series home court advantage. The other 3 games (3, 4, and 6) are played on the other team's home court. You can learn more about how a team earns playoff series home court advantage [here](https://www.quora.com/How-is-the-home-court-advantage-determined-in-the-NBA-playoffs).

For a particular playoff series, we will refer to the team with series home court advantage as Team 1, and the other team as Team 2. Using this convention, we can specify where a playoff game was played using either a `'1'` or a `'2'` character. We can also specify which team won the playoff game using the same characters.

Imagine that we have a best-of-5 playoff series. (Prior to the 2002-3 NBA Season, first round playoff games were best-of-5.) Let's look at one possible sequence of outcomes for 5 games.

In [2]:
game_winners = ('2', '2', '1', '2', '1')

### Creating Valid Playoff Series Outcomes

This example is not actually a valid NBA best-of-5 playoff series. The series should never have gotten to 5 games. Team 2 should have won the series after the fourth game. We want to adjust the above outcome to reflect that the series should have ended after 4 games.

Here's how we can do that. First, we need a function to find the last `'1'` or `'2'` character. We can use some Python [slicing](https://www.dotnetperls.com/slice-python) with the usual Python [`index()`](https://docs.python.org/3.6/tutorial/datastructures.html) method to find the last occurrence of the item. This method will work on a Python seqeuence implementing the `index()` method (in particular, standard `list` or `tuple` objects).

In [3]:
def last_index(outcome, c):
    """Index of last occurrence of an item in a list or tuple"""
    return len(outcome) - outcome[::-1].index(c)

Suppose we know that the team without series home court advantage won this playff series. Here's how we use our function to find the decisive game.

In [4]:
last_index(game_winners, '2')

4

From this we know that the series should had ended after 4 games. Let's use slicing again to chop off the game which would never have actually been played.

In [5]:
game_winners[:last_index(game_winners, '2')]

('2', '2', '1', '2')

We can turn this idea into a general function. The function will accept a sequence of games (represented by `'Y'` and `'N'` characters representing wins or losses by the team with series home court advantage). It will return the truncated outcome representing the games that would have actually been played.

In [6]:
def valid_series(game_winners, *, best_of, shca_wins):
    """Return valid NBA playoff series outcome"""
    assert best_of in (5, 7)
    c = '1' if shca_wins else '2'  # Character for counting wins
    wins = 4 if best_of == 7 else 3
    assert game_winners.count(c) == wins  # Sanity check the input
    return game_winners[:last_index(game_winners, c)]

In [7]:
valid_series(game_winners, best_of=5, shca_wins=False)

('2', '2', '1', '2')

Notice that if we happen to pass in a sequence that is already valid, we get back the same (valid) sequence.

In [8]:
valid_series(('2', '2', '1', '2'), best_of=5, shca_wins=False)

('2', '2', '1', '2')

### Generating Possible Outcomes

Now it's time to generate all the possible playoff series outcomes.

Continuing with our best-of-5 series as an example, let's generate all possible sequences where Team 1 wins the series, by winning 3 games.

In [9]:
wins = '1'*3 + '2'*2
set(itertools.permutations(wins))

{('1', '1', '1', '2', '2'),
 ('1', '1', '2', '1', '2'),
 ('1', '1', '2', '2', '1'),
 ('1', '2', '1', '1', '2'),
 ('1', '2', '1', '2', '1'),
 ('1', '2', '2', '1', '1'),
 ('2', '1', '1', '1', '2'),
 ('2', '1', '1', '2', '1'),
 ('2', '1', '2', '1', '1'),
 ('2', '2', '1', '1', '1')}

The `permutations()` function returns an [iterator](https://dbader.org/blog/python-iterators) of all the possible permutations of the 5-character string. Many of the items returned by this iterator would be duplicates. That is why we use `set` above to get the unique outcomes.

We can use our function above to prune each possible permutation and make sure it is a valid NBA playoff outcome. Here's a general function that does what we need.

In [10]:
def playoff_outcomes(*, best_of, shca_wins):
    """Possible playoff series outcomes"""
    if best_of not in (5, 7):
        raise ValueError('playoff series must be best of 5 or 7 games')
    if shca_wins:
        team_1 = 4 if best_of == 7 else 3
        team_2 = best_of - team_1
    else:
        team_2 = 4 if best_of == 7 else 3
        team_1 = best_of - team_2
    wins = '1'*team_1 + '2'*team_2
    game_winners = set(itertools.permutations(wins))
    outcomes = [
        valid_series(outcome, best_of=best_of, shca_wins=shca_wins)
        for outcome in game_winners
    ]
    return sorted(outcomes, key=len)

Here are the possible outcomes for a best-of-5 series, where the team without series home court advantage wins.

In [11]:
playoff_outcomes(best_of=5, shca_wins=False)

[('2', '2', '2'),
 ('2', '1', '2', '2'),
 ('2', '2', '1', '2'),
 ('1', '2', '2', '2'),
 ('1', '1', '2', '2', '2'),
 ('2', '2', '1', '1', '2'),
 ('1', '2', '2', '1', '2'),
 ('2', '1', '1', '2', '2'),
 ('2', '1', '2', '1', '2'),
 ('1', '2', '1', '2', '2')]

In [12]:
Counter(len(outcome) for outcome in playoff_outcomes(best_of=5, shca_wins=False))

Counter({3: 1, 4: 3, 5: 6})

Here are the possible outcomes of a best-of-7 series, where the team with series home court advantage wins.

In [13]:
playoff_outcomes(best_of=7, shca_wins=True)

[('1', '1', '1', '1'),
 ('1', '1', '1', '2', '1'),
 ('1', '2', '1', '1', '1'),
 ('1', '1', '2', '1', '1'),
 ('2', '1', '1', '1', '1'),
 ('1', '1', '2', '2', '1', '1'),
 ('2', '1', '1', '1', '2', '1'),
 ('2', '2', '1', '1', '1', '1'),
 ('2', '1', '2', '1', '1', '1'),
 ('2', '1', '1', '2', '1', '1'),
 ('1', '1', '1', '2', '2', '1'),
 ('1', '1', '2', '1', '2', '1'),
 ('1', '2', '1', '2', '1', '1'),
 ('1', '2', '1', '1', '2', '1'),
 ('1', '2', '2', '1', '1', '1'),
 ('1', '1', '2', '1', '2', '2', '1'),
 ('2', '1', '2', '1', '2', '1', '1'),
 ('2', '1', '2', '1', '1', '2', '1'),
 ('2', '1', '2', '2', '1', '1', '1'),
 ('1', '2', '2', '2', '1', '1', '1'),
 ('2', '2', '1', '1', '1', '2', '1'),
 ('1', '1', '1', '2', '2', '2', '1'),
 ('2', '2', '2', '1', '1', '1', '1'),
 ('2', '2', '1', '1', '2', '1', '1'),
 ('2', '2', '1', '2', '1', '1', '1'),
 ('2', '1', '1', '1', '2', '2', '1'),
 ('1', '2', '2', '1', '2', '1', '1'),
 ('1', '2', '1', '1', '2', '2', '1'),
 ('2', '1', '1', '2', '2', '1', '1'),
 ('

Let's look at the distribution of series length (in games).

In [14]:
Counter(len(outcome) for outcome in playoff_outcomes(best_of=7, shca_wins=True))

Counter({4: 1, 5: 4, 6: 10, 7: 20})

Now, let's generate the outcomes where the other team (without series home court advantage) wins the series.

In [15]:
playoff_outcomes(best_of=7, shca_wins=False)

[('2', '2', '2', '2'),
 ('2', '1', '2', '2', '2'),
 ('1', '2', '2', '2', '2'),
 ('2', '2', '2', '1', '2'),
 ('2', '2', '1', '2', '2'),
 ('1', '1', '2', '2', '2', '2'),
 ('2', '1', '1', '2', '2', '2'),
 ('2', '1', '2', '1', '2', '2'),
 ('2', '2', '2', '1', '1', '2'),
 ('2', '1', '2', '2', '1', '2'),
 ('2', '2', '1', '2', '1', '2'),
 ('1', '2', '1', '2', '2', '2'),
 ('2', '2', '1', '1', '2', '2'),
 ('1', '2', '2', '2', '1', '2'),
 ('1', '2', '2', '1', '2', '2'),
 ('2', '2', '1', '2', '1', '1', '2'),
 ('1', '2', '1', '1', '2', '2', '2'),
 ('1', '1', '2', '2', '2', '1', '2'),
 ('1', '2', '2', '1', '1', '2', '2'),
 ('1', '1', '2', '2', '1', '2', '2'),
 ('2', '1', '1', '1', '2', '2', '2'),
 ('2', '1', '2', '1', '2', '1', '2'),
 ('1', '2', '1', '2', '2', '1', '2'),
 ('2', '1', '2', '1', '1', '2', '2'),
 ('1', '1', '1', '2', '2', '2', '2'),
 ('2', '2', '2', '1', '1', '1', '2'),
 ('2', '1', '1', '2', '2', '1', '2'),
 ('2', '1', '1', '2', '1', '2', '2'),
 ('1', '2', '2', '2', '1', '1', '2'),
 ('

In [16]:
Counter(len(outcome) for outcome in playoff_outcomes(best_of=7, shca_wins=False))

Counter({4: 1, 5: 4, 6: 10, 7: 20})

In summary, there are 70 total possible best-of-7 playoff series outcomes. There are 35 outcomes for each team in which that team wins the series.