# Introduction

We are now moving to the final part of the works, which involves formulating business recommendations. Our tasks are:
- Determining a global betting odds,
- Dividing the dataset into categories: A, B, C, D, where A is the best group and D is the weakest group,
- Determining the risk of odds based on accepted parameters for each category.

# Notebook Configuration

## Import necessary libraries

In [9]:
import pandas as pd

## Loading data into the workspace

In [10]:
df = pd.read_csv(r"..\data\processed\hockey_teams.csv", sep=";")

### Checking data loading accuracy

In [11]:
df.head()

Unnamed: 0,team,season,victories,defeats,overtime_defeats,victory_percentage,scored_goals,received_goals,goal_difference,% of defeats in overtime,goals_ratio
0,Boston Bruins,1990,44,24,0,55.0,299,264,35,0.0,1.13
1,Buffalo Sabres,1990,31,30,0,38.8,292,278,14,0.0,1.05
2,Calgary Flames,1990,46,26,0,57.5,344,263,81,0.0,1.31
3,Chicago Blackhawks,1990,49,23,0,61.3,284,211,73,0.0,1.35
4,Detroit Red Wings,1990,34,38,0,42.5,273,298,-25,0.0,0.92


# Determining Betting Odds

Let's review the content of the page: [click](https://trustbet.pl/kursy-bukmacherskie/), where information about methods for determining betting odds can be found. First, we will determine a global odd, which will be the starting point for our analysis (the so-called _baseline scenario_). At this point, we ignore the margin and assume that we are calculating the decimal odd.

## Implementations of the `get_betting_odds` function

In [12]:
def get_betting_odds(probability):
    result = 1/probability
    return result

### Some tests to check the correctness of the implementation

In [13]:
def test_get_betting_odds():
    assert get_betting_odds(1) == 1, "Expected 1"
    assert get_betting_odds(0.5) == 2, "Expected 2"
    assert get_betting_odds(0.25) == 4, "Expected 4"
    assert get_betting_odds(0.1) == 10, "Expected 10"
    try:
        get_betting_odds(0)
    except ZeroDivisionError:
        pass
    else:
        assert False, "Expected ZeroDivisionError"

    print("All tests passed!")

test_get_betting_odds()

All tests passed!


### Determining the global odds

Here, we will determine the probability of any team winning

In [14]:
df.describe(percentiles=[0, 0.05, 0.1, 0.25, 0.5, 0.75, 0.9, 0.95, 1]).round(2)

Unnamed: 0,season,victories,defeats,overtime_defeats,victory_percentage,scored_goals,received_goals,goal_difference,% of defeats in overtime,goals_ratio
count,582.0,582.0,582.0,582.0,582.0,582.0,582.0,582.0,582.0,582.0
mean,2000.91,36.94,32.35,4.59,45.85,234.06,234.06,0.0,11.75,1.02
std,6.33,8.93,8.41,4.61,10.22,40.55,42.51,45.28,11.38,0.19
min,1990.0,9.0,11.0,0.0,11.9,115.0,115.0,-196.0,0.0,0.51
0%,1990.0,9.0,11.0,0.0,11.9,115.0,115.0,-196.0,0.0,0.51
5%,1991.0,21.05,20.0,0.0,28.03,175.05,168.05,-72.0,0.0,0.73
10%,1992.0,24.1,23.0,0.0,32.1,190.1,187.0,-57.0,0.0,0.79
25%,1996.0,31.0,27.0,0.0,39.0,211.0,207.0,-27.0,0.0,0.88
50%,2001.0,38.0,31.0,4.0,46.3,231.0,232.5,4.0,10.79,1.02
75%,2007.0,43.0,37.0,8.0,53.4,254.0,258.75,31.0,21.39,1.14


We will set the global rate here using the `get_betting_odds` function.

In [15]:
team_winning = 45.85

get_betting_odds(0.4585)

2.1810250817884405

# Team Categorization

Let's discuss how we can classify teams into _leagues_. We want to establish 4 leagues:
- A - league consisting of the best teams,
- B - league consisting of good teams,
- C - league consisting of average teams,
- D - league consisting of the weakest teams.

The above terms are quite subjective, so for the purpose of this task, we will adopt the following assumptions:
- A - the top 5% of teams,
- B - teams performing better than 70% of the group but worse than league A,
- C - teams performing better than 20% of the group but worse than league B,
- D - the remaining teams.

To accomplish this task, we will additionally implement the function `assign_team_to_league`.

## Determination of cutoff points for individual leagues

In [16]:
def assign_team_to_league(x):

    victory_per = pd.pivot_table(
        df,
        values="victory_percentage",
        index="team",
        aggfunc="mean")

    league = None

    victory_per_sorted = victory_per.sort_values(
        by="victory_percentage",
        ascending=False)

    top5 = int(round(0.35 * 5, 0))
    top30 = int(round(0.35 * 30, 0))
    top80 = int(round(0.35 * 80, 0))

    top5_teams = victory_per_sorted.index[:top5].tolist()
    top30_teams = victory_per_sorted.index[:top30].tolist()
    top80_teams = victory_per_sorted.index[:top80].tolist()

    if x in top5_teams:
        league = "A"
    elif x in top30_teams:
        league = "B"
    elif x in top80_teams:
        league = "C"
    else:
        league = "D"

    return league


In [17]:
teams = df["team"].unique()

for team in teams:
    group = assign_team_to_league(team)
    print(f"{team} belongs to the {group} group.")

Boston Bruins belongs to the B group.
Buffalo Sabres belongs to the C group.
Calgary Flames belongs to the C group.
Chicago Blackhawks belongs to the C group.
Detroit Red Wings belongs to the A group.
Edmonton Oilers belongs to the C group.
Hartford Whalers belongs to the D group.
Los Angeles Kings belongs to the C group.
Minnesota North Stars belongs to the D group.
Montreal Canadiens belongs to the C group.
New Jersey Devils belongs to the A group.
New York Islanders belongs to the D group.
New York Rangers belongs to the C group.
Philadelphia Flyers belongs to the B group.
Pittsburgh Penguins belongs to the B group.
Quebec Nordiques belongs to the C group.
St. Louis Blues belongs to the B group.
Toronto Maple Leafs belongs to the C group.
Vancouver Canucks belongs to the B group.
Washington Capitals belongs to the C group.
Winnipeg Jets belongs to the D group.
Atlanta Thrashers belongs to the D group.
Carolina Hurricanes belongs to the C group.
Colorado Avalanche belongs to the B gr

## Determination of odds per league

Here we set the betting odds for each league, which will allow us to draw final conclusions and establish the basic odds for individual teams.

In [18]:
df["group"] = df["team"].apply(assign_team_to_league)

groups = pd.pivot_table(
    df,
    values="victory_percentage",
    index="group",
    aggfunc="mean"
)

groups

Unnamed: 0_level_0,victory_percentage
group,Unnamed: 1_level_1
A,56.016667
B,49.665517
C,44.678797
D,38.140506


Now, we can calculate the final base betting odds.

In [19]:
def base_odds(group1,group2):
    groups = pd.pivot_table(
    df,
    values="victory_percentage",
    index="group",
    aggfunc="mean")
    
    p1 = groups.loc[group1, "victory_percentage"]
    p2 = groups.loc[group2, "victory_percentage"]

    
    total = p1 + p2
    prob = p1 / total

    result = get_betting_odds(prob)

    return result

In [23]:
base_odds("D","A")

2.468692265994048

# Discussion

We have obtained certain odds values for each league. But how does this translate into real business? The entire task was about determining certain values from which a bookmaker can begin operations. Correct determination of these values is critical to attract customers to place bets with us, and on the other hand, inappropriate determination may lead to financial losses in the first days of operation.

For the purposes of this assignment, the calculation of betting odds was intentionally simplified.