https://statsbomb.com/soccer-metrics/expected-goals-xg-explained/

So, above is the "expected goals" aka xG metric, which is fairly new in the world of football. Knowing the definition and how it works, so here's TLDR version if you're not up for a lesson:

xG stands for expected goals and is a statistical measure that quantifies the quality of goal-scoring chances in a football match. It provides a numerical value between 0 and 1 to each scoring opportunity, representing the likelihood of a goal being scored from that chance.

xG takes into account various factors such as the location of the shot, the angle, the distance from the goal, the body part used to shoot, and sometimes additional factors like the number of defenders in the way or the assist type. By analyzing large amounts of historical data, statisticians and analysts can assign probabilities to each type of chance.

This metric is valuable because it helps to assess the performance of a team or player beyond just the number of goals scored. It provides a more accurate picture of the quality of scoring opportunities created or conceded, allowing for a better understanding of a team's attacking or defensive effectiveness. xG is widely used in modern football analysis and is often referenced in match reports, player evaluations, and tactical discussions.


Is there a chance to introduce xB as "expected baskets" in basketball? What metrics should we use?

### 1. Defining a function to calculate xB

In [2]:
def calculate_xB(shots_data):
    total_xB = 0
    # Iretate thru each shot in database
    for shot in shots_data:
        xB = calculate_shot_xB(shot)
        total_xB += xB
    return total_xB

### 2. Defining a function to calculate xB value for a specific shot

In [3]:
def calculate_shot_xB(shot):
    # Perform calculations to determine xB for each shot
    # Factors to consider may include shot distance, shot type, defender proximity, etc.
    xB = 0.5  # Placeholder value, replace with actual calculation

    return xB

### 3. Sample database

In [3]:
# Example usage
shots_data = [
    {'distance': 5, 'type': 'layup', 'defender_distance': 2},
    {'distance': 20, 'type': 'jump shot', 'defender_distance': 1},
    # Add more shot data as needed
]

### 4. Add more sample data to the database with randomizer

In [14]:
import random

def generate_random_shot():
    distance = random.randint(1, 30)
    types = ['layup', 'jump shot', 'dunk', 'three-pointer']  # Add more types as needed
    defender_distance = random.randint(1, 10)
    
    shot_result = random.choices(['scored', 'missed'], weights=[0.65, 0.35])[0]
    team = random.choices(['home', 'away'], weights=[0.55, 0.45])[0]
    
    return {'distance': distance, 'type': random.choice(types), 'defender_distance': defender_distance, 'result': shot_result, 'team': team}

# Generate 5000 random shots
shots_data = []
for _ in range(5000):
    shot = generate_random_shot()
    shots_data.append(shot)

# Print the generated shots data
for i, shot in enumerate(shots_data):
    print(f"Shot {i + 1}: {shot}")



Shot 1: {'distance': 2, 'type': 'three-pointer', 'defender_distance': 3, 'result': 'scored', 'team': 'away'}
Shot 2: {'distance': 9, 'type': 'layup', 'defender_distance': 6, 'result': 'scored', 'team': 'away'}
Shot 3: {'distance': 1, 'type': 'dunk', 'defender_distance': 5, 'result': 'missed', 'team': 'home'}
Shot 4: {'distance': 4, 'type': 'layup', 'defender_distance': 5, 'result': 'scored', 'team': 'home'}
Shot 5: {'distance': 26, 'type': 'layup', 'defender_distance': 6, 'result': 'scored', 'team': 'away'}
Shot 6: {'distance': 12, 'type': 'dunk', 'defender_distance': 5, 'result': 'missed', 'team': 'away'}
Shot 7: {'distance': 4, 'type': 'dunk', 'defender_distance': 8, 'result': 'scored', 'team': 'away'}
Shot 8: {'distance': 24, 'type': 'layup', 'defender_distance': 2, 'result': 'scored', 'team': 'home'}
Shot 9: {'distance': 18, 'type': 'three-pointer', 'defender_distance': 6, 'result': 'missed', 'team': 'home'}
Shot 10: {'distance': 15, 'type': 'three-pointer', 'defender_distance': 4,

### 5. Testing the model on random data

In [12]:
def calculate_xB(shots_data):
    total_xB = 0
    for shot in shots_data:
        xB = calculate_shot_xB(shot)
        total_xB += xB
    return total_xB

def calculate_shot_xB(shot):
    # Perform calculations to determine xB for each shot
    # Factors to consider may include shot distance, shot type, defender proximity, etc.
    xB = 0.5  # Placeholder value, replace with actual calculation
    if shot['result'] == 'scored':
        xB *= calculate_scored_xB(shot)
    else:
        xB *= calculate_missed_xB(shot)

    return xB

def calculate_scored_xB(shot):
    # Perform calculations for xB when the shot is scored
    # Modify the calculation based on the factors specific to scored shots
    xB_scored = 1  # Placeholder value, replace with actual calculation

    return xB_scored

def calculate_missed_xB(shot):
    # Perform calculations for xB when the shot is missed
    # Modify the calculation based on the factors specific to missed shots
    xB_missed = 0.45 # Placeholder value, replace with actual calculation

    return xB_missed

# Example usage
shots_data = shots_data

expected_baskets = calculate_xB(shots_data)
print(f"Expected Baskets: {expected_baskets}")


Expected Baskets: 405.95000000000186


### 6. Adding home and away as a metric /sample data code is updated/

### 7. Making calculations for expected baskets for home and away team

In [16]:
def calculate_xB(shots_data):
    total_home_xB = 0
    total_away_xB = 0
    
    for shot in shots_data:
        xB = calculate_shot_xB(shot)
        if shot['team'] == 'home':
            total_home_xB += xB
        elif shot['team'] == 'away':
            total_away_xB += xB
    
    return total_home_xB, total_away_xB

def calculate_shot_xB(shot):
    # Perform calculations to determine xB for each shot
    # Factors to consider may include shot distance, shot type, defender proximity, etc.
    xB = 0.5  # Placeholder value, replace with actual calculation
    if shot['result'] == 'scored':
        xB *= calculate_scored_xB(shot)
    else:
        xB *= calculate_missed_xB(shot)

    return xB

def calculate_scored_xB(shot):
    # Perform calculations for xB when the shot is scored
    # Modify the calculation based on the factors specific to scored shots
    home_factor = 1.2  # Placeholder value, replace with actual calculation
    away_factor = 0.8  # Placeholder value, replace with actual calculation
    xB_scored = 1.0  # Placeholder value, replace with actual calculation

    if shot['team'] == 'home':
        xB_scored *= home_factor
    elif shot['team'] == 'away':
        xB_scored *= away_factor

    return xB_scored

def calculate_missed_xB(shot):
    # Perform calculations for xB when the shot is missed
    # Modify the calculation based on the factors specific to missed shots
    home_factor = 0.9  # Placeholder value, replace with actual calculation
    away_factor = 1.1  # Placeholder value, replace with actual calculation
    xB_missed = 0.35  # Placeholder value, replace with actual calculation

    if shot['team'] == 'home':
        xB_missed *= home_factor
    elif shot['team'] == 'away':
        xB_missed *= away_factor

    return xB_missed

# Example usage
shots_data = shots_data

home_baskets, away_baskets = calculate_xB(shots_data)
print(f"Expected Baskets for Home: {home_baskets}")
print(f"Expected Baskets for Away: {away_baskets}")


Expected Baskets for Home: 1213.7400000000193
Expected Baskets for Away: 742.3399999999768


### 8. Updating calculations with standardizer for the placeholder values

In [20]:
def calculate_xB(shots_data):
    total_home_xB = 0
    total_away_xB = 0
    
    for shot in shots_data:
        xB = calculate_shot_xB(shot)
        if shot['team'] == 'home':
            total_home_xB += xB
        elif shot['team'] == 'away':
            total_away_xB += xB
    
    return total_home_xB, total_away_xB

def calculate_shot_xB(shot):
    # Perform calculations to determine xB for each shot
    # Factors to consider may include shot distance, shot type, defender proximity, etc.
    xB = 0.5  # Placeholder value, replace with actual calculation
    if shot['result'] == 'scored':
        xB *= calculate_scored_xB(shot)
    else:
        xB *= calculate_missed_xB(shot)

    return xB

def calculate_scored_xB(shot):
    # Perform calculations for xB when the shot is scored
    # Modify the calculation based on the factors specific to scored shots
    home_factor_min = 0.8  # Placeholder value, replace with the actual minimum value
    home_factor_max = 1.2  # Placeholder value, replace with the actual maximum value
    away_factor_min = 0.8  # Placeholder value, replace with the actual minimum value
    away_factor_max = 1.1  # Placeholder value, replace with the actual maximum value
    
    home_factor = 1.0  # Placeholder value, replace with the actual home factor
    away_factor = 1.0  # Placeholder value, replace with the actual away factor
    
    home_factor_std = standardize_value(home_factor, home_factor_min, home_factor_max)
    away_factor_std = standardize_value(away_factor, away_factor_min, away_factor_max)

    xB_scored = 1.0  # Placeholder value, replace with actual calculation

    if shot['team'] == 'home':
        xB_scored *= home_factor_std
    elif shot['team'] == 'away':
        xB_scored *= away_factor_std

    return xB_scored

def calculate_missed_xB(shot):
    # Perform calculations for xB when the shot is missed
    # Modify the calculation based on the factors specific to missed shots
    home_factor_min = 0.5  # Placeholder value, replace with the actual minimum value
    home_factor_max = 1  # Placeholder value, replace with the actual maximum value
    away_factor_min = 0.5 # Placeholder value, replace with the actual minimum value
    away_factor_max = 1  # Placeholder value, replace with the actual maximum value
    
    home_factor = (home_factor_min + home_factor_max) / 2
    away_factor = (away_factor_min + away_factor_max) / 2
    
    home_factor_std = standardize_value(home_factor, home_factor_min, home_factor_max)
    away_factor_std = standardize_value(away_factor, away_factor_min, away_factor_max)

    xB_missed = 0.45  # Placeholder value, replace with actual calculation

    if shot['team'] == 'home':
        xB_missed *= home_factor_std
    elif shot['team'] == 'away':
        xB_missed *= away_factor_std

    return xB_missed

# Example usage
shots_data = shots_data

home_baskets, away_baskets = calculate_xB(shots_data)
print(f"Expected Baskets for Home: {home_baskets}")
print(f"Expected Baskets for Away: {away_baskets}")


Expected Baskets for Home: 550.3500000000039
Expected Baskets for Away: 579.8999999999976
