# Computation Intelligence for Optimization | Sports League Optimization

`Group AM`
- Eduardo Mendes, 20240850
- Helena Duarte, 20240530
- João Freire, 20240528
- Mariana Sousa, 20240516

<div class="alert alert-block alert-info">

# Table of Contents
    
[1. Import Libraries](#1)<br>

[2. Load data](#2)<br>

<a class="anchor" id="1">

# 1. Import Libraries
    
</a>

In [63]:
import os
import pandas as pd
import numpy as np


from copy import deepcopy
from random import random, sample, choice, randint
import copy

<a class="anchor" id="2">

# 2. Load data
    
</a>

In [64]:
#data_dir= os.path.join(os.getcwd(), 'players(in).csv')

# Read the CSV file into a DataFrame
df = pd.read_csv("players(in).csv", index_col=0)
df.head()

Unnamed: 0,Name,Position,Skill,Salary (€M)
0,Alex Carter,GK,85,90
1,Jordan Smith,GK,88,100
2,Ryan Mitchell,GK,83,85
3,Chris Thompson,GK,80,80
4,Blake Henderson,GK,87,95


<a class="anchor" id="3">

# 3. Problem Definiton
    
</a>

In a fantasy sports league, the objective is to assign players to teams in a way that ensures
a balanced distribution of talent while staying within salary caps.

1) Each player is defined by the following attributes:
* Skill rating: Represents the player's ability.
* Cost: The player's salary.
* Position (One of four roles) : Goalkeeper (GK), Defender (DEF), Midfielder (MID), or Forward (FWD).

A solution is a complete league configuration, specifying the team assignment for each player. These are the constraints that must be verified in every solution of the search space (no object is considered a solution if it doesn’t comply with these):
* Each team must consist of: 1 Goalkeeper, 2 Defenders, 2 Midfielders and 2
Forwards.
* Each player is assigned to exactly one team.

*Impossible Configurations*: Teams that do not follow this exact structure (e.g., a team with 2 goalkeepers, or a team where the same defender is assigned twice) are not part of the search space and are not considered solutions. It is forbidden to generate such an arrangement during evolution.

Besides that, each team should not exceed a 750€ million total budget. If it does, it is not a valid solution and the fitness value should reflect that.

The `objective` is to create a balanced league that complies with the constraints. 
A balanced league a is a league where the average skill rating of the players is roughly the same among the teams. 
This can be measured by the standard deviation of the average skill rating of the teams.

You can find a dataset of players with their names, position, skill rating and salary (in million €).
These players should be distributed across 5 teams of 7 players each.

# 4.Representação com classes player, team and League 

In [65]:
# change the column names to lowercase
df.rename(columns={
    "Name": "name",
    "Position": "position",
    "Skill": "skill",
    "Salary (€M)": "salary"
}, inplace=True)

In [66]:
df.head(10) # show the first 10 rows of the dataframe with the renamed columns

Unnamed: 0,name,position,skill,salary
0,Alex Carter,GK,85,90
1,Jordan Smith,GK,88,100
2,Ryan Mitchell,GK,83,85
3,Chris Thompson,GK,80,80
4,Blake Henderson,GK,87,95
5,Daniel Foster,DEF,90,110
6,Lucas Bennett,DEF,85,90
7,Owen Parker,DEF,88,100
8,Ethan Howard,DEF,80,70
9,Mason Reed,DEF,82,75


In [67]:
position_order = ["GK", "DEF", "MID", "FWD"] # the order of positions

# Create a mapping dictionary: {'GK': 0, 'DEF': 1, 'MID': 2, 'FWD': 3}
# x.map(...) converts each "position" value in the DataFrame to its corresponding order index
# Sorts the DataFrame according to these mapped indices
df_sorted = df.sort_values(by="position", key=lambda x: x.map({pos: i for i, pos in enumerate(position_order)})).reset_index(drop=True)
df_sorted.head(10)

Unnamed: 0,name,position,skill,salary
0,Alex Carter,GK,85,90
1,Jordan Smith,GK,88,100
2,Ryan Mitchell,GK,83,85
3,Chris Thompson,GK,80,80
4,Blake Henderson,GK,87,95
5,Daniel Foster,DEF,90,110
6,Lucas Bennett,DEF,85,90
7,Owen Parker,DEF,88,100
8,Ethan Howard,DEF,80,70
9,Mason Reed,DEF,82,75


In [68]:
# Create a column "id" in the sorted DataFrame
df_sorted["id"]= df_sorted.index

In [69]:
df_sorted.head(10)


Unnamed: 0,name,position,skill,salary,id
0,Alex Carter,GK,85,90,0
1,Jordan Smith,GK,88,100,1
2,Ryan Mitchell,GK,83,85,2
3,Chris Thompson,GK,80,80,3
4,Blake Henderson,GK,87,95,4
5,Daniel Foster,DEF,90,110,5
6,Lucas Bennett,DEF,85,90,6
7,Owen Parker,DEF,88,100,7
8,Ethan Howard,DEF,80,70,8
9,Mason Reed,DEF,82,75,9


In [70]:
# Create a list of players with their attributes, each player is a dictionary with their attributes, such as name, position, skill level and salary
dict_players = df_sorted.to_dict(orient="records")
dict_players



[{'name': 'Alex Carter', 'position': 'GK', 'skill': 85, 'salary': 90, 'id': 0},
 {'name': 'Jordan Smith',
  'position': 'GK',
  'skill': 88,
  'salary': 100,
  'id': 1},
 {'name': 'Ryan Mitchell',
  'position': 'GK',
  'skill': 83,
  'salary': 85,
  'id': 2},
 {'name': 'Chris Thompson',
  'position': 'GK',
  'skill': 80,
  'salary': 80,
  'id': 3},
 {'name': 'Blake Henderson',
  'position': 'GK',
  'skill': 87,
  'salary': 95,
  'id': 4},
 {'name': 'Daniel Foster',
  'position': 'DEF',
  'skill': 90,
  'salary': 110,
  'id': 5},
 {'name': 'Lucas Bennett',
  'position': 'DEF',
  'skill': 85,
  'salary': 90,
  'id': 6},
 {'name': 'Owen Parker',
  'position': 'DEF',
  'skill': 88,
  'salary': 100,
  'id': 7},
 {'name': 'Ethan Howard',
  'position': 'DEF',
  'skill': 80,
  'salary': 70,
  'id': 8},
 {'name': 'Mason Reed', 'position': 'DEF', 'skill': 82, 'salary': 75, 'id': 9},
 {'name': 'Logan Brooks',
  'position': 'DEF',
  'skill': 86,
  'salary': 95,
  'id': 10},
 {'name': 'Caleb Fisher

In [25]:
# The goal is to create a dict that maps each position ["GK", "DEF", "MID", "FWD"]
# to a list of player IDs who play that position — in the specific order: GK → DEF → MID → FWD

team_order= ["GK", "DEF", "MID", "FWD"]
team_grouped = df_sorted.groupby("position")["id"].apply(list) # in the end converts the group of ids to a list

# Create a dictionary that maps each position to its corresponding player IDs
position_id_map = {pos: team_grouped[pos] for pos in team_order if pos in team_grouped}

In [26]:
position_id_map

{'GK': [0, 1, 2, 3, 4],
 'DEF': [5, 6, 7, 8, 9, 10, 11, 12, 13, 14],
 'MID': [15, 16, 17, 18, 19, 20, 21, 22, 23, 24],
 'FWD': [25, 26, 27, 28, 29, 30, 31, 32, 33, 34]}

In [30]:
import random

* this is a possible function to generate one team, just to understand what we have to do

In [449]:
# create a function that generates a team from the dictionary of indices
def generate_team_indices(position_id_map):

    """
    Generates a valid sports team from a dictionary of player indices by position.

    This function randomly selects player indices to form a complete team that adheres
    to the required composition constraints:
    - 1 Goalkeeper (GK)
    - 2 Defenders (DEF)
    - 2 Midfielders (MID)
    - 2 Forwards (FWD)

    The selection is made from a dictionary (`position_id_map`) that maps each position
    to a list of available player indices. 
    
    Parameters:
        players (pd.DataFrame): A DataFrame containing player information (not used directly in this function).
        position_id_map (dict): A dictionary mapping positions ("GK", "DEF", "MID", "FWD")
                                to lists of available player indices.

    Returns:
        list: A list of player indices representing one valid team.

    """
    # generates a team from a dictionary of indices

    # Randomly select one valid team (just player indices)
    team = []
    team += random.sample(position_id_map["GK"], 1)  # Select 1 goalkeeper
    team += random.sample(position_id_map["DEF"], 2) # Select 2 defenders
    team += random.sample(position_id_map["MID"], 2) # Select 2 midfielders
    team += random.sample(position_id_map["FWD"], 2) # Select 2 forwards

    return team

In [450]:
position_id_map

{'GK': [0, 1, 2, 3, 4],
 'DEF': [5, 6, 7, 8, 9, 10, 11, 12, 13, 14],
 'MID': [15, 16, 17, 18, 19, 20, 21, 22, 23, 24],
 'FWD': [25, 26, 27, 28, 29, 30, 31, 32, 33, 34]}

In [451]:
team_indices = generate_team_indices(position_id_map)
team_indices

[1, 5, 11, 21, 18, 33, 27]

* this is a function to understand how to create a valid league

In [453]:
def generate_league(players):
    from copy import deepcopy
    import random
    num_teams=5
    
    # Step 1: Copy the player pool
    available_indices = list(range(len(players)))
    random.shuffle(available_indices)  # randomize pool to start
    
    # Step 2: Group indices by position
    def group_available(indices):
        from collections import defaultdict
        pos_map = defaultdict(list)
        for i in indices:
            pos_map[players[i]["position"]].append(i)
        return pos_map
    
    league = []

    for _ in range(num_teams):
        pos_to_indices = group_available(available_indices)

        # Check we still have enough players per role
        if (len(pos_to_indices["GK"]) < 1 or
            len(pos_to_indices["DEF"]) < 2 or
            len(pos_to_indices["MID"]) < 2 or
            len(pos_to_indices["FWD"]) < 2):
            raise ValueError("Not enough players left to form a complete team.")

        team = []
        team += random.sample(pos_to_indices["GK"], 1)
        team += random.sample(pos_to_indices["DEF"], 2)
        team += random.sample(pos_to_indices["MID"], 2)
        team += random.sample(pos_to_indices["FWD"], 2)

        # Remove these players from the available pool
        for idx in team:
            available_indices.remove(idx)

        league.append(team)

    return league


In [None]:
league = generate_league(dict_players) 
league

[[0, 12, 9, 18, 15, 30, 26],
 [3, 5, 10, 20, 23, 34, 28],
 [2, 8, 13, 19, 21, 25, 29],
 [1, 11, 14, 16, 17, 27, 32],
 [4, 7, 6, 22, 24, 31, 33]]

# SOLUTION REPRESENTATION

In [38]:
from abc import ABC, abstractmethod

In [71]:
class Solution(ABC):
    def __init__(self, repr=None):
        # To initialize a solution we need to know it's representation.
        # If no representation is given, a representation is randomly initialized.
        if repr == None:
            repr = self.random_initial_representation()
        # Attributes
        self.repr = repr

    # Method that is called when we run print(object of the class)
    def __repr__(self):
        return str(self.repr)

    # Other methods that must be implemented in subclasses
    @abstractmethod
    def fitness(self):
        pass

    @abstractmethod
    def random_initial_representation():
        pass


In [72]:
df_sorted.head(5)

Unnamed: 0,name,position,skill,salary,id
0,Alex Carter,GK,85,90,0
1,Jordan Smith,GK,88,100,1
2,Ryan Mitchell,GK,83,85,2
3,Chris Thompson,GK,80,80,3
4,Blake Henderson,GK,87,95,4


In [456]:
def generate_league(df):

    import random
    from collections import defaultdict
    
    num_teams=5
    available_ids = df.index.tolist()
    random.shuffle(available_ids)

    league = []

    for _ in range(num_teams):
        pos_map = defaultdict(list)

        # Build position map using current available players
        for i in available_ids:
            pos = df.loc[i, "position"]
            pos_map[pos].append(i)

        # Check we have enough players left per role
        if (len(pos_map["GK"]) < 1 or
            len(pos_map["DEF"]) < 2 or
            len(pos_map["MID"]) < 2 or
            len(pos_map["FWD"]) < 2):
            raise ValueError("Not enough players left to form a full team")

        # Select players for the team
        team = []
        team += random.sample(pos_map["GK"], 1)
        team += random.sample(pos_map["DEF"], 2)
        team += random.sample(pos_map["MID"], 2)
        team += random.sample(pos_map["FWD"], 2)

        # Remove them from pool
        for idx in team:
            available_ids.remove(idx)

        league.append(team)

    return league

In [525]:
class SportsLeagueSolution(Solution):
    def __init__(self, repr=None, players_df=df_sorted):
        self.players_df = players_df
        super().__init__(repr=repr)

    
    def random_initial_representation(self):
        self.repr = generate_league(self.players_df)

        # Reset the 'team' column
        self.players_df["team"] = -1

        # Assign team number to each player
        for team_idx, team in enumerate(self.repr):
            self.players_df.loc[team, "team"] = team_idx

        return self.repr

    def fitness(self):
        team_avg_skills = []

        # Reset team column (optional safety)
        self.players_df["team"] = -1

        for team_idx, team in enumerate(self.repr):
            self.players_df.loc[team, "team"] = team_idx
            team_df = self.players_df.loc[team]

            total_salary = team_df["salary"].sum()
            if total_salary > 750:
                return 1e9  # Penalize invalid solution

            avg_skill = team_df["skill"].mean()
            team_avg_skills.append(avg_skill)

        # Minimize std deviation of team avg skills → balanced league
        return float(np.std(team_avg_skills))


In [526]:
sol1 = SportsLeagueSolution()
sol1

[[0, 9, 11, 16, 17, 34, 32], [3, 6, 7, 21, 20, 26, 29], [1, 5, 12, 24, 18, 31, 25], [4, 13, 8, 23, 19, 28, 27], [2, 14, 10, 15, 22, 33, 30]]

In [527]:
sol1.fitness()

0.7526822784182577

In [528]:
sol2 = SportsLeagueSolution()
sol2

[[1, 9, 14, 15, 16, 29, 27], [3, 11, 10, 20, 17, 31, 34], [0, 13, 6, 21, 22, 32, 25], [2, 8, 5, 19, 18, 26, 33], [4, 7, 12, 24, 23, 28, 30]]

In [529]:
sol2.repr

[[1, 9, 14, 15, 16, 29, 27],
 [3, 11, 10, 20, 17, 31, 34],
 [0, 13, 6, 21, 22, 32, 25],
 [2, 8, 5, 19, 18, 26, 33],
 [4, 7, 12, 24, 23, 28, 30]]

In [530]:
sol2.fitness()

1.0513353995985193

## Mutation Operators

* create the 3 different mutation operators
* maybe run multiple times with the 3 and then compare results

Mutation types:

* PlayerSwap: Basic switch one player with another in the same position
* RoleShuffle: Choose a role, remove all players from that role, shuffle and re-atribute to teams
* PlayerRoleLeftShift: Select a player role (GK, DEF, MID, FWD) and shifts the players in that role to the left across teams, by a random number of positions

In [626]:
from abc import ABC, abstractmethod
from random import randint, shuffle, choice
from copy import deepcopy

In [627]:
class MutationOperator(ABC):
    @abstractmethod
    def mutate(self, solution):
        pass

In [630]:
class PlayerSwapMutation(MutationOperator):
    def mutate(self, solution, verbose=False):
        new_repr = deepcopy(solution.repr)

        # Choose the id within the team of the player that will be swapped
        player_to_swap = randint(0, 6)

        # Choose the teams where the players will be swapped. Make sure they are different
        team_to_swap_1 = randint(0, 4)
        team_to_swap_2 = randint(0, 4)
        while team_to_swap_1 == team_to_swap_2:
            team_to_swap_2 = randint(0, 4)

        if verbose:
            # Extract player IDs before swap for accurate logging
            pid1 = new_repr[team_to_swap_1][player_to_swap]
            pid2 = new_repr[team_to_swap_2][player_to_swap]

            print(f"Swapping player {pid1} from team {team_to_swap_1} "
                f"with player {pid2} from team {team_to_swap_2}")

        # Swap players at the chosen index
        new_repr[team_to_swap_1][player_to_swap], new_repr[team_to_swap_2][player_to_swap] = pid2, pid1

        mutated = deepcopy(solution)
        mutated.repr = new_repr
        return mutated

In [631]:
original_solution = SportsLeagueSolution()

In [632]:
mutation = PlayerSwapMutation()
original_repr = deepcopy(original_solution.repr) 
mutated_solution = mutation.mutate(original_solution, verbose=True)

print(f"Original Solution: {original_repr}")
print(f"Mutated Solution: {mutated_solution}")

Swapping player 27 from team 4 with player 33 from team 0
Original Solution: [[0, 5, 9, 23, 18, 31, 33], [1, 11, 10, 15, 17, 26, 30], [4, 12, 14, 21, 19, 32, 34], [2, 7, 13, 22, 24, 25, 28], [3, 8, 6, 16, 20, 29, 27]]
Mutated Solution: [[0, 5, 9, 23, 18, 31, 27], [1, 11, 10, 15, 17, 26, 30], [4, 12, 14, 21, 19, 32, 34], [2, 7, 13, 22, 24, 25, 28], [3, 8, 6, 16, 20, 29, 33]]


In [633]:
class RoleShuffleMutation(MutationOperator):
    def mutate(self, solution, verbose=False):
        new_repr = deepcopy(solution.repr)
        
        # Choose the role that will be affected
        # Remembering that the player IDs withing the team correspond to {"GK": 0, "DEF": [1, 2], "MID": [3, 4], "FWD": [5, 6]}
        i = randint(0, 6)
        if i in [1, 2]:
            i = [1, 2]
        elif i in [3, 4]:
            i = [3, 4]
        elif i in [5, 6]:
            i = [5, 6]
        else:
            i = [0]
        
        # If verbose, print the role and indexes being shuffled
        if verbose:
            role_map = {
                "GK": [0],
                "DEF": [1, 2],
                "MID": [3, 4],
                "FWD": [5, 6]
            }
            inv_map = {tuple(v): k for k, v in role_map.items()}
            role_name = inv_map[tuple(i)]
            print(f"Shuffling players in role {role_name}, corresponding to indexes {i}")


        # Remove all the players from the selected role and shuffle them
        bag_of_players = []
        for team in new_repr:
            bag_of_players += [team[i] for i in i]
            
        shuffle(bag_of_players)

        # Once shuffled, put them back in the teams
        index = 0
        for team in new_repr:
            for j in i:
                team[j] = bag_of_players[index]
                index += 1

        mutated = deepcopy(solution)
        mutated.repr = new_repr
        return mutated

In [636]:
original_solution = SportsLeagueSolution(players_df=df_sorted)
original_repr = deepcopy(original_solution.repr)  # Capture pre-mutation state

mutation = RoleShuffleMutation()
mutated_solution = mutation.mutate(original_solution, verbose=True)

print(f"Original Solution: {original_repr}")
print(f"Mutated Solution: {mutated_solution}")

Shuffling players in role FWD, corresponding to indexes [5, 6]
Original Solution: [[2, 14, 12, 15, 16, 28, 27], [3, 10, 5, 24, 20, 34, 25], [1, 7, 8, 18, 19, 26, 29], [4, 6, 13, 17, 22, 30, 31], [0, 11, 9, 21, 23, 32, 33]]
Mutated Solution: [[2, 14, 12, 15, 16, 33, 30], [3, 10, 5, 24, 20, 25, 34], [1, 7, 8, 18, 19, 26, 27], [4, 6, 13, 17, 22, 31, 29], [0, 11, 9, 21, 23, 32, 28]]


In [637]:
class PlayerRoleLeftShiftMutation(MutationOperator):
    def mutate(self, solution, verbose=False):
        new_repr = deepcopy(solution.repr)

        # Choose the role that will be affected
        # Remembering that the player IDs withing the team correspond to {"GK": 0, "DEF": [1, 2], "MID": [3, 4], "FWD": [5, 6]}
        i = randint(0, 6)
        if i in [1, 2]:
            i = [1, 2]
        elif i in [3, 4]:
            i = [3, 4]
        elif i in [5, 6]:
            i = [5, 6]
        else:
            i = [0]
        
        # Get all the players from the selected role
        role_players = []
        for team in new_repr:
            for idx in i:
                role_players.append(team[idx])

        # Shift left 
        shift_amount = randint(1, len(new_repr) - 1)
        role_players = role_players[shift_amount:] + role_players[:shift_amount]
        
        # If verbose, print the role and indexes being shifted
        if verbose:
            role_map = {
                "GK": [0],
                "DEF": [1, 2],
                "MID": [3, 4],
                "FWD": [5, 6]
            }
            inv_map = {tuple(v): k for k, v in role_map.items()}
            role_name = inv_map[tuple(i)]
            print(f"Shifting role group {role_name}, corresponding to indexes {i}, by {shift_amount} positions")

        # Reassign to teams
        index = 0
        for team in new_repr:
            for idx in i:
                team[idx] = role_players[index]
                index += 1


        mutated = deepcopy(solution)
        mutated.repr = new_repr
        return mutated

In [638]:
original_solution = SportsLeagueSolution(players_df=df_sorted)
original_repr = deepcopy(original_solution.repr)  # Capture pre-mutation state

mutation = PlayerRoleLeftShiftMutation()
mutated_solution = mutation.mutate(original_solution, verbose=True)

print(f"Original Solution: {original_repr}")
print(f"Mutated Solution: {mutated_solution}")

Shifting role group GK, corresponding to indexes [0], by 4 positions
Original Solution: [[1, 12, 13, 21, 16, 27, 33], [4, 6, 11, 17, 24, 30, 25], [0, 8, 7, 23, 15, 32, 29], [3, 5, 10, 18, 19, 34, 31], [2, 14, 9, 22, 20, 26, 28]]
Mutated Solution: [[2, 12, 13, 21, 16, 27, 33], [1, 6, 11, 17, 24, 30, 25], [4, 8, 7, 23, 15, 32, 29], [0, 5, 10, 18, 19, 34, 31], [3, 14, 9, 22, 20, 26, 28]]


## Crossover Operators

In [466]:
import random
from collections import defaultdict, Counter

def get_position_map(players_df):
    
    """
    Creates a mapping from player ID to their position.

    This function takes a DataFrame of players with at least two columns:
    'id' (unique player identifier) and 'position' (e.g., GK, DEF, MID, FWD),
    and returns a dictionary that maps each player ID to their respective position.

    Args:
        players_df (pd.DataFrame): A DataFrame containing player information,
                                   with 'id' and 'position' columns.

    Returns:
        dict: A dictionary mapping each player ID (int) to their position (str).
              Example: {0: 'GK', 1: 'DEF', 2: 'MID', ...}
    """
    return dict(zip(players_df['id'], players_df['position']))

def standard_crossover_with_position_repair(parent1, parent2, players_df, crossover_point=3):
    crossover_point=random.randint(1, len(parent1) - 1)
    position_map = get_position_map(players_df) # creates a mapping from player ID to their position
    all_players = set(players_df['id']) # creates a set of all players

    # Crossover on team level by the crossover point defined on the arguments
    offspring1 = parent1[:crossover_point] + parent2[crossover_point:]
    offspring2 = parent2[:crossover_point] + parent1[crossover_point:]

    # Position slots by index inside each team
    # we want to always keep the [GK, DEF, DEF, MID, MID, FWD, FWD] formatation
    position_slots = ["GK", "DEF", "DEF", "MID", "MID", "FWD", "FWD"] # players are assumed to always appear in this fixed order

    def repair_offspring(offspring):

        """
        Repairs an offspring by removing duplicate players and replacing 
        them with valid players that were missing from the solution.

        This function ensures:
            - Each player appears only once across all teams.
            - Players are only replaced by others who play the same position.
            - All players from the original player pool are used exactly once.

        The function relies on two external variables:
            - `all_players`: A set of all player IDs expected to be used in the solution.
            - `position_map`: A dictionary mapping player IDs to their positions (e.g., 'GK', 'DEF').
            - `position_slots`: A list defining the position expected at each slot in a team.

        Parameters:
            offspring (list of list of int): A candidate solution represented as a list of teams, 
                                            where each team is a list of player IDs.

        Returns:
            list of list of int: A repaired version of the offspring, where:
                                - All players are unique.
                                - Each player is placed in a slot matching their position.
                                - Missing players are added in valid positions.

        Notes:
            The function modifies teams independently, attempting to replace invalid entries
            (duplicates or misplaced players) with players not yet used but suitable for
            the expected position.
        """
        
        # receives an offspring, thelist of teams and it returns a "repaired" version where:
            # * All players are unique, so no duplicates across teams, that is a problem with the standard crossover
            # * Players are replaced only with others of the same position
            # * Any missing players are added back.


        # Flatten all players used
        flat = [p for team in offspring for p in team] # Flattens the list of teams into a single list of all player IDs
        counts = Counter(flat) # Count occurrences of each player ID ( how many times each player appears in the flattened offspring)

        # Find duplicates and missing players
        duplicates = {p for p, c in counts.items() if c > 1} # get the players that appear more than once
        used = set(flat) # get the players that are used in the offspring
        missing = list(all_players - used) # get the players that are missing in the offspring but should be there (all players come from above code)
        
        # Build available players by position for replacements

        available_by_pos = defaultdict(list)  # create a dict where the key is a position  and the value is a shuffled list of player IDs that are available for that position
        for pid in missing: # for each player ID in the missing list
            pos = position_map[pid] # get the position of the player
            available_by_pos[pos].append(pid) # append the player ID to the list of available players for that position
        for pos in available_by_pos: # for each position in the available players
            random.shuffle(available_by_pos[pos]) # shuffle the list of available players for that position

        # Track used players to avoid duplicates
        used_players = set() #  tracks player IDs that have already been added to teams to prevent further duplicates during the repair process
        repaired_offspring = [] 
        for team in offspring: # for each team in the offspring
            new_team = [] # create a new team
            for idx, pid in enumerate(team): # for each player ID in the team
                pos = position_slots[idx] # get the position of the player based on the index

                # If player is duplicated or already used, replace
                if pid in used_players or pid in duplicates: # if the player ID is already used or is a duplicate
                    if available_by_pos[pos]: # if there are available players for that position
                        replacement = available_by_pos[pos].pop() # get a replacement player ID from the available players for that position
                        new_team.append(replacement) # add the replacement player ID to the new team
                        used_players.add(replacement) # add the replacement player ID to the used players
                    else:
                        # If no available players left in that position (should not happen), keep original
                        new_team.append(pid) # add the original player ID to the new team
                        used_players.add(pid) # add the original player ID to the used players
                else: # if the player ID is not a duplicate and not already used
                    new_team.append(pid) # add the original player ID to the new team
                    used_players.add(pid) # add the original player ID to the used players

            repaired_offspring.append(new_team) # add the new team to the repaired offspring
        return repaired_offspring

    return repair_offspring(offspring1), repair_offspring(offspring2)


In [467]:
parent1 = [[2, 12, 5, 19, 15, 29, 27], [1, 13, 11, 22, 18, 30, 28], [3, 9, 14, 16, 20, 33, 31], [4, 8, 7, 21, 24, 25, 26], [0, 10, 6, 23, 17, 32, 34]]
parent2 = [[2, 7, 9, 20, 24, 32, 25], [3, 8, 10, 15, 19, 34, 27], [1, 12, 5, 23, 21, 29, 30], [0, 14, 11, 22, 16, 31, 33], [4, 6, 13, 17, 18, 28, 26]]


In [468]:
parent1

[[2, 12, 5, 19, 15, 29, 27],
 [1, 13, 11, 22, 18, 30, 28],
 [3, 9, 14, 16, 20, 33, 31],
 [4, 8, 7, 21, 24, 25, 26],
 [0, 10, 6, 23, 17, 32, 34]]

In [469]:
parent2

[[2, 7, 9, 20, 24, 32, 25],
 [3, 8, 10, 15, 19, 34, 27],
 [1, 12, 5, 23, 21, 29, 30],
 [0, 14, 11, 22, 16, 31, 33],
 [4, 6, 13, 17, 18, 28, 26]]

In [470]:
offspring1, offspring2 = standard_crossover_with_position_repair(parent1, parent2, df_sorted, crossover_point=2)


In [471]:
offspring1

[[2, 12, 5, 19, 15, 29, 27],
 [1, 7, 10, 23, 21, 30, 32],
 [3, 9, 8, 24, 20, 34, 25],
 [0, 14, 11, 22, 16, 31, 33],
 [4, 6, 13, 17, 18, 28, 26]]

In [473]:
offspring2

[[2, 13, 9, 20, 16, 28, 33],
 [3, 14, 11, 15, 19, 31, 27],
 [1, 12, 5, 18, 22, 29, 30],
 [4, 8, 7, 21, 24, 25, 26],
 [0, 10, 6, 23, 17, 32, 34]]

In [474]:
sol1 = SportsLeagueSolution()
sol1

[[0, 10, 6, 24, 19, 29, 25], [3, 13, 14, 16, 23, 32, 31], [1, 8, 5, 17, 20, 26, 33], [2, 7, 12, 15, 22, 34, 28], [4, 9, 11, 21, 18, 27, 30]]

In [475]:
sol2 = SportsLeagueSolution()
sol2

[[4, 10, 6, 19, 21, 30, 32], [0, 11, 12, 17, 20, 33, 25], [1, 9, 13, 24, 18, 31, 28], [2, 7, 8, 15, 23, 34, 27], [3, 14, 5, 22, 16, 26, 29]]

In [476]:
sol1.repr

[[0, 10, 6, 24, 19, 29, 25],
 [3, 13, 14, 16, 23, 32, 31],
 [1, 8, 5, 17, 20, 26, 33],
 [2, 7, 12, 15, 22, 34, 28],
 [4, 9, 11, 21, 18, 27, 30]]

In [477]:
sol2.repr

[[4, 10, 6, 19, 21, 30, 32],
 [0, 11, 12, 17, 20, 33, 25],
 [1, 9, 13, 24, 18, 31, 28],
 [2, 7, 8, 15, 23, 34, 27],
 [3, 14, 5, 22, 16, 26, 29]]

In [478]:
offspring1, offspring2 = standard_crossover_with_position_repair(sol1.repr, sol2.repr, df_sorted)

In [479]:
offspring1

[[4, 10, 6, 21, 19, 32, 30],
 [0, 11, 12, 17, 20, 33, 25],
 [1, 9, 13, 24, 18, 31, 28],
 [2, 7, 8, 15, 23, 34, 27],
 [3, 14, 5, 22, 16, 26, 29]]

In [480]:
def validate_league(league, players_df, position_slots = ["GK", "DEF", "DEF", "MID", "MID", "FWD", "FWD"]):
    """
    Validates a league composed of multiple teams.

    Requirements checked:
        - Each player appears exactly once across all teams (no duplicates or missing).
        - Players are placed in the correct position slot according to position_slots.
        - Each team has the correct number of players.

    Parameters:
        league (list of list of int): The league to validate, as a list of teams (each a list of player IDs).
        players_df (pd.DataFrame): DataFrame with at least 'id' and 'position' columns.
        position_slots (list of str): Expected position at each index of a team, e.g., 
                                      ["GK", "DEF", "DEF", "MID", "MID", "FWD", "FWD"]

    Returns:
        bool: True if the league is valid. Otherwise, prints detailed issues and returns False.
    """
    from collections import Counter

    all_player_ids = set(players_df['id'])
    expected_team_size = len(position_slots)

    # 1. Check total number of players
    flat_players = [p for team in league for p in team]
    player_counts = Counter(flat_players)

    # Check if every player is used exactly once
    if set(flat_players) != all_player_ids:
        missing = all_player_ids - set(flat_players)
        extra = set(flat_players) - all_player_ids
        if missing:
            print(f"Missing players: {missing}")
        if extra:
            print(f"Unknown players found: {extra}")
        return False

    duplicates = {p for p, c in player_counts.items() if c > 1}
    if duplicates:
        print(f"Duplicated players: {duplicates}")
        return False

    # 2. Check team structure and position integrity
    position_map = dict(zip(players_df['id'], players_df['position']))
    for i, team in enumerate(league):
        if len(team) != expected_team_size:
            print(f"Team {i} has invalid size {len(team)}. Expected {expected_team_size}.")
            return False
        for idx, pid in enumerate(team):
            expected_pos = position_slots[idx]
            actual_pos = position_map.get(pid)
            if actual_pos != expected_pos:
                print(f"Team {i}, player {pid} in slot {idx} expected {expected_pos}, got {actual_pos}.")
                return False

    print("League is valid!")
    return True


### A lot of examples

In [481]:
sol1 = SportsLeagueSolution()
sol2 = SportsLeagueSolution()
sol3 = SportsLeagueSolution()
sol4 = SportsLeagueSolution()
sol5 = SportsLeagueSolution()
sol6 = SportsLeagueSolution()
sol7 = SportsLeagueSolution()
sol8 = SportsLeagueSolution()
sol9 = SportsLeagueSolution()
sol10 = SportsLeagueSolution()

In [520]:
sol1.repr

[[1, 6, 12, 15, 23, 31, 26],
 [3, 13, 9, 19, 21, 25, 32],
 [2, 10, 8, 20, 18, 27, 28],
 [0, 7, 5, 17, 22, 29, 30],
 [4, 11, 14, 16, 24, 33, 34]]

In [524]:
sol1.fitness()

np.float64(1.6373198607879287)

In [482]:
offspring1, offspring2 = standard_crossover_with_position_repair(sol1.repr, sol2.repr, df_sorted)
offspring3, offspring4 = standard_crossover_with_position_repair(sol1.repr, sol3.repr, df_sorted)
offspring5, offspring6 = standard_crossover_with_position_repair(sol1.repr, sol4.repr, df_sorted)
offspring7, offspring8 = standard_crossover_with_position_repair(sol1.repr, sol5.repr, df_sorted)
offspring9, offspring10 = standard_crossover_with_position_repair(sol1.repr, sol6.repr, df_sorted)

offspring11, offspring12 = standard_crossover_with_position_repair(sol1.repr, sol7.repr, df_sorted)
offspring13, offspring14 = standard_crossover_with_position_repair(sol1.repr, sol8.repr, df_sorted)
offspring15, offspring16 = standard_crossover_with_position_repair(sol1.repr, sol9.repr, df_sorted)
offspring17, offspring18 = standard_crossover_with_position_repair(sol1.repr, sol10.repr, df_sorted)
offspring19, offspring20 = standard_crossover_with_position_repair(sol2.repr, sol3.repr, df_sorted)

offspring21, offspring22 = standard_crossover_with_position_repair(sol2.repr, sol4.repr, df_sorted)
offspring23, offspring24 = standard_crossover_with_position_repair(sol2.repr, sol5.repr, df_sorted)
offspring25, offspring26 = standard_crossover_with_position_repair(sol2.repr, sol6.repr, df_sorted)
offspring27, offspring28 = standard_crossover_with_position_repair(sol2.repr, sol7.repr, df_sorted)
offspring29, offspring30 = standard_crossover_with_position_repair(sol2.repr, sol8.repr, df_sorted)

offspring31, offspring32 = standard_crossover_with_position_repair(sol2.repr, sol9.repr, df_sorted)
offspring33, offspring34 = standard_crossover_with_position_repair(sol2.repr, sol10.repr, df_sorted)
offspring35, offspring36 = standard_crossover_with_position_repair(sol3.repr, sol4.repr, df_sorted)
offspring37, offspring38 = standard_crossover_with_position_repair(sol3.repr, sol5.repr, df_sorted)
offspring39, offspring40 = standard_crossover_with_position_repair(sol3.repr, sol6.repr, df_sorted)

offspring41, offspring42 = standard_crossover_with_position_repair(sol3.repr, sol7.repr, df_sorted)
offspring43, offspring44 = standard_crossover_with_position_repair(sol3.repr, sol8.repr, df_sorted)
offspring45, offspring46 = standard_crossover_with_position_repair(sol3.repr, sol9.repr, df_sorted)
offspring47, offspring48 = standard_crossover_with_position_repair(sol3.repr, sol10.repr, df_sorted)
offspring49, offspring50 = standard_crossover_with_position_repair(sol4.repr, sol5.repr, df_sorted)

offspring51, offspring52 = standard_crossover_with_position_repair(sol4.repr, sol6.repr, df_sorted)
offspring53, offspring54 = standard_crossover_with_position_repair(sol4.repr, sol7.repr, df_sorted)
offspring55, offspring56 = standard_crossover_with_position_repair(sol4.repr, sol8.repr, df_sorted)
offspring57, offspring58 = standard_crossover_with_position_repair(sol4.repr, sol9.repr, df_sorted)
offspring59, offspring60 = standard_crossover_with_position_repair(sol4.repr, sol10.repr, df_sorted)

offspring61, offspring62 = standard_crossover_with_position_repair(sol5.repr, sol6.repr, df_sorted)
offspring63, offspring64 = standard_crossover_with_position_repair(sol5.repr, sol7.repr, df_sorted)
offspring65, offspring66 = standard_crossover_with_position_repair(sol5.repr, sol8.repr, df_sorted)
offspring67, offspring68 = standard_crossover_with_position_repair(sol5.repr, sol9.repr, df_sorted)
offspring69, offspring70 = standard_crossover_with_position_repair(sol5.repr, sol10.repr, df_sorted)

offspring71, offspring72 = standard_crossover_with_position_repair(sol6.repr, sol7.repr, df_sorted)
offspring73, offspring74 = standard_crossover_with_position_repair(sol6.repr, sol8.repr, df_sorted)
offspring75, offspring76 = standard_crossover_with_position_repair(sol6.repr, sol9.repr, df_sorted)
offspring77, offspring78 = standard_crossover_with_position_repair(sol6.repr, sol10.repr, df_sorted)
offspring79, offspring80 = standard_crossover_with_position_repair(sol7.repr, sol8.repr, df_sorted)

offspring81, offspring82 = standard_crossover_with_position_repair(sol7.repr, sol9.repr, df_sorted)
offspring83, offspring84 = standard_crossover_with_position_repair(sol7.repr, sol10.repr, df_sorted)
offspring85, offspring86 = standard_crossover_with_position_repair(sol8.repr, sol9.repr, df_sorted)
offspring87, offspring88 = standard_crossover_with_position_repair(sol8.repr, sol10.repr, df_sorted)
offspring89, offspring90 = standard_crossover_with_position_repair(sol9.repr, sol10.repr, df_sorted)

In [483]:
print(validate_league(offspring1, df_sorted))
print(validate_league(offspring2, df_sorted))
print(validate_league(offspring3, df_sorted))
print(validate_league(offspring4, df_sorted))
print(validate_league(offspring5, df_sorted))
print(validate_league(offspring6, df_sorted))
print(validate_league(offspring7, df_sorted))
print(validate_league(offspring8, df_sorted))
print(validate_league(offspring9, df_sorted))
print(validate_league(offspring10, df_sorted))


League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True


In [484]:
print(validate_league(offspring11, df_sorted))
print(validate_league(offspring12, df_sorted))
print(validate_league(offspring13, df_sorted))
print(validate_league(offspring14, df_sorted))
print(validate_league(offspring15, df_sorted))
print(validate_league(offspring16, df_sorted))
print(validate_league(offspring17, df_sorted))
print(validate_league(offspring18, df_sorted))
print(validate_league(offspring19, df_sorted))
print(validate_league(offspring20, df_sorted))

League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True


In [485]:
print(validate_league(offspring21, df_sorted))
print(validate_league(offspring22, df_sorted))
print(validate_league(offspring23, df_sorted))
print(validate_league(offspring24, df_sorted))
print(validate_league(offspring25, df_sorted))
print(validate_league(offspring26, df_sorted))
print(validate_league(offspring27, df_sorted))
print(validate_league(offspring28, df_sorted))
print(validate_league(offspring29, df_sorted))
print(validate_league(offspring30, df_sorted))

League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True


In [486]:
print(validate_league(offspring31, df_sorted))
print(validate_league(offspring32, df_sorted))
print(validate_league(offspring33, df_sorted))
print(validate_league(offspring34, df_sorted))
print(validate_league(offspring35, df_sorted))
print(validate_league(offspring36, df_sorted))
print(validate_league(offspring37, df_sorted))
print(validate_league(offspring38, df_sorted))
print(validate_league(offspring39, df_sorted))
print(validate_league(offspring40, df_sorted))

League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True


In [487]:
print(validate_league(offspring41, df_sorted))
print(validate_league(offspring42, df_sorted))
print(validate_league(offspring43, df_sorted))
print(validate_league(offspring44, df_sorted))
print(validate_league(offspring45, df_sorted))
print(validate_league(offspring46, df_sorted))
print(validate_league(offspring47, df_sorted))
print(validate_league(offspring48, df_sorted))
print(validate_league(offspring49, df_sorted))
print(validate_league(offspring50, df_sorted))


League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True


In [488]:
print(validate_league(offspring51, df_sorted))
print(validate_league(offspring52, df_sorted))
print(validate_league(offspring53, df_sorted))
print(validate_league(offspring54, df_sorted))
print(validate_league(offspring55, df_sorted))
print(validate_league(offspring56, df_sorted))
print(validate_league(offspring57, df_sorted))
print(validate_league(offspring58, df_sorted))
print(validate_league(offspring59, df_sorted))
print(validate_league(offspring60, df_sorted))

League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True


In [489]:
print(validate_league(offspring61, df_sorted))
print(validate_league(offspring62, df_sorted))
print(validate_league(offspring63, df_sorted))
print(validate_league(offspring64, df_sorted))
print(validate_league(offspring65, df_sorted))
print(validate_league(offspring66, df_sorted))
print(validate_league(offspring67, df_sorted))
print(validate_league(offspring68, df_sorted))
print(validate_league(offspring69, df_sorted))
print(validate_league(offspring70, df_sorted))

League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True


In [490]:
print(validate_league(offspring71, df_sorted))
print(validate_league(offspring72, df_sorted))
print(validate_league(offspring73, df_sorted))
print(validate_league(offspring74, df_sorted))
print(validate_league(offspring75, df_sorted))
print(validate_league(offspring76, df_sorted))
print(validate_league(offspring77, df_sorted))
print(validate_league(offspring78, df_sorted))
print(validate_league(offspring79, df_sorted))
print(validate_league(offspring80, df_sorted))

League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True


In [491]:
print(validate_league(offspring81, df_sorted))
print(validate_league(offspring82, df_sorted))
print(validate_league(offspring83, df_sorted))
print(validate_league(offspring84, df_sorted))
print(validate_league(offspring85, df_sorted))
print(validate_league(offspring86, df_sorted))
print(validate_league(offspring87, df_sorted))
print(validate_league(offspring88, df_sorted))
print(validate_league(offspring89, df_sorted))
print(validate_league(offspring90, df_sorted))

League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True
League is valid!
True


# outro crossover

* acho que tinhamos ontem falado num crossover onde mantinhos todos os jogadores de 2 posições do pai 1 e dps adicionavamos todos os jogadores das restantes 2 posições do pai 2, portanto, era crossover de "posições"
E isto podia ser feito para vários numeros, do pai1 mantemos os jogadores que sao GK, e juntamos os DEF, MID, FWD do pai2, ou do pai31 
mantemos os jogadores que sao GK e MID e juntamos os DEF, FWD do pai2

![image.png](attachment:image.png)

In [492]:
from collections import defaultdict

def crossover_by_position(parent1, parent2, players_df, keep_positions=["GK", "MID"]):
    """
    Performs a role-aware crossover where selected positions are preserved from parent1,
    and the rest are filled from parent2. Each team is composed of 7 players in this order:
    [GK, DEF, DEF, MID, MID, FWD, FWD]

    Args:
        parent1 (list of list of int): First parent (5 teams × 7 players).
        parent2 (list of list of int): Second parent (5 teams × 7 players).
        players_df (pd.DataFrame): DataFrame with 'id' and 'position' columns.
        keep_positions (list): List of position names to keep from parent1.

    Returns:
        list: A new offspring created by crossover of parent1 and parent2 based on positions.
    """
    position_map = dict(zip(players_df['id'], players_df['position']))
    position_slots = ["GK", "DEF", "DEF", "MID", "MID", "FWD", "FWD"]

    used_players = set()
    offspring = []

    for team_idx in range(5):
        team1 = parent1[team_idx]
        team2 = parent2[team_idx]

        new_team = [None] * 7

        # Step 1: Keep players from parent1 in desired positions
        for i, slot in enumerate(position_slots):
            pid1 = team1[i]
            if position_map[pid1] == slot and slot in keep_positions:
                new_team[i] = pid1
                used_players.add(pid1)

        # Step 2: Fill the remaining positions from parent2 with correct role and no duplicates
        for i, slot in enumerate(position_slots):
            if new_team[i] is not None:
                continue  # already filled from parent1

            for pid2 in team2:
                if (
                    pid2 not in used_players
                    and position_map[pid2] == slot
                    and (slot not in keep_positions)
                ):
                    new_team[i] = pid2
                    used_players.add(pid2)
                    break

        # Step 3: Fallback (if not filled — shouldn't happen if parents are valid)
        for i, slot in enumerate(position_slots):
            if new_team[i] is None:
                # Search all players of that position not yet used
                for pid in players_df[players_df["position"] == slot]["id"]:
                    if pid not in used_players:
                        new_team[i] = pid
                        used_players.add(pid)
                        break

        offspring.append(new_team)

    return offspring


In [493]:
exe1 = SportsLeagueSolution()
exe1.repr

[[2, 10, 14, 15, 21, 30, 26],
 [3, 13, 6, 19, 16, 34, 32],
 [1, 12, 9, 22, 24, 27, 31],
 [0, 11, 7, 18, 20, 25, 29],
 [4, 5, 8, 23, 17, 28, 33]]

In [494]:
exe2 = SportsLeagueSolution()
exe2.repr

[[3, 13, 11, 21, 19, 26, 33],
 [4, 8, 7, 16, 17, 34, 31],
 [0, 10, 14, 18, 24, 32, 28],
 [1, 5, 6, 23, 20, 29, 27],
 [2, 9, 12, 15, 22, 25, 30]]

In [495]:
res= crossover_by_position(parent1=exe1.repr, parent2=exe2.repr, players_df=df_sorted)

In [496]:
res

[[2, 13, 11, 15, 21, 26, 33],
 [3, 8, 7, 19, 16, 34, 31],
 [1, 10, 14, 22, 24, 32, 28],
 [0, 5, 6, 18, 20, 29, 27],
 [4, 9, 12, 23, 17, 25, 30]]

In [497]:
validate_league(res, df_sorted)

League is valid!


True

In [498]:
def crossover_by_position_dual(parent1, parent2, players_df, keep_positions=["GK", "MID"]):
    """
    Dual crossover:
    - Offspring1 keeps selected positions from parent1, fills rest from parent2.
    - Offspring2 keeps selected positions from parent2, fills rest from parent1.

    Each team: [GK, DEF, DEF, MID, MID, FWD, FWD]

    Args:
        parent1 (list of list of int): First parent (5 teams × 7 players).
        parent2 (list of list of int): Second parent (5 teams × 7 players).
        players_df (pd.DataFrame): DataFrame with 'id' and 'position' columns.
        keep_positions (list): Positions to keep from each parent.

    Returns:
        tuple: (offspring1, offspring2)
    """
    position_map = dict(zip(players_df['id'], players_df['position'])) # maps each player_id to its position ( 33 → "FWD").
    position_slots = ["GK", "DEF", "DEF", "MID", "MID", "FWD", "FWD"] # expected order of players in a team

    def make_offspring(primary, secondary):

        # build one offspring using:
        # * primary parent — the parent to preserve roles 
        # * secondary — the parent to fill the rest from.

        used_players = set() # player IDs that have already been added to teams to prevent  duplicates during the repair process
        offspring = [] # list to store the new teams

        for team_idx in range(5):
            team1 = primary[team_idx] # team1 is the primary parent
            team2 = secondary[team_idx] # team2 is the secondary parent
            new_team = [None] * 7 # create a new team with 7 empty slots

            # KKeep selected positions from primary
            for i, slot in enumerate(position_slots): # for each position in the team
                pid1 = team1[i] # get the player ID from the primary parent
                if position_map[pid1] == slot and slot in keep_positions: # if the player ID is in the correct position and is one of the positions to keep
                    new_team[i] = pid1 # add the player ID to the new team
                    used_players.add(pid1) # add the player ID to the used players

            # Fill remaining from secondary
            for i, slot in enumerate(position_slots): # for each position in the team
                if new_team[i] is not None: # if the position is already filled from the primary parent
                    continue
                for pid2 in team2: # for each player ID in the secondary parent
                    if ( 
                        pid2 not in used_players # if the player ID is not already used
                        and position_map[pid2] == slot # if the player ID is in the correct position
                        and slot not in keep_positions # if the position is not one of the positions to keep
                    ):
                        new_team[i] = pid2 # add the player ID to the new team
                        used_players.add(pid2) # add the player ID to the used players
                        break

            # By the type of crossover, we should not never use this part, but just in case things dont work as expected
            for i, slot in enumerate(position_slots):
                if new_team[i] is None:
                    for pid in players_df[players_df["position"] == slot]["id"]:
                        if pid not in used_players:
                            new_team[i] = pid
                            used_players.add(pid)
                            break

            offspring.append(new_team)

        return offspring

    offspring1 = make_offspring(parent1, parent2)
    offspring2 = make_offspring(parent2, parent1)
    return offspring1, offspring2


In [499]:
exe1 = SportsLeagueSolution()
exe1.repr

[[2, 12, 11, 23, 19, 25, 32],
 [1, 9, 5, 15, 24, 28, 26],
 [0, 14, 8, 16, 21, 30, 29],
 [3, 7, 10, 17, 18, 31, 33],
 [4, 6, 13, 20, 22, 34, 27]]

In [500]:
exe2 = SportsLeagueSolution()
exe2.repr

[[1, 7, 9, 15, 20, 33, 28],
 [2, 13, 5, 17, 21, 32, 25],
 [3, 10, 8, 23, 16, 27, 31],
 [4, 11, 14, 19, 22, 30, 26],
 [0, 6, 12, 18, 24, 34, 29]]

In [501]:
offspring11, offspring12 = crossover_by_position_dual(exe1.repr, exe2.repr, df_sorted)

In [502]:
offspring11

[[2, 7, 9, 23, 19, 33, 28],
 [1, 13, 5, 15, 24, 32, 25],
 [0, 10, 8, 16, 21, 27, 31],
 [3, 11, 14, 17, 18, 30, 26],
 [4, 6, 12, 20, 22, 34, 29]]

In [417]:
offspring12

[[0, 9, 7, 23, 24, 32, 25],
 [4, 13, 8, 22, 18, 31, 34],
 [2, 14, 11, 19, 17, 30, 27],
 [1, 5, 10, 21, 20, 26, 28],
 [3, 6, 12, 16, 15, 29, 33]]

In [503]:
exe1 = SportsLeagueSolution()
exe2 = SportsLeagueSolution()
exe3 = SportsLeagueSolution()
exe4 = SportsLeagueSolution()
exe5 = SportsLeagueSolution()
exe6 = SportsLeagueSolution()
exe7 = SportsLeagueSolution()
exe8 = SportsLeagueSolution()
exe9 = SportsLeagueSolution()
exe10 = SportsLeagueSolution()

In [504]:
offspring1, offspring2 = crossover_by_position_dual(exe1.repr, exe2.repr, df_sorted)
offspring3, offspring4 = crossover_by_position_dual(exe1.repr, exe3.repr, df_sorted)
offspring5, offspring6 = crossover_by_position_dual(exe1.repr, exe4.repr, df_sorted)
offspring7, offspring8 = crossover_by_position_dual(exe1.repr, exe5.repr, df_sorted)
offspring9, offspring10 = crossover_by_position_dual(exe1.repr, exe6.repr, df_sorted)




In [505]:
exe1.repr

[[1, 10, 8, 21, 19, 34, 28],
 [0, 7, 12, 22, 18, 30, 25],
 [3, 6, 11, 17, 16, 29, 33],
 [4, 5, 9, 24, 20, 27, 26],
 [2, 14, 13, 15, 23, 32, 31]]

In [506]:
exe2.repr

[[3, 5, 6, 23, 18, 27, 28],
 [2, 8, 13, 19, 21, 26, 33],
 [1, 12, 9, 20, 24, 25, 34],
 [4, 7, 14, 17, 22, 32, 30],
 [0, 10, 11, 16, 15, 31, 29]]

In [507]:
offspring1

[[1, 5, 6, 21, 19, 27, 28],
 [0, 8, 13, 22, 18, 26, 33],
 [3, 12, 9, 17, 16, 25, 34],
 [4, 7, 14, 24, 20, 32, 30],
 [2, 10, 11, 15, 23, 31, 29]]

In [508]:
offspring2

[[3, 10, 8, 23, 18, 34, 28],
 [2, 7, 12, 19, 21, 30, 25],
 [1, 6, 11, 20, 24, 29, 33],
 [4, 5, 9, 17, 22, 27, 26],
 [0, 14, 13, 16, 15, 32, 31]]

In [532]:
import random

def crossover_by_position_dual(parent1, parent2, players_df, keep_positions=None, random_keep=False, num_keep_positions=2):
    """
    Dual crossover:
    - Offspring1 keeps selected positions from parent1, fills rest from parent2.
    - Offspring2 keeps selected positions from parent2, fills rest from parent1.

    Each team: [GK, DEF, DEF, MID, MID, FWD, FWD]

    Args:
        parent1 (list of list of int): First parent (5 teams × 7 players).
        parent2 (list of list of int): Second parent (5 teams × 7 players).
        players_df (pd.DataFrame): DataFrame with 'id' and 'position' columns.
        keep_positions (list): Positions to keep from each parent (overrides randomness).
        random_keep (bool): If True, randomly choose positions to keep.
        num_keep_positions (int): Number of positions to randomly select if random_keep=True.

    Returns:
        tuple: (offspring1, offspring2, selected_positions)
    """
    position_map = dict(zip(players_df['id'], players_df['position']))
    position_slots = ["GK", "DEF", "DEF", "MID", "MID", "FWD", "FWD"]
    all_positions = ["GK", "DEF", "MID", "FWD"]

    if random_keep:
        keep_positions = random.sample(all_positions, num_keep_positions)

    def crossover_team(team_primary, team_secondary, keep_positions, used_players):
        new_team = [None] * 7

        # Step 1: Keep selected positions from primary parent
        for i, slot in enumerate(position_slots):
            pid = team_primary[i]
            if position_map[pid] == slot and slot in keep_positions:
                new_team[i] = pid
                used_players.add(pid)

        # Step 2: Fill remaining from secondary parent
        for i, slot in enumerate(position_slots):
            if new_team[i] is not None:
                continue
            for pid in team_secondary:
                if pid not in used_players and position_map[pid] == slot and slot not in keep_positions:
                    new_team[i] = pid
                    used_players.add(pid)
                    break

        # Step 3: Fallback (repair any empty positions)
        for i, slot in enumerate(position_slots):
            if new_team[i] is None:
                for pid in players_df[players_df["position"] == slot]["id"]:
                    if pid not in used_players:
                        new_team[i] = pid
                        used_players.add(pid)
                        break

        return new_team

    # Build both offspring
    offspring1 = []
    offspring2 = []
    used1 = set()
    used2 = set()

    for t1, t2 in zip(parent1, parent2):
        offspring1.append(crossover_team(t1, t2, keep_positions, used1))
        offspring2.append(crossover_team(t2, t1, keep_positions, used2))

    return offspring1, offspring2, keep_positions


In [510]:
offspring1, offspring2, kept = crossover_by_position_dual(
    exe1.repr, 
    exe2.repr, 
    df_sorted,
    random_keep=True,
    num_keep_positions=2
)

In [511]:
exe1.repr

[[1, 10, 8, 21, 19, 34, 28],
 [0, 7, 12, 22, 18, 30, 25],
 [3, 6, 11, 17, 16, 29, 33],
 [4, 5, 9, 24, 20, 27, 26],
 [2, 14, 13, 15, 23, 32, 31]]

In [512]:
exe2.repr

[[3, 5, 6, 23, 18, 27, 28],
 [2, 8, 13, 19, 21, 26, 33],
 [1, 12, 9, 20, 24, 25, 34],
 [4, 7, 14, 17, 22, 32, 30],
 [0, 10, 11, 16, 15, 31, 29]]

In [513]:
offspring1

[[1, 5, 6, 21, 19, 27, 28],
 [0, 8, 13, 22, 18, 26, 33],
 [3, 12, 9, 17, 16, 25, 34],
 [4, 7, 14, 24, 20, 32, 30],
 [2, 10, 11, 15, 23, 31, 29]]

In [514]:
offspring1, offspring2, kept = crossover_by_position_dual(
    exe1.repr, 
    exe2.repr, 
    df_sorted,
    keep_positions=["MID", "DEF"],
    num_keep_positions=3
)

In [515]:
offspring1

[[3, 10, 8, 21, 19, 27, 28],
 [2, 7, 12, 22, 18, 26, 33],
 [1, 6, 11, 17, 16, 25, 34],
 [4, 5, 9, 24, 20, 32, 30],
 [0, 14, 13, 15, 23, 31, 29]]

# Genetic Algorithm


In [537]:
from typing import Callable

def get_best_ind(population: list[Solution], maximization: bool):
    fitness_list = [ind.fitness() for ind in population]
    if maximization:
        return population[fitness_list.index(max(fitness_list))]
    else:
        return population[fitness_list.index(min(fitness_list))]

def genetic_algorithm(
    initial_population: list[Solution],
    max_gen: int,
    selection_algorithm: Callable,
    maximization: bool = False,
    xo_prob: float = 0.9,
    mut_prob: float = 0.1,
    elitism: bool = True,
    verbose: bool = False,
):
    # 1. Initialize a population with N individuals
    population = initial_population

    # 2. Repeat until termination condition
    for gen in range(1, max_gen + 1):
        if verbose:
            print(f'-------------- Generation: {gen} --------------')

        # 2.1. Create an empty population P'
        new_population = []

        # 2.2. If using elitism, insert best individual from P into P'
        if elitism:
            new_population.append(deepcopy(get_best_ind(initial_population, maximization)))
        
        # 2.3. Repeat until P' contains N individuals
        while len(new_population) < len(population):
            # 2.3.1. Choose 2 individuals from P using a selection algorithm
            first_ind = selection_algorithm(population, maximization)
            second_ind = selection_algorithm(population, maximization)

            if verbose:
                print(f'Selected individuals:\n{first_ind}\n{second_ind}')

            # 2.3.2. Choose an operator between crossover and replication
            # 2.3.3. Apply the operator to generate the offspring
            if random.random() < xo_prob:
                offspring1, offspring2 = first_ind.crossover(second_ind)
                if verbose:
                    print(f'Applied crossover')
            else:
                offspring1, offspring2 = deepcopy(first_ind), deepcopy(second_ind)
                if verbose:
                    print(f'Applied replication')
            
            if verbose:
                print(f'Offspring:\n{offspring1}\n{offspring2}')
            
            # 2.3.4. Apply mutation to the offspring
            first_new_ind = offspring1.mutation(mut_prob)
            # 2.3.5. Insert the mutated individuals into P'
            new_population.append(first_new_ind)

            if verbose:
                print(f'First mutated individual: {first_new_ind}')
            
            if len(new_population) < len(population):
                second_new_ind = offspring2.mutation(mut_prob)
                new_population.append(second_new_ind)
                if verbose:
                    print(f'Second mutated individual: {first_new_ind}')
        
        # 2.4. Replace P with P'
        population = new_population

        if verbose:
            print(f'Final best individual in generation: {get_best_ind(population, maximization)}')

    # 3. Return the best individual in P
    return get_best_ind(population, maximization)


In [593]:
def mutate_team_structure(self):
    pass

In [602]:
class SportsLeagueGASolution(SportsLeagueSolution):
    # crossover
    def crossover(self, other_solution):
        offspring1_repr, offspring2_repr = standard_crossover_with_position_repair(
            self.repr, other_solution.repr, self.players_df
        )

        return (
            SportsLeagueGASolution(repr=offspring1_repr),
            SportsLeagueGASolution(repr=offspring2_repr)
        )

    # mutation
    def mutation(self, mut_prob=0):
        if random.random() < mut_prob:
            # Perform some actual mutation on self.repr
            mutated_repr = mutate_team_structure(self.repr)
            return SportsLeagueGASolution(repr=mutated_repr)
        else:
            return deepcopy(self)

In [617]:
POP_SIZE = 50
initial_population = [
    SportsLeagueGASolution(

    )
    for _ in range(POP_SIZE)
]
initial_population

[[[2, 11, 9, 24, 15, 34, 33], [0, 12, 14, 21, 19, 32, 30], [4, 7, 10, 18, 22, 29, 28], [1, 5, 6, 17, 23, 25, 31], [3, 13, 8, 20, 16, 26, 27]],
 [[4, 10, 11, 20, 23, 31, 34], [1, 5, 8, 22, 24, 25, 29], [3, 14, 9, 17, 15, 32, 27], [2, 12, 13, 18, 21, 26, 33], [0, 6, 7, 19, 16, 30, 28]],
 [[2, 9, 13, 16, 22, 25, 34], [1, 12, 14, 24, 20, 29, 28], [0, 5, 8, 23, 19, 33, 27], [4, 11, 10, 15, 21, 31, 26], [3, 7, 6, 18, 17, 30, 32]],
 [[0, 5, 10, 20, 21, 29, 32], [3, 9, 6, 18, 23, 30, 27], [2, 11, 12, 16, 24, 33, 26], [1, 14, 7, 17, 15, 31, 25], [4, 8, 13, 22, 19, 34, 28]],
 [[3, 14, 9, 20, 19, 33, 27], [0, 11, 7, 16, 22, 29, 32], [4, 12, 5, 21, 24, 28, 31], [2, 6, 13, 23, 15, 26, 34], [1, 8, 10, 17, 18, 25, 30]],
 [[4, 9, 6, 22, 18, 25, 31], [0, 12, 8, 23, 19, 30, 28], [3, 11, 10, 20, 17, 29, 32], [2, 14, 5, 15, 16, 27, 33], [1, 13, 7, 24, 21, 34, 26]],
 [[2, 12, 10, 24, 18, 34, 33], [3, 9, 14, 15, 23, 32, 30], [4, 5, 7, 21, 16, 28, 29], [1, 6, 13, 17, 20, 31, 25], [0, 8, 11, 22, 19, 26, 27]],

In [618]:
def ranking_selection(population, maximization=False):
    """
    Ranking selection based on linear rank probabilities.
    
    Args:
        population (list): A list of individuals in the population
    
    Returns:
        list or individual: The selected individual
    """
    sorted_population = sorted(population, key=lambda x: x.fitness(), reverse=True)
    n = len(population)
    ranks = list(range(1, n + 1))  # linear rank

    selected = random.choices(sorted_population, weights=ranks, k=1)
    return selected[0]

In [619]:
parent1 = ranking_selection(initial_population)
parent1

[[4, 5, 10, 22, 20, 34, 25], [1, 12, 13, 17, 24, 31, 32], [3, 6, 9, 16, 21, 27, 26], [2, 7, 11, 15, 18, 29, 30], [0, 14, 8, 23, 19, 33, 28]]

In [620]:
parent1.repr

[[4, 5, 10, 22, 20, 34, 25],
 [1, 12, 13, 17, 24, 31, 32],
 [3, 6, 9, 16, 21, 27, 26],
 [2, 7, 11, 15, 18, 29, 30],
 [0, 14, 8, 23, 19, 33, 28]]

In [621]:
parent1.fitness()

0.9578888350994424

In [623]:
best_solution = genetic_algorithm(
     initial_population=initial_population,
     selection_algorithm=ranking_selection,
     max_gen=10,
     maximization=False,
     verbose=True,
     elitism=True,) 

-------------- Generation: 1 --------------
Selected individuals:
[[4, 11, 8, 15, 16, 28, 32], [2, 12, 9, 21, 20, 26, 33], [3, 10, 6, 17, 19, 34, 31], [1, 7, 5, 23, 24, 29, 25], [0, 13, 14, 22, 18, 27, 30]]
[[3, 10, 9, 17, 24, 29, 27], [4, 5, 11, 23, 16, 26, 31], [0, 8, 7, 22, 15, 30, 32], [1, 13, 6, 18, 20, 28, 34], [2, 12, 14, 21, 19, 25, 33]]
Applied crossover
Offspring:
[[3, 9, 10, 24, 17, 29, 27], [4, 5, 11, 23, 16, 26, 31], [0, 8, 7, 22, 15, 30, 32], [1, 13, 6, 18, 20, 28, 34], [2, 12, 14, 21, 19, 25, 33]]
[[4, 11, 8, 16, 15, 28, 32], [2, 12, 9, 21, 20, 26, 33], [3, 10, 6, 17, 19, 34, 31], [1, 7, 5, 23, 24, 29, 25], [0, 13, 14, 22, 18, 27, 30]]
First mutated individual: [[3, 9, 10, 24, 17, 29, 27], [4, 5, 11, 23, 16, 26, 31], [0, 8, 7, 22, 15, 30, 32], [1, 13, 6, 18, 20, 28, 34], [2, 12, 14, 21, 19, 25, 33]]
Second mutated individual: [[3, 9, 10, 24, 17, 29, 27], [4, 5, 11, 23, 16, 26, 31], [0, 8, 7, 22, 15, 30, 32], [1, 13, 6, 18, 20, 28, 34], [2, 12, 14, 21, 19, 25, 33]]
Select

In [625]:
print("Best solution:", best_solution)
print("Fitness:", best_solution.fitness())

Best solution: [[4, 10, 7, 19, 23, 25, 30], [1, 14, 9, 21, 16, 31, 27], [3, 13, 11, 15, 22, 28, 29], [2, 6, 8, 18, 17, 26, 32], [0, 5, 12, 20, 24, 33, 34]]
Fitness: 0.2770102775666495
