<a href="https://colab.research.google.com/github/papagorgio23/Python101/blob/master/Forming_Teams.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Forming Teams to Create Synergy

## Background
As the saying goes, "two heads are better than one". [Research](https://scholar.google.ca/scholar?q=effectiveness+of+teamwork+in+the+workplace) has shown that individuals working in teams are more effective together than working alone. 

Knowing this insight, how can we form the right teams to maximize overall productivity?

While this idea can be applied to any domain, e.g., sales associates, research groups, etc.,  we'll form teams of comic book superheroes for this demonstration.

![Superheroes](https://ap2hyc.com/wp-content/uploads/2016/11/12444766_why-marvel-had-to-pull-one-of-its-comic_5138a63e_m.jpg)

### Conditions
Before we start to assign heroes to teams, there are certain conditions that the teams must satisfy:


1. **Balance**: Each team must have at least one high performer
2. **Diversity:** Each team must have at least one female and one non-human
3. **Completeness**: All superheroes must be assigned to a team. (We don't want anyone to feel excluded.)
4. **Team Size**: Teams have a maximum of 8 superheroes


## Solution
This is an example of constraint optimization (aka constraint programming), where the main goal is to find a solution that meets the given conditions. Although we are still interested in maximizing overall productivity, it is a secondary priority. More details about constraint optimization can be found [here](https://developers.google.com/optimization/cp).


For this problem, we will use Google's OR Tools [CP-SAT solver](https://developers.google.com/optimization/reference/python/sat/python/cp_model). The following code is adapted from this wedding guest [example](https://github.com/google/or-tools/blob/stable/examples/notebook/examples/wedding_optimal_chart_sat.ipynb).

In [None]:
# install Google's OR tools library
%pip install --upgrade --user ortools

### 1. Load and prepare the data

To keep things simple, we'll work with only the "good" super heroes from the Marvel and DC comics from this [superhero dataset](https://www.kaggle.com/claudiodavi/superhero-set). Any superhero with above average powers is considered a "high performer".

In [None]:
import pandas as pd

In [None]:
superheroes = pd.read_csv("heroes_information.csv")
powers = pd.read_csv("super_hero_powers.csv")

In [None]:
## keep only good superheroes and DC/Marvel characters for simplicity
superheroes = superheroes[(superheroes.Publisher == "Marvel Comics") | (superheroes.Publisher == "DC Comics")]
superheroes = superheroes[superheroes.Alignment == "good"]

In [None]:
superheroes.head()

Unnamed: 0.1,Unnamed: 0,name,Gender,Eye color,Race,Hair color,Height,Publisher,Skin color,Alignment,Weight
0,0,A-Bomb,Male,yellow,Human,No Hair,203.0,Marvel Comics,-,good,441.0
2,2,Abin Sur,Male,blue,Ungaran,No Hair,185.0,DC Comics,red,good,90.0
7,7,Adam Strange,Male,blue,Human,Blond,185.0,DC Comics,-,good,88.0
8,8,Agent 13,Female,blue,-,Blond,173.0,Marvel Comics,-,good,61.0
9,9,Agent Bob,Male,brown,Human,Brown,178.0,Marvel Comics,-,good,81.0


In [None]:
## create female indicator column
superheroes["is_female"] = superheroes["Gender"] == "Female"

In [None]:
superheroes["Gender"].value_counts()

Male      245
Female    137
-          19
Name: Gender, dtype: int64

In [None]:
## create non-human indicator column
superheroes["is_non_human"] = superheroes["Race"] != "Human"

In [None]:
powers.head()

Unnamed: 0,hero_names,Agility,Accelerated Healing,Lantern Power Ring,Dimensional Awareness,Cold Resistance,Durability,Stealth,Energy Absorption,Flight,Danger Sense,Underwater breathing,Marksmanship,Weapons Master,Power Augmentation,Animal Attributes,Longevity,Intelligence,Super Strength,Cryokinesis,Telepathy,Energy Armor,Energy Blasts,Duplication,Size Changing,Density Control,Stamina,Astral Travel,Audio Control,Dexterity,Omnitrix,Super Speed,Possession,Animal Oriented Powers,Weapon-based Powers,Electrokinesis,Darkforce Manipulation,Death Touch,Teleportation,Enhanced Senses,...,Mind Control Resistance,Plant Control,Sonar,Sonic Scream,Time Manipulation,Enhanced Touch,Magic Resistance,Invisibility,Sub-Mariner,Radiation Absorption,Intuitive aptitude,Vision - Microscopic,Melting,Wind Control,Super Breath,Wallcrawling,Vision - Night,Vision - Infrared,Grim Reaping,Matter Absorption,The Force,Resurrection,Terrakinesis,Vision - Heat,Vitakinesis,Radar Sense,Qwardian Power Ring,Weather Control,Vision - X-Ray,Vision - Thermal,Web Creation,Reality Warping,Odin Force,Symbiote Costume,Speed Force,Phoenix Force,Molecular Dissipation,Vision - Cryo,Omnipresent,Omniscient
0,3-D Man,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,True,False,False,False,False,True,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
1,A-Bomb,False,True,False,False,False,True,False,False,False,False,False,False,False,False,False,True,False,True,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
2,Abe Sapien,True,True,False,False,True,True,False,False,False,False,True,True,True,False,False,True,True,True,False,True,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
3,Abin Sur,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
4,Abomination,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,True,False,False,False,False,False,False,False,True,False,False,False,False,True,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False


In [None]:
## merge powers with the superheros data
superheroes = pd.merge(superheroes, powers, how="left", left_on="name", right_on="hero_names")

In [None]:
# sum up each hero's powers:
superheroes.replace(to_replace={False: 0, True: 1}, inplace=True)
superheroes["powers"] = superheroes.loc[:,"Agility":"Omniscient"].sum(axis=1).map(int)

In [None]:
superheroes["powers"].describe()

count    401.000000
mean       7.962594
std        7.355005
min        0.000000
25%        2.000000
50%        6.000000
75%       11.000000
max       49.000000
Name: powers, dtype: float64

In [None]:
# for this demo, we'll assume if superhero has more than 8 powers, they are a high performer
superheroes["is_high_performer"] = superheroes["powers"] > 8

In [None]:
superheroes["is_high_performer"].value_counts()

False    257
True     144
Name: is_high_performer, dtype: int64

In [None]:
# for time, we'll randomly select 17 superheroes 
superheroes = superheroes.sample(17)
superheroes = superheroes.reset_index()

In [None]:
superheroes["is_female"].value_counts()

0    14
1     3
Name: is_female, dtype: int64

In [None]:
import math
MAX_TEAM_SIZE = 8

num_superheroes = superheroes.shape[0]
num_teams = math.ceil(num_superheroes / MAX_TEAM_SIZE)
all_superheroes = range(superheroes.shape[0])
all_teams = range(num_teams)

In [None]:
print("There are %d superheroes." % num_superheroes)
print("We can make at most %d teams of size %d" % (num_teams, MAX_TEAM_SIZE))

There are 17 superheroes.
We can make at most 3 teams of size 8


In [None]:
print(superheroes.loc[:,["name","is_high_performer","Gender","Publisher"]])

               name  is_high_performer  Gender      Publisher
0     Black Goliath              False    Male  Marvel Comics
1        Hal Jordan               True    Male      DC Comics
2     Silver Surfer               True    Male  Marvel Comics
3             Storm              False  Female  Marvel Comics
4           Aquaman               True    Male      DC Comics
5       Mockingbird              False  Female  Marvel Comics
6           Box III              False       -  Marvel Comics
7   Howard the Duck              False    Male  Marvel Comics
8           Robin V              False    Male      DC Comics
9    Jack of Hearts               True    Male  Marvel Comics
10      Mr Immortal              False    Male  Marvel Comics
11      Shatterstar              False    Male  Marvel Comics
12             Hulk               True    Male  Marvel Comics
13      Doctor Fate               True    Male      DC Comics
14         Agent 13              False  Female  Marvel Comics
15      

### 2. Instantiate the solver

In [None]:
from ortools.sat.python import cp_model
model = cp_model.CpModel()



### 3. Declare decision variables

Decision variables allow the solver to assign superheroes to teams. The solver will set the variables to `1` if a superhero is on a certain team. 

1. Let $t_{ij} = 1$ if superhero $j$ is on team $i$ and $t_{ij} = 0$ otherwise.
2. Let $m_{jk} = 1$ if superhero $j$ is on the same team as superhero $k$ and $m_{jk} = 0$ otherwise.
3. Let $s_{ijk} = 1$ if superhero $j$ and $k$ are on team $i$, and  $s_{ijk} = 0$ otherwise.

We add these variables using the `NewBoolVar` [method](https://developers.google.com/optimization/reference/python/sat/python/cp_model#newboolvar).



In [None]:
# decision variables

# superhero a is in team t if team[(a,t)] = 1
teams = {}
for a in all_superheroes:
    for t in all_teams:
        teams[(t, a)] = model.NewBoolVar('team:%i superhero:%i' % (t, a))

team_members = {}
for a1 in range(num_superheroes - 1):
    for a2 in range(a1 + 1, num_superheroes):
        team_members[(a1, a2)] = model.NewBoolVar('superhero %i is teamed with superhero %i' % (a1, a2))

same_team = {}
for a1 in range(num_superheroes - 1):
    for a2 in range(a1 + 1, num_superheroes):
        for t in all_teams:
            same_team[(a1, a2, t)] = model.NewBoolVar(
                'superhero %i is teamed with superhero %i on team %i' % (a1, a2, t))

### 3. Add constraints

Now that we have the decision variables, we can add the constraints using the solver's `Add` [method](https://developers.google.com/optimization/reference/python/sat/python/cp_model#add). The solver will decide which of the decision variables to set to `1`, i.e., which superhero to assign to which team, and satisfy the constraints.





In [None]:
# set constants
MIN_TEAM_SIZE = 1
MIN_HIGH_PERFORMER = 1
MIN_FEMALE = 1
MIN_NONHUMAN = 1

Let $S$ be the set of superheroes and $T$ the set of teams.
1. Every superhero can only be on one team.
$$\forall j \in S, \sum_{i\in T} t_{ij} = 1$$

In [None]:
# each superhero is assigned to only one team
for a in all_superheroes:
    model.Add(sum(teams[(t, a)] for t in all_teams) == 1)

2. Teams can have at most 8 superheroes.
$$\forall i \in T, \sum_{j \in S} t_{ij} \le 8 $$


In [None]:
# teams can have at most MAX_TEAM_SIZE superheroes
for t in all_teams:
    model.Add(sum(teams[(t, a)] for a in all_superheroes) <= MAX_TEAM_SIZE)

3. Teams have at least one high performer.
$$\forall i \in T, \sum_{j \in S} I(j=\text{high performer})*t_{ij} \ge 1 $$

4. Teams have at least one female.
$$\forall i \in T, \sum_{j \in S} I(j=\text{female})*t_{ij} \ge 1 $$

5. Teams have at least one non-human superhero.
$$\forall i \in T, \sum_{j \in S} I(j=\text{non-human})*t_{ij} \ge 1 $$

In [None]:
for t in all_teams:
  # each team has at least one high performer, i.e., all high performers can't be in the same team
    model.Add(sum(superheroes.loc[a, 'is_high_performer'] * teams[(t, a)] for a in all_superheroes) >= MIN_HIGH_PERFORMER)

  # each team has at least one female and one non-human superhero
    model.Add(sum(superheroes.loc[a, 'is_female'] * teams[(t, a)] for a in all_superheroes) >= MIN_FEMALE)
    model.Add(sum(superheroes.loc[a, 'is_non_human'] * teams[(t, a)] for a in all_superheroes) >= MIN_NONHUMAN)



In [None]:
# add one final constraint to link it all together
# Link team members with teams
for a1 in range(num_superheroes - 1):
    for a2 in range(a1 + 1, num_superheroes):
        for t in all_teams:
            # Link same_team and teams, i.e., one of the following is true
            model.AddBoolOr([
                teams[(t, a1)].Not(), teams[(t, a2)].Not(), same_team[(a1, a2, t)]
            ])

            # a1 and a2 being on team t means a1 is on team t and a2 is on team t
            model.AddImplication(same_team[(a1, a2, t)], teams[(t, a1)])
            model.AddImplication(same_team[(a1, a2, t)], teams[(t, a2)])

        # Link team_members and same_team.
        model.Add(sum(same_team[(a1, a2, t)] for t in all_teams) == team_members[(a1, a2)])



**Adding an objective function** 

Optionally, we can also add an objective function to maximize overall team synergy. In practice, the synergies of individuals is estimated or known in advance of applying CO.

For this scenario, let's assume that when on the same team, superheroes from the same universe complement each other's powers so that the resulting output is twice their combined powers. On the otherhand, when superheroes from different universes are teamed, they have trouble working together so there is no productivity gained.

Let $p_j$ represent the power of superhero $j$ and $u_j$ represent the comic universe of superhero $j$

$$synergy(j,k) = \begin{cases} 
      2(p_j+p_k) & u_j = u_k \\
      0 & u_j \ne u_k \\
   \end{cases}
\
$$

We want to maximize the overall synergy of all teams
$$ \sum_{j}\sum_{k}\text{synergy}(j,k)*m_{jk} $$

In [None]:
def synergy(universe1, universe2, power1, power2):
  synergy_factor = 2

  # superheroes of the same universe create synergy
  if universe1 == universe2:
    return (power1+power2)*synergy_factor
  
  #superheroes from different universe detract
  else:
    return (power1+power2)*0
    

In [None]:
# Objective
model.Maximize(
    sum(synergy(superheroes.loc[a1,'Publisher'], superheroes.loc[a2,'Publisher'],superheroes.loc[a1,'powers'],superheroes.loc[a2,'powers']) * team_members[a1, a2]
        for a1 in range(num_superheroes - 1) for a2 in range(a1 + 1, num_superheroes)))



### 4. Run the solver

We can now run the solver to find our teams!

In [None]:
  # call the solver
  solver = cp_model.CpSolver()
  status = solver.Solve(model)

  solution_printer = SuperheroesPartialSolutionPrinter(teams, num_superheroes, num_teams, range(5))

In [None]:
if status == cp_model.FEASIBLE or status == cp_model.OPTIMAL:
    for t in all_teams:
        print('\nTeam %i' % t)
        for a in all_superheroes:
            if solver.Value(teams[t, a]):
                name = superheroes.loc[a, 'name']
                gender = superheroes.loc[a,'Gender']
                race = superheroes.loc[a,'Race']
                num_powers = superheroes.loc[a,'powers']
                universe = superheroes.loc[a,'Publisher']
                print("%s - %s - %s - %s Powers: %d" % (name, gender, race, universe, num_powers))

    print('Statistics')
    print('  - conflicts       : %i' % solver.NumConflicts())
    print('  - branches        : %i' % solver.NumBranches())
    print('  - wall time       : %f s' % solver.WallTime())
    print('  - solutions found : %i' % solution_printer.solution_count())


Team 0
Black Goliath - Male - - - Marvel Comics Powers: 0
Mockingbird - Female - Human - Marvel Comics Powers: 7
Jack of Hearts - Male - Human - Marvel Comics Powers: 10

Team 1
Silver Surfer - Male - Alien - Marvel Comics Powers: 21
Storm - Female - Mutant - Marvel Comics Powers: 8
Box III - - - - - Marvel Comics Powers: 0
Howard the Duck - Male - - - Marvel Comics Powers: 0
Mr Immortal - Male - Mutant - Marvel Comics Powers: 3
Shatterstar - Male - - - Marvel Comics Powers: 6
Hulk - Male - Human / Radiation - Marvel Comics Powers: 18
Groot - Male - Flora Colossus - Marvel Comics Powers: 13

Team 2
Hal Jordan - Male - Human - DC Comics Powers: 15
Aquaman - Male - Atlantean - DC Comics Powers: 24
Robin V - Male - Human - DC Comics Powers: 5
Doctor Fate - Male - Human - DC Comics Powers: 22
Agent 13 - Female - - - Marvel Comics Powers: 0
Metamorpho - Male - - - DC Comics Powers: 4
Statistics
  - conflicts       : 4209
  - branches        : 5751
  - wall time       : 0.743992 s
  - solut

As we can see, the solver chose to make teams homogeneous in terms of the comic book universe to maximize productivity. This makes sense since we specified that being from the same universe is more productive.

## Summary

We saw how to create teams to potentially maximize productivity (that is, if every superhero puts their egos aside and works cooperatively). Although this was a fun, example, it can be applied to sports, business, even a [wedding](https://www.improbable.com/2012/02/12/finding-an-optimal-seating-chart-for-a-wedding/).