# Digimon Database - Team composition

Source: https://www.kaggle.com/rtatman/digidb

Problem: 

Team composition is a complex problem which include variables as the gameplay, champion statistics, teammates interactions. Neural network algorithms can be found in the litterature for the best team composition , by example, in Leagues of Legends, where abundants informations are collected as teammates gameplay and datas on victory/losses. However, those algorithms are complex to write, analyze and require a lot of information collection. It seems that the best team is a well-balanced one, amongst other variables. This will be the starting point of the reasoning. 

Source:
Conference paper: Towards automated team composition in MOBA games based on players’ personality: an intelligent approach,Lincoln Costa
Ong, Hao Yi & Deolalikar, Sunil & Peng, Mark. (2015). Player Behavior and Optimal Team Composition for Online Multiplayer Games. 

The objective is to find the best team composition by selecting 3 digimons based on the DigiDB_digimonlist dataset. 
For the sake of simplicity, we consider here that the best team is a balanced team composed of 
- A Digimon with high physical damage stats
- A Digimon with high magical damage stats
- A Digimon with high defense stats


That hypothesis is too simple for complex gameplay but it is a good start on champion selection depending on statistics.  This can be treated as an <b>optimization problem</b> known as <b>assignment</b> problem: to assign N Digimons to M different tasks. Here, we have 249 Digimons for 3 different tasks. 

In the <b>first part</b> of that project, we a simple model that  will assign the best candidates according to our criteria. The suggested solution is verified and compared to a simple analysis. From the previous questions, we know that the best team according to that simple metric is: Chaosmon (318 Atk) GroundLocomon (213 Def) and Lucemon SM or Barbamon (233 int). The model results are validated by the manual analysis. 

The <b>second part</b> of the project is to perform a deeper analysis by adding more complex metrics and Digimons filtering. First, we exclude all digimons who have a statistic below the average of all Digimons. This selection let us with 19 Digimons. Second, new metrics are designed to take in consideration more variables combinations. The best team is now composed of:
- A Digimon with high physical damage and high speed stats ( Atk * Spd  = TotDam)
- A Digimon with best ratio magical damage to SP ( Int / SP  = MtoSP )
- A Digimon with high defense stats ( HP + Def = Tank)
With those hypothesis, the best team is: (TotDam) Leopardmon LM (TotDam), Alphamon (MtoSP), Imperialdramon FM (Tank)

In <b>conclusion</b>, we succeed to compose a team respecting our hypothesis. The model can be modified easily to taken in account more features, more champions or a team with different size or characteristics.

The model is still quite simple and don't consider a lot of variables in the gameplay. These specificities are hard to capture and to understand completely and will require much more work, datas and a more complex model. However, there is room for small improvements and result analysis by exploiting the datas already present in the dataset.

Some suggested improvements are:
- Updated filtering methods, the cost notion becomes more important with few Digimons to select from which can be achieved with higher filtering. 
- New metrics for candidate selections, Memory and Equip slots could be included as a malus in the cost.
- Include the attribute and type selection in the algorithm, to ensure a team balance 
- Add supportlist and movelist to the analysis by assigning to each of those moves a score that could be added to the appropriate category of pokemon. As example, a special power incresing the damage could be added as a +5 in the 'TotDam' stat



## Design a simple model

### Data pre-processing
Select the require datas from the dataset 

In [2]:
import pandas as pd
import numpy as np

digimonList = pd.read_csv('DigiDB_digimonlist.csv')
selectedcolumns=digimonList[['Lv50 Atk','Lv50 Int','Lv50 Def']]
matrix_filtered=selectedcolumns.copy()
matrix_filtered.head()

Unnamed: 0,Lv50 Atk,Lv50 Int,Lv50 Def
0,79,68,69
1,76,69,76
2,97,50,87
3,77,76,95
4,54,95,59


Data normalization (value/max)

In [3]:
def normalize(df):
    result = df.copy()
    for feature_name in df.columns:
        max_value = df[feature_name].max()
        result[feature_name] = (df[feature_name]) / (max_value )
    return result

matrix_norm=normalize(matrix_filtered)

# Sanity check - Highest attack at index 243
matrix_norm.sort_values(by=['Lv50 Atk'], ascending=False).head(3)


Unnamed: 0,Lv50 Atk,Lv50 Int,Lv50 Def
243,1.0,0.381974,0.441315
235,0.77673,0.467811,0.788732
195,0.764151,0.339056,0.488263


Convert to numpy array and convert stats into cost (inversely proportional)

In [4]:
matrix_np=matrix_norm.copy().to_numpy()

# Results are inverted, because the algorithm uses a cost. So a high skill stat corresponds to a small cost.
def ConvertToCost(np_matrix):
    for i in range(len(np_matrix[:,0])): # rows
        for j in range(len(np_matrix[0,:])): # Columns
            value=np_matrix[i,j]
            np_matrix[i,j]=1.0/value
    return np_matrix

matrix_np2=ConvertToCost(matrix_np)

# Sanity check - index 243 has the highest value of 1
result = np.where(matrix_np2[:,0] == np.amin(matrix_np2))
print('Indice of minimum element :', result[0])

Indice of minimum element : [243]


### Assignment problem

In [5]:
from scipy.optimize import linear_sum_assignment

cost=matrix_np2.transpose()
row_ind, col_ind = linear_sum_assignment(cost)

composants=['Lv50 Atk','Lv50 Int','Lv50 Def']
for i in range(len(col_ind)):
    print(composants[i] + ': '  +  digimonList['Digimon'][col_ind[i]])

Lv50 Atk: Chaosmon
Lv50 Int: Barbamon
Lv50 Def: GroundLocomon


The best team is composed of the expected champions. The result is obvious because of the high number of digimons with different statistics for three roles and the simple metric. 

A deeper analysis can be performed by doing some

## Digging deeper

- Filtering of the champions

### Data pre-processing

In [6]:
# Filtering
digimonList2 = digimonList.copy()
print(digimonList2.shape)
carac=['Lv 50 HP','Lv50 SP','Lv50 Atk','Lv50 Def','Lv50 Int','Lv50 Spd']
for item in carac:
    index_names = digimonList2[(digimonList2[item] <= digimonList[item].mean()) ].index #Warning, mean of the unfiltered list
    digimonList2.drop(index_names, inplace = True)

print(digimonList2.shape)
digimonList2.reset_index(inplace=True) # Reassign index

digimonList2.head()

(249, 13)
(19, 13)


Unnamed: 0,index,Number,Digimon,Stage,Type,Attribute,Memory,Equip Slots,Lv 50 HP,Lv50 SP,Lv50 Atk,Lv50 Def,Lv50 Int,Lv50 Spd
0,166,167,Alphamon,Mega,Vaccine,Neutral,22,1,1390,128,158,183,158,130
1,169,170,Imperialdramon DM,Mega,Free,Fire,20,2,1730,143,139,139,139,148
2,170,171,Imperialdramon FM,Mega,Free,Neutral,20,2,1780,114,198,124,114,153
3,174,175,Examon,Mega,Data,Wind,22,1,1630,148,174,129,129,153
4,177,178,ChaosGallantmon,Mega,Virus,Dark,22,1,1340,139,178,139,163,144


The filtering permits us to have a pool of 19 champîons. New, we select the appropriate features. 


In [123]:
# Select the appropriate features
temp=digimonList2[['Lv 50 HP','Lv50 Atk','Lv50 Int','Lv50 Def','Lv50 SP','Lv50 Spd']].copy()
temp['TotDam'] = (temp['Lv50 Atk'] * temp['Lv50 Spd'])
temp['MtoSP'] = round((temp['Lv50 Int'] / temp['Lv50 SP']),2)
temp['Tank'] = (temp['Lv50 Def'] + temp['Lv 50 HP'])

matrix2_filtered=temp[['TotDam','MtoSP','Tank']].copy()
matrix2_filtered.head()

Unnamed: 0,TotDam,MtoSP,Tank
0,20540,1.23,1573
1,20572,0.97,1869
2,30294,1.0,1904
3,26622,0.87,1759
4,25632,1.17,1479


In [102]:
# Pre-process the datas
matrix2_norm=normalize(matrix2_filtered) # Normalization
matrix2_np=matrix2_norm.copy().to_numpy()# Convert to numnpy
matrix2_np2=ConvertToCost(matrix2_np)# Convert to cost


Optimization

In [117]:
cost2=matrix2_np2.transpose()
row_ind, c_i = linear_sum_assignment(cost2)
#print(c_i)
#print(cost[row_ind, col_ind].sum())


With those hypothesis, the best team is :


In [116]:
composants=['TotDam','MtoSP','Tank']
for i in range(len(c_i)):
    print(composants[i] + ': '  +  digimonList2['Digimon'][c_i[i]])

TotDam: Leopardmon LM
MtoSP: Alphamon
Tank: Imperialdramon FM


In [114]:
digimonList2.iloc[c_i]

Unnamed: 0,index,Number,Digimon,Stage,Type,Attribute,Memory,Equip Slots,Lv 50 HP,Lv50 SP,Lv50 Atk,Lv50 Def,Lv50 Int,Lv50 Spd
8,200,201,Leopardmon LM,Mega,Data,Earth,25,1,1290,153,159,129,139,218
0,166,167,Alphamon,Mega,Vaccine,Neutral,22,1,1390,128,158,183,158,130
2,170,171,Imperialdramon FM,Mega,Free,Neutral,20,2,1780,114,198,124,114,153
