# Team Ranker

So here I've created a little team ranker based on data from the top 30 raned Sinsiter Cup mons in PvPoke.com, as well as implementing the frequency statistics from silph.gg. All of this is done in Julia, but hopefully these little notes should clarify what the ranker is doing. I recommend running everything, and then reading through this. Some sections take a little bit of time to run.

## Getting Started

### Installing Packages

Here I'm just grabbing some of the packages I need. I'll be reading from CSVs into DataFrames and making plots and using random distributions for uncertainty. Basically, I'm just grabbing some tools that make my life easier programming. Oh and all of the plots will be interactive, hence my use of Plotly.

In [3]:
using CSV, Plots, Distributions, DataFrames
plotly();

### Reading Data

I've got my CSV with PvPoke data that I'm reading from. You can find that here.

In [4]:
rankings = CSV.read("SinisterTop30.csv"; delim=','); 

### Colors

Just defining some colors so that things look pretty. I try to use the non cup typing when coloring stats from different mons, but mons like Medicham, Gallade, and Bronzong are difficult, so I gave them one of their typing's colors arbitrarily. Also, these colors are a little translucent so that I can overlap later.

In [5]:
opacity = 0.7
fire = RGBA(254/255,163/255,84/255, opacity); ground = RGBA(212/255,141/255,91/255, opacity); poison = RGBA(193/255,98/255,212/255, opacity); rock = RGBA(208/255,196/255,142/255, opacity); ghost = RGBA(89/255,107/255,181/255, opacity); psychic = RGBA(245/255,126/255,121/255, opacity); ice = RGBA(120/255,212/255,192/255, opacity); water = RGBA(86/255,158/255,222/255, opacity); fighting = RGBA(213/255,63/255,91/255, opacity); steel = RGBA(82/255,142/255,160/255, opacity); fairy = RGBA(240/255,152/255,228/255, opacity); flying = RGBA(148/255,171/255,225/255, opacity); bug = RGBA(158/255,195/255,49/255, opacity); electric = RGBA(246/255,215/255,75/255, opacity);

### Setting Up the Tables

I'm grabbing just the data I need and putting it in a constant (for SPEED) matrix, and also defining a teamBattles matrix that will be the data it outputs.

In [6]:
# clean up ranking data for simulation
# Defining as constant for SPEED
const ranks = [rankings[i,j] for i = 1:30, j = 2:4:120];
teamBattles = zeros(12180, 12180);

### Grabbing Data From Silph.gg

Here we're grabbing the data from the Sinister Cup API from Silph.gg. This is an amazing resource that we will use to determine meta relevance. But, do not run this section more than ten times in ten minutes. You definitely don't need to as its updated hourly, but please respect the API limits

### Team Numbers

Here I'm mapping the 3 mon teams to a number, such that each team has its own unique number from 1 to 12180 ($\frac{30 \cdot 29 \cdot 28}{2}$)

In [7]:
teamNumberVar = zeros(30, 30, 30)
i = 1
for mon1 = 1:30 , mon2 = 1:29 , mon3 = (mon2 + 1):30
    if mon1 != mon2 && mon1 != mon3
        teamNumberVar[mon1, mon2, mon3] = Int(i)
        i += 1
    end
end
const teamNumber = teamNumberVar;

In [8]:
teamNumber[28, 1, 6]

10967.0

## The Model

### Assumptions

So, we all know assumptions are bad. But in data science, sometimes our models need to be simplified so we can work with the information that we have in a reasonable time scale. Therefore, I have made some simplifying assumptions for the model of PvP battles, and will add some uncertainty to account for some of the differences between this model and the reality of PvP battling.

*   **No Switching:** Switching is a weird mechanic for which the timing is never consistent, when and if you should use it in scenario X is hotly debated, and then your opponent switches, which is a lot of variability. As will be a pretty common motivator among all of these reasons, if I have the home team switch perfectly, it involves some knowledge of the opposing team that you don't have in that situation. All of this to say, for this model, nobody switches because it makes everything work better, my life easier, and doesn't run the risk of having the model make decisions better than any player could be expected to. 

*   **Each Mon Gets One Shield:** I know, I know. Doesn't that add up to three shields? Well, yeah. But again, shielding choices add in some variability, and perfect use involves some knowledge of the opposing team that you don't have in that situation. Plus, then one data set is needed, the 1-1 shield matchups from PvPoke, which I believe also implicitly have the shield used on the first charged move. 

*   **Players Play Perfectly Otherwise:** Whoa. In all that avoidance of perfection, now I want my model to be perfect? Well, for one, this is just based on the assumptions in PvPoke. For two, I stand by that decision, as its perfection that's acheivable by the knowledge a player has in a particular situation. You may not know when to shield or switch, but you do probably know that you want an excellent charged move and to tap out fast moves (also I'm assuming everyone is on 1.57 or higher, because I do not want to deal with under or overtapping). Also, this model assumes the ideal moveset for each mon. So if you don't have Shadow Ball for your Haunter, the Haunter results aren't for you.

*   **Mons Appear According to Silph Distribution:** This may be different for your community. Not many people near you have a Bastiodon? This might not be the right distribution for you. I do believe this is far better than my original assumption that the top 30 mons appear uniformly frequently. Also, the API gives results for individual mons, so I'm assuming that the probability of a team appearing is the probability of mon 1 times mon 2 times mon 3. Is this true? Not necessarily, as combinations of certain mons are more or less used. But it's an approximation.

*   **Score Above 1500 is a Win:** This is based on the PvPoke battle score, and since there are three battles, its out of 3000 instead of 1000. Scoring is explained more below, but this is the assumption of what we do with that score. 

*   **Only Top 30 Mons are Considered:** Sorry, spoink fans. This is to keep the amount of data this processes to a reasonable amount.

None of these assumptions are set in stone. In fact, if you have a way to change them and think that that's more useful to you, 1) go ahead and 2) let me know how you did it. 

### Scoring

I know, this section is already too long, just show you the data. But here's what the data means. Each PvPoke battle is given a score of 0-1000 with 500 representing a tie, and above that being a win, below that a loss. The score for the three-mon battle is adding those three scores together. But, that depends on which matchups you see. The lead pokemon always face each other. If your lead wins, you get the favorable matchups (because your opponent had to put in a pokemon and you can counter it), if your lead loses, you get the less favorable matchups. And, as stated above, over 1500 is a win.

In [9]:
function individual_battle_verbose(home1,home2,home3,away1,away2,away3,rankings)
    
                 score  = 1000 - rankings[home1, away1]
    secondBattleResult1 = 1000 - rankings[home2, away2]
    secondBattleResult2 = 1000 - rankings[home2, away3]
    thirdBattleResult1  = 1000 - rankings[home3, away2]
    thirdBattleResult2  = 1000 - rankings[home3, away3]

    if score > 500 
        score += max(secondBattleResult2 + thirdBattleResult1, secondBattleResult1 + thirdBattleResult2) 
    elseif score < 500 
        score += min(secondBattleResult2 + thirdBattleResult1, secondBattleResult1 + thirdBattleResult2) 
    else
        score += (secondBattleResult2 + thirdBattleResult1 + secondBattleResult1 + thirdBattleResult2)/2.0 
    end 
    
    return score
    
end;

In [10]:
function individual_battle(home1,home2,home3,away1,away2,away3,rankings)
    
                 score  = 1000 - rankings[home1, away1] - 50 * randn()
    secondBattleResult1 = 1000 - rankings[home2, away2] - 50 * randn()
    secondBattleResult2 = 1000 - rankings[home2, away3] - 50 * randn() 
    thirdBattleResult1  = 1000 - rankings[home3, away2] - 50 * randn()
    thirdBattleResult2  = 1000 - rankings[home3, away3] - 50 * randn()

    if score > 500 
        score += max(secondBattleResult2 + thirdBattleResult1, secondBattleResult1 + thirdBattleResult2) 
    elseif score < 500 
        score += min(secondBattleResult2 + thirdBattleResult1, secondBattleResult1 + thirdBattleResult2) 
    else
        score += (secondBattleResult2 + thirdBattleResult1 + secondBattleResult1 + thirdBattleResult2)/2.0 
    end 
    
    return score
    
end;

### Testing

The timing stuff is mostly for me, to make sure things can run in a reasonable time. But they exist and they're here in case you're curious

In [11]:
# Run once to compile, run again to test speed (with different input so no cheating)
individual_battle(1,2,3,4,5,6,ranks)
@time score = individual_battle(1,2,7,4,5,6,ranks)
# Run once to compile, run again to test speed (with different input so no cheating)
individual_battle_verbose(1,2,3,4,5,6,ranks)
@time score = individual_battle_verbose(1,2,7,4,5,6,ranks)

  0.000004 seconds (6 allocations: 224 bytes)
  0.000002 seconds (5 allocations: 176 bytes)


1099

### Testing a Team

This function does all possible teams against a certain team. And there are two versions, depending on if you want verbose output.

In [12]:
function run_away_teams_verbose(home1,home2,home3,ranks)

    for away1 = 1:30 , away2 = 1:29 , away3 = (away2 + 1):30

        if away1 != away2 && away1 != away3

            # Do the battle!
            # Use the function we wrote above
            score = individual_battle_verbose(home1,home2,home3,away1,away2,away3,ranks)
            teamBattles[Int(teamNumber[home1,home2,home3]), Int(teamNumber[away1,away2,away3])] = score
        end
    end
end;

In [13]:
function run_away_teams(home1,home2,home3,ranks)
    avgScore = 0.0
    winRatio = 0.0
    
    for away1 = 1:30 , away2 = 1:29 , away3 = (away2 + 1):30

        if away1 != away2 && away1 != away3

            # Do the battle!
            # Use the function we wrote above
            score = individual_battle_verbose(home1,home2,home3,away1,away2,away3,ranks)
            
            avgScore += score
            

            if score > 1500 
                winRatio += 1 
            end 
        end
    end
    return avgScore, winRatio
end;

### Note on Uncertainty

Note that the time below is significantly less for the non-verbose function. We will use this function to find the uncertainty on the mean (average) score and the number of wins, as we can run it 100s of times in a reasonable time frame

In [14]:
# Run once to compile, run again to test speed (with different input so no cheating)
run_away_teams(1,2,3,ranks)
@time run_away_teams(1,2,4,ranks)
# Run once to compile, run again to test speed (with different input so no cheating)
run_away_teams_verbose(1,2,5,ranks)
@time run_away_teams_verbose(1,2,6,ranks)

  0.000088 seconds (5 allocations: 192 bytes)
  0.000897 seconds (23.85 k allocations: 372.797 KiB)


In [15]:
function run_home_teams_verbose(ranks)
    
    for home1 = 1:30 , home2 = 1:29 , home3 = (home2 + 1):30 
        if home1 != home2 && home1 != home3
            run_away_teams_verbose(home1,home2,home3,ranks)
        end 
    end 
end;

In [16]:
function run_home_teams(ranks, n)
    
    # Store home team linup as ordered string (small vector would probably be better)
    teamLineup = Array{String}(undef, 12180) 
    winRatio   = zeros(12180, n)
    avgScore   = zeros(12180, n)
    
    for i = 1:n
        count = 1
        for home1 = 1:30 , home2 = 1:29 , home3 = (home2 + 1):30 

            if home1 != home2 && home1 != home3 

                teamLineup[count] = string(home1,",",home2,",",home3)

                avgScore[count, i], winRatio[count, i] = run_away_teams(home1,home2,home3,ranks)

                count += 1

            end 
        end 
    end
    
    return teamLineup, winRatio, avgScore
    
end;

In [17]:
# Run once to compile, run again to test speed 
# run_home_teams(ranks)
@time run_home_teams_verbose(ranks);

 12.869704 seconds (432.55 M allocations: 6.447 GiB, 5.51% gc time)


In [18]:
# Run once to compile, run again to test speed 
run_home_teams(ranks, 1)
@time lineup, winRatio, avgScore = run_home_teams(ranks, 100);

119.059472 seconds (15.84 M allocations: 743.686 MiB, 0.10% gc time)


In [19]:
teamBattles

12180×12180 Array{Float64,2}:
 1564.5  1304.5  1418.5  1478.5  1681.0  …  2196.0  2170.0  2046.0  1994.0
 1761.0  1500.5  1575.0  1669.0  1800.0     2118.0  2115.0  1992.0  2104.0
 1558.0  1426.5  1500.5  1541.0  1676.5     2041.0  2016.0  1983.0  2095.0
 1587.0  1332.5  1445.0  1540.0  1574.5     1676.0  2111.0  2055.0  2167.0
 1396.5  1282.5  1374.5  1439.0  1401.0     2023.0  2094.0  1970.0  1915.0
 1415.0  1217.0  1249.5  1423.0  1494.5  …  2103.0  1899.0  1803.0  1915.0
 1507.0  1606.0  1503.0  1747.0  1501.0     2123.0  2266.0  1993.0  2254.0
 1757.0  1609.0  1587.5  1946.5  1762.5     1899.0  1803.0  2007.0  2119.0
 1572.5  1363.5  1534.0  1501.5  1644.0     1864.0  1996.0  1791.0  1984.0
 1714.5  1691.0  1784.5  1692.0  1810.5     1777.0  2178.0  2156.0  2268.0
 1583.0  1479.5  1601.5  1885.0  1532.5  …  2094.0  2199.0  1678.0  2187.0
 1604.5  1474.0  1539.5  1796.0  1688.5     1695.0  2177.0  1666.0  2165.0
 1381.0  1280.5  1380.5  1586.5  1481.0     1541.0  2040.0  1916.0  18

### Summary Stats

Here I'm going to save the summary stats for every team, where my summary stats are the number of wins, and the mean, variance, skewness, and kurtosis of the scores. 

In [20]:
summaryStats = zeros(12180, 8)
i = 1
p = x -> (x > 1500.0)
for mon1 = 1:30 , mon2 = 1:29 , mon3 = (mon2 + 1):30
    if mon1 != mon2 && mon1 != mon3
        team = teamBattles[Int(teamNumber[Int(mon1), Int(mon2), Int(mon3)]),:]
        numWins = count(p,team)
        summaryStats[i, :] = [numWins mean(team) var(team) skewness(team) kurtosis(team) mon1 mon2 mon3]
        i += 1
    end
end        

## Number of Wins

Here we've got the information related to the number of wins. Below is a histogram of the number of teams vs their number of wins, a sorted list that shows the best and worst teams by number of wins, and a histogram comparing the scores of the best and worst teams by number of wins.

In [21]:
histogram(summaryStats[:, 1], label = "Number of Wins")

In [22]:
sortslices(summaryStats, by=x->x[1], dims = 1, rev = true)

12180×8 Array{Float64,2}:
 9849.0  1851.24       1.39512e5  -0.542953  -0.327253    1.0   9.0  12.0
 9781.0  1787.88       1.06081e5  -0.561242  -0.352822    1.0  11.0  12.0
 9780.0  1827.39       1.43118e5  -0.56514   -0.167767    1.0   4.0   9.0
 9772.0  1853.45       1.51946e5  -0.68338   -0.209535    1.0   4.0  12.0
 9760.0  1780.07   98243.9        -0.406001  -0.363677   11.0   1.0  12.0
 9749.0  1808.83       1.29406e5  -0.599305  -0.240058    1.0   3.0  12.0
 9721.0  1794.55       1.28032e5  -0.586679  -0.15211     1.0   2.0   9.0
 9712.0  1841.01       1.50626e5  -0.559722  -0.336102    1.0  12.0  13.0
 9709.0  1784.51       1.11905e5  -0.343834  -0.423393   11.0   9.0  12.0
 9698.0  1768.23  106207.0        -0.501313  -0.207263   11.0   1.0   4.0
 9688.0  1803.62  130239.0        -0.567663  -0.221184    1.0   3.0   9.0
 9634.0  1814.46       1.35153e5  -0.555374  -0.402198    1.0  12.0  22.0
 9632.0  1761.62   98207.8        -0.251612  -0.389239   11.0   1.0   9.0
    ⋮       

In [23]:
histogram(teamBattles[Int(teamNumber[1, 9, 12]),:], bins = 50, color = fire, label = "#1 Alolan Marowak, Claydol, Mawile")
histogram!(teamBattles[Int(teamNumber[26, 15, 25]),:], bins = 50, color = fighting, label = "#12180 Hitmontop, Registeel, Mismagius")

Here we've got the number of wins on average in every team a particular pokemon is a part of and rank them below, again showing the best and worst.

In [24]:
avgNumOfWins = zeros(30, 2)
i = 1
for mon1 = 1:30 , mon2 = 1:29 , mon3 = (mon2 + 1):30
    if mon1 != mon2 && mon1 != mon3
        avgNumOfWins[mon1, 1] += summaryStats[i, 1]
        avgNumOfWins[mon2, 1] += summaryStats[i, 1]
        avgNumOfWins[mon3, 1] += summaryStats[i, 1]
        i += 1
    end
end
for i = 1:30
    avgNumOfWins[i, 2] = i
end
avgNumOfWins[:, 1] ./ 1218.0
sortslices(avgNumOfWins, by=x->x[1], dims = 1, rev = true)

30×2 Array{Float64,2}:
 9.19987e6   1.0
 8.86561e6   9.0
 8.81442e6  12.0
 8.72191e6   4.0
 8.60383e6   3.0
 8.53836e6  13.0
 8.53389e6  11.0
 8.4913e6    2.0
 8.15647e6  10.0
 8.14821e6  22.0
 8.04268e6   6.0
 8.03159e6   5.0
 8.00519e6  23.0
 ⋮              
 7.50464e6  20.0
 7.46665e6  30.0
 7.3905e6    8.0
 7.23995e6  29.0
 7.19577e6  17.0
 7.19523e6  27.0
 6.94247e6   7.0
 6.60062e6  14.0
 6.55538e6  21.0
 6.34789e6  15.0
 6.1966e6   25.0
 5.25711e6  26.0

Here I've plotted the PvPoke ranking compared to the average number of wins. I think its not necessarily a surprise that there's not a perfect correlation here (you don't put in the top 6 ranked mons generally), but there is a pretty clear negative slope here (which is new as of the Halloween announcement)

In [25]:
plot(1:30, avgNumOfWins[:, 1], label = "Average Number of Wins")

In [26]:
histogram(teamBattles[Int(teamNumber[1, 2, 3]),:], bins = 50, color = steel, label = "#1 PvPoke Alolan Marowak, Steelix, Drifblim")
histogram!(teamBattles[Int(teamNumber[1, 9, 12]),:], bins = 50, color = ground, label = "#1 Teams Alolan Marowak, Claydol, Mawile")


## Average Score

Here we've got the information related to the average. Below is a histogram of the number of teams vs their average, a sorted list that shows the best and worst teams by average score, and a histogram comparing the scores of the best and worst teams by average score. Not that there are minor differences between the average score statistics and the number of wins. I generally consider number of wins to be a more useful statistic (as it doesn't necessarily matter to me how much I win by as long as I win), but I could see arguments for this as well.

In [27]:
histogram(summaryStats[:, 2], label = "Mean Score")

In [28]:
sortslices(summaryStats, by=x->x[2], dims = 1, rev = true)

12180×8 Array{Float64,2}:
 9772.0  1853.45       1.51946e5  -0.68338   -0.209535     1.0   4.0  12.0
 9849.0  1851.24       1.39512e5  -0.542953  -0.327253     1.0   9.0  12.0
 9712.0  1841.01       1.50626e5  -0.559722  -0.336102     1.0  12.0  13.0
 9780.0  1827.39       1.43118e5  -0.56514   -0.167767     1.0   4.0   9.0
 9017.0  1818.36       1.84698e5  -0.434252  -0.721278     9.0   4.0  12.0
 9242.0  1816.98       1.51576e5  -0.365871  -0.698637     9.0   1.0  12.0
 9573.0  1815.64       1.56778e5  -0.524413  -0.288915     1.0   4.0  13.0
 9634.0  1814.46       1.35153e5  -0.555374  -0.402198     1.0  12.0  22.0
 9749.0  1808.83       1.29406e5  -0.599305  -0.240058     1.0   3.0  12.0
 9032.0  1806.91       1.80007e5  -0.343558  -0.739681     9.0  12.0  13.0
 9223.0  1804.64       1.72063e5  -0.441973  -0.520064     1.0  12.0  24.0
 9200.0  1804.06       1.6225e5   -0.449788  -0.596361     9.0   1.0   4.0
 9688.0  1803.62  130239.0        -0.567663  -0.221184     1.0   3.0   9.0

In [29]:
histogram(teamBattles[Int(teamNumber[1, 4, 12]),:], bins = 50, color = fire, label = "#1 Alolan Marowak, Haunter, Mawile")
histogram!(teamBattles[Int(teamNumber[26, 15, 25]),:], bins = 50, color = fighting, label = "#12180 Hitmontop, Registeel, Mismagius")

In [30]:
avgScore = zeros(30, 2)
i = 1
for mon1 = 1:30 , mon2 = 1:29 , mon3 = (mon2 + 1):30
    if mon1 != mon2 && mon1 != mon3
        avgScore[mon1, 1] += summaryStats[i, 2]
        avgScore[mon2, 1] += summaryStats[i, 2]
        avgScore[mon3, 1] += summaryStats[i, 2]
        i += 1
    end
end
for i = 1:30
    avgScore[i, 2] = i
end
avgScore[:, 1] ./ 1218.0
sortslices(avgScore, by=x->x[1], dims = 1, rev = true)

30×2 Array{Float64,2}:
 1.97089e6   1.0
 1.96281e6  12.0
 1.9594e6    9.0
 1.94405e6   4.0
 1.93325e6  13.0
 1.92017e6   3.0
 1.91438e6  11.0
 1.9135e6    2.0
 1.9e6      22.0
 1.8937e6   10.0
 1.87746e6   5.0
 1.87607e6  23.0
 1.8748e6   24.0
 ⋮              
 1.83964e6  20.0
 1.83406e6  30.0
 1.83108e6   8.0
 1.82544e6  27.0
 1.81018e6  17.0
 1.80387e6  29.0
 1.78483e6   7.0
 1.76833e6  14.0
 1.75816e6  21.0
 1.72974e6  15.0
 1.7277e6   25.0
 1.6519e6   26.0

Again, there isn't a strong correlation with the average score to the PvPoke ranking, but that is to be expected. Also, average score is strongly correlated with the number of wins, which is also to be expected. So this particular statistic is not the same as number of wins, or even leads to the same conclusion, but they are correlated.

In [31]:
plot(1:30, avgScore[:, 1], label = "Average Score")

In [32]:
avgScore = zeros(30, 2)
i = 1
for mon1 = 1:30 , mon2 = 1:29 , mon3 = (mon2 + 1):30
    if mon1 != mon2 && mon1 != mon3
        avgScore[mon1, 1] += summaryStats[i, 2]
        avgScore[mon2, 1] += summaryStats[i, 2]
        avgScore[mon3, 1] += summaryStats[i, 2]
        i += 1
    end
end
for i = 1:30
    avgScore[i, 2] = i
end
avgScore[:, 1] ./ 1218.0
sortslices(avgScore, by=x->x[1], dims = 1, rev = true)

30×2 Array{Float64,2}:
 1.97089e6   1.0
 1.96281e6  12.0
 1.9594e6    9.0
 1.94405e6   4.0
 1.93325e6  13.0
 1.92017e6   3.0
 1.91438e6  11.0
 1.9135e6    2.0
 1.9e6      22.0
 1.8937e6   10.0
 1.87746e6   5.0
 1.87607e6  23.0
 1.8748e6   24.0
 ⋮              
 1.83964e6  20.0
 1.83406e6  30.0
 1.83108e6   8.0
 1.82544e6  27.0
 1.81018e6  17.0
 1.80387e6  29.0
 1.78483e6   7.0
 1.76833e6  14.0
 1.75816e6  21.0
 1.72974e6  15.0
 1.7277e6   25.0
 1.6519e6   26.0

In [33]:
plot(avgScore[:, 1], avgNumOfWins[:, 1], seriestype=:scatter, label = "Average Score vs. Num of Wins")

## Variance

In [34]:
histogram(summaryStats[:, 3], label = "Variance")

In [35]:
sortslices(summaryStats, by=x->x[3], dims = 1, rev = true)

12180×8 Array{Float64,2}:
 5561.0  1468.03      2.77661e5   0.10408     -1.0225    17.0  24.0  28.0
 6545.0  1523.4       2.74812e5  -0.128959    -1.00761   28.0  17.0  24.0
 5769.0  1406.01      2.71216e5   0.00203468  -0.97876   28.0  24.0  26.0
 6800.0  1555.69      2.67666e5  -0.141753    -1.07196   28.0  16.0  24.0
 5319.0  1385.39      2.67637e5   0.096403    -0.926016  24.0  26.0  28.0
 6767.0  1599.33      2.66622e5  -0.102151    -1.07579   24.0   4.0  28.0
 6130.0  1501.36      2.66543e5  -0.0432197   -0.958687  24.0  17.0  28.0
 6799.0  1554.0       2.6361e5   -0.203034    -1.15388   28.0   6.0  24.0
 6392.0  1496.43      2.62919e5  -0.0851467   -1.05091   28.0  14.0  24.0
 6725.0  1595.72      2.62115e5  -0.0686812   -1.0373    24.0  13.0  28.0
 5515.0  1473.96      2.61674e5   0.128089    -1.1118    17.0   6.0  24.0
 7301.0  1635.72      2.60551e5  -0.227221    -0.973734  13.0  24.0  28.0
 6931.0  1578.47      2.59417e5  -0.141421    -1.06664   28.0  10.0  24.0
    ⋮       

In [36]:
histogram(teamBattles[Int(teamNumber[17, 24, 28]),:], bins = 50, color = fighting, label = "#1 Primeape, Gardevoir, Magneton")
histogram!(teamBattles[Int(teamNumber[11, 5, 18]),:], bins = 50, color = ghost, label = "#12180 Dusclops, Banette, Dusknoir")

In [37]:
avgVar = zeros(30, 2)
i = 1
for mon1 = 1:30 , mon2 = 1:29 , mon3 = (mon2 + 1):30
    if mon1 != mon2 && mon1 != mon3
        avgVar[mon1, 1] += summaryStats[i, 3]
        avgVar[mon2, 1] += summaryStats[i, 3]
        avgVar[mon3, 1] += summaryStats[i, 3]
        i += 1
    end
end
for i = 1:30
    avgVar[i, 2] = i
end
avgVar[:, 1] ./ 1218.0
sortslices(avgVar, by=x->x[1], dims = 1, rev = true)

30×2 Array{Float64,2}:
 2.30594e8  24.0
 2.20057e8  28.0
 2.16082e8  17.0
 2.06409e8  26.0
 2.04648e8   4.0
 2.02589e8  13.0
 1.99575e8  16.0
 1.96057e8  14.0
 1.95454e8   6.0
 1.93223e8  10.0
 1.91582e8  12.0
 1.89399e8  19.0
 1.89224e8   9.0
 ⋮              
 1.76072e8   2.0
 1.74331e8   3.0
 1.72292e8   7.0
 1.70593e8  15.0
 1.69093e8   1.0
 1.64747e8  21.0
 1.63565e8  23.0
 1.62644e8  25.0
 1.54698e8   5.0
 1.54398e8  18.0
 1.5275e8   20.0
 1.43355e8  11.0

In [38]:
plot(avgVar[:, 1], avgNumOfWins[:, 1], seriestype=:scatter, label = "Average Variance vs. Num of Wins")

## Skewness

In [39]:
histogram(summaryStats[:, 4], label = "Skewness")

In [40]:
sortslices(summaryStats, by=x->x[4], dims = 1, rev = true)

12180×8 Array{Float64,2}:
 2326.0  1162.46       1.59356e5   1.04543    0.266853   26.0  15.0  27.0
 2280.0  1115.57  164466.0         1.03833    0.279411   26.0  14.0  25.0
 2583.0  1275.63       1.74242e5   1.0228     0.191034   26.0   2.0  27.0
 2496.0  1192.14       1.74119e5   1.01838    0.334089   26.0  14.0  27.0
 2721.0  1269.38       1.42946e5   1.01201    0.200447   26.0  20.0  27.0
 2431.0  1196.05       1.31428e5   0.999304   0.107155   26.0  20.0  25.0
 2408.0  1148.42       1.71996e5   0.99786    0.186216   26.0   7.0  14.0
 2455.0  1226.69       1.28951e5   0.988954   0.176225   26.0  20.0  21.0
 2566.0  1207.68       1.59338e5   0.972041   0.194453   26.0  14.0  18.0
 2503.0  1191.86       1.75656e5   0.971606   0.260695   26.0   8.0  14.0
 2528.0  1230.74       1.37511e5   0.967925   0.0271794  26.0   7.0  20.0
 2589.0  1282.28       1.28592e5   0.963943   0.0246417  26.0  18.0  20.0
 2271.0  1161.56       1.54633e5   0.961371   0.325843   26.0  14.0  21.0
    ⋮       

In [41]:
histogram(teamBattles[Int(teamNumber[26, 15, 27]),:], bins = 50, color = fighting, label = "#1 Hitmontop, Registeel, Uxie")
histogram!(teamBattles[Int(teamNumber[1, 6, 11]),:], bins = 50, color = fire, label = "#12180 Alolan Marowak, Bastiodon, Dusclops")

In [42]:
avgSkew = zeros(30, 2)
i = 1
for mon1 = 1:30 , mon2 = 1:29 , mon3 = (mon2 + 1):30
    if mon1 != mon2 && mon1 != mon3
        avgSkew[mon1, 1] += summaryStats[i, 4]
        avgSkew[mon2, 1] += summaryStats[i, 4]
        avgSkew[mon3, 1] += summaryStats[i, 4]
        i += 1
    end
end
for i = 1:30
    avgSkew[i, 2] = i
end
avgSkew[:, 1] ./ 1218.0
sortslices(avgSkew, by=x->x[1], dims = 1, rev = true)

30×2 Array{Float64,2}:
  461.259     26.0
  222.635     14.0
  188.197     27.0
  165.739     25.0
  154.893     21.0
   86.195     15.0
   64.1886     8.0
   58.8451     7.0
   40.1238    20.0
   37.6651    17.0
   14.7199    24.0
   13.6423    16.0
   -0.631238  30.0
    ⋮             
  -45.8387    10.0
  -94.6014    11.0
 -115.177     13.0
 -115.455     12.0
 -118.218      9.0
 -140.013     28.0
 -143.005     19.0
 -150.451      2.0
 -175.304      3.0
 -180.573      4.0
 -200.551      6.0
 -241.052      1.0

In [43]:
plot(avgNumOfWins[:, 1], avgSkew[:, 1], seriestype=:scatter, label = "Average Skewness vs. Num of Wins")

In [44]:
histogram(teamBattles[Int(teamNumber[8, 1, 20]),:], bins = 50, color = ghost, label = "#1 Dusclops, Alolan Marowak, Mawile")
histogram!(teamBattles[Int(teamNumber[1, 4, 27]),:], bins = 50, color = fire, label = "#12180 Alolan Marowak, Bastiodon, Medicham")

## Kurtosis

In [45]:
histogram(summaryStats[:, 5], label = "Kurtosis")

In [46]:
sortslices(summaryStats, by=x->x[5], dims = 1, rev = true)

12180×8 Array{Float64,2}:
 2496.0  1192.14       1.74119e5   1.01838     0.334089  26.0  14.0  27.0
 2271.0  1161.56       1.54633e5   0.961371    0.325843  26.0  14.0  21.0
 2280.0  1115.57  164466.0         1.03833     0.279411  26.0  14.0  25.0
 2326.0  1162.46       1.59356e5   1.04543     0.266853  26.0  15.0  27.0
 2503.0  1191.86       1.75656e5   0.971606    0.260695  26.0   8.0  14.0
 2456.0  1206.71       1.40809e5   0.894162    0.225513  26.0  25.0  27.0
 2767.0  1260.35       1.65544e5   0.888958    0.211687  26.0  14.0  22.0
 2581.0  1228.48       1.53093e5   0.937044    0.208362  26.0   7.0  27.0
 3039.0  1327.17       1.3522e5    0.957383    0.201993  26.0  11.0  27.0
 2721.0  1269.38       1.42946e5   1.01201     0.200447  26.0  20.0  27.0
 2560.0  1225.72       1.44228e5   0.923184    0.19707   26.0  14.0  20.0
 2566.0  1207.68       1.59338e5   0.972041    0.194453  26.0  14.0  18.0
 2621.0  1229.34       1.59804e5   0.952291    0.191709  26.0  14.0  23.0
    ⋮       

In [47]:
avgKurt = zeros(30, 2)
i = 1
for mon1 = 1:30 , mon2 = 1:29 , mon3 = (mon2 + 1):30
    if mon1 != mon2 && mon1 != mon3
        avgKurt[mon1, 1] += summaryStats[i, 5]
        avgKurt[mon2, 1] += summaryStats[i, 5]
        avgKurt[mon3, 1] += summaryStats[i, 5]
        i += 1
    end
end
for i = 1:30
    avgKurt[i, 2] = i
end
avgSkew[:, 1] ./ 1218.0
sortslices(avgKurt, by=x->x[1], dims = 1, rev = true)

30×2 Array{Float64,2}:
  -542.534  26.0
  -715.271   1.0
  -771.581   3.0
  -774.666  11.0
  -789.85    2.0
  -791.037  14.0
  -792.641  25.0
  -796.58   21.0
  -801.026  18.0
  -827.988   8.0
  -843.695   9.0
  -854.294   5.0
  -859.485  19.0
     ⋮          
  -910.029   7.0
  -910.637  13.0
  -916.291  15.0
  -920.295  30.0
  -929.792  22.0
  -931.741  20.0
  -934.365  29.0
  -938.421   6.0
  -943.44   17.0
  -952.44   12.0
  -972.568  28.0
 -1028.49   24.0

In [48]:
plot(avgNumOfWins[:, 1], avgKurt[:, 1], seriestype=:scatter, label = "Average Kurtosis vs. Num of Wins")

In [49]:
histogram(teamBattles[Int(teamNumber[1, 2, 27]),:], bins = 50, color = fire, label = "#1 Alolan Marowak, Steelix, Medicham")
histogram!(teamBattles[Int(teamNumber[4, 6, 11]),:], bins = 50, color = rock, label = "#12180 Bastiodon, Cresselia, Empoleon")

# Teams of Six

In [50]:
function maxScore(team)
    maxScores = zeros(12180)
    for lead = 1:6, second = 1:5, third = second + 1:6
        if(lead != second && lead != third)
            for i = 1:12180
                currentScore = teamBattles[Int(teamNumber[Int(team[Int(lead)]), Int(team[Int(second)]), Int(team[Int(third)])]),i]
                if maxScores[Int(i)] < currentScore
                    maxScores[Int(i)] = currentScore
                end
            end
        end
    end
    return maxScores
end

maxScore (generic function with 1 method)

In [51]:
histogram(maxScore([1 3 8 17 20 27]), alpha = 0.5, bins = 64, color = :green, label = "Top 6 Team Ranked")
histogram!(maxScore([1 2 3 4 5 6]), alpha = 0.5, bins = 64, color = :teal, label = "Top 6 PvPoke Ranked")
histogram!(maxScore([25 26 27 28 29 30]), alpha = 0.5, bins = 64, color = :blue, label = "Bottom 6 PvPoke Ranked")
histogram!(maxScore([9 14 18 22 25 30]), alpha = 0.5, bins = 64, color = :purple, label = "Bottom 6 Team Ranked")
vline!([1500], color = :red, label = "W/L cutoff")

In [52]:
team = [1 2 10 17 26]
h = histogram()
summaryStats2 = zeros(30, 8)
i = 1
p = x -> (x > 1500.0)   
for lead = 1:5, second = 1:4, third = second + 1:5
    if(lead != second && lead != third)
        team1 = teamBattles[Int(teamNumber[Int(team[lead]), Int(team[second]), Int(team[third])]),:]
        numWins = count(p,team1)
        summaryStats2[i, :] = [numWins mean(team1) var(team1) skewness(team1) kurtosis(team1) team[lead] team[second] team[third]]
        i += 1
        histogram!(h, teamBattles[Int(teamNumber[Int(team[lead]), Int(team[second]), Int(team[third])]),:], bins = 50, alpha = 0.5, label = team[lead])
    end
end
     
h
vline!([1500], color = :red)

In [53]:
sortslices(summaryStats2, by=x->x[1], dims = 1, rev = true)

30×8 Array{Float64,2}:
 9200.0  1739.13       1.36146e5  -0.501262     -0.284084   1.0   2.0  10.0
 8934.0  1728.77       1.42788e5  -0.445457     -0.463262   2.0   1.0  10.0
 8609.0  1684.61       1.48646e5  -0.480212     -0.327705   1.0   2.0  17.0
 8363.0  1696.84       1.48708e5  -0.271383     -0.687502  10.0   1.0   2.0
 8359.0  1673.34       1.57682e5  -0.397829     -0.520606   2.0   1.0  17.0
 7870.0  1640.82       1.49363e5  -0.187912     -0.426569   1.0  10.0  17.0
 7699.0  1639.98       1.7905e5   -0.219297     -0.752512  10.0   1.0  17.0
 7300.0  1590.67       1.75559e5  -0.214284     -0.697144  10.0   2.0  17.0
 7239.0  1585.22  152278.0        -0.137154     -0.453111   2.0  10.0  17.0
 7226.0  1562.87       1.46173e5  -0.19731      -0.359655   1.0   2.0  26.0
 6870.0  1548.12       1.54751e5  -0.0754247    -0.54534    2.0   1.0  26.0
 6649.0  1587.52       1.77882e5   0.00944979   -0.958569  17.0   1.0   2.0
 6408.0  1527.4        1.48219e5   0.0230675    -0.277536   1.0  

In [54]:
histogram(teamBattles[Int(teamNumber[26, 2, 10]),:], bins = 50, color = ghost, label = "#30 Dusknoir, Steelix, Claydol")
histogram!(teamBattles[Int(teamNumber[1, 2, 17]),:], bins = 50, color = fire, label = "#1 Alolan Marowak, Steelix, Poliwrath")
vline!([1500], color = :red, label = "W/L cutoff")

In [55]:
sortslices(summaryStats2, by=x->x[2], dims = 1, rev = true)

30×8 Array{Float64,2}:
 9200.0  1739.13       1.36146e5  -0.501262     -0.284084   1.0   2.0  10.0
 8934.0  1728.77       1.42788e5  -0.445457     -0.463262   2.0   1.0  10.0
 8363.0  1696.84       1.48708e5  -0.271383     -0.687502  10.0   1.0   2.0
 8609.0  1684.61       1.48646e5  -0.480212     -0.327705   1.0   2.0  17.0
 8359.0  1673.34       1.57682e5  -0.397829     -0.520606   2.0   1.0  17.0
 7870.0  1640.82       1.49363e5  -0.187912     -0.426569   1.0  10.0  17.0
 7699.0  1639.98       1.7905e5   -0.219297     -0.752512  10.0   1.0  17.0
 7300.0  1590.67       1.75559e5  -0.214284     -0.697144  10.0   2.0  17.0
 6649.0  1587.52       1.77882e5   0.00944979   -0.958569  17.0   1.0   2.0
 7239.0  1585.22  152278.0        -0.137154     -0.453111   2.0  10.0  17.0
 6356.0  1573.39       1.96668e5   0.0750557    -1.01888   17.0   1.0  10.0
 7226.0  1562.87       1.46173e5  -0.19731      -0.359655   1.0   2.0  26.0
 6870.0  1548.12       1.54751e5  -0.0754247    -0.54534    2.0  

In [56]:
histogram(teamBattles[Int(teamNumber[10, 1, 2]),:], bins = 50, color = ground, label = "#30 Claydol, Alolan Marowak, Steelix")
histogram!(teamBattles[Int(teamNumber[1, 10, 17]),:], bins = 50, color = water,alpha = 0.5, label = "#1 Alolan Marowak, Claydol, Poliwrath")
vline!([1500], color = :red, label = "W/L cutoff")

In [57]:
# This is a naive way of ranking teams that is too slowwwwwww
#
# summaryStats2 = zeros(12180, 8)
# i = 1
# p = x -> (x > 1500.0)
# for mon1 = 1:25, mon2 = mon1 + 1:26, mon3 = mon2 + 1:27, mon4 = mon3 + 1:28, mon5 = mon4 + 1:29, mon6 = mon5 + 1:30
#         team = maxScore([mon1 mon2 mon3 mon4 mon5 mon6])
#         numWins = count(p,team)
#         summaryStats2[i, :] = [numWins mean(team) var(team) skewness(team) kurtosis(team) mon1 mon2 mon3]
#         print(i)
#         i += 1
#     end
# end

# Counters

In [58]:
indices = zeros(12180, 3)
i = 1
for mon1 = 1:30 , mon2 = 1:29 , mon3 = (mon2 + 1):30
    if mon1 != mon2 && mon1 != mon3
        indices[i, 1] = mon1
        indices[i, 2] = mon2
        indices[i, 3] = mon3
        i += 1
    end
end

homeTeam = hcat(teamBattles[Int(teamNumber[8, 1, 20]),:], indices)

counters = zeros(30, 2)
for i = 1:12180
    for j = 1:30
        if(homeTeam[i, 2] == j || homeTeam[i, 3] == j || homeTeam[i, 4] == j)
            counters[j, 1] += 1500 - homeTeam[i, 1]
            counters[j, 2] = j
        end
    end
end

sortslices(counters, by=x->x[1], dims = 1, rev = true)

30×2 Array{Float64,2}:
       1.89253e5   3.0
  122269.0         4.0
       1.01423e5   5.0
   50382.0         6.0
   46498.0        25.0
   34381.0        29.0
   23439.0         2.0
  -13221.5         7.0
  -16579.0         1.0
  -22882.5        19.0
  -27055.0        21.0
  -33306.0        12.0
  -34537.5        11.0
       ⋮              
  -91535.5        30.0
  -98488.5        15.0
      -1.05989e5  10.0
      -1.4354e5    8.0
      -1.85955e5  17.0
      -1.98196e5   9.0
      -1.98273e5  16.0
      -2.16663e5  13.0
 -266086.0        26.0
 -271274.0        24.0
      -3.05199e5  22.0
 -385738.0        14.0

## Roles

### Lead

In [59]:
leadMeanScore = zeros(30, 2)
leadNumWins = zeros(30, 2)
i = 0
p = x -> (x > 1500.0)
for mon1 = 1:30 , mon2 = 1:29 , mon3 = (mon2 + 1):30
    if mon1 != mon2 && mon1 != mon3
        team = teamBattles[Int(teamNumber[mon1, mon2, mon3]), :]
        leadMeanScore[mon1, 1] += mean(team)
        leadNumWins[mon1, 1] += count(p, team)
        i += 1
    end
end

for j = 1:30
    leadMeanScore[j, 2] = j
    leadNumWins[j, 2] = j
end
leadMeanScore[:, 1] = leadMeanScore[:, 1]./(i / 30.0)
sortslices(leadMeanScore, by=x->x[1], dims = 1, rev = true)

30×2 Array{Float64,2}:
 1664.17   1.0
 1637.11   9.0
 1629.96  12.0
 1617.52   4.0
 1608.93   3.0
 1606.79   2.0
 1604.44  11.0
 1601.98  13.0
 1576.36  10.0
 1561.14  22.0
 1555.38  23.0
 1555.15   6.0
 1550.03   5.0
    ⋮         
 1501.54  20.0
 1493.27  30.0
 1483.97   8.0
 1476.0   27.0
 1475.45  29.0
 1469.94  17.0
 1449.69   7.0
 1415.36  14.0
 1403.7   21.0
 1390.55  15.0
 1378.65  25.0
 1292.35  26.0

In [60]:
sortslices(leadNumWins, by=x->x[1], dims = 1, rev = true)

30×2 Array{Float64,2}:
 3.42596e6   1.0
 3.15799e6   9.0
 3.12978e6   2.0
 3.12492e6  11.0
 3.12245e6   3.0
 3.06904e6  12.0
 3.06465e6   4.0
 2.94043e6  13.0
 2.90838e6   6.0
 2.88955e6  10.0
 2.79336e6  23.0
 2.74826e6   5.0
 2.71179e6  22.0
 ⋮              
 2.44072e6  20.0
 2.39606e6  30.0
 2.383e6    29.0
 2.29223e6   8.0
 2.28052e6  17.0
 2.19248e6  27.0
 2.19218e6   7.0
 1.89563e6  14.0
 1.88671e6  15.0
 1.8503e6   21.0
 1.74402e6  25.0
 1.28089e6  26.0

### Secondary

In [61]:
secondaryMeanScore = zeros(30, 2)
secondaryNumWins = zeros(30, 2)
i = 0
p = x -> (x > 1500.0)
for mon1 = 1:30 , mon2 = 1:29 , mon3 = (mon2 + 1):30
    if mon1 != mon2 && mon1 != mon3
        team = teamBattles[Int(teamNumber[mon1, mon2, mon3]), :]
        secondaryMeanScore[mon2, 1] += mean(team)
        secondaryMeanScore[mon3, 1] += mean(team)
        secondaryNumWins[mon2, 1] += count(p, team)
        secondaryNumWins[mon3, 1] += count(p, team)
        i += 1
    end
end
for j = 1:30
    secondaryMeanScore[j, 2] = j
    secondaryNumWins[j, 2] = j
end
sortslices(secondaryMeanScore, by=x->x[1], dims = 1, rev = true)

30×2 Array{Float64,2}:
 1.30104e6  12.0
 1.29524e6   1.0
 1.29474e6   9.0
 1.28734e6   4.0
 1.28284e6  13.0
 1.26694e6   3.0
 1.26618e6  22.0
 1.26298e6  11.0
 1.26114e6   2.0
 1.2537e6   10.0
 1.25094e6  24.0
 1.24815e6   5.0
 1.24458e6  23.0
 ⋮              
 1.22859e6   8.0
 1.2278e6   30.0
 1.22619e6  27.0
 1.22534e6  19.0
 1.21339e6  17.0
 1.20483e6  29.0
 1.19626e6   7.0
 1.19369e6  14.0
 1.18826e6  21.0
 1.16796e6  25.0
 1.16518e6  15.0
 1.12721e6  26.0

In [62]:
sortslices(secondaryNumWins, by=x->x[1], dims = 1, rev = true)

30×2 Array{Float64,2}:
 5.77391e6   1.0
 5.74538e6  12.0
 5.70762e6   9.0
 5.65726e6   4.0
 5.59793e6  13.0
 5.48138e6   3.0
 5.43642e6  22.0
 5.40897e6  11.0
 5.36152e6   2.0
 5.28333e6   5.0
 5.26692e6  10.0
 5.24106e6  24.0
 5.21183e6  23.0
 ⋮              
 5.07059e6  30.0
 5.06392e6  20.0
 5.05948e6  19.0
 5.00275e6  27.0
 4.91525e6  17.0
 4.85695e6  29.0
 4.75029e6   7.0
 4.70508e6  21.0
 4.70499e6  14.0
 4.46118e6  15.0
 4.45258e6  25.0
 3.97623e6  26.0

In [63]:
plot(leadNumWins[:, 1], secondaryNumWins[:, 1], seriestype=:scatter,)