Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Win probability? #1

Closed
llimllib opened this issue Mar 27, 2012 · 32 comments
Closed

Win probability? #1

llimllib opened this issue Mar 27, 2012 · 32 comments

Comments

@llimllib
Copy link

I'm trying to figure out how to calculate the win probability from two TrueSkill ratings. You have draw probability, as match_quality, but not win probability.

Any idea how to calculate it? I've been reading lots of stuff on TrueSkil, but it's a bit over my head.

@sublee
Copy link
Owner

sublee commented Mar 27, 2012

You can calculate with β constant. β means the skill class width which is the difference between two ratings with 80% of win probability. You can find more information about β in 14p of The Math Behind TrueSkill.

@llimllib
Copy link
Author

That paper tells me that player A, of rating R, will beat player B of rating S 80% of the time if R-S=β, but it doesn't give me a function from (R,S,β) to win percentage.

@sublee
Copy link
Owner

sublee commented Mar 27, 2012

According to the paper, the ratio of A's winning is 4^((R-S)/β):1.

  1. 0 -> 1:1 -> 50%
  2. β -> 4:1 -> 80%
  3. 2β -> 4²:1 -> 94%

So we can make a naive win_probability function such as:

def win_probability(greater, lesser):
    exp = (greater.mu - lesser.mu) / BETA
    n = 4. ** exp
    return n / (n + 1)

But I think that this function isn't trustworthy because it ignores the sigma values.

@llimllib
Copy link
Author

I don't see that equation in the paper? Could you point me to it?

I feel like an idiot because I'm reading the paper over and over and not finding it.

@llimllib
Copy link
Author

Furthermore, what I'd really like to determine is Pwin(β,µ1,σ1,µ2,σ2), but the paper only gives Pdraw, and I deeply do not understand the derivation for it. I assume you don't know of any way to derive a function Pwin?

@sublee
Copy link
Owner

sublee commented Mar 28, 2012

Here are 3 players:

  1. A player (best) has Rating(μ+β, σ₁)
  2. B player (normal) has Rating(μ, σ₂)
  3. C player (worst) has Rating(μ-β, σ₃)

You know that A player beats B player 80% of time and B player beats C player 80% also. Then what of time A player beats C player? See the below proportional expressions. The win probability can be represented as ratio expression.

  1. A : B = 4 : 1 = 4² : 4
  2. B : C = 4 : 1
  3. A : B : C = 4² : 4 : 1

We can also generalize about nβ. The nβ of difference follows 4ⁿ : 1 of win probability.

You're right. I don't know how calculate win probability with sigma values. TrueSkill system doesn't offer Pwin function. So the result of my function is just expected value not correct value.

@llimllib
Copy link
Author

right... I should have just thought a bit more. Thanks for the explanation, and concurring with me that there's no Pwin.

@warner121
Copy link

This gives the probability of a win, taking the current sigma values into account (from Herbrich and Graepel's equation 1.1):

from trueskill import Rating
from trueskill.mathematics import cdf

def Pwin(rA=Rating(), rB=Rating()):
    deltaMu = rA.mu - rB.mu
    rsss = sqrt(rA.sigma**2 + rB.sigma**2)
    return cdf(deltaMu/rsss)

There's no accounting for the draw here, though. Perhaps you could subtract 'half a draw' from the result?

@sublee
Copy link
Owner

sublee commented Nov 19, 2012

@warner121 Thanks to tell the code! It seems to be pretty nice. I also want to see the original documentation of Herbrich and Graepel's equation 1.1. Can you give me that?

@warner121
Copy link

@sublee No problem, thank you for sharing your code. I found the paper at http://research.microsoft.com/pubs/74419/TR-2006-80.pdf.

@nbonfire
Copy link

This win probability is good for 1v1, but what would need to change to give the win probability of an n v n match?

@azurespace
Copy link

@nickdanger3d In the formula, we obtained a normal difference distribution of two ratings and calculate its CDF. I think we can sum players' ratings in a team into one normal distribution before that. (See: http://mathworld.wolfram.com/NormalSumDistribution.html )

However, what should I do if a player partial played? Do you people have any idea?

@nbonfire
Copy link

@azurespace Just to be clear, you're saying the win probability for a match of n v n assuming no one partially plays would be something like this:

def Pwin(rAlist=[Rating()],  rBlist=[Rating()]):
    deltaMu = sum( [x.mu for x in rAlist])  - sum( [x.mu for x in  rBlist])
    rsss = sqrt(sum( [x.sigma**2 for x in  rAlist]) + sum( [x.sigma**2 for x in rBlist]) )
    return cdf(deltaMu/rsss)

@azurespace
Copy link

@nickdanger3d Exactly. I think it makes sense when the skill in target game is additive. If it isn't, we must think the again with the paper ...

@azurespace
Copy link

@warner121 I've tested the function for several cases but its prediction didn't match real result well. I think it is because the formula does not take account of beta value, which matters in Trueskill system.

@azurespace
Copy link

Now I got it. I think we should combine @warner121 's way and @sublee 's one. The meaning of cumulative density function is the probability that player A performs better than player B. However, it doesn't count beta in any calculation.

We can calculate probability at a certain difference of skill(delta). We also know how advantageous player A is against player B at the delta. In order to get the right winning probability, we should calculate the weighted sum of two probability functions(P(delta) * P(win))!

@bundlking
Copy link

@azurespace - what are your thoughts on generalising the win probability function for the free for all scenario (4 - 8 player)?

@jsnell
Copy link

jsnell commented Oct 21, 2015

The following seems to produce good results. (It's Jeff Moser's suggestion in the comments of http://www.moserware.com/2010/03/computing-your-skill.html, translated to Python).

def win_probability(a, b):                                                      
    deltaMu = sum([x.mu for x in a]) - sum([x.mu for x in b])                   
    sumSigma = sum([x.sigma ** 2 for x in a]) + sum([x.sigma ** 2 for x in b])  
    playerCount = len(a) + len(b)                                               
    denominator = math.sqrt(playerCount * (BETA * BETA) + sumSigma)             
    return cdf(deltaMu / denominator)  

nbonfire pushed a commit to nbonfire/hitztourney that referenced this issue Oct 21, 2015
Update the win probability algorithm per sublee/trueskill#1 (comment)
@bichengcao
Copy link

I'm a bit confused about the winning probability regarding Beta. In Moser's article, it states that player with rating (mu+beta) can beat player with rating (mu) 80%.

But on trueskill.org, it states beta means 76% winning chance. From my testing of the code of this library, with the win_probability method jsnell@ mentioned above, it does give me 76%.

Can anyway please explain that is it suppose to be 80% or 76%?

@avdeevvadim
Copy link

@bichengcao In the Elo model the probability of a win is P = \Phi( \frac{ s_1 - s_2 }{ \sqrt{2} \beta } ) (where \Phi is the cumulative distribution function of the standard normal distribution), therefore the distance s_1 - s_2 = \beta implies P = \Phi( \frac{ 1 }{ \sqrt{2} } ) =76,02%.
In the TrueSkill model the probability of a win also depends on the variances of players: in a game with two players P = \Phi( \frac{ \mu_1 - \mu_2 - \epsilon }{ \sqrt{ \sigma_1^2 + \sigma_2^2 + 2\beta^2 } } ).
If we set \epsilon = 0, \sigma_1 = \sigma_2 = 0, \mu_1 - \mu_2 = \beta, then P also equals 76,02%, as in the Elo model.

@sublee
Copy link
Owner

sublee commented Sep 5, 2016

@bichengcao It was @avdeevvadim's suggestion. I called him to come. His reply would be perfect answer for your question.

@markbarrington
Copy link

Using the calculation for win probability formula given above I get the following....

a = ts.Rating(mu=24.0,sigma=1.0)
b = ts.Rating(mu=25.0,sigma=2.0)
print ts.quality_1vs1(a,b)
print Pwin(a,b)

0.923252184581
0.436966107024

How should I interpret this result? I would have expected that draw probability + win probability = 1 - loss probability?

@sublee
Copy link
Owner

sublee commented Jan 10, 2017

@markbarrington Shouldn't 1 - loss probability be same with just win probability rather than draw probability + win probability?

Best match is most fair match. All players have almost same performance, draw probability or quality should be highest. At that same match, one of players will have nearly 50% of chance to win.

@markbarrington
Copy link

OK. I see from more experimenting that win_probability(a,b) = 1 - win_probability(b,a).

I don't understand how the sum of the probabilities of the possible outcomes (w,d,l) can be more than 1. I'd appreciate any insight on what I'm missing here?

@sublee
Copy link
Owner

sublee commented Jan 10, 2017

@markbarrington Let's imagine an ideal match. Player A and B has same win probability of 50%. The draw probability of this match is 100%. The sum of win probabilities and draw probability cannot be 100%.

@romovpa
Copy link

romovpa commented Oct 1, 2017

How about add suggested Moser's function as a special case? #17

@sublee
Copy link
Owner

sublee commented Jan 13, 2018

@jsnell I've just introduced your snippet in the documentation. Thank you for the good code.

@hayj
Copy link

hayj commented Sep 25, 2020

The following seems to produce good results. (It's Jeff Moser's suggestion in the comments of http://www.moserware.com/2010/03/computing-your-skill.html, translated to Python).

def win_probability(a, b):                                                      
    deltaMu = sum([x.mu for x in a]) - sum([x.mu for x in b])                   
    sumSigma = sum([x.sigma ** 2 for x in a]) + sum([x.sigma ** 2 for x in b])  
    playerCount = len(a) + len(b)                                               
    denominator = math.sqrt(playerCount * (BETA * BETA) + sumSigma)             
    return cdf(deltaMu / denominator)  

What is the variable BETA here ?

NameError: name 'BETA' is not defined

@scttcper
Copy link

@hayj beta is the current global beta value, or self.beta when in the trueskill class.

https://github.com/sublee/trueskill/blob/master/trueskill/__init__.py#L47

@bernd-wechner
Copy link
Contributor

Let's imagine an ideal match. Player A and B has same win probability of 50%. The draw probability of this match is 100%. The sum of win probabilities and draw probability cannot be 100%.

This claim puzzles me as it must be drawing on some idiosyncratic definition of "win probability" not clear from the English name (the sort of thing that happens when math travels language boundaries sometimes). Because by the very definition of probability, win and draw is must be true in a two player game that:

P_winA + P_winB + P_draw = 1

as these are the three distinct possible outcomes. So if "Player A and B has same win probability of 50%." the the draw probability is by definition 0, not 100%.

I don't have a win probability coded but I did establish a predicted ranking and a confidence measure (in %) in that prediction in conversation with others and am content with that. An actual win probability I haven't looked at. Given I (and TrueSkill in general) deal with multiplayer games, win probability for each player may not be a trivial thing to calculate, not least give TrueSkill itself works on numeric approximations to PDFs. In short I'd expect that treated in papers ;-). And it may be, I have a few around and have written and summary of it all as well. Hit me up if interested.

@coldfix
Copy link
Contributor

coldfix commented Sep 2, 2021

@bernd-wechner

I agree. The above snippet assumes there is no distinct outcome "draw". If you want to compute the winning probability, you would need the formula (7) from Application and Further Development of TrueSkill™ Ranking in Sports, 2019, Ibstedt, et al:

p(p₁-p₂≥ε) = 1 - Φ((ε-µ₁+µ₂) / √(2β²+σ₁²+σ₂²)) = Φ((µ₁-µ₂-ε) / √(2β²+σ₁²+σ₂²))

For 1v1 matchup, I believe it should look like this (untested):

from math import sqrt
from trueskill import Rating, TrueSkill, global_env

def p_win_1v1(
    p1: Rating,
    p2: Rating,
    draw_margin: float,
    n: int = 2
    env: Trueskill = None,
) -> float:
    """Calculate the probability that p1 wins the game."""
    if env is None:
        env = global_env()
    return 1env.cdf(
        (p1.mu - p2.mu - draw_margin) /
        sqrt(n * env.beta**2 + p1.sigma**2 + p2.sigma**2)
    )

Then, team games are basically 1v1s with summed ratings:

def p_win_team(
    team1: list[Rating],
    team2: list[Rating],
    draw_margin: float,
    env: Trueskill = None,
) -> float:
    """Calculate the probability that team1 wins the game."""
    n = len(team1) + len(team2)
    p1 = team_rating(team1)
    p2 = team_rating(team2)
    return p_win_1v1(p1, p2, draw_margin, n=n, env=env)


def team_rating(team: list[Rating]) -> Rating:
    """Return sum of ratings as a Rating, i.e. the sum of Gaussians."""
    return Rating(
        sum([p.mu for p in team]),
        sqrt(sum([p.sigma**2 for p in team])),
    )

which reduces to the above snippet for draw_margin == 0.

@bernd-wechner
Copy link
Contributor

Thanks enormously for the new paper. I look forward to digesting it. I am busy as can be alas. So I've added it to my library for now. I have since posting here formulated the the probability of given rankings albeit incompletely and will need to refer to my conclusions and report back when I have time. If you have time and pre-reading of the paper linked to can you determine whether or not tit is, as appears likely, constrained tot he two team scenario? Or generalised to the n team scenario. As my needs are in the latter space.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests