Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement total SR formula that better correlates with pp #13986

Merged
merged 10 commits into from
Sep 15, 2021

Conversation

Fr0stium
Copy link
Contributor

@Fr0stium Fr0stium commented Jul 22, 2021

Please see this document that explains the changes: https://docs.google.com/document/d/10DZGYYSsT_yjz2Mtp6yIJld0Rqx4E-vVHupCqiM4TNI/edit?usp=sharing

Please see this spreadsheet to see the results of this change: https://docs.google.com/spreadsheets/d/1rbg71zWUWp1TjvwBMZRW5GAfqn3t0acpoSdaKhMxuSA/edit?usp=sharing

In short, the change increases correlation between pp and star rating by changing the formula from aim + speed + abs(aim - speed) / 2 to the one described in the document.

This change does NOT change pp, only stars.

Also, apologies if I've made a mistake in the code or presentation; this is my first pull request.

@bdach
Copy link
Collaborator

bdach commented Jul 22, 2021

Hi, thanks for the contribution. I've skimmed the document and while the math transformations themselves seem correct, I have a few questions before I proceed to look at the code.

  • Can the plots have labeled axes? It's not clear to me what I am looking at with most of them, especially so that most of the time the document is treating performance points as a function of two arguments, while some plots that seem to have the total pp on the Y axis are 2D instead of 3D, so I don't know what the X axis is supposed to be. The "fixed aim star rating" one is also similarly confusing.

  • Are you aware of other efforts in changing star ratings, such as Expand attribute set for osu!standard SR calculation to assess length and combo #13583? How does this pull relate to them? Is it a quick substitute "for now", or an entire alternative on its own?

  • Did you gather any community feedback or present some test numbers to other players before submitting this change?

  • Most notably, I am curious if star rating and performance points really are supposed to be correlated. Because if they are, I'm not sure why that is done in this manner of adjusting the formulae, rather than just dropping one of the two or making one of them entirely based on the other (so basically do something like SR = PP / 100).

    To not get the pitchforks taken out at me, I am not saying that it should be the case. I am merely suggesting it as a thought experiment of sorts, and to see whether there's a valid reason for not doing that if SR-PP correlation is desired.

@Fr0stium
Copy link
Contributor Author

Fr0stium commented Jul 22, 2021

Hi, thanks for the contribution. I've skimmed the document and while the math transformations themselves seem correct, I have a few questions before I proceed to look at the code.

  • Can the plots have labeled axes? It's not clear to me what I am looking at with most of them, especially so that most of the time the document is treating performance points as a function of two arguments, while some plots that seem to have the total pp on the Y axis are 2D instead of 3D, so I don't know what the X axis is supposed to be. The "fixed aim star rating" one is also similarly confusing.
  • Are you aware of other efforts in changing star ratings, such as Expand attribute set for osu!standard SR calculation to assess length and combo #13583? How does this pull relate to them? Is it a quick substitute "for now", or an entire alternative on its own?
  • Did you gather any community feedback or present some test numbers to other players before submitting this change?
  • Most notably, I am curious if star rating and performance points really are supposed to be correlated. Because if they are, I'm not sure why that is done in this manner of adjusting the formulae, rather than just dropping one of the two or making one of them entirely based on the other (so basically do something like SR = PP / 100).
    To not get the pitchforks taken out at me, I am not saying that it should be the case, I am merely suggesting it as a thought experiment of sorts, and to see whether there's a valid reason for not doing that if SR-PP correlation is desired.

I'll explain the graphs more clearly here.

For the first graph, the x-axis is quite literally just any number x, with no inherent meaning. x is then transformed into two different variables a and s, where a is 3.5 + x and s is 3.5 - 3x. These variables are made to represent aim and speed star ratings that, when combined, give a total star rating of 7 in the current system. Finally, the variables a and s are plugged into t(a,s), which returns the combined pp value. Another way to see the graph is that it's a plot of t(3.5 + x, 3.5 - 3x), where x is any number. I did this to make it easier to visualize because a contour plot of a, s, and the total pp function would be messy.

The second graph is similar. x can be any number, where it again doesn't mean anything. Two different variables a and s are made where a is 3.5 + x and s is 3.5 - 3x, with the property that a + s + |a - s| / 2 = 7. The variables a and s are then plugged into three functions: the total star rating in black, the new star rating in red, and the combined pp value in blue scaled by a factor of 0.02 (to make it fit into the graph). Again, the reason that these functions can be output in a plane rather than in 3 dimensions is because the input is originally one number, then split into two different numbers, and the output is one number.

I guess it would be helpful to focus on the shape of the curves in these two graphs rather than what the x-axis represents, because the x-axis doesn't have any meaning; what x is turned into (a and s) is what has meaning, but a and s are not plotted.

Unlike the other two graphs, the third graph has a meaningful x-axis. The x-axis is simply the speed star rating. The aim star rating is fixed at 3.5. The blue graph is the current star rating, and the red is the new star rating. When the speed star rating equals 3.5, both curves give a total SR of 7.

I hope I cleared it up. To answer the other points:

  1. This is an entire alternative to combining aim and speed star ratings, and in reality it proposes a general idea as to how to combine them given any base pp function and total pp function, although the general idea isn't written explicitly. What I'm trying to say is that if you had different base pp and total pp functions, you can apply the same process I did and arrive at a different combined SR function. Looking at the specific proposal you linked, I think this change should be used instead of what is implemented there, because the base pp function and the combined pp function are the same as the ones in ppv2. I recommend taking a look at the last section of the document, where a different base pp function and a different total pp function is looked into and leads to a different formula to combine aim and speed.

  2. I did post the document in the pp-dev discord and got positive responses, but I will gather more opinions.

  3. I should've said this, but another main goal of the change is just to provide a non-arbitrary (or less arbitrary) way of combining aim and speed stars. There are proposals out there that correlate stars and pp even more. For example, joz star rating incorporates length and difficulty spikes into star rating and removes the length bonus from pp entirely. This makes it so that long maps generally have much higher star ratings than shorter maps. This is known as star rating representing FC difficulty instead of peak difficulty, and both kinds of star ratings have been received well, however from what I've seen people are definitely more hesitant to implement FC difficulty SR because it measures FC difficulty, which we are not used to. In short, the reason why I don't just make SR into PP is just because it seems like people aren't ready for that; they still want SR to be a measure of peak difficulty, which this change preserves (to be clear, this change is still compatible with FC difficulty SR. If aim and speed stars measure peak difficulty, as they do now, then combined SR will be peak difficulty, and vice versa.)

@bdach
Copy link
Collaborator

bdach commented Jul 22, 2021

Thanks for the explanation. I'm confused by one more thing. You stated both that

Looking at the specific proposal you linked, I think this change should be used instead of what is implemented there, because the base pp function and the combined pp function are the same as the ones in ppv2

and

There are proposals out there that correlate stars and pp even more. For example, joz star rating incorporates length and difficulty spikes into star rating and removes the length bonus from pp entirely

I was under the impression that #13583 was colloquially referred to as "joz star rating"?

@Fr0stium
Copy link
Contributor Author

Thanks for the explanation. I'm confused by one more thing. You stated both that

Looking at the specific proposal you linked, I think this change should be used instead of what is implemented there, because the base pp function and the combined pp function are the same as the ones in ppv2

and

There are proposals out there that correlate stars and pp even more. For example, joz star rating incorporates length and difficulty spikes into star rating and removes the length bonus from pp entirely

I was under the impression that #13583 was colloquially referred to as "joz star rating"?

Yes, that proposal is joz star rating.

But check out lines 62 to 67, line 113, and line 167 of that proposal: https://github.com/ppy/osu/blob/a31bf159a4828c98a41933c7377b1539f6bcaf5d/osu.Game.Rulesets.Osu/Difficulty/OsuPerformanceCalculator.cs

Then check out lines 55 to 60, line 82, and line 136 of ppv2: https://github.com/ppy/osu/blob/master/osu.Game.Rulesets.Osu/Difficulty/OsuPerformanceCalculator.cs

You'll see that they are the same. Those functions are the total pp, base aim pp, and base speed pp. The same is true for total SR, it's still aim + speed + |aim - speed| / 2. Joz has a multiplier for total SR, but that doesn't change the fundamental nature.

To be clear, when people refer to joz star rating, they are referring to how joz does difficulty aggregation. In ppv2, star rating is aggregated using a geometric series, whilst in joz, the idea is that repeating a section multiplies SR by a constant, which corresponds to a power series. So the formulas used in joz and ppv2 to transform SR into pp and to combine aim and speed stars are the same, but the way in which the systems come up with SR in the first place is different.

@smoogipoo
Copy link
Contributor

smoogipoo commented Jul 23, 2021

Looks reasonable to me, I'll do a full sheet/calc later but until then @emu1337 @stanriders @Xexxar can I get some of your eyes on this? Leave an approval if you agree with the direction.

@stanriders
Copy link
Member

I'd say it'd be good to inflate SR a bit to better match people's expectations of what SR roughly correlates to what pp, considering it's a purely visual change. Right now it seems to be only deflating values

@smoogipoo
Copy link
Contributor

Yeah I agree with that in the context of this changeset. In the longer term I believe we may want to scale SR down, but that's a discussion for another day.

@emu1337
Copy link
Member

emu1337 commented Jul 24, 2021

I believe sr right now is used by players to gauge how well they could play the map, not how much pp they can get from it. Trying to heavily correlate sr with pp is a change from what users expect sr to be.

Having said that, I think the live formula inflates maps with only one type of difficulty too much. Having the sr be the 'base pp' (without length bonus etc.) is a good approach to eliminate this issue without compromising user experience.

I also agree that the sr values should be bumped a bit to make up for losses.

All in all this is a good change.

@stanriders
Copy link
Member

sr right now is used by players to gauge how well they could play the map, not how much pp they can get from it

People usually do both, for example people usually know that if they want to get ~300pp they need to play maps that are around 6-6.5*. It's very loose ofc because of how much SR can differ from PP

@emu1337
Copy link
Member

emu1337 commented Jul 24, 2021

sr right now is used by players to gauge how well they could play the map, not how much pp they can get from it

People usually do both, for example people usually know that if they want to get ~300pp they need to play maps that are around 6-6.5*. It's very loose ofc because of how much SR can differ from PP

As an extreme counter-example, this map is potentially worth ~450pp as a 6.13* map. The values you describe only apply on a certain map length and OD. The sr value makes sense though because Save Me isn't any harder to play than similar sr tv size maps.

@Fr0stium
Copy link
Contributor Author

Fr0stium commented Jul 30, 2021

It might be necessary to multiply by a constant, but I'm not sure if I want to do that just yet because it would mess up the property that if aim and speed are the same, the total SR would be just their sum. For example, if we have 3.5 aim SR and 3.5 speed SR, it would lead to an SR of 7, but with this change it would be higher than 7.

However, at least multiplying by a constant doesn't mess up the 1 to 1 mapping of SR to base pp, which is more important.

I think one of the ways it can be mathematically justified is that since the global multiplier for total pp is 1.12, and pp is SR cubed, then the multiplier for SR might be cbrt(1.12), which is around 1.038. This still seems a little arbitrary though so I'll have to think about it more.

Let's see what Smoogi's sheet calc shows, and if the values look too deflated then I'll change it.

Co-authored-by: Bartłomiej Dach <dach.bartlomiej@gmail.com>
roansong
roansong previously approved these changes Aug 1, 2021
@pull-request-size pull-request-size bot added size/S and removed size/XS labels Aug 1, 2021
@bdach bdach dismissed their stale review August 1, 2021 13:39

addressed, code looks to match expectations based on doc attached

@smoogipoo
Copy link
Contributor

Interesting test failures - FP differences between Windows/macOS/Linux?

@bdach
Copy link
Collaborator

bdach commented Aug 2, 2021

Or math function implementation, maybe? I was pretty sure I checked these were passing locally.

@Fr0stium
Copy link
Contributor Author

Fr0stium commented Aug 3, 2021

People have said that the star ratings look better after being multiplied by a constant, which I chose to be the cube root of 1.12, 1.12 being the pp multiplier. The justification is above, and I think it's a good enough reason to do it.

@smoogipoo
Copy link
Contributor

Wanna apply these changes to taiko/catch/mania also?

@Fr0stium
Copy link
Contributor Author

Wanna apply these changes to taiko/catch/mania also?

I don't know how SR is combined and then transformed into pp for those gamemodes, so nope.

@stanriders stanriders requested a review from a team September 13, 2021 12:13
@stanriders
Copy link
Member

stanriders commented Sep 13, 2021

@02naitsirk mind updating resulting values in the sheet to include changes from #13483?

@MBmasher
Copy link
Member

Wanna apply these changes to taiko/catch/mania also?

Catch/Mania only have one skill, while Taiko uses the total SR for pp calc, so it should only make a difference in standard.

emu1337
emu1337 previously approved these changes Sep 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants