-
Notifications
You must be signed in to change notification settings - Fork 667
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add documentation about sorting param #1419
Comments
Hi @polarathene, You're right. The rank (if we can say so) is computed from the 50th percentile. The main idea was to be as closest as we can to decide which language/framework to use. As my background is BTW, any idea / recommendation is ❤️ PS : I will edit the title to reflect main idea -> the sorting param (50th percentile) SHOULD be reflected on the |
I was just referring to the table results listed, not in relation to performance/experiences elsewhere. As noted above with It's true that as the only metric, the top 50% of responses have better latency, but the 2nd half of the results tell a very different story. I think consistency/stability of the low latency throughout should have some weight towards the scoring. The standard deviation shows how some of the current positions are getting away with having half of their responses(slower) ignored. I have asked a statistics community for their input on a proper way to improve the scoring/ranking and I will let you know if they make any suggestions. For now, assigning a small bit of weighting to the
Where
The ranking seems to better represent performance by giving a small bit of weight to the last half of latency results. I did not include |
🎉 Thanks for this However, be aware that the results are not very |
Yes I understand, there is a clear warning up top on the README pointing that out. But that would not change anything regarding how results are sorted. I do understand that the actual results themselves are not stable presently as evident with the test result history in past commits varying widely. I am just interested in more accurately representing how well a framework has performed based on the given data. The weighted score suggestion above, seems to work well? Off-topic to the issue:
I don't quite follow how Docker messes up results here. Is it because of the different base images? Docker if anything should be a useful tool to get consistency. On bare metal, you're dealing with the distro environment and it's own package manager, not all distros are the same, there are many other factors that can impact results. Users systems likewise aren't likely to be at parity with where you run the tests. But the results provide some insight, and the user can then verify on their own systems if the results are similar(easier to do with using the same Docker images, followed by adapting to their own needs/environment after confirmation). On bare metal, you can do some things to better ensure consistency, pinning CPU cores to the processes involved(Docker would again be useful here afaik), where you can also isolate the CPU cores so that nothing else on the system is permitted to use those cores for processing. Once you involve a network externally, that's a different variable that you might not have much control over and lack any consistency with. It's useful information to include and can still be achieved with Docker, the quality of the network is going to vary for users though, just like other parts of the environment, so local tests are still useful imo. You can configure a network that has characteristics of what you'd get from an external network being involved too. |
Feel free to suggest any idea about how to rank/sort ❤️ I have taken ideas from #670 and #223, but no preference for me 😛
I mean, that metrics are computed in a way that prevent any framework to be push on its performance limits :
My bad, this is NOT only about |
Latency Top 10:
The only column that seems to define the rank is the 50th Percentile? Could the README mention that's how the ranking is being done?
iron
definitely seems ahead ofrack-routing
.flame
looks like it should be ahead of the 3 PHP results above it, they're all tying on 0.39ms 50th percentile, butflame
's standard deviation score should give it the upper hand here. There are other cases like this further down the chart, likeagoo-c
vsrocket
:Here there is a slight lead by the 50th percentile by
rocket
, butagoo-c
seems better overall?Would it be ok to use a 2nd value with some weighting to get a better ranking? I'm not stats savvy, but looking over the table, if you weighted
50th percentile
by 90% to 10% of theAverage
you'd get an order that seems more representative of the performance, rather than those that lose consistency and skew quite poorly over that half way mark.The text was updated successfully, but these errors were encountered: