Changes / Updates / Dates for Round 22 #7475

NateBrady23 · 2022-07-18T15:50:30Z

Round 21 has concluded. The results will be posted shortly. We'd like to have a quicker turnaround for Round 22. We're hoping between Oct - Nov.

Checklist

billywhizz · 2022-07-20T23:09:25Z

i was just reading a thread on reddit about latest results and a commenter mentioned that "Stripped" scores are included in the composite results. I didn't think this was allowed/possible, but it turns out this is in fact the case, for actix at least.

the /json and /plaintext results in the composite scores for actix are from the "server" configuration which is marked as "Stripped". is this correct or a mistake in the collation of the results?

Yurunsoft · 2022-07-21T07:06:35Z

The sql queries currently tested are returning too fast.
In the production environment, the table may have a lot of data, the query will not return so fast, and there are even slow queries.
Suggest adding slow IO test.

#6687

It is suggested to add memory usage indicator as one of the scoring criteria.

NateBrady23 · 2022-07-21T13:47:11Z

@billywhizz You're correct. That shouldn't be the case. That implementation approach was changed and never updated for the composite scores. Will see what I can do.

billywhizz · 2022-07-21T15:05:21Z

@nbrady-techempower yes - i didn't think it was possible when i saw the comment so glad i checked. by my calculation this would give actix a composite score of 6939, moving it down to ninth place behind officefloor, aspnet.core, salvo and axum. i haven't checked if same is happening with any others.

joanhey · 2022-07-21T16:23:23Z

It is not problem only from the TechEmpower people.
We need to help all together for the health of the benchmark.

A lot of people don't understand a benchmark.
They don't use it for refactor their frameworks to be faster and learn from others.
They use it for make tricks and look faster, and not useful in a production app.
But they use it like Bench Marketing only.

joanhey · 2022-07-21T16:43:21Z

When I said that we need to clarify the rules is for that:

We are changing the rules for that people, but all need to follow the rules.

It's like with Faf #7402, we all can learn from that, for good or bad.
But nobody say anything about the process priority, it's OK but all or no fws. #6967 (comment)

For some time, the length of the server name is discussed but still without a solution. Before was a problem with the urls.
It's only some bytes, but for 7 millions req/s is enough difference.

etc, etc
But need to be in the rules, later they will be also tested.
What is not possible, is that because it is not in the rules, you can't report a fw. Like caching the response in fortunes.

joanhey · 2022-07-21T16:45:45Z

Also I want to see which frameworks are using pipeline in plaintext.
The same than we look for ORM or raw.

joanhey · 2022-07-21T16:57:16Z

Another big problem, that a lot of us have it is the servers.

We have enough information to inform the Ops.
From round 18 to 19, we all see a big drop in performance with the mitigations for spectre/meltdown.
If we change to the new kernel or new CPUs, we can help with the results to many company's.

Only with the kernel change #7321 or new servers.
Or with the new kernel with the MGLRU, or the patch for older.

They make a very big impact, more than the fw that we use.
And we need to inform and help about that.

joanhey · 2022-07-21T17:16:37Z

Another question,
It's possible to hardcode the Content-Lenght: ??
For me it's a stripped version.

joanhey · 2022-07-21T17:18:39Z

@billywhizz
always will be critics about any benchmark, but we need that be as few as possible.

joanhey · 2022-07-21T18:11:07Z

More questions

Is it realistic to have different variants and configs for every test??
Some use one for plaintext and json, and another for the db tests.
But some have 1 for plaintext, 1 for json, 1 for updates, ....
Anybody will do that for an app ??

If the fw is using JIT is very beneficial, but not realistic.
Also some are talking about that #7358.

billywhizz · 2022-07-21T18:26:55Z

@joanhey yes - i tend to think there should be a single entry/configuaration allowed per framework and it should be the same codebase that covers all the tests. this would be much more "realistic" and would also massively decrease the amount of time a full test run takes - some frameworks have 10 or more different configuarations that have to be tested!

joanhey · 2022-07-21T18:33:27Z

I understand variants for different db or driver.
But not per test (JIT) and also specific config.

In the same way, some fws use only 1 variant but the config to the bd is different for every test.
How anybody will do that in a real app ??

billywhizz · 2022-07-21T18:39:05Z

@joanhey good point re. different databases - but apart from that i think the number of configurations per framework should be minimised. i also think it would work better overall if a run was only triggered when a framework changed rather than continually running every framework end to end. if we only ran on every merge just for the changed framework then maintainers would have to wait a lot less time to see results of changes.

my worry about introducing too many and too complex rules is it will just discourage people from entering at all, so there is a balance to be found between too many rules and allowing for innovation in the approaches.

franz1981 · 2022-07-21T18:54:10Z

Shared my thoughts on #7475 (comment) here #7358 (reply in thread) trying to both give my opinion but still convey that there are things, assuming what the purpose of the benchmark is, that should change a bit... although I agree with #7475 (comment) to not making it too complex to avoid folks to not get in.

joanhey · 2022-07-21T19:09:09Z

We can create an addendum, for those 1-2% devs who try to lie. With the more esoteric tricks.

About the run only with the changed frameworks.
The bench not only help to make faster code, it's also very useful to find bugs.
Now I'm searching for one with PHP8.1 and JIT, It's an intermittent bug.
Without change the code, not the php version, there are a ~15% drop from run to run.

cirospaciari · 2022-12-07T19:07:44Z

We can create an addendum, for those 1-2% devs who try to lie. With the more esoteric tricks.

About the run only with the changed frameworks. The bench not only help to make faster code, it's also very useful to find bugs. Now I'm searching for one with PHP8.1 and JIT, It's an intermittent bug. Without change the code, not the php version, there are a ~15% drop from run to run.

@nbrady-techempower
Robyn uses "const" with basically caches the string in Rust and avoid calling python at all, i think its not allowed right?
https://sansyrox.github.io/robyn/#/architecture?id=const-requests

https://github.com/TechEmpower/FrameworkBenchmarks/blob/master/frameworks/Python/robyn/app.py

romanstingler · 2022-12-09T16:11:22Z

Is it possible to get the tested framework version?

is it just me or should people be allowed to state
approach = "Realistic"
only when they use framework built-in functions/methods i.e. not allowed to rewrite the router, not allowed to write a custom JSON function, as this is not a realistic approach for each framework.
Otherwise, it is just a programming language benchmark with pulled in function of frameworks and minimal use of them or even overwriting them.

volyrique · 2023-01-14T04:18:39Z

The result visualization for every round has a link to the continuous benchmarking run that has been used (for example, round 21 is based on run edd8ab2e-018b-4041-92ce-03e5317d35ea). From the run you can get the commit ID, so that you can browse the repository at the respective revision. Then check the Dockerfile that corresponds to the test implementation you are interested in (and possibly any associated scripts in the implementation directory) to get the framework version that has been used. Unfortunately not all implementations keep their dependencies locked down properly - in that case your best bet is probably to check the build logs from the run. If that does not help, then I am afraid that other than making a guesstimate, you are out of luck.

fafhrd91 · 2023-01-17T19:20:09Z

Hi, @nbrady-techempower do you guys decide on dates for Round 22?

volyrique · 2023-01-26T21:05:21Z

@nbrady-techempower I noticed an issue that seems to have appeared back in December after continouous benchmarking started running properly again - the Dstat data is missing.

NateBrady23 · 2023-02-07T15:22:45Z

@fafhrd91 Nothing concrete yet. I've got to get in front of the servers and do some upgrades. I'd like to shoot for late March.

@volyrique Thanks, I'll take a look!

volyrique · 2023-03-31T03:29:39Z

@nbrady-techempower I had a closer look at the Dstat issue and it looks like a common problem. Unfortunately the tool appears to be unmaintained, and the closest thing to a drop-in replacement seems to be Dool.

graemerocher · 2023-06-26T13:28:44Z

@nbrady-techempower if there is not going to be an update to the benchmarks soon can you please remove micronaut from the dataset? Some random non-micronaut maintainer submitted a PR that was merged without the consent of the maintainers and artificially crippled our results (limited the connection pool size to 5) and we have had to live with people enquiring why it is so low in the results.

People unfortunately use these benchmarks to make technology decisions and if the data is wrong for long periods it is impacting us directly.

NateBrady23 · 2023-06-30T22:47:30Z

@graemerocher I'm sorry to hear about this. I can get the round 21 results from micronaut removed as soon as I'm back next week. I'm having a hard time finding the commit for that. Would you mind linking it so I can see other activity from that user.

graemerocher · 2023-07-03T09:43:33Z

@nbrady-techempower the history of what happened is in this thread #7618

Thanks for helping.

spericas · 2023-07-06T15:50:49Z

@nbrady-techempower Is there a tentative date for Round 22?

NateBrady23 · 2023-07-06T16:34:47Z

It's been almost a year since I said I'd like to start having more regular rounds... 😵‍💫

So, I think the biggest thing here was getting Citrine updated. Though I think all the things on the checklist are important, clarifying rules is a never-ending process, and if anyone thinks there's some in clear violation of any rules, please open a PR or an issue and ping me and the maintainers. Otherwise, I think we'll shoot for the first complete run in August.

shaovie · 2023-07-17T03:02:02Z

Good anticipation

graemerocher · 2023-07-31T07:41:59Z

@nbrady-techempower requesting this again as we keep getting questions. Please remove the invalid round 21 results for micronaut

NateBrady23 · 2023-07-31T16:11:40Z

@graemerocher This is done.

Please note that we have added a "maintainers" property in the benchmark_config.json. Feel free to add yourself and any others that you want to be notified in case changes are made to your tests. It's impossible for us to track who the maintainers are for each project, so this will help catch this issue in the future.

https://github.com/TechEmpower/FrameworkBenchmarks/wiki/Codebase-Framework-Files

everyone:

We're waiting on a few missing pieces for the new rack in the new place. We're hoping to be up and running on Thursday.

graemerocher · 2023-08-01T18:17:42Z

Thanks

alfiver · 2023-08-07T14:58:29Z

May I ask when the new round of benchmark results will be released?

NateBrady23 · 2023-08-07T18:50:49Z

@alfiver As soon as we get Citrine back online we're going to do a few full runs and then the next round. We'll be investigating the issues this week and have an update for everyone around Thursday.

NateBrady23 · 2023-08-14T16:53:51Z

Sorry, folks. No good news yet. One of the machines won't stay on. We're looking into it and will update when we can.

NateBrady23 · 2023-08-18T15:31:15Z

Ok, we got lucky! It was just the system battery. The servers are moved to their new home and looks like we're back up and running. Going to finish a full run to make sure everything looks good and then we'll send out a notice for when we're locking PRs for the this round.

fakeshadow · 2023-08-18T20:21:30Z

i was just reading a thread on reddit about latest results and a commenter mentioned that "Stripped" scores are included in the composite results. I didn't think this was allowed/possible, but it turns out this is in fact the case, for actix at least.

the /json and /plaintext results in the composite scores for actix are from the "server" configuration which is marked as "Stripped". is this correct or a mistake in the collation of the results?

I know this is a known issue and just want to point out the same is happening to xitca-web too. An unrealistic benchmark is counted towards total composite score unfortunately. It would be best if the misleading can be fixed in an official run. If a quick fix is not possible I suggest both xitca and actix mark their unrealistic bench as "broken" temporary until round 22 is finished.

shaovie · 2023-08-30T16:56:26Z

Ok, we got lucky! It was just the system battery. The servers are moved to their new home and looks like we're back up and running. Going to finish a full run to make sure everything looks good and then we'll send out a notice for when we're locking PRs for the this round.

Excuse me, do you have any good news?

NateBrady23 · 2023-09-11T20:29:15Z

Unfortunately we had some other issues come up, but hopefully resolved. The latest is here.

NateBrady23 · 2023-09-20T21:54:34Z

With the last run looking back to normal, it's time to actually set some dates for Round 22!

The run in progress will complete around 9/26. The following complete run will be a preview run. And we'll look to start the round run on 10/3.

We normally lock PRs down during the preview run. I would caution any maintainers on making adjustments to their frameworks during that time. As a reminder, we don't rerun individual frameworks for completed runs.

CosminSontu · 2023-09-24T16:00:47Z

Please wait for .NET 8 LTS release if it's not already accounted for. The release is planned for 14th of November this year.
Worst case scenario, please use a release candidate of .NET 8. Thanks!

joanhey · 2023-09-24T23:26:13Z

Wait, for the next run, than you don't know the results ....

NateBrady23 · 2023-10-04T15:23:10Z

We had an internet outage that looks like it stopped the preview round run / communication to tfb-status. I'll be in the office tomorrow to see if the preview round completed successfully and kick of the official round.

NateBrady23 · 2023-10-04T16:45:06Z

Someone was able to restart the service. Since the preview round wasn't able to complete, we'll do one more preview round and move the Round 22 official run to start around Oct 11th.

graemerocher · 2023-10-09T09:31:14Z

could someone review and merge #8478 before the run. Thanks.

macel94 · 2023-10-13T19:24:40Z

Someone was able to restart the service. Since the preview round wasn't able to complete, we'll do one more preview round and move the Round 22 official run to start around Oct 11th.

@nbrady-techempower any news?

:D

p8 · 2023-10-13T19:26:24Z

@macel94 It's running: https://tfb-status.techempower.com/

NateBrady23 added the Round 22 label Jul 18, 2022

NateBrady23 pinned this issue Jul 18, 2022

fundon mentioned this issue Aug 11, 2022

chore(Rust): Add Viz #7488

Merged

fakeshadow mentioned this issue Sep 4, 2022

Update TechEmpower benchmark link to latest actix/actix-web#2859

Closed

robjtede mentioned this issue Oct 9, 2022

Lost history #4147

Open

michaelhixson unpinned this issue Jan 31, 2023

michaelhixson pinned this issue Jan 31, 2023

synopse mentioned this issue Sep 20, 2023

New Citrine Setup Shows Lower Numbers #8397

Closed

NateBrady23 unpinned this issue Oct 18, 2023

Changes / Updates / Dates for Round 22 #7475

Changes / Updates / Dates for Round 22 #7475

Comments

NateBrady23 commented Jul 18, 2022

Checklist

billywhizz commented Jul 20, 2022

Yurunsoft commented Jul 21, 2022

NateBrady23 commented Jul 21, 2022

billywhizz commented Jul 21, 2022

joanhey commented Jul 21, 2022

joanhey commented Jul 21, 2022

joanhey commented Jul 21, 2022 • edited

joanhey commented Jul 21, 2022

joanhey commented Jul 21, 2022

joanhey commented Jul 21, 2022

joanhey commented Jul 21, 2022 • edited

billywhizz commented Jul 21, 2022

joanhey commented Jul 21, 2022

billywhizz commented Jul 21, 2022

franz1981 commented Jul 21, 2022 • edited

joanhey commented Jul 21, 2022 • edited

cirospaciari commented Dec 7, 2022 • edited

romanstingler commented Dec 9, 2022 • edited

volyrique commented Jan 14, 2023

fafhrd91 commented Jan 17, 2023

volyrique commented Jan 26, 2023

NateBrady23 commented Feb 7, 2023

volyrique commented Mar 31, 2023

graemerocher commented Jun 26, 2023 • edited

NateBrady23 commented Jun 30, 2023 • edited

graemerocher commented Jul 3, 2023

spericas commented Jul 6, 2023

NateBrady23 commented Jul 6, 2023 • edited

shaovie commented Jul 17, 2023

graemerocher commented Jul 31, 2023

NateBrady23 commented Jul 31, 2023 • edited

graemerocher commented Aug 1, 2023

alfiver commented Aug 7, 2023

NateBrady23 commented Aug 7, 2023

NateBrady23 commented Aug 14, 2023

NateBrady23 commented Aug 18, 2023

fakeshadow commented Aug 18, 2023

shaovie commented Aug 30, 2023

NateBrady23 commented Sep 11, 2023

NateBrady23 commented Sep 20, 2023

CosminSontu commented Sep 24, 2023

joanhey commented Sep 24, 2023

NateBrady23 commented Oct 4, 2023

NateBrady23 commented Oct 4, 2023

graemerocher commented Oct 9, 2023

macel94 commented Oct 13, 2023

p8 commented Oct 13, 2023

joanhey commented Jul 21, 2022 •

edited

joanhey commented Jul 21, 2022 •

edited

franz1981 commented Jul 21, 2022 •

edited

joanhey commented Jul 21, 2022 •

edited

cirospaciari commented Dec 7, 2022 •

edited

romanstingler commented Dec 9, 2022 •

edited

graemerocher commented Jun 26, 2023 •

edited

NateBrady23 commented Jun 30, 2023 •

edited

NateBrady23 commented Jul 6, 2023 •

edited

NateBrady23 commented Jul 31, 2023 •

edited