New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Requirements] Add that Response cache is not permitted in Fortunes test #6529
Comments
I'll bring this up with everybody on Monday, but I agree with you. Unfortunately, I can't think of a way to test for this during verification. (Always makes me think how cool it would be if we had some static code analysis.) So, it will require persistence from the community and might be hard to find. |
It isn't easy to automate the verification, but also it isn't with the cache of Fortune objects or rows. Perhaps will be good to add a random number or time to the "Additional fortune added at request time." and so will be more difficult to cache the template. But visually checking the results, I can't understand some numbers. Compare a single row query with 2 simple numbers converted to json vs a 12 rows query, sort, escape, template, .... Examples from last run:
vs
|
I like this idea. |
If we add a random char plus a random number at the beginning, It can also break any sort cache. |
All the 20 round is bad for that. |
I didn't feel I was violating the
And I don't know if it should be forbidden. This usage is an integral part of optimising implementations. I'm afraid that by adding new rules, we'll forbid any kind of optimization, which would make it impossible to differentiate between frameworks using the same technologies. Adding a random data on Fortune, why not, but this would change the context of the test, for not respecting a rule that is not currently explained. Obviously, if this new rule was added, I would make sure that the PS: It would be a good practice to ping the authors of the implementations, when mentioning them in a new issue. |
I didn't say that any framework is violating the requirements. So we need to clarify the requirements: template cache is permitted or not ? If it's permitted:
In the Fortunes test it's easy to cache, the template always receive the same data and send the same html. |
@nbrady-techempower any news ?? |
@joanhey I've shared again with the team to see if I can get more thoughts on it, thanks. |
This is an interesting one. It reminds me in some ways of the conversations we had about allowing pipelined queries in database connectivity providers. After some back and forth, we ultimately decided to allow query pipelining because, as @jcheron points out above, some optimization strategies benefit real-world applications and not just results in this benchmark. In other words, some optimizations that appear at first to violate the spirit of our tests by optimizing "to the benchmark," are in fact something of the opposite: providing a new benefit to real-world applications. Precluding such optimizations would be curiously antithetical to the project's goals, which include motivating frameworks and platforms to improve performance for the benefit of application developers. That said, I am not personally convinced that caching the result of a template composition is providing unique and innovative value to real-world applications. I think "template caching" means two different things:
It is my understanding that the latter is what we're discussing in this thread. Perhaps this would more accurately be called "Template composition caching" or even "Response caching." I's the similarity to actual "response caching" that leads me to doubt the unique and innovative value provided here. I see caching the resulting composition as nearly equivalent to caching the bulk of the response. If your application is serving static content, as implied by a scenario where "template composition caching" provides value, it would likely have a reverse proxy cache or CDN between the user and your server anyway. In our Fortunes test, the intent is for the context object, though always the same, to be a stand-in for an actually dynamic context object (hence the dynamic addition of an item and sorting). There are subtle differences, of course. The template composition caching approach, if I understand it correctly, would require no cache invalidation if the context object ever changed. The cache lookup would be a miss, and a new composition would be generated and cached seamlessly. By comparison, caching a response at a reverse proxy will typically have some timeout after which a request will be sent to the back-end server for refreshing the cache. When we discussed query pipelining, we also discussed the possibility of adding a results filter on this attribute since the approach was sufficiently different that it seemed reasonable people might want to be able to slice the data accordingly (e.g., show me only results from implementations that use pipelining or vice-versa). Although that didn't get implemented, it might still be useful. If such finer-grained attributes were added, template composition caching could be added as another attribute. While I am currently leaning toward asking that we do not use template composition caching, I'd like to hear more thoughts and opinions. |
I fully agree with this point @bhauer:
Now, let's look at the value of this type of cache (template caching) in a real context:
This last case, which is possible in production, is quite close to the context described in the Fortunes specification: It could be argued that in this case, we are no longer testing the efficiency of the template engine rendering, but of its caching system.
|
It doesn't have to be, it can be some at application level, and as you said we are not testing the templating engine anymore which to me is what is important in this test.
Well the cache could be done still at application level where it is being rendered, and yes most if not all cases are going to be faster than templated ones for obvious reasons, that doesn't mean templates are bad but might be an interesting thing to compare. As opposed to the pipelined tests which still tests the DB performance, this stops testing the templating engine or HTML rendering to test a caching system, so IMO it should be forbidden. |
When we fix this issue. It's only necessary to change the requirements. And the change is easy. Any fw can cache the response. But we are not testing the cache. We are testing the template, sort and escape strings. |
I am quite late to the party, but as the author of the So, I did a quick investigation and discovered the following: The last run that contained "reasonable" results was 92383925-3ba7-40fd-88cf-19f55751f01c (single query - 470930 requests per second, fortunes - 421247). The next run, 51274292-fa20-4316-bc29-4138d6ac607e, was problematic (the values being 403863 and 415689 respectively; note that both of them regressed, though the single query test was affected significantly worse by 14.24%, while the effect on the fortunes value was moderate at 1.32%). However, if you check the history of the While I am unable to pinpoint the cause, I can offer a theory (only for |
Thanks @volyrique After checking round 18 and 19 (with the vulnerability mitigations). That's when this strange behaviour started. So it is becoming more and more important to upgrade the kernel. #7321 Still some frameworks use response cacheSo we need to clarify the rules. |
@joanhey @volyrique @nbrady-techempower @dantti @jcheron on the general point here, i totally agree template output should absolutely not be cached. the whole point of the fortunes test is to compare the performance of executing the template with html escaping against the database results for every request. @joanhey's suggestion of randomizing the additional row seems like a good solution to ensure this isn't allowed. |
@billywhizz Sorry, I don't see it, at least for |
hi @volyrique sorry, my explanation was poor and actually incorrect, but the issue is normal and can be explained. if you look here: https://ajdust.github.io/tfbvis/?testrun=Citrine_started2022-01-12_d9300976-46e5-4dcd-a72b-4f14c01ef5d8&testtype=db&round=false you can see that h2o scores 393 krps for single query and 401 krps for fortunes. but if you look at the cpu usage on the web server for the single query test it is 63% utilized - indicating that the bottleneck in this case is the database which is likely at 100% cpu load while the web server sits idle 37% of the time. for fortunes, cpu usage is 88% on the web server. so, the database is still the bottleneck and is also 100% loaded for this test but web server is idle only 12% of the time. so if you calculate the requests per second if cpu on web server could be fully utilized and db was not a bottleneck then you would get: db = 393 / 63 * 100 = 623 krps if web server cpu could be fully utilized so, the web server can handle more /db rps than /fortunes but isn't able to because of the load on the database. does that make sense now? it's actually a shortcoming in the techempower tests - it would make more sense, if we are comparing web server performance, to rank the frameworks on the number of requests per second per cpu rather than just the raw rps number. |
@billywhizz I agree with your analysis - it presents a quite useful framework-independent technique to demonstrate that the results are consistent with the expectation that the single query test has higher throughput than the fortunes one, i.e. there is a bottleneck that limits them both. However, it doesn't answer the really interesting question, which is why is the fortunes test affected just a tiny bit less than the other one - TBH I haven't bothered doing a detailed statistical analysis across many runs or anything like that, but it seems that the effect is consistent and reproducible (that is, in the |
Fortunes Requirements
It's obvious that want to test server-side templates, XSS countermeasures,... and that it need to be executed in each request and not test the template cache.
And
xxi. Use of an in-memory cache of Fortune objects or rows by the application is not permitted.
But It does not say anything about the template cache.
Almost all template systems have a cache, and should not be allowed in tests.
One example is Ubiquity framework in async mode: https://github.com/phpMv/ubiquity/blob/9090fe990f49b475d68f416d9e64f6d1ae6f2580/src/Ubiquity/controllers/SimpleViewAsyncController.php#L18-L27
But perhaps more frameworks are using it.
The text was updated successfully, but these errors were encountered: