New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High dependence on CPU performance #44
Comments
Really need to look at the performance of the following tests:
In other tests, first of all, most problems with user libraries (Mesa, libpng/libz, libpthread) and the kernel (DRM). |
Hi! Thank you for this very interesting analysis. After your initial report I started taking a look at the CPU usage (I mostly used valgrind/callgrind), and, like you, noticed that in many cases the primary CPU consumer was not glmark2 itself, but rather some other part of the graphics stack. The CPU usage from libpng/libz shouldn't be a concern, since the textures are decoded at setup time and shouldn't affect benchmarking results. The CPU usage from drivers, and its effect on the benchmarks is actually something that we want reflected in the benchmark results. There are still cases like the ones you mention in the second comment where the CPU usage lies predominantly in glmark2 itself. I have started looking into these and I will report progress in this issue. |
Use std::copy to copy data instead of copying elements manually. Performance results indicate that this change results in speed-up of over 2x for this function. See #44.
I have pushed a performance improvement in 5b0f603 that helps with the CPU usage in the buffer scene. Looking at the refract scene, the majority of the CPU usage is in the scene setup code, so it shouldn't affect the benchmark results (of course, it would be good to improve it anyway). |
For some tests, there is a strong dependence of the results on the processor speed. I made measurements for frequency scan from 800MHz to 3600MHz using the cpufrequtils utility on the i7-4790 with the AMD Radeon R7 250E graphics card:
a little like the fastest x86 processor, so they are demanding to the video card
3.6 GHz shows that we do not rest on anything else in this test.
The rest of the tests behave very similar, starting at some CPU frequency
The further its dispersal practically does not influence result:
The remaining tests cease to notice the processor speed from 2.4..2.6 GHz, the most
high result [conditionals-3] (it needs a fast processor, but a graphics card it almost does not load).
It is clear that this is a real life, but perhaps you should pay attention to optimizing some tests. Although, of course, etho can create a problem comparing results with old versions of the benchmark.
The text was updated successfully, but these errors were encountered: