New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add extra precision configure switch #449
base: master
Are you sure you want to change the base?
Conversation
Large tiles are now enabled in the default precision mode with assembly.
60181eb
to
5f12e9c
Compare
Perhaps I’m misunderstanding something, but isn’t the latter option both more precise and faster across the board? I trust you, but I have a few comments regarding scientific rigour: In general, with any benchmark, it would be great to have the data and methodology public so that other people could (if they wanted) replicate the results on their own hardware and platforms, as well as better understand the impact and potentially spot methodology flaws and suggest improvements. Although admittedly, I doubt any of us actually has the time and desire to scrutinize this, as much as it would be a good practice. Also, you suggest the results with and without assembler code are different, but the table shows only one set. |
That benchmark doesn't measure performance in relevant case (slow animation). On the over hand, simple math considerations say that for sufficiently slow animations performance should be proportional to
This result is not of the scientific quality, I've grabbed some tests laying around and pasted it together. This is also why I don't want to make too drastic changes based on it. But I can show test sources through IRC for anyone interested.
I don't have relevant hardware for such tests (ARM-based or some outdated 32-bit atom) so that decision is mostly guesswork. I've done some tests with disabled assembly and lowered architecture target but it's even less relevant than other tests. |
To verify the new defaults against some more scenarios I did run the error tests against these regression test. The reference images the errors are calculated against were create based on current master 5c30976, with
I only did precision tests, no runtime tests as getting good results for those short tests with relevant run-by-run variance would require much more work. First here are the results for unmodified master: With x86_64 ASM:
Without ASM:
Now the new defaults as per this pr's 5f12e9c: New x86_64 ASM default:
New default with --disable-asm:
Extra precision with --disable-asm:
In most scenarios the new defaults do decrease deviation from the reference images, but the karaoke tests are a notable exception. This is not necessarily due to karaoke-effect itself, as there are also blurs and borders in the current version of the karaoke tests. Here are the full compare results: As a side note, the image difference between ASM and non-ASM configs might cause some additional work for our future test suite (#108), as we might end up needing multiple sets of reference images, one without ASM and one more per supported CPU-architecture.
|
Looks like
I think I checked that my assembly output is bit-identical to corresponding C code output. I wonder where is that difference arises. |
Both |
Now that the C-ASM differences has been dealt with, let's take a new look at this.
Full logs and GNU Awk script used for generating these tables attached: 449_logs_20210524.zip Current Default
Current
Note, that for some reason, this has a lower deviation for Proposed default for non-ASM:
Significantly bigger deviation for Proposed default for ASM:
Sometimes more, sometimes less deviation than no-ASM. Proposed
|
I'm not sure about removing the "lower quality, but faster" |
You should use
|
Right, the current description makes it sound like a flat improvement though. I think there may be some use for a real “lower quality, but faster” setting for some use-cases; independent of whether You're results in the original post suggest there is a potential time reduction of up to 13% with lowered precision settings, given that you weren't looking to actually lower the quality in those tests, this percentage may increase further with different but for this use-case still acceptable settings. But I also don't now how real-worldy the test sample was.
This made me remember that one of the karaoke tests was quite affected by the fix from #483. Perhaps it's better to wait for #483 to be merged before remeasuring this. |
This is my version of PR #329.
I've updated visual quality assessment tests (more tests, better reference target) and checked additional parameter values, so the table now looks the following:
Legend:
RASTERIZER_PRECISION
;POSITION_PRECISION
;compare
tests;Large tiles are now enabled by default if assembly is present, as without good vector assembly it's slower instead.