optimizations: game load time about twice as fast #2416

andrew-raphael-lukasik · 2022-09-04T22:12:43Z

Hi guys, I am working on this for a few weeks now and wanted to get some feedback from you

what

a lots of different optimizations that all targets game one single issue: load time
adds profiler markers (load time related code only)
adds utility methods & jobs. One is worth noting here:any_managed_array.AsNativeArray() extension method.
few unit tests to make sure those few parts aren't broken (will see, maybe I will write more of them)

why

load time on my machine was too high (up to 5-8 seconds)

result

it reduced load time to 2-4 seconds for initial load and to 1-2 for subsequent ones
performance markers give some ideas where further optimizations can be made
auxiliary cores work less now, which seems bad but this is the result of Burst compiling jobs that previously weren't AND general deficit of IJob-ified code to throw at them
Old profiler view (left) - basically no useful info. New profiler view (right) - detailed insight on what cpu is doing

Unit tests. Low coverage but a good start.

Interkarma · 2022-09-04T23:49:03Z

Hey there. :) I'm happy to see your optimisation efforts, and I do value the large amount of work you've put into this. However, the way this PR is presented makes it difficult for me to review and stage your changes across multiple releases.

The most obvious is the significant number of style-only changes that don't serve the PR intent as described. For example:

Changing camelCase to PascalCase - (shm to Shm)
Renaming fields - (paletteData to paletteBuffer)
Using var instead of explicit types (NativeArray<byte> tileData to var tileData)
Editing legacy properties to Lambda operator (=>)

None of this is strictly a problem in isolation, but it very much muddies the intent in a large PR with dozens of files impacted. My preference is for contributors to follow the writing style of original author, be it myself or someone else. Otherwise, I have to spend considerable time just unpacking what are functional changes and what are style changes of the latest contributor. You're welcome to follow your own particular style in your new code files provided it's still easy for others to read.

I also prefer for functional changes to be split across several changes over time, with each stage in the public hands for a while before the next. I've found this gradual incremental approach a good way to find problems and adapt rapidly.

As an example, the way @KABoissonneault staged changes to support custom enemies over several months and spread across many releases worked effectively. There were small problems found at each stage of this process after release - but thanks to the incremental approach this was limited and easily fixed in each subsequent release. If these changes were dropped all at once, the issues would have been compounded and harder to isolate. This ultimately helps end users by limiting issues, in addition to helping me review changes.

I'm not opposed to your optimisation efforts. But I need them to be unpacked and limited to functional changes, and better distributed so they can be rolled out incrementally. I can't say that I'll merge this on this side of 1.0. And depending on where I go after finishing with DFU, I may not be the current steward of project when this is merged.

I hope others will check in on this PR and provide feedback - @KABoissonneault and @ajrb might want to look through as it touches their work also. I always appreciate their eyes on the more complex changes.

Thank you @andrew-raphael-lukasik. I hope you understand. :)

andrew-raphael-lukasik · 2022-09-05T12:46:02Z

Thank you @Interkarma, that is what I needed and 100% humbly agree. I will divide this up into smaller, more reviewer-friendly PRs as you said.

andrew-raphael-lukasik · 2022-09-06T22:51:41Z

note

I tested some of my previous assertions and came to a conclusion that some of the math-oriented changes made previously in this PR are actually detrimental. More on that here.

Interkarma · 2022-09-06T23:15:02Z

Thank you for the detailed write-up. I appreciate the extra effort to confirm changes were actually beneficial. No point breaking things for the sake of it, or even for minor gains. Good job!

Anecdote time. When I was a younger person coming out of 6502 into C/C++, I stressed every instruction and every line of code. I wrote inline assembly, unrolled loops, and wrote all my own libraries. After I while, I noticed that compilers generally did a better job at optimisation than I could, and the standard libraries were faster too. I'm not the brightest programmer (certainly not on your level), so this came as a tremendous relief to me. :) After some time, I started concentrating on making stuff work as the top priority, then optimising the biggest bottlenecks first. As I like to say "perfect is the enemy of good".

But with your help, I think DFU will take quite a few steps beyond good. Cheers. :)

andrew-raphael-lukasik · 2022-09-09T16:03:19Z

While reading somewhat deeper into the DFU source code I only gained more respect for you, sir - Hats off, truly amazing job reverse-engineering all that!

Having some narrow programming skills (like me) is definitely nowhere near to getting all this resolved and project of this magnitude done and working. And the end result is awesome - people will enjoy this game decades to come, that is for sure.

One last point on optimization - compilers can optimize math but cant optimize how we read from memory, and this often the most common perf bottleneck nowadays. I am merely repeating after this guy, Mike Acton: https://www.youtube.com/watch?v=rX0ItVEVjHc

6502 ❤️

andrew-raphael-lukasik added 5 commits September 4, 2022 22:06

utilities for productivity and performance

d9781ec

code formatting (no technical importance)

c77e9f4

load time code perf optimisations and profiling markers

25b54db

profiling markers

5537599

unit tests

b8a8a0e

andrew-raphael-lukasik changed the title ~~Game load time optimizations (about twice as fast)~~ optimizations: game load time about twice as fast Sep 4, 2022

andrew-raphael-lukasik mentioned this pull request Sep 4, 2022

Changes to allow randomized billboard scaling in billboard batches. #2415

Closed

andrew-raphael-lukasik closed this Sep 5, 2022

This was referenced Sep 5, 2022

performance and productivity utilities #2420

Closed

[BurstCompile] for all existing jobs #2421

Merged

andrew-raphael-lukasik mentioned this pull request Sep 12, 2022

DaggerfallBillboardBatch: mesh data creation process multi-threaded #2427

Merged

andrew-raphael-lukasik mentioned this pull request Oct 2, 2022

DaggerfallBillboardBatch: mesh data creation process multi-threaded (take 2) #2436

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

optimizations: game load time about twice as fast #2416

optimizations: game load time about twice as fast #2416

andrew-raphael-lukasik commented Sep 4, 2022

Interkarma commented Sep 4, 2022

andrew-raphael-lukasik commented Sep 5, 2022

andrew-raphael-lukasik commented Sep 6, 2022

Interkarma commented Sep 6, 2022

andrew-raphael-lukasik commented Sep 9, 2022 •

edited

optimizations: game load time about twice as fast #2416

optimizations: game load time about twice as fast #2416

Conversation

andrew-raphael-lukasik commented Sep 4, 2022

what

why

result

Interkarma commented Sep 4, 2022

andrew-raphael-lukasik commented Sep 5, 2022

andrew-raphael-lukasik commented Sep 6, 2022

Interkarma commented Sep 6, 2022

andrew-raphael-lukasik commented Sep 9, 2022 • edited

andrew-raphael-lukasik commented Sep 9, 2022 •

edited