Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimizations: game load time about twice as fast #2416

Conversation

andrew-raphael-lukasik
Copy link
Contributor

Hi guys, I am working on this for a few weeks now and wanted to get some feedback from you

what

  • a lots of different optimizations that all targets game one single issue: load time
  • adds profiler markers (load time related code only)
  • adds utility methods & jobs. One is worth noting here:any_managed_array.AsNativeArray() extension method.
  • few unit tests to make sure those few parts aren't broken (will see, maybe I will write more of them)

why

  • load time on my machine was too high (up to 5-8 seconds)

result

  • it reduced load time to 2-4 seconds for initial load and to 1-2 for subsequent ones

  • performance markers give some ideas where further optimizations can be made

  • auxiliary cores work less now, which seems bad but this is the result of Burst compiling jobs that previously weren't AND general deficit of IJob-ified code to throw at them

  • Old profiler view (left) - basically no useful info. New profiler view (right) - detailed insight on what cpu is doing

  • Unit tests. Low coverage but a good start.

@andrew-raphael-lukasik andrew-raphael-lukasik changed the title Game load time optimizations (about twice as fast) optimizations: game load time about twice as fast Sep 4, 2022
@Interkarma
Copy link
Owner

Hey there. :) I'm happy to see your optimisation efforts, and I do value the large amount of work you've put into this. However, the way this PR is presented makes it difficult for me to review and stage your changes across multiple releases.

The most obvious is the significant number of style-only changes that don't serve the PR intent as described. For example:

  • Changing camelCase to PascalCase - (shm to Shm)
  • Renaming fields - (paletteData to paletteBuffer)
  • Using var instead of explicit types (NativeArray<byte> tileData to var tileData)
  • Editing legacy properties to Lambda operator (=>)

None of this is strictly a problem in isolation, but it very much muddies the intent in a large PR with dozens of files impacted. My preference is for contributors to follow the writing style of original author, be it myself or someone else. Otherwise, I have to spend considerable time just unpacking what are functional changes and what are style changes of the latest contributor. You're welcome to follow your own particular style in your new code files provided it's still easy for others to read.

I also prefer for functional changes to be split across several changes over time, with each stage in the public hands for a while before the next. I've found this gradual incremental approach a good way to find problems and adapt rapidly.

As an example, the way @KABoissonneault staged changes to support custom enemies over several months and spread across many releases worked effectively. There were small problems found at each stage of this process after release - but thanks to the incremental approach this was limited and easily fixed in each subsequent release. If these changes were dropped all at once, the issues would have been compounded and harder to isolate. This ultimately helps end users by limiting issues, in addition to helping me review changes.

I'm not opposed to your optimisation efforts. But I need them to be unpacked and limited to functional changes, and better distributed so they can be rolled out incrementally. I can't say that I'll merge this on this side of 1.0. And depending on where I go after finishing with DFU, I may not be the current steward of project when this is merged.

I hope others will check in on this PR and provide feedback - @KABoissonneault and @ajrb might want to look through as it touches their work also. I always appreciate their eyes on the more complex changes.

Thank you @andrew-raphael-lukasik. I hope you understand. :)

@andrew-raphael-lukasik
Copy link
Contributor Author

Thank you @Interkarma, that is what I needed and 100% humbly agree. I will divide this up into smaller, more reviewer-friendly PRs as you said.

@andrew-raphael-lukasik
Copy link
Contributor Author

note

I tested some of my previous assertions and came to a conclusion that some of the math-oriented changes made previously in this PR are actually detrimental. More on that here.

@Interkarma
Copy link
Owner

Thank you for the detailed write-up. I appreciate the extra effort to confirm changes were actually beneficial. No point breaking things for the sake of it, or even for minor gains. Good job!

Anecdote time. When I was a younger person coming out of 6502 into C/C++, I stressed every instruction and every line of code. I wrote inline assembly, unrolled loops, and wrote all my own libraries. After I while, I noticed that compilers generally did a better job at optimisation than I could, and the standard libraries were faster too. I'm not the brightest programmer (certainly not on your level), so this came as a tremendous relief to me. :) After some time, I started concentrating on making stuff work as the top priority, then optimising the biggest bottlenecks first. As I like to say "perfect is the enemy of good".

But with your help, I think DFU will take quite a few steps beyond good. Cheers. :)

@andrew-raphael-lukasik
Copy link
Contributor Author

andrew-raphael-lukasik commented Sep 9, 2022

While reading somewhat deeper into the DFU source code I only gained more respect for you, sir - Hats off, truly amazing job reverse-engineering all that!

Having some narrow programming skills (like me) is definitely nowhere near to getting all this resolved and project of this magnitude done and working. And the end result is awesome - people will enjoy this game decades to come, that is for sure.

One last point on optimization - compilers can optimize math but cant optimize how we read from memory, and this often the most common perf bottleneck nowadays. I am merely repeating after this guy, Mike Acton: https://www.youtube.com/watch?v=rX0ItVEVjHc

6502 ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants