Skip to content

Conversation

@QuintillusCFC
Copy link
Member

Boosts FLC import performance by 40% (what used to take 1 second now takes 0.6 seconds).

 - Minor things in ReadFlic (ToString implementation), Game.cs (move the start-up timer to the wrong place so I get metrics for just the FLC), and project.godot (the console can now handle reams of print statements, useful when I was printing all the images in all the FLCs to get an idea of the overall size of the task)
 - Micro-optimizations in LegacyUnit.cs that don't make much of a difference.  Basically, replacing expensive CPU instructions (multiply, divide, modulus) with cheap ones (addition, shift logical left).
…els takes half the time.

Obviously we don't want to merge this, but it feels good to follow the one commit, one change ideal when it so clearly demonstrates a difference.
…ching from SetPixel to LoadBmpFromBuffer.

SetPixel was taking up half the time before, about 350 ms out of just over 700.  This new method reduces that time by 80%.

One major call-out about this code is that it uses an unsafe helper function to process the bits, including pointers and pointer arithmetic.  When thinking "how do we create a byte buffer where there are various length primitives included in it?" this seemed like the way to go, as it allows us to treat shorts and ints as shorts and ints by pointer casting.

We might be able to work around it without an unsafe block by tossing the ints/shorts in the header in as multiple bytes, based on their constituent bytes, and using bytes from the palette as-is.  But that introduces a different level of complex reasoning.

Regardless of what we decide about the specifics of this function, it shows that there are real and significant gains to be had by using byte buffers instead of SetPixel, and it was fun to dive back into the BMP format and learn how to write some low-level C-style C#.
We've added enough stuff to slow this down again over the past 5 days, so it's now slower than before (2.4 seconds).  But we're drawing more stuff, and importing a SAV, and haven't tried to optimize any of that.  Not going to try to make everything lickity-split in this branch.
…r amount if enabled, though are useful for figuring out the scope of what we import.

[network]

limits/debugger_stdout/max_chars_per_second=16384
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This lets the console print more stuff. When I was printing out all the pictures we were importing, the console couldn't handle it until I added this. Might be too much on really slow systems, but it seems like a generally good thing and didn't cause any problems on my 9.9-year-old system.

Copy link
Member Author

@QuintillusCFC QuintillusCFC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like what I expected. Removed a few debugging sysouts.

@QuintillusCFC QuintillusCFC merged commit b711c0a into Development Nov 29, 2021
@QuintillusCFC QuintillusCFC deleted the FLCPerformance branch November 29, 2021 16:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants