Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some flash size optimizations related to string0.c (implementation of str/mem functions) #6397

Merged
merged 4 commits into from Oct 13, 2022

Conversation

jepler
Copy link
Member

@jepler jepler commented May 14, 2022

This saves 124 bytes on trinket m0, though at some performance cost (I didn't measure). On "full build"s it saves 24 bytes, the rest is only saved on small builds.

Taking just the extra memset() removal part of the patch saves 24 bytes of flash, removing the int-at-a-time optimization from string0 saves 100.

It's lightly tested on a pygamer with the de-optimization manually enabled (it wouldn't be enabled on a pygamer by default).

This saves 24 bytes of flash on trinket m0
This saves 100 bytes on trinket_m0 at the price of making many memset &
memcpy calls slower.
@dhalbert dhalbert added this to the 7.x.x milestone May 14, 2022
@dhalbert dhalbert modified the milestones: 7.x.x, 8.0.0 May 24, 2022
@jepler jepler requested a review from dhalbert June 27, 2022 17:54
@DavePutz
Copy link
Collaborator

I tested this on a CircuitPlayground express. The memset build took an extra 5812 bytes, but performance tested about the same between the two builds.

@jepler
Copy link
Member Author

jepler commented Jul 25, 2022

It's very surprising that this branch would take an extra 5812 bytes of flash storage. In case you looked at the size of this PR vs the size of main, I've updated the PR with the latest changes on main as of today.

@DavePutz
Copy link
Collaborator

@jepler , I was surprised as well. My numbers for a Trinket M0 build matched the ones you saw. I'll continue to investigate.

@DavePutz
Copy link
Collaborator

Re-tested CircuitPlayground Express with the latest changes. Size now looks much better, the increase was 148 bytes and performance was on a par with main. I built some boards with CIRCUITPY_FULL_BUILD = 0 (itsybitsy_m0, metro_m0_express, and qtpy_m0) and all saw reduced code sizes. The difference with CircuitPlayground Express looks to be due to it being a full build. An itsybitsy_nrf52840_express build (which is also a full build) was 208 bytes larger.

@jepler
Copy link
Member Author

jepler commented Jul 25, 2022

hmm that's still odd. If storage in flash isn't being freed up with this PR it doesn't make sense to take the change.

@dhalbert
Copy link
Collaborator

I tested the build sizes on a version of this merged with the latest main, on Trinket M0, CPX, Metro M0, and Metro M4. The savings are essentially as the same as reported in the OP, so I think this is safe to merge.

@dhalbert dhalbert merged commit 8825e7f into adafruit:main Oct 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants