Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assorted optimizations #62

Closed
wants to merge 15 commits into from
Closed

Assorted optimizations #62

wants to merge 15 commits into from

Conversation

debrouxl
Copy link
Contributor

@debrouxl debrouxl commented Sep 1, 2017

Together, these commits shave more than 33500 bytes, i.e. more than 4.7%, over the size of the ARM binary compiled by the arm-none-eabi toolchain packaged by Debian unstable (GCC 6.3.1, binutils 2.28). They optimize the source code's size, too, but that doesn't matter :D
More than half of the optimization potential of an old version of this PR was destroyed by integrating #123 - essentially, my optimizations mostly remain, but jacobly's have disappeared.

Premature optimization is known to be bad... but so is letting known pessimization patterns (e.g. copying nontrivial classes such as Complex by value when const references do the job, copy-paste jobs, letting hundreds of trivial constructors / getters / setters / action methods out of line) stay, grow and multiply in the code base :)

I am well aware that the pull request contains way too many commits, and that some of those commits are very large (despite some splitting work). As long as I hadn't signed that that stupid, counter-productive CLA that you insist on people signing, it wasn't much of an issue, as you wouldn't have looked at the code anyway ;)
Before I signed the CLA, you couldn't integrate those changes even if they were coming from other persons (and in fact, you probably independently reinvented several changes on your side - as I wrote below, I noticed it when fixing conflicts), unless said changes were suggested by other persons before I implemented them, as occurred with a couple ideas from jacobly. But the code can still be useful to other persons for their own purposes, and can be subject to review, in the meantime.

@CLAassistant
Copy link

CLAassistant commented Sep 1, 2017

CLA assistant check
All committers have signed the CLA.

@debrouxl
Copy link
Contributor Author

debrouxl commented Dec 10, 2017

I've finished the rebase onto current master @ 8c37e8a .

Most of the optimizations I performed several months ago on 1.1.x remain valid, under some form, on 1.2.x. The largest conflicts were, probably as expected, in Poincare. The absolute savings of this PR are nearly 2 KB higher in 1.2.x than in 1.1.x, partially because I added some more optimizations, but as the whole code base is significantly larger, the relative savings are lower (5.8% -> 5.3%).

The number of patches was reduced from 18 to 16 by squashing two consecutive patches together, as well as removing a patch which added -fno-math-errno to the build flags - that used to be an optimization, it's now a 8-byte pessimization. However, the modified file count raised by several dozens.

Resolving some of the conflicts showed that a tiny subset of my changes was integrated upstream, probably independently reinvented, after I published them ;)

… on (mostly final, some singleton) classes with trivial constructors, getters and/or setters.

Signed-off-by: Lionel Debroux <lionel_debroux@yahoo.fr>
…ader, so that the compiler can leverage its prior knowledge of the fact that the destructor is trivial (noticed by disassembling the code) to greatly optimize derived classes' destructors.

Signed-off-by: Lionel Debroux <lionel_debroux@yahoo.fr>
…declared only once; optimize several strlcpy() with short strings as source; add and use strcpy() to replace strlcpy(dest, src, strlen(src)+1); add and use a function to factor out some redundant code in three app controllers.

Signed-off-by: Lionel Debroux <lionel_debroux@yahoo.fr>
… while making classes final - all of that only when it does not hurt size.

Signed-off-by: Lionel Debroux <lionel_debroux@yahoo.fr>
…, setters, forwards, while making classes final - all of that only when it does not hurt size.

Signed-off-by: Lionel Debroux <lionel_debroux@yahoo.fr>
…setters, forwards, while making classes final - all of that only when it does not hurt size.

Signed-off-by: Lionel Debroux <lionel_debroux@yahoo.fr>
…setters, forwards, while making classes final - all of that only when it does not hurt size.

Signed-off-by: Lionel Debroux <lionel_debroux@yahoo.fr>
…setters, forwards, while making classes final - all of that only when it does not hurt size.

Signed-off-by: Lionel Debroux <lionel_debroux@yahoo.fr>
…rds, while making classes final - all of that mostly when it does not hurt size.

NOTE: switching KDCoordinate from int16_t to int32_t would save nearly 1 KB.

Signed-off-by: Lionel Debroux <lionel_debroux@yahoo.fr>
…, BoundedStaticHierarchy, Complex, Integer and Symbol constructors, destructors and method bodies to the header, to optimize the code quite a bit by giving inlining a chance.

Signed-off-by: Lionel Debroux <lionel_debroux@yahoo.fr>
…ine::ComplexAndComplexReduction, Rational::Rational and their implementations + subroutines, take const Complex<T> & instead of (const) Complex<T>, to avoid inefficiently copying Complex instances around; inline most clone() methods, which saves a bit of space as well.

Signed-off-by: Lionel Debroux <lionel_debroux@yahoo.fr>
…ng the creation of instances from NAN, INFINITY and real numbers.

Signed-off-by: Lionel Debroux <lionel_debroux@yahoo.fr>
…ate() / toScalar() / delete triplets by calls to approximateToScalar(), which does exactly this.

Signed-off-by: Lionel Debroux <lionel_debroux@yahoo.fr>
…rsions, factor out methods between the FloatPairStore children, perform a bit of manual CSE.

This reduces accuracy a tiny bit, but... if accuracy were a goal, only double would be used in the first place.

Signed-off-by: Lionel Debroux <lionel_debroux@yahoo.fr>
…new[] and operator delete[]. It increases size very slightly (small code generation changes in some functions), but it allows executing a single bl instead of a bl followed by a b.

Signed-off-by: Lionel Debroux <lionel_debroux@yahoo.fr>
@debrouxl
Copy link
Contributor Author

At long last, the Epsilon licensing issue was alleviated (today's updates to #38 and the license change).
I still hate CLAs, because they're a barrier to contribution and they drive some contributors away (the GNU project, Canonical and others saw that effect)... but with NumWorks' reaffirmed and reinforced commitment to a CLA, despite the potential harm, me remaining adamant about not signing this kind of PoS isn't the most constructive behaviour I can display. Therefore, I signed the f* CLA, and let's proceed integrating some of my work to Epsilon.

I've finished the rebase onto current master ( fde4883 ), in several steps, all of which are published on my repo. However, I'll stagger force pushes to my repo's master branch, so that I can write comments here every time.

Nearly all of the optimizations I performed several months ago on 1.2.x remain valid, under some form, on 1.3.x. The absolute savings of this PR are ~200 bytes higher in 1.3.0 than in 1.2.0, but again, as the whole code base is significantly larger, the relative savings are lower (5.3% -> 5.1%).

No change on the number of patches.

Resolving some of the conflicts showed that a tiny subset of my changes was integrated upstream, probably independently reinvented, after I published them ;)

@debrouxl
Copy link
Contributor Author

debrouxl commented May 28, 2018

Guess what, nearly all optimizations still hold on 1.4.0 and 1.4.1. The absolute savings are just about the same, but the relative savings are now barely above 5.0%.

@debrouxl
Copy link
Contributor Author

Likewise for 1.5.0. The absolute savings are several dozens of bytes higher, but epsilon.elf gained over 36000 bytes, which deflates the relative savings to ~4.75%.

@EmilieNumworks
Copy link
Collaborator

I am going to close this PR as it is outdated. I am sorry we did not review it at that time.

@adriweb
Copy link
Contributor

adriweb commented Mar 6, 2020

Rebasing some of the commits for a "part 2" could likely still work, but will need some time, hmm

@debrouxl
Copy link
Contributor Author

debrouxl commented Mar 6, 2020

Some of the optimizations scattered around the PR are certainly still valid indeed, but this PR predated the usage of LTO, which was the way to go to reduce the size a lot, but which made the result of optimizations unstable over time, and as such, harder to predict and maintain in the face of code evolution...

LTO probably made a small subset of these optimizations automatically performed, but the main consequence was counter-intuitiveness and volatility over time of the results. Making the code of a function objectively better (individually smaller), or a class explicitly final (good for correctness and formerly usually for size optimization), can change link-time optimization decisions such as inlining, partial specialization and finalization, and end up causing a size increase. Explicit noinline marking could work around that in the current state of the code, but could later prove detrimental to optimization, after the code has changed to fulfill functional requirements.
Also, LTO made the build cycle much slower, and therefore, it's harder to test the result of individual optimizations, the way I did it when I worked on this PR in pre-LTO times.

Even if there were several speed-oriented changes causing minor size increases in my PR, e.g. inlining operator new / new[] / delete / delete[], the main goal has always been to decrease the size, so as to be able to cram more functionality within the N0100's low Flash memory size.

EDIT: the TL;DR of this message is that merging the whole set of changes from this PR would probably have been a bad thing, as some changes would have had adverse effect on size with LTO, though others kept being useful with LTO, as shown by issue #519 . However, I probably would have liked to know that you'd eventually be doing nothing with the changes in this PR, I would have spent less time on it ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants