Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using fmt with Keil armcc compiler #758

Closed
Amomum opened this issue Jun 4, 2018 · 6 comments
Closed

Using fmt with Keil armcc compiler #758

Amomum opened this issue Jun 4, 2018 · 6 comments

Comments

@Amomum
Copy link

Amomum commented Jun 4, 2018

That's not a bug, just a bunch of questions.

I tried to use fmt with Keil armcc - it's one of the major compilers for embedded ARM targets (ARMv7, Cortex-M and so on).
It lacks proper C++11 support (no standard library, UDL, etc) so fmt 5 was out.

I was able to compile fmt 4.1 with minor patching of format.cc (I can post a patch if somebody will be interested). And here's the deal:

My main looks like this:

#include "fmt/format.h"

int main(void)
{
  while(1);

  return 0;
}

(and obviously I had to reimplement fputc, ferror and a couple other functions so stdout would actually print somewhere).

Exceptions were not really useful so I turned them off with FMT_EXCEPTIONS=0.

When compiled and linked with --feedback option (link time removal of unused code) this main produces 37296 bytes of binary. That's kinda a lot!
I guess, armcc is not capable of removing unused code from complicated templates (since no fmt functionality is actually used here). Using -O3 reduced code size to 32 KiB which is still a lot.

When I use usual printf to print one line, code size is 1108 bytes. That's a big difference. I know that 37 KiB does not sound like a lot but believe me, on embedded device with just 64 KiB of memory - it really is.

So my main question is: is there any way that I can reduce binary size with fmt?

@agauniyal
Copy link

@Amomum is mentioned compiler available on godbolt? Then someone could look at how different instructions are being generated by two separate compiler, like here I put clang vs ARMgcc and they generated similar amount - https://godbolt.org/g/WwPum5

@Amomum
Copy link
Author

Amomum commented Jun 5, 2018

@agauniyal no, unfortunately armcc is not available on godbolt, it's closed source and shareware.
I can post full dissasembly or a list of functions that are linked in the binary.

I'm afraid that in your example assembly looks similar because all the work is done in called functions:

 call fmt::v5::vprint(fmt::v5::basic_string_view<char>, fmt::v5::format_args)

or

bl printf

and their assembly is not shown (and I'm not sure if it can be seen on godbolt at all).

@vitaut
Copy link
Contributor

vitaut commented Jun 7, 2018

I agree that 30k is significant for a small embedded platform. The design ensures that per-call overhead is low (compared to printf) but I don't think any work has been done to reduce the library size itself. There might be some low hanging fruits but I don't have any good suggestions from the top of my head other than maybe not instantiating wchar_t overloads in https://github.com/fmtlib/fmt/blob/4.x/fmt/format.cc#L477-L487 and using a subset of the library (format.cc and format.h) since it appears that the linker has troubles eliminating unused code.

@Amomum
Copy link
Author

Amomum commented Jun 7, 2018

Here is the list of all functions that are linked in the binary; it's tab-separated because spaces and commas can be inside demangled function names.

Format is:

function name <tab> code size in bytes <tab> object file name

Maybe this will shed some light on what's going on.

@vitaut
Copy link
Contributor

vitaut commented Jun 8, 2018

Looks like sprintf used by floating-point formatting pulls in a lot of stdio functions, even scanf:

    __vfscanf                                	878	_scanf.o
    _scanf_really_hex_real                   	786	scanf_hexfp.o
    _printf_fp_hex_real                      	756	_printf_fp_hex.o
    __btod_div_common                        	696	btod.o
    _scanf_really_real                       	684	scanf_fp.o

If you don't use floating point formatting you might try conditionally compiling (or rather removing) the implementation of write_double in

void basic_writer<Range>::write_double(T value, const format_specs &spec) {
i.e. making it a noop.

@vitaut
Copy link
Contributor

vitaut commented Jun 10, 2018

Closing for now, but will be happy to discuss the question of reducing the library size further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants