Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

F() Flash Macro No Longer Recognized #62

Closed
woolseyj opened this issue Dec 2, 2019 · 38 comments
Closed

F() Flash Macro No Longer Recognized #62

woolseyj opened this issue Dec 2, 2019 · 38 comments

Comments

@woolseyj
Copy link

woolseyj commented Dec 2, 2019

The F() flash macro used for storing strings in flash memory no longer appears to be working as of the 1.8.5 version. The following error is reported during compile for each use.

error: cannot convert 'const char*' to 'const __FlashStringHelper*' in return
#define F(str) (str)

@facchinm
Copy link
Member

facchinm commented Dec 3, 2019

F() is useless in megaAVR architecture (it only increases flash occupation). Which code are you trying to compile?

@matthijskooijman
Copy link
Collaborator

F() is useless in megaAVR architecture (it only increases flash occupation). Which code are you trying to compile?

But for compatibility with existing sketches, all other cores have a working F() macro as well, even when they do not actually need.

@facchinm
Copy link
Member

facchinm commented Dec 3, 2019

Also this core has it, this is why I'm asking for the code 🙂

@woolseyj
Copy link
Author

woolseyj commented Dec 3, 2019

The code is the Adafruit_IO_Arduino library.

@facchinm
Copy link
Member

facchinm commented Dec 3, 2019

I just compiled one random example from that library and I didn't get any error. Could you provide more info?

@woolseyj
Copy link
Author

woolseyj commented Dec 3, 2019

Here is an example sketch that uses the library.

@facchinm
Copy link
Member

facchinm commented Dec 3, 2019

Got it. This patch makes the project compile again

diff --git a/cores/arduino/Arduino.h b/cores/arduino/Arduino.h
index 1f02a58..3891105 100644
--- a/cores/arduino/Arduino.h
+++ b/cores/arduino/Arduino.h
@@ -28,6 +28,7 @@
 
 #undef F
 #define F(str) (str)
+#define __FlashStringHelper char
 
 #ifdef __cplusplus
 extern "C"{

But I'm not 100% sure it is the right approach.
@MCUdude do you think it could work as is?

@woolseyj
Copy link
Author

woolseyj commented Dec 3, 2019

Confirmed.

@MCUdude
Copy link
Contributor

MCUdude commented Dec 3, 2019

@facchinm even though it would fix this exact issue, it's not a very elegant solution. Will this mess up other libraries that use __FlashStringHelper in some way?

@matthijskooijman
Copy link
Collaborator

I think it will, since libraries might overload a method for both char* and __FlashStringHelper*. It might be better to make this a separate type and heave F() cast to __FlashStringHelper* as well.

How does SAM and SAMD solve this? I suspect like above?

@woolseyj
Copy link
Author

woolseyj commented Dec 3, 2019

@facchinm, could you please explain a little further why the F() macro is useless for the megaAVR architecture? I did not understand what you meant by "only increases flash occupation". Both the Uno R3 and the WiFi R2 have flash memory; are they used differently?

@facchinm
Copy link
Member

facchinm commented Dec 3, 2019

Classic AVR architecture is referred as modified Harward, because data and memory are in different address spaces (so address 0 could refer to ram or flash, and there are different functions to access them).
New AVR (rebranded megaAVR) architecture is instead more conventional as flash and ram share the same address space (like ARM) so there's no need for special treatment if a variable is only in flash (adding const modifier is enough).

@woolseyj
Copy link
Author

woolseyj commented Dec 3, 2019

Great, thanks!

@facchinm
Copy link
Member

facchinm commented Dec 3, 2019

Anyway, __FlashStringHelper is clearly not intended to be used as-is but as an helper to print/compare strings in Flash with strings in RAM on a very peculiar architecture via the F() macro. Clearly it's too late to hide it but I have no better idea than defining it as const char or char.

@woolseyj
Copy link
Author

woolseyj commented Dec 3, 2019

Does that mean that the ATmega328P in the Uno R3 also does not need to use the F() macro?

@woolseyj
Copy link
Author

woolseyj commented Dec 3, 2019

Does this also mean that instead of using
Serial.println(F(“This is a very long string”));
we should now use

const String longString = “This is a very long string”;
Serial.println(longString);

or

const char longString[] = “This is a very long string”;
Serial.println(longString);

Sorry for all of the questions.

@matthijskooijman
Copy link
Collaborator

Does that mean that the ATmega328P in the Uno R3 also does not need to use the F() macro?

What makes you ask that? The 328P uses the "original" AVR architecture, which does need the F() macro (or related PSTR() or PROGMEM) to put things into flash without copying into RAM, which is different from the megaAVR arch this core is for.

Anyway, __FlashStringHelper is clearly not intended to be used as-is but as an helper to print/compare strings in Flash with strings in RAM on a very peculiar architecture via the F() macro. Clearly it's too late to hide it but I have no better idea than defining it as const char or char.

It would certainly have been better if there was a proper public type for this with a better name, but some libraries will need to use this wrapper type to properly handle both PROGMEM and regular strings. And, as you say, it's already used in the wild, so we need to keep it working if possible.

As for the proper way to do this, I think we can just implement F() as a simple cast (like the avr version without the PROGMEM attribute/PSTR() macro), and the pgm_read_* as no-ops (I suspect avr-libc might also do that? Or does this core use a different libc?).

Looking at SAMD, it does this by overrinding the PSTR() macro (along with a bunch of others) as a no-op:

https://github.com/arduino/ArduinoCore-samd/blob/2ca7f7531d4e4e54068bc584503b9e02e9033f3b/cores/arduino/avr/pgmspace.h#L34

and defining F() just like in AVR:

https://github.com/arduino/ArduinoCore-samd/blob/2ca7f7531d4e4e54068bc584503b9e02e9033f3b/cores/arduino/WString.h#L37-L38

I think the same approach would work for megaAVR. You should probably include all of avr/pgmspace.h from the SAMD core, so all pgmspace works transparently (unless avr-libc already defines all these as no-ops, in that case you only need to define the __FlashStringHelper class and F() macro as shown above, I think).

@matthijskooijman
Copy link
Collaborator

Does this also mean that instead of using
Serial.println(F(“This is a very long string”));
we should now use

const String longString = “This is a very long string”;
Serial.println(longString);

No, that has completely different behaviour (it allocates a dynamic string on the heap).

const char longString[] = “This is a very long string”;
Serial.println(longString);

This (or just Serial.println(“This is a very long string”);) is how you would do it on most architectures. On AVR you an also do this, but this produces additional RAM usage that can be prevented with F().

The point of this issue is to ensure that code that is written for AVR and uses F() can be compiled on other architectures as well, without changes (and then would end up effectively the same as Serial.println(“This is a very long string”);.

@woolseyj
Copy link
Author

woolseyj commented Dec 3, 2019

I asked that because both the ATmega328P (Uno R3) and the ATmega4809 (Uno WiFi R2) have mega in their name. I had assumed the 328P was Classic AVR and the 4809 was the new architecture, but I then realized both had mega in their name and thought I should clarify.

@matthijskooijman
Copy link
Collaborator

I asked that because both the ATmega328P (Uno R3) and the ATmega4809 (Uno WiFi R2) have mega in their name. I had assumed the 328P was Classic AVR and the 4809 was the new architecture,

That is correct.

but I then realized both had mega in their name and thought I should clarify.

And that is, indeed, confusing :-)

@bridystone
Copy link

Hi guys,

I'm running in the same issue as mentioned above.

@matthijskooijman write:

I think it will, since libraries might overload a method for both char* and __FlashStringHelper*. It might be better to make this a separate type and heave F() cast to __FlashStringHelper* as well.

I agree that using the explicit cast also fixes the issue:
if you could change:
#define F(str) (str)
to
#define F(str) (const __FlashStringHelper*)(str)

currently I need to either use the older 1.8.4 of megaAVR or fix it locally to get my project working.

@facchinm @MCUdude :
what do you think?

@PaulStoffregen
Copy link
Sponsor Contributor

SAMD, SAM, ESP32, STM32 define it this way:

#define F(string_literal) (reinterpret_cast<const __FlashStringHelper *>(PSTR(string_literal)))

Teensy uses essentially the same define, but with C style syntax:

#define F(string_literal) ((const __FlashStringHelper *)(string_literal))

ESP8266 uses this:

#define FPSTR(pstr_pointer) (reinterpret_cast<const __FlashStringHelper *>(pstr_pointer))
#define PSTR(s)            (__extension__({static const char __c[] __attribute__((__aligned__(4))) PROGMEM = (s); &__c[0];}))
#define F(string_literal) (FPSTR(PSTR(string_literal)))

@bridystone
Copy link

bridystone commented Feb 2, 2020

Short feedback here:

#define F(string_literal) ((const __FlashStringHelper *)(string_literal))

I've tried this approach. It didn't work!
it compiled fine - but the result was awful..

I use the MFRC522 library.
And the return in Serial was BAD... only lots of question marks

When adding

#define __FlashStringHelper char

it worked here at least...

@SpenceKonde
Copy link
Contributor

SpenceKonde commented Apr 21, 2020

The reason #define F(string_literal) ((const __FlashStringHelper *)(string_literal)) does not work is that those libraries see that and use pgm_read_byte() to read it instead of just reading the variable. But variables declared PROGMEM in megaavr-land have the address offset, so it's just a pointer to the location in flash, not the location in RAM where the flash is mapped to. and pgm_read_byte() (effectively) does the corresponding adjustment (actually, it uses lpm, like the classic AVRs, but if you think of it as making the corresponding adjustment, that's conceptually equivalent) - so we're taking a pointer that points to the location in the memory address space, but then using a function to read from it that expects an address in the flash...

I think #define __FlashStringHelper char is a much better solution here... I don't see any solutions that maintain perfect compatibility but don't kneecap the advantages that the megaavr architecture has... I mean, I guess you could just go back to making it put the strings into PROGMEM and returning a __FlashStringHelper - but then we're back to wasting flash - and I think pgm_read_byte() is probably significantly slower too....

To what extent are libraries that break with that fix around in the wild AND NO LONGER MAINTAINED? Because if they're still being actively maintained, this should be fixed by the library authors IMO - should be a dead simple fix to just check for a register that only exists on megaavr (how about VPORTA? That will work for all the megaavr parts, including megaAVR 0-series, tinyAVR, and the new DA-series parts) with a #ifdef and not include the problematic functions if it's there - would probably be trivial to just PR problematic libraries. Like, in the time people have been discussing it in this thread, we could have probably found and PR'ed all the libraries people are using that the naive compatibility layer breaks...

@matthijskooijman
Copy link
Collaborator

But variables declared PROGMEM in megaavr-land have the address offset, so it's just a pointer to the location in flash, not the location in RAM where the flash is mapped to.

Oh, so there are still two separate flash and data memory spaces on megaavr, it's just that flash is mapped into data with an offset. That's rather unfortunate, but I can imagine this has compatibility advantages at some level.

I think #define __FlashStringHelper char is a much better solution here...

As I said in #62 (comment), this would break for some libraries:

think it will, since libraries might overload a method for both char* and __FlashStringHelper*. It might be better to make this a separate type and heave F() cast to __FlashStringHelper* as well.

An example of this is the Print class, but there are likely also libraries in the wild that doe this.

So making __FlashStringHelper equal to char, will end up with duplicate overloads.

The SAM/SAMD solution is: keep __FlashStringHelper as a separate type, have F() cast to this type, and have pgm_load_* cast back to char, effectively making all of this an elaborate no-op of casts.

The SAM/SAMD solution will not work for megaavr, because megaavr (as @SpenceKonde pointed out) already defines pgm_load_* functions which have different semantics, so we cannot make those plain casts.

Another way to look at it, is that the "Arduino API" currently implies that:

  1. F() returns a __FlashStringHelper*
  2. Casting that to char*Passing that into pgm_load_* returns the original string.

This is not documented API, so we could maybe change this, but preserving some compatibility with this behaviour is important IMHO.

Given that pgm_load_* is fixed on megaavr, if we want to be compatible with the above API, then__FlashStringHelper* must actually be used for flash-addressed PROGMEM strings rather than RAM-addressed strings, which would suggest the entire thing needs to be kept.

a dead simple fix to just check for a register that only exists on megaavr (how about VPORTA?

That does not sound like a good approach, we should rather define a clear API (adding some define that describes the semantics of this core) rather than relying on an arch-specific exception.

However, writing this, I think there would be an alternative fix:

  • Let F() return a char* instead of a __FlashStringHelper*.
  • Define __FlashStringHelper as on the AVR core as a dummy class.
  • Handle __FlashStringHelper* as on AVR through pgm_load_* (e.g. in Print).

This means that:

  • Things like Serial.print(F("foo")); remains working and will be efficient, F() is essentially a no-op.
  • Nothing in the Arduino core returns a __FlashStringHelper*, but it still exists for libraries to not break.
  • Libraries/sketches that define an __FlashStringHelper* overload will still have it, it will just probably be never called.
  • Libraries/sketches that allocate a PROGMEM string manually and cast it to __FlashStringHelper will still work.

This does break point 1. above, since F() now no longer returns a __FlashStringHelper*. This will break code that does something like this:

const *__FlashStringHelper* somestr = F("foo");

The above does currently work on AVR/SAM/SAMD and is used in practice (I've been using it in a few projects). So maybe this approach is not quite perfect either...

One other approach, which I think is essentially what @SpenceKonde proposes, would be to:

  • Make __FlashStringHelper an alias of char (I would use a typedef rather than a define, to observer proper scoping rules, but the result would usually be the same).
  • Let F() just return char* (or __FlashStringHelper*, which is now the same).
  • Define something like ARDUINO_FLASHSTRINGHELPER_IS_CHAR to let libraries (and the Print and String classes) know about this situation (this replaces @SpenceKonde's VPORTA proposal).

This allows libraries and sketches to, omit overloads for __FlashStringHelper when ARDUINO_FLASHSTRINGHELPER_IS_CHAR is defined, to prevent overload conflicts.

Hm, but this would then break libraries that do:

const char foo_pgm[] PROGMEM = "foo";
const __FlashStringHelper* foo = (__FlashStringHelper*)foo;

Since that ends up with a pointer into flash, casted to normal char*, which breaks.

I have also used code like this in practice, since F() cannot be used in global scope, so a construct like this is needed there.

One final approach would be to:

  • Replace pgmspace.h with a version that makes PROGMEM and pgm_read_* a no-op.
  • Use the SAM/SAMD no-op version of F().

However, that would then break applications that actually want to read from hardcoded flash offsets (e.g. to read a bootloader version from flash) or want to overwrite PROGMEM variables with pgm_write_*...

Hm. I'm not so sure what the best way forward would be here. What a mess :-S

Maybe we should byte the bullet and implement some better alternative for __FlashStringHelper that is properly documented (maybe an extension for the String class to store different types of strings?). I have some ideas for this, but no time to work on them, I'm afraid...

@bxparks
Copy link
Contributor

bxparks commented Aug 14, 2020

Another problem with changing the return type of F() with #define F(x) (x): It breaks code which has functions that accept only const __FlashStringHelper* parameters, with no corresponding overloaded const char* versions, something like:

void someFunction(const __FlashStringHelper* msg);
// void someFunction(const char* msg); // this does not exist

I have at least 24 of those in one of my libraries.

I agree that redefining __FlashStringHelper to charis not a viable solution, because this would break libraries/code that do have overloaded functions.

It seems like we need a layer indirection, a set of str*_FS(..., const __FlashStringHelper*, ...) functions corresponding to each of the str*_P() functions. On the megaAVR, the F() can be defined without the PROGMEM, and the str*_FS() functions simply delegate to the normal str*() functions after casting to const char*.

On all architectures, if someone creates a PROGMEM manually, they must use the str*_P() functions, which I think is the convention that people follow. On the megaAVR, there is the edge case of someone creating a PROGMEM manually, then also manually casting it to const __FlashStringHelper*, e.g. through the FPSTR() macro. I cannot think of anyway to detect or disallow this edge case.

I agree with @matthijskooijman. This is a bit of a mess.

@SpenceKonde
Copy link
Contributor

Worth noting that with current implementation (I think everywhere) if they cast it to a __FlashStringHelper* and put it into PROGMEM explicitly, that will work (and it won't work if treated like a const char* - but it will if you offset it by 0x4000 (megaAVR 0-series) or 0x8000 (tinyAVR 0/1-series)

@matthijskooijman
Copy link
Collaborator

Re-reading the above, I think I can see two ways to fix this:

  1. Just implement PROGMEM normally, as on regular AVR. So F() returns a PROGMEM string, so with the address in the pgm_read_*-accessible area of flash, casted to __FlashStringHelper* and casting that back to char* and feeding it to pgm_load_* would work as expected. This is probably slightly less efficient (because it uses lpm instead of regular load instructions, but it's probably not that bad. Non-portable sketches that want maximum efficiency, can also omit F() entirely, then.
  2. Replace the avr-libc-supplied pgm_load_* functions with dummy functions, and replace the PROGMEM macro with something empty. This is essentially what SAMD and other non-AVR architectures also do, just define all parts of the PROGMEM-ecosystem (PROGMEM, PSTR(), F(), pgm_load_* as dummies, while still keeping the __FlashStringHelper* tyep). On MegaAVR this is slightly harder, since avr-libc already supplies a pgmspace.h with different behaviour, but user-supplied include dirs have more priority than system default include dirs, so it would be easy to hide avr-libc's pgmspace.h with a dummy one. The only code that I could see breaking, is code that uses lpm manually (i.e. inline assembly), which I expect is rare and can be easily fixed by checking for ARDUINO_ARCH_MEGAAVR or something like that.

I would tend to prefer option 2., since the resulting binary code is more efficient and clean, but there might still be corner cases. Option 1. is probably safer, since it just makes things work just like on regular AVR.

Any thoughts on these options? Any additional breaking cornercases I forgot?

@SpenceKonde
Copy link
Contributor

I like option 2 as well.... I'll give that a spin on megaTinyCore....

@matthijskooijman
Copy link
Collaborator

Cool!

Re option 1, the performance overhead might actually be even smaller than I thought. I was thinking lpm is three cycles and ld only 2, but that's only for loads for RAM. Apparently ld from the mapped flash is also three cycles (maybe even more, documentation is unclear). Using the mapped flash might still be faster since ld has more addressing modes and lds can be used when the address is known at compiletime, though.

As for the cycle time, the AVR Instruction Set Manual says:

Cycle time for data memory access assumes internal RAM access, and are not valid for access to NVM. A
minimum of one extra cycle must be added when accessing NVM. The additional time varies dependent on
the NVM module implementation. See the NVMCTRL section in the specific devices data sheet for more
information.

So that makes at least 3 cycles for ld from flash. However, the NVMCTRL section of the Atmega4808/9 datasheet does not seem to say anything about the number of cycles to read from mapped flash...

@matthijskooijman
Copy link
Collaborator

Giving this a bit more thought, and seeing the PRs that try to implement both approaches, I suspect that maybe option 1. is the better one after all. There is some overhead, but it should really work with all code out there (AFAICS) and does not complicate the include path even more, which is worth a few cycles IMHO.

@aentinger
Copy link
Contributor

aentinger commented Dec 11, 2020

I've been going through all those issues (#53, #62), PRs (#54, #82, #85, #87) and commits (a0f6beb) again and frankly, my head hurts 🤕. All in all it seems to me that by @facchinm 's commit a0f6beb we've implemented (restored actually) option 1 as described by @matthijskooijman here. @matthijskooijman can you confirm that I copy you correctly?

Also @giulcioffi's PRs are outdated a bit by #62 (comment) (this muddled the water when investigating this issue) - but no fault of her own, rather result of the long-standing PR we failed to integrate.

@matthijskooijman
Copy link
Collaborator

All in all it seems to me that by @facchinm 's commit a0f6beb we've implemented (restored actually) option 1 as described by @matthijskooijman here

Yes, I agree. Now, -API includes the main avr-libc pgmspace.h for PROGMEM and PSTR, and defines __FlashStringHelper and F() as on all platforms, and megaavr doesn't seem to change these, so I think now progmem stuff just works as it does on AVR, which I think is the indeed the safest approach. If we want to go for option 1, I think we're done and the issues and PRs can be closed.

There were some alternative PRs to compare the impact of this approach (but of those, I think only #85, implementing option 2, is relevant, since #87 has major compatibility problems, and #82 essentially just does what current master does with a bunch of duplicated code). So if we still want to consider option 2, we might want to look at the result of #85, but to really compare, that should be rebased on current master so it really compares option 1 with option 2 (it now compares the older, broken no-op F() version with option 2). Also, I think #85 (or rather, the https://github.com/giulcioffi/ArduinoCore-API/commits/FmacroIssue branch) needs some more work to really be ready for merging, but it looks ok for testing now.

@aentinger
Copy link
Contributor

Thank you very much for your feedback @matthijskooijman. Let's go with option #1 for now and schedule another look for this issue at a later point in time 🙏 .

@woolseyj
Copy link
Author

When is the next version expected to be released that will include this fix?

@aentinger
Copy link
Contributor

@facchinm What do you reckon?

@facchinm
Copy link
Member

facchinm commented Jan 8, 2021

@aentinger version 1.8.7 was released with the fix just before the end of the year 🙂

@aentinger
Copy link
Contributor

Totally true, I was deceived because it was released via cli git tag therefore not showing up in the release section of the repo main page 😊

Screenshot from 2021-01-08 09-58-25

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants