New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Major cleanup - c_whatever is finally history. #2838
Conversation
This was a royal pain to clean up, but I believe it's now all done. If Espressif had deigned to ship the corresponding libc headers this wouldn't have been needed in the first place, but oh well. Essentially, the libc.a provided by the SDK uses non-standard memory allocation, so anything that would use the regular malloc()/free() ends up looking for things that do not exist. Why they didn't provide wrappers for the whole lot is beyond me. Even worse is the libc.a they ship includes functions which rely on the missing alloc routines! Anyway, app/libc has now been cut down to the bare minimum overrides to shadow the corresponding functions in the SDK's libc. The old c_xyz.h headerfiles have been nuked in favour of the standard <xyz.h> headers, with a few exceptions over in sdk-overrides. Again, shipping a libc.a without headers is a terrible thing to do. We're still living on a prayer that libc was configured the same was as a default-configured xtensa gcc toolchain assumes it is. That part I cannot do anything about, unfortunately, but it's no worse than it has been before. The linker ordering in app/Makefile has been updated to ensure we link in our overrides first, without needing to rely on the crutch of prefixing everything with c_ (yuck!). From here on out it's now possible (and needed) to include the standard header files, and use the typical malloc()/calloc()/realloc()/free(), the strwhatever()s and memwhatever()s. These end up, through macro and linker magic, mapped to the appropriate SDK or ROM functions. I'm still impressed the NodeMCU founders managed to get things going, considering the state of the SDK...
Thanks Johny, This was on my TODO roadmap pretty much next anyway. I'll go through this lot and post review comments separately. One challenge that we have is that the git squash and merge process doesn't deal with parallel changes to the same files. We also need to think of the impact of the periodic It's going to be quite a journey to merge the two codebases and we will need to take some big steps. To be honest, I really think that we can't afford to dance around other minor contributions if we want to do this within the next six months. For example, we currently have 12 IMO, we need to agree which of the 12 other outstanding
Of course any contributors can continue to prepare and submit their PRs in parallel so long as they realise that they will need to rebaseline their work against the restructured directories before any final merge. The reason that I've added the other committers into this discussion as reviewers is that I will implement a timeout-on-response policy to maintain some form of minimum momentum. So all, please engage and speak up or accept that we are using our best endeavours to stabilise the project's future and that we can fix up any minor issues in consequence downstream. |
// the mbedtls platform.c from their SDK. Really, this should go | ||
// somewhere else. Note that the prototype here for vPortFree differs (!) | ||
// from the one in sdk-overrides.h. That's above my pay grade. | ||
// --nwf; 2018 Feb 18 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, past-me would be so happy to see this disappearing. :)
@@ -0,0 +1,3 @@ | |||
#include <stdlib.h> | |||
void *mbedtls_calloc_wrap(size_t n, size_t sz) { return calloc(n, sz); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anything preventing us from using calloc
and free
directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the mbedtls lib expects those two to be actual functions with standard prototypes. On the non-OS SDK they are macros which expand to vPortXyz()
things, which makes it impossible for the mbedtls lib to do stuff like void (*my_free)(void *) = MBEDTLS_WHATEVER_MACRO_FREE_IS;
. Hence the need for wrapper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both RTOS and the SDK use the non-standard pvPortXXXX
allocators; these are separate from the ROM-based mem_XXXX
allocators and include extra parameters to enable some crude allocator diagnostics.
I think that I've discussed this elsewhere, but if you look as the disassembly of these routines, they are largely wrappers around the ROM-based mem_XXXX
allocators for most execution paths.
In this case we would want to minimise unnecessary source changes to something like mdedtls
and the compiler at -O2
will inline this into a direct calloc call anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jmattsson: I see. If there are other places in the tree where we need "actual malloc
and free
functions", maybe move this to platform.c
? But keeping them where they are is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered it, but I didn't want to make everyone else suffer an extra call level just because of mbedtls. Sure, it'd be a marginal expense compared to what the functions do, but when in 99% of the cases it can all be handled with a macro, I went with penalising mbedtls only. But yes, we could put actual malloc()/free()
functions into our app/libc/
if so wanted. We could also experiment and see how the SDK's libc would handle it if we provided the missing _malloc_r()/_free_r()
functions (obviously without the reentrancy support) as that might allow us to not have to shadow functions it already has, but that's a whole 'nother barrel o' fish.
Okay, I have no idea why github decided to request reviews from everyone - I only tagged Terry. And I also have no idea why Travis is unhappy with the linux build, seeing as my branch was cleared by Travis before I PR'd. It's past 2am here, fixing this will have to wait. |
app/platform/vfs.c
Outdated
@@ -420,14 +419,14 @@ void vfs_clearerr( const char *name ) | |||
|
|||
#ifdef BUILD_SPIFFS | |||
if (fs_fns = myspiffs_realm( normname, &outname, FALSE )) { | |||
fs_fns->clearerr( ); | |||
//fs_fns->clearerr( ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it might be a functional change, in contrast to much of the rest of this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yikes! That one snuck in by mistake! Thanks for picking it up! Can you also enlighten me why the compiler throws a
vfs.c: In function 'vfs_clearerr':
vfs.c:422:13: error: expected identifier before '(' token
fs_fns->clearerr( );
^
make[2]: *** [../../Makefile:416: .output/eagle/debug/obj/vfs.o] Error 1
at me when it's not commented out?
Changing it to:
void (*wtf)(void) = fs_fns->clearerr;
wtf();
does not cause an error.
I'm too tired to see what's up with this...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, it's gonna be a macro thing, isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Go to BED. I'll take a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Workaround pushed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, I'm gonna take Terry's suggestion and hit the sack now. A more elegant workaround would certainly not go astray.
Looks good to me, some minor nit questions notwithstanding. Thanks for doing what surely must have been tedious work. |
@jmattsson, I've gone through the diff at a scan level and it looks pretty similar to anything similar that I would have done. I think that we need some sort of validation invariant as a check. What I mean here is that these changes should only be simplifying the sources and not changing the generated binaries, so perhaps we need to diff the On balance, I think that we should merge in this #2838 before #2836, which means that I am going to have to rebasline #2836 against dev post this merge -- grrrhhhh. Given the amount of work that has gone into this and that any other merges will mean that we need to rebaseline this, I am strongly tempted (if the valiation is OK) just to JFDI and merge this one in next, but leave #2836 which actually modifies NodeMCU behaviour until after the next master drop. @marcelstoer, where are you old chap? |
Just watching in awe from the sideline while the big boys are playing 😄
👍 but we should be doing it in a team discussion as doing it here would take focus away from the actual changes |
I think I agree with @TerryE that this PR should land. It's going to conflict with a lot of other work, but this one is a non-functional change that puts everything on better footing going forward. |
That would indeed be a good idea, I like it.
There will be a couple of differences I expect. Some around the mbedtls malloc/allocs, and some around the [vs]printf()s and also strdup. Those are the ones I can think of off the top of my head. If you throw a readable mapfile diff at me, I'll try to explain/justify the changes! :)
I'm perfectly happy for you to slurp this in on the side so it becomes part of one of your PRs Terry. Just because I raised the PR here doesn't mean it's the only way to handle it. And it may very well be easier to merge this on top of yours, since any conflicts at that point should just be the slight tedium of stripping some more c_ prefixes. The value (or time cost/investment) in this PR is really around working out the what and how of shadowing functions in the SDK's libc. Then again, treating it as a separate non-functional change also has merit.
I'm on board with whichever way you guys want to go here. It's no question that it's quite the disrupting sort of change. @nwf Thanks, yeah, this took longer that I had expected (like everything else I've done the last few weeks, grr). A few of the hours were spent trying to build with @marcelstoer Don't sell yourself short - all the work you're doing with our CI and the cloud builder is hugely valuable too! |
Johny, as I said in #2836, this is a multi-step journey. We shouldn't get too hung up on whether any one step is perfect; it just needs to be good and achieve material progress towards the end goal. We will be doing more steps later. In this case what we want for the modules is for them to compile to both targets successfully using the same source, so that we can unify the codebase downstream. IMO, we should just take the hit and merge this sooner than later. I'll do that mapfile comparison later today and some quick sanity checks with the built image. If all looks OK, then I'll do the merge. |
@@ -4,8 +4,4 @@ | |||
#include "c_types.h" | |||
#include "mem.h" | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent work! When I was testing build with newer xtensa toolchain those were a real pain. Just few things:
Maybe it would be a good idea to just remove this header file and it's occurrence in Makefile?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should certainly be possible to adjust the sjson module to not depend on the SDK specific things; at the time I went with minimum possible change just to get through the bulk of it.
@@ -31,7 +31,7 @@ | |||
#define SQLITE_OMIT_PROGRESS_CALLBACK 1 | |||
|
|||
#include <c_stdlib.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just looked and there is still that one c_
header - can it be swapped to stdlib.h
? AFAIK SQLite is turned off completely so it shouldn't matter anyway - just a nit pick.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
D'oh, I must've missed that one.
OK, the good news is that this branch compiles fine. The bad news is that the two mapfiles are very hard to compare: there have been subtle changes in some of the include search lists and for some reason make does not compile wildcard lists within a directory in a collated order so editing files can change their compile order and this then effects their link order and hence their order in the binary -- bizarre. Even so doing an objdump gives the following segment sizes:
So there are small differences. I will drill down and see what they are. |
This does my head in. The differences are typically the odd 4 bytes. When I check the linker output the "size before relaxing" are the same, but the actual sizes are different. The relaxation process occurs when the linker makes cross-module storage optimisations -- for example two adjacent modules share the same constants or strings and they are PC-relative accessible to the second, then the copy constant can be dumped. On a few random samples, doing an use strict;
my $last = '';
while (<>) {
$_ = $last . $_ if ($last && / {16}0x[0-9af]{16} /);
$last = '';
$last = $1 if /^( \S+)$/;
my ($sect, $start, $len, $file) = / (\S+)\s+(\S+)\s+(\S+)\s+(\S+)/;
next unless $start =~ /^0x[0-9a-f]{16}$/;
next if $start eq '0x0000000000000000';
$start = '0xXXXXXXXX';
next if $sect =~ /\.(xtensa|comment|debug)/;
$file =~ s!^.*?/nodemcu-firmware/sdk/esp_iot_sdk_v3.0-e4434aa/lib!SDKLIB!;
$file =~ s!^.*?/xtensa-lx106-elf/lib!XTENSALIB!;
next if $sect eq '*fill*';
print "$sect $start $len $file\n";
} Anyway, I propose to merge this tomorrow morning (GMT) unless any of the committers raises an objection. |
I am delaying this another day to give the committers time to complete their team discussion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already given feedback inline
While browsing through the stale issues, I did some experiments with
This is probably out of scope for this PR since |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good and feels right :)
Some changes will conflict with #2836 (e.g. vfs.h
).
@@ -4,7 +4,7 @@ | |||
#include "platform.h" | |||
#include "user_interface.h" | |||
#include "c_types.h" | |||
#include "c_string.h" | |||
#include <string.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like this is a redundant include anyway. There are more instances of this type around.
I assume that the pattern based change doesn't detect that aspect - should we use this PR to clean up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been trying to find a smart way of weeding out redundant includes, but the best that I've come up with (including are searching SO etc.) is a tight interactive try deleting one and re-make. If the compile is clean it stays deleted; if it errors then restore it.
As I mention elsewhere we don't do ourselves any favours because the platform abstraction layer doesn't do it's job, IMO. Your modules still end up including the SDK headers which rather defeats the purpose of abstraction. Still one step at a time in the right direction is a good thing, I think.
Welcome to the hell I had while getting this PR done.
Yes, what you're seeing here is the broken libc in SDKs. I think what Espressif did was they built their newlib libc.a as one normally would, with its regular Therefore, we have to "shadow" any and all libc functions that are thus broken with our own copies so the linker doesn't find the broken version in the SDK's libc. And we're currently then missing As I mentioned somewhere above, it would be really interesting to try with writing wrappers for |
I just had a chance to test it and I missed at least one thing - new I2C implementation that uses |
@jmattsson: Any chance we could make our lives a little easier -- at least in terms of the error reports we get -- by ripping out anything that won't link from |
Actually two, but thanks v much @galjonsfigur. See #2845 fixes this |
The PR removed the bulk of non-newlib headers from the NodeMCU source base. app/libc has now been cut down to the bare minimum overrides to shadow the corresponding functions in the SDK's libc. The old c_xyz.h headerfiles have been nuked in favour of the standard <xyz.h> headers, with a few exceptions over in sdk-overrides. Again, shipping a libc.a without headers is a terrible thing to do. We're still living on a prayer that libc was configured the same was as a default-configured xtensa gcc toolchain assumes it is. That part I cannot do anything about, unfortunately, but it's no worse than it has been before. This enables our source files to compile successfully using the standard header files, and use the typical malloc()/calloc()/realloc()/free(), the strwhatever()s and memwhatever()s. These end up, through macro and linker magic, mapped to the appropriate SDK or ROM functions.
This was a royal pain to clean up, but I believe it's now all done.
If Espressif had deigned to ship the corresponding libc headers this
wouldn't have been needed in the first place, but oh well.
Essentially, the libc.a provided by the SDK uses non-standard memory
allocation, so anything that would use the regular malloc()/free()
ends up looking for things that do not exist. Why they didn't provide
wrappers for the whole lot is beyond me. Even worse is the libc.a they
ship includes functions which rely on the missing alloc routines! So yeah,
missing headers and missing functions. Good job, Espressif. I can see why
you don't want to maintain the non-OS SDK...
Anyway, app/libc/ has now been cut down to the bare minimum overrides to
shadow the corresponding (broken) functions in the SDK's libc. The old c_xyz.h
header files have been nuked in favour of the standard <xyz.h> headers,
with a few exceptions over in sdk-overrides. Again, shipping a libc.a
without headers is a terrible thing to do. We're still living on a
prayer that libc was configured the same was as a default-configured
xtensa gcc toolchain assumes it is. That part I cannot do anything about,
unfortunately, but it's no worse than it has been before.
The linker ordering in app/Makefile has been updated to ensure we
link in our overrides first, without needing to rely on the crutch
of prefixing everything with c_ (yuck!).
From here on out it's now possible (and needed) to include the standard
header files, and use the typical malloc()/calloc()/realloc()/free(),
the strwhatever()s and memwhatever()s. These end up, through macro
and linker magic, mapped to the appropriate SDK or ROM functions.
I'm still impressed the NodeMCU founders managed to get things going,
considering the state of the SDK...
Terry, this one is for you, to help bridge the gap between the 8266 and the 32 branches. It should cut down the number of edits required between the two branches by a lot. I've promised you this long enough that I finally felt bad enough to sit down and slog through it tonight. I don't know if it makes more sense to merge this to dev, or for you to merge into the next round of big changes in your repo and get it in that way. Have a chat with @marcelstoer about what makes sense. And of course feel free to tweak if needed.
I have built all the modules successfully, but I obviously have not been able to test them all. Most of the changes are plain replacements that should be hard to have screwed up.
dev
branch rather than formaster
.docs/*
.