RTOS-SDK, ESP32 and the way forward #1319

Closed
jmattsson opened this Issue May 27, 2016 · 62 comments

Projects

None yet

9 participants

@jmattsson
Collaborator
jmattsson commented May 27, 2016 edited

Edit: the below progress update refers to the dev-rtos branch which targeted the RTOS-SDK and the ESP31B. With the final release of the ESP32, Espressif abandoned the RTOS SDK in favour of their new IoT Development Framework (IDF). While the IDF is vastly superior to the previous SDKs, it does set our porting effort back a fair bit. Progress updates on the IDF/ESP32 port of NodeMCU can be found further down in this discussion.

With the ESP32 release coming up in a few months time, it's time to seriously start thinking about the way forward. I think it's a given that we'd all like to see NodeMCU run on the ESP32 as well. With the ESP32 there is only the RTOS SDK however, which means we really need to consider how to get ourselves switched from the non-OS SDK over to RTOS.

Since $work is rather interested in shifting some of our products over to the ESP32 I've had a bit of time to investigate the effort that will be required in terms of NodeMCU. I've been "spiking" over on the DiUS dev-rtos branch to see what I can get going. Here's the overview so far:

  • Make NodeMCU compile with RTOS-SDK headers rather than non-OS SDK headers. There's a bunch of glue in the sdk-overrides/directory which would need cleaning up, but overall this step wasn't too bad - the SDK functions are largely the same.
  • Make NodeMCU link with RTOS SDK. This took a bunch of changes, and a couple of functions needed to be stubbed for now.
  • Reimplement NodeMCU task interface on top of RTOS tasks. Thanks to Terry's earlier work, this seems to be relatively straight forward. Once it gets run-tested some issues may surface though (cue Jaws music...)
  • Complete reimplementation of our exception handler to allow constants in flash. Turns out the RTOS-SDK doesn't use any of the ROM functions for hooking exceptions (probably performance reasons), and the documented method of installing user hooks simply does not exist. Took a fair chunk of work to find a good way to hi-jack the UserExceptionVector, but on the upside it's now also a whole lot faster than the previous one.
  • Remove NodeMCU's partial libc implementation. This was conflicting with the SDK's libc and causing complete hangs. On the upside, there now is a real libc available. Almost all the various c_ prefixed functions (and a bunch of os_ prefixed ones) have been consolidated back to standard C library names.
  • Fixed SPI flash reading functions. The RTOS-SDK changed the flashchip variable from a pointer to a struct, so our use of it bombed completely...
  • Understand why printf() now doesn't work, but ets_printf() does. printf now working, without bounce buffers.
  • Get to the Lua prompt being printed. So far, so good.
  • Make UART driver RTOS compatible. The UART driver wasn't to blame, my buggy task.c implementation was. Whoopsie.
  • Get to a (mostly) working Lua prompt. This would be a major milestone, and hopefully be the starting point for others to join in the effort.
  • Deal with timer callbacks executing from a different task context. Considering that pretty much all NodeMCU code is written expecting run-to-completion semantics, switching to a preempting OS framework has huge potential for random lockups and crashes. My current approach is to dedicate a single RTOS task to running all of NodeMCU in, which should hopefully mitigate most issues. To make this happen I think I'll probably need to wrap the timer API to have the actual callbacks posted back to the NodeMCU task for execution. Having done the transition for the tmr module, the whole thing appears to be easier to just change in each place where needed than attempt to wrap everything. Besides, having high-priority timer callbacks might be useful for some drivers.
  • Understand which tasks the SDK callbacks execute from, and develop a strategy for dealing with that. Similar to the timers, but a bit more challenging, possibly. Our earlier work in not allowing Lua callbacks to run from SDK callbacks should help here and limit the amount of rework needed. I hope. As expected callbacks seem to be called from various RTOS tasks directly, such as the rtT high priority timer task and the tiT TCP/IP task, not to mention the uiT task which is what user_init() runs in. It will be up to everyone who is taking an SDK callback to either deal with it fully within that callback without referencing data used by other tasks, or copy the necessary information from said callback and relay it back to the main nodemcu RTOS task. Appropriate locking must be done though. I've updated the sntp module as an exercise, and while it grew a little bit it was pretty straight forward. [cue everyone pointing out things now wrong with it...]
  • Make output redirection work. Needs a putc handler installed which can queue characters across into the nodemcu task.
  • Make silencing of SDK output work. Espressif didn't provide a system_set_os_print() function in the RTOS-SDK and wants you to install a putc handler to suppress everything instead (which is useless, since then we'd have to put a mutex around each printf call if want system_set_os_printf() like functionality). I fixed this by placing all the SDK functions first in irom, and then in the wrapped printf() checking the return address - if it's in the SDK part of irom and we've flagged off SDK prints, then we suppress it. Rather sneaky, but works well and with almost no cost.
  • Revisit printf override. The internal print() function takes peculiar arguments, but we now match those to the letter I believe.
  • Find out what's using so much stack space. Currently I'm running the NodeMCU task with an unsupported stack size just to prevent everything from falling over due to the stack being smashed. If anyone has any stack analysis tools that could work for the ESP platform, I'd love to hear about them, since -fstack-usage is not available.
  • ??? (no profit guaranteed)
  • Reimplement the net module (and others) on top of lwIP API, since espconn is only partially supported on the ESP8266 RTOS, and not at all on the ESP32 RTOS. Probably look at including mbedTLS for TLS support.
  • Fix whatever other issues and races we encounter. There will be races, I'm sure. There will also be regressions
  • Look at upgrading various components and drivers to take better advantage of the RTOS aspects.

If we can get our current NodeMCU to run stable on the ESP8266 with the RTOS SDK, it should be quite easy to get ESP32 support in I believe. If/when I get my hands on ESP32 hardware, I'll have an even better idea.

Oh, and the dev-rtos branch is subject to force-pushing and other unpleasant things, and it is most assuredly not ready for public consumption, but if you want to track my progress you'll see it there.

@TerryE
Collaborator
TerryE commented May 27, 2016

@jmattsson I will up the priority on the net over LwIP. I really wanted to go that way anyway, but this was the push that I needed.

About to hop on a Ferry and travelling for the rest of the day, but will post back at the weekend. 😄

@devsaurus
Collaborator

Took a fair chunk of work to find a good way to hi-jack the UserExceptionVector, but on the upside it's now also a whole lot faster than the previous one.

Is this improvement also applicable to our current firmware?

@jmattsson
Collaborator

It could probably be moved over to the regular SDK with some care, but I haven't looked at that.

@jmattsson
Collaborator

Okay, I'm at a Lua prompt now!

NodeMCU 1.5.1 build unspecified powered by Lua 5.1.4 on RTOS-SDK 1.4.0(c599790)
lua: cannot open init.lua
> 
=node.heap()
39240

The printf doesn't seem to support tabs(?!), because everything seems to end up on its own line, but that's for later... I seem to be running into the auto-bauder leaving things at 74880 until I bang characters at it, even when I've initialized the baudrate to 115200. I don't know what's up with that; @pjsg maybe you have some idea?

Is this the point where I suggest we get this branch into the official repo and let everyone loose on it to try to bang it into shape? After we make sure it's hidden in the cloud builder, of course @marcelstoer .

I'm now eagerly looking forward to getting my hands on the ESP32 dev board from @nodemcu! :)

@marcelstoer
Collaborator

Hurray Johny, sounds exciting!

Is this the point where I suggest we get this branch into the official repo and let everyone loose on it to try to bang it into shape?

Definitely! However, I suggest to track of the challenges you encounter in a separate issue on GitHub. Otherwise, this one will soon become confusing and hard to follow. It would help if we could define new labels ourselves @nodemcu so we could create RTOS SDK, non-OS SDK or ESP32 to distinguish between the issues.

After we make sure it's hidden in the cloud builder, of course @marcelstoer .

I used to pull active branches from GiHub API (really cool API) and then blacklisted some. However, a few months ago I switched to statically define master and dev only. So, no more maintenance overhead and no more fear of things potentially falling over with every new branch.

@pjsg
Collaborator
pjsg commented May 31, 2016

On 31/05/2016 00:12, Johny Mattsson wrote:

The |printf| doesn't seem to support tabs(?!), because everything
seems to end up on its own line, but that's for later... I seem to be
running into the auto-bauder leaving things at 74880 until I bang
characters at it, even when I've initialized the baudrate to 115200. I
don't know what's up with that; @pjsg https://github.com/pjsg maybe
you have some idea?

That sounds very odd. I'll take a look if you tell me your branch name.
The autobaud code does not (AFAIK) touch the uart configuration until it
detects the baud rate in use....

@devsaurus
Collaborator

Almost all the various c_ prefixed functions have been consolidated back to standard C library names.

Changing c_puts() to puts() brought along two regressions.
Functionality-wise, output redirection is not supported any more with node.output(). c_puts() routed the string through output_redirect() in node.c which triggers an optional callback.
On the formatting side it generates additional newlines - libc's puts() unconditionally appends a newline char to the string which c_puts() / output_redirect() didn't do.

This is currently a show stopper for my dev environment with ESPlorer (and potentially other similar tools).

@jmattsson
Collaborator
jmattsson commented Jun 1, 2016 edited

Thanks for diagnosing that!

puts() adding a newline is standards compliant, so if we're assuming it doesn't then that's our bug(s). Looking at the output_redirect() I see that it never captured characters printed via putc(), so that redirect was never complete in the first place :(

The old-style redirect we were using won't be possible with the pre-empting RTOS, as we'd at "best" end up running Lua code reentrantly from whichever RTOS tasks are printing. I suspect what we'll need is to install a proper handler via os_install_putc() and queue characters over into the Lua task. Shouldn't be too bad to do - we need to install one of those anyway to be able to implement the system_set_os_print() function anyway.

@jmattsson
Collaborator

@marcelstoer Thanks, I hadn't realised you switched from black-list to white-list!

I've pushed a dev-rtos branch into the main repo now, and will continue work against that. I consider this a free-for-all branch for the time being - there is so much to test and fix and polish that I'd prefer to risk stepping on toes (and having toes stepped on) than using the PR mutex. I'm contemplating starting a wiki page to document what I learn as a I go - would that be sufficiently useful to be worth the effort?

@devsaurus I believe I've taken care of the newline madness now. I've yet to address the output redirect.

@devsaurus
Collaborator
devsaurus commented Jun 1, 2016 edited

Thanks a bunch, Johny! That'll allow me to have some test drive with the new rtos flavour. Output redirection is not that important atm, as long as there's serial comm 😃

As a side note: I found that you need to manually download and extract ESP8266_RTOS_SDK_v1.4.0_16_02_28 (presumably) to rtos-sdk. Also update init data from rtos-sdk/bin/esp_init_data_default.bin - just to be sure.

@jmattsson
Collaborator

The RTOS SDK is a git submodule so a simple git submodule update --init should do the trick.

@devsaurus
Collaborator
devsaurus commented Jun 1, 2016 edited

node.compile() crashes 100% on an integer build with u8g enabled on top of defaults (doesn't happen for float):

> node.compile("telnet.lua")
Fatal exception (28): 
epc1=0x401008a2
epc2=0x00000000
epc3=0x40105adb
epcvaddr=0x4024e717
depc=0x00000000
rtn_add=0x4010089b
<repeating dump>

Also dofile() is broken.
I even got a meaningful error for a small script saying:

"�M�?�"(stack_size = 0,task handle = 3fff3bf0) overflow the heap_size.
"�M�?�"(stack_size = 0,task handle = 3fff3bf0) overflow the heap_size.
 ets Jan  8 2013,rst cause:4, boot mode:(3,6)

wdt reset
load 0x40100000, len 26856, room 16 
tail 8
chksum 0xfe
load 0x3ffe8000, len 2392, room 0 
tail 8
chksum 0xd4
load 0x3ffe8958, len 8, room 0 
tail 8
chksum 0x5c
csum 0x5c
@jmattsson
Collaborator
jmattsson commented Jun 2, 2016 edited

Both of those look like they're due to stack overflow (again). This seems like it's going to be our biggest pain point in this transition - we have stuff that uses way more than the 2k heap RTOS wants to let us use. If we say each function call uses ~100 bytes of stack on average, the 2k stack limit should still allow us to go roughly 20 functions deep. I suspect something is placing too large arrays or objects on the stack, in a frequently used code path. Help tracking this down would be much appreciated.

Edit: you can increase the nodemcu RTOS task stack size in user_main.c, but that obviously directly impacts free heap, and I don't yet know why the SDK docs state that 2k is the upper limit since we're already running past that.

@jmattsson
Collaborator

@devsaurus Output redirection should now work again.

@jmattsson
Collaborator

I increased the nodemcu task stack size to 6k, see if that helps for now?

@TerryE
Collaborator
TerryE commented Jun 2, 2016

I've been pondering this and the issue that we have with Lua is that the execution engine is intrinsically single threaded. Yes, coroutining is supported but this is cooperative and non-preemptive. Yes, on a real OS you can have multiple Lua environments running but these can't interact except though OS mechanisms, and you just don't want to go there on a ESP-class processor.

All of this wasn't an issue with the non-OS SDK since this was also non-preemptive (at least in terms of non-ISR code), but as Johny has pointed out callbacks in RTOS are (or can be) invoked asynchronously in a separate C stack space, and there are a legion of bears traps here.

I think that we should be thinking about extending our model for ISRs and adopt an asymmetric Lua-land / other-land approach. We can' use a symmetric mutex approach because of the Lua process's heavy stack use; We can't (at least on the ESP8266) allow multiple tasks to demand a large stack. I feel that we should think in terms of a 1+N structure where the Lua task is "special" and that all callback tasks do a task post which queues a request to the Lua task when they need to xfer control into Lua-land.

Anyway just musings whilst I build my house. 😃

@devsaurus
Collaborator

I increased the nodemcu task stack size to 6k, see if that helps for now?

Of course, node.compile() works ok now and I didn't hit any other issues so far.

While compiling the fw up and down I got the impression that the Makefile processing is a bit odd. The rtos-sdk/third_party/lwip tree is traversed for each make invocation of a SUBDIR. That's a cosmetic issue in the first place, but also cleaning rtos-sdk/third_party/lwip doesn't work. It caused me some headaches when upgrading my local tool chain.

@jmattsson
Collaborator
jmattsson commented Jun 3, 2016 edited

@devsaurus Thanks, I've cleaned up the Makefile now. The clean & clobber targets have also been hooked up to the lwip in rtos-sdk. I appreciate the testing! Scratch that - we shouldn't even be rebuilding the SDK lwIP, that was just a left-over from when I was initially getting things to link.

@TerryE I agree 100%. Thinking of callbacks as being executed in interrupt context is the best (if not only) workable approach. I was briefly entertaining the notion of implementing the lua_lock() function, but considering the stack usage of the Lua VM it is not feasible to execute callbacks from other RTOS tasks, even if we got the VM locked properly (which I suspect would likely have caused some tasks to be blocked for too long while waiting for the VM to become available).

@jmattsson
Collaborator

I've started a wiki page to track overall progress. Feel free to expand it.

@devsaurus
Collaborator

I've cleaned up the Makefile now.

Thanks for that - I didn't get the intention with lwip in the first place, but it's definitely calmer now. Though I do want to clean lwip for the time being in order to recompile and check stack usage there as well 😉

In this respect, please find a ((very) clunky) approach to obtain info from -fstack-usage over at my esp-open-sdk fork. It generates the desired *.su files and they don't look too silly, but I haven't cross checked the results yet with the generated code. It's a start and might need further tweaking - hope it's useful in the end.

@jmattsson
Collaborator

Ooooh! Building new toolchain now!

@jmattsson
Collaborator
jmattsson commented Jun 3, 2016 edited

@devsaurus Not looking quite right yet:

(gdb) disass nodemcu_main 
Dump of assembler code for function nodemcu_main:
   0x40253d38 <+0>:     addi    a1, a1, -16
   0x40253d3b <+3>:     s32i.n  a0, a1, 12
   0x40253d3d <+5>:     call0   0x40253cc0 <nodemcu_init>
   0x40253d40 <+8>:     call0   0x40254fbc <task_pump_messages>
End of assembler dump.

whereas the .su says:

user_main.c:123:13:nodemcu_main 8   static

The stack pointer must always be 16-byte aligned, btw (ref Xtensa ISA reference, p587), and since the return address almost always needs to be stored I'd expect every function to at least have 16bytes of stack usage.

Edit: rounding up to the nearest 16byte boundary makes it look pretty good though, so this should be fairly representative:

$ find app|grep .su$|xargs cat | sort -rnk2 |head -n 40
coap_server.c:7:8:coap_server_respond   476     static
lparser.c:411:8:luaY_parser     376     static
lparser.c:611:13:body   312     static
wifi_common.c:8:6:wifi_add_sprintf_field        304     static
encoder.c:37:15:fromBase64      296     static
enduser_setup.c:1002:14:enduser_setup_http_recvcb       280     dynamic
coap_client.c:10:6:coap_client_response_handler 240     static
wifi.c:1168:12:wifi_ap_dhcp_config      192     static
httpclient.c:441:24:http_request        184     static
wifi.c:607:12:wifi_station_config       152     static
ltable.c:527:16:newkey  152     static
lstrlib.c:492:12:str_find_aux   152     static
enduser_setup.c:928:13:on_scan_done     152     static
enduser_setup.c:1459:12:enduser_setup_start     152     static
wifi.c:548:12:wifi_station_getconfig    148     static
lstrlib.c:545:12:gmatch_aux     136     static
ldblib.c:334:12:db_errorfb      136     static
coap.c:314:12:coap_request      136     static
crypto.c:23:12:crypto_sha1      128     static
wifi.c:86:13:wifi_scan_done     120     static
mdns.c:27:12:mdns_register      120     dynamic
loadlib.c:564:12:ll_module      120     static
ldo.c:184:6:luaD_callhook       120     static
ldebug.c:734:6:luaG_runerror    120     static
wifi.c:1005:12:wifi_ap_getconfig        116     static
lbaselib.c:548:12:costatus      116     static
lauxlib.c:84:17:luaL_where      116     static
lbaselib.c:125:13:getfunc       112     static
wifi.c:1018:12:wifi_ap_config   108     static
node.c:593:12:node_stripdebug   108     static
lauxlib.c:54:16:luaL_argerror   108     static
spiffs_nucleus.c:891:7:spiffs_object_append     104     static
spiffs_nucleus.c:688:7:spiffs_object_create     104     static
spiffs_nucleus.c:1132:7:spiffs_object_modify    104     static
spiffs_gc.c:376:7:spiffs_gc_clean       104     static
spiffs_gc.c:234:7:spiffs_gc_find_candidate      104     static
spiffs_check.c:830:7:spiffs_page_consistency_check      104     static
lparser.c:1362:13:chunk 104     static
encoder.c:14:15:toBase64        104     static
wifi.c:377:12:wifi_getmac       100     static

Further edit: And now I see that we're missing the -fcallgraph-info option as well. That's unfortunate, the merger of callgraph and stack usage is what would give us the proper targeting information for reducing stack usage.

@devsaurus
Collaborator

I'd expect every function to at least have 16bytes of stack usage.

That's very valuable input for a sanity check, thanks. I'll investigate later this day to find out how to report gross stack usage.

@TerryE
Collaborator
TerryE commented Jun 3, 2016 edited

@jmattsson Johny, apart fromthe performacne and code size implications, stay well away from lua_lock() for other reasons:

  • The eLua mods were all tested assuming a threadless implementation (and especially the LTR patch), that is this lua_lock() function being nulled out.
  • Ditto all our nodemcu libraries.
  • Lastly when I started my compact debug testing on a standard PC Lua build (which defaults to locks enabled), I found one case in the compiler where the locked code was nested, so the outer locked function bracketed the inner call with an unlock/lock pair to avoid a deadlock; this breaks the atomicity rules by creating two (small) unlocked regions within a nominally locked region.
@devsaurus
Collaborator

Looks better now:

user_main.c:79:6:nodemcu_init   16      static

Although it's still more guesswork than code-fu 😊

Seems that the 16 bytes penalty is still not considered, but the change fixed a severe miscalculation of the net stack use. The leaderboard changed significantly...

You will now also get *.ci files via -fcallgraph-info 😉

coap.c:232:13:coap_response_handler     1424    static
lstrlib.c:756:12:str_format     1264    static
lstrlib.c:642:12:str_gsub       1216    static
coap.c:33:13:coap_received      1184    static
struct.c:212:12:b_pack  1120    static
mqtt.c:251:13:mqtt_socket_received      1120    static
mqtt.c:1427:12:mqtt_socket_publish      1120    static
spi.c:68:12:spi_send_recv       1104    static
spi.c:175:12:spi_recv   1088    static
node.c:336:6:output_redirect    1088    static
mqtt.c:1312:12:mqtt_socket_subscribe    1088    static
mqtt.c:1200:12:mqtt_socket_unsubscribe  1088    static
ltablib.c:144:12:tconcat        1088    static
lstrlib.c:122:12:str_char       1088    static
liolib.c:343:12:read_line       1088    static
lauxlib.c:388:24:luaL_gsub      1088    static
file.c:191:12:file_g_read       1088    static
liolib.c:371:12:read_chars      1072    static
lauxlib.c:689:16:luaL_loadfsfile        1072    static
i2c.c:122:12:i2c_read   1072    static
ow.c:194:12:ow_search   1056    static
ow.c:129:12:ow_read_bytes       1056    static
mqtt.c:565:6:mqtt_socket_timer  1056    static
lstrlib.c:90:12:str_rep 1056    static
lstrlib.c:78:12:str_upper       1056    static
lstrlib.c:65:12:str_lower       1056    static
lstrlib.c:54:12:str_reverse     1056    static
lstrlib.c:144:12:str_dump       1056    static
mqtt.c:527:13:mqtt_socket_connected     1040    static
coap_server.c:7:8:coap_server_respond   496     static
lparser.c:411:8:luaY_parser     384     static
lparser.c:611:13:body   336     static
wifi_common.c:8:6:wifi_add_sprintf_field        320     static
encoder.c:37:15:fromBase64      320     static
enduser_setup.c:1002:14:enduser_setup_http_recvcb       304     dynamic
coap_client.c:10:6:coap_client_response_handler 256     static
wifi.c:1168:12:wifi_ap_dhcp_config      208     static
httpclient.c:441:24:http_request        208     static
wifi.c:607:12:wifi_station_config       176     static
@jmattsson
Collaborator

That's a lot of luaL_Buffer instances on the stack there, each with a 1k buffer (via LUAL_BUFFERSIZE -> BUFSIZ -> stdio.h).

@TerryE
Collaborator
TerryE commented Jun 10, 2016 edited

Reimplement the net module (and others) on top of lwIP API, since espconn is only partially supported on the ESP8266 RTOS, and not at all on the ESP32 RTOS. Probably look at including mbedTLS for TLS support.

@jmattsson Johnny, I've just been going through a review of what I'd need to do to the net library to port it in a nonOS SDK + ESP8266 / ESP32 RTOS SDK way, and to that end I've been comparing the documentation for the two RTOS APIs. In short, the ESP32 has a few additions:

  • Sensor APIs: Temperature sensor and Touch pad sensor APIs
  • System APIs: Hardware MAC APIs : Hardware MAC address APIs
  • Driver APIs: I2S Driver APIs : I2S APIs

It also has some big omissions:

  • Force Sleep APIs : WiFi Force Sleep APIs
  • Rate Control APIs : WiFi Rate Control APIs
  • WPS APIs : WiFi WPS APIs
  • AirKiss APIs : AirKiss APIs
  • Upgrade APIs : Firmware upgrade (FOTA) APIs
  • Network Espconn APIs : Network espconn APIs
  • ESP-NOW APIs : ESP-NOW APIs
  • Mesh APIs : Mesh APIs
  • Hardware timer APIs : Hardware timer APIs

OK, the additions partially reflect new H/W capability, but what isn't clear to me is that any specific omission is a permanent removal or simple a temporary omission because the ESP32 SDK is still in beta and that Espressif will add this back before a V1 production release. I really don't want to spend a lot of effort effectively reimplementing something that gets added back before I am done. I'll have a trawl around the ESP32 forum to see what I can find here, for example: Network Espconn APIs.

@jmattsson
Collaborator

There's a fair bit more undocumented support on the ESP32, especially in the RTC co-processor area (can't wait to find out how to build code to run that core too!). It's not yet clear which model(s) Espressif will support with the two main cores. I've seen references to both running the entire chip as an SMP RTOS system, but also a "split" version where the WiFi stack runs on the "pro" core and the application has the "app" core to itself. Currently there is only support for running a single-core shared RTOS on the "pro" core.

Regarding omissions, yes, there are certainly some, but they were honestly surprisingly few to me. I don't know if you've been keeping an eye on the dev-rtos branch, but I'm at the point where that branch can now build and link for the ESP32 as well (can't/haven't run it yet though; my ESP32 seems to have trouble with its flash chip - we've ordered replacements for next week to try a swap). The things I ended up ifdef'ing out for now were only espconn, RTC, RF modes and some bus drivers (SPI, I2C) due to hardware differences.

I didn't have to change any of the hardware timer (FRC) stuff, other than provide suitable compatibility macros. Not sure which timer API you are referring to here? Also worth noting that in RTOS the os_timers are very high priority and might be sufficient for some things we've hooked the hardware timer for.

In terms of upgrade APIs, I'm still not sold on the Espressif way. We already have two competing, working implementations for the non-OS SDK version. I'd rather try to make those two compatible with each other and port that over to the ESP32.

For meshing, I'm guessing this will come later, just as it did for the 8266. Considering we haven't used it yet, this should be fine. Besides, the whole ESP-NOW protocol needs to evolve and stabilise a bit further first imo - it's got some serious drawbacks which makes any real life deployment challenging.

With the espconn bugbear I did see that mention of a compat layer for ESP32 RTOS, but never for the secure version (there is none available even for ESP8266 RTOS). The TLS library also looks very different. Even if they were to provide both espconns, we'd still have the issue that it's not possible to shut down a TCP server safely. Also, the lwIP native API is a much better fit in the RTOS model, since pbuf ownership is well defined and would allow us to easily and safely transfer it between tasks, giving us lower memory usage. I really don't think any time you spend on transitioning us to native lwIP is going to be wasted, Terry. At worst I could see feature parity, but far better stability and code quality in your implementation.

@TerryE
Collaborator
TerryE commented Jun 11, 2016

I really don't think any time you spend on transitioning us to native lwIP is going to be wasted, Terry. At worst I could see feature parity, but far better stability and code quality in your implementation.

OK, but I think that a sensible compromise for now is to get an LwIP non-OS SDK-based net module working. as we will need this what ever we do.

@TerryE
Collaborator
TerryE commented Jun 11, 2016

That's a lot of luaL_Buffer instances on the stack there, each with a 1k buffer (via LUAL_BUFFERSIZE -> BUFSIZ -> stdio.h).

Picking up this discussion:

  • This is an analysis of the calculated stack size for a set of unsorted functions, not the actual total size (str_upper does not call str_lower or v.v.)
  • This is also a manifestation of Luas KISS approach to its RTL and the use of standard pattens. The luaL_Buffer balances using the stack which is runtime efficient but possibly memory inefficient, in that you need to size your stack for a worst case depth; against a malloc approach which might be more memory efficient but generates a lot of malloc/free operations.
  • There would be nothing to prevent us breaking the mapping to BUFSIZ and defining LUAL_BUFFERSIZE to be 128, other than when we are processing I/O buffers this would tend to generate extra malloc/free operations -- though this won't be the norm in the case of embedded Lua, so there are good reasons for considering this.
  • It's a pity that Lua doesn't provide a luaL_Buffer declaration that takes an initial size which it could allocate using alloca() if less than LUAL_BUFFERSIZE or heap allocate otherwise, as a lot of rutines have a good estimate of the size of string needed (e.g. str_lower), but that's another story.

So out of this, the one suggestion that I think that we could consider is breaking the LUAL_BUFFERSIZE -> BUFSIZ association.

@jmattsson
Collaborator

If you've been following the ESP32 news, you might be aware the we got a WIP pre-release SDK drop a couple of weeks ago. I've finally had a bit of time to sit down and go through it. A lot of things have changed, and there's a lot more source available (yaaaaay!). They've even changed the terminology away from "SDK" over to "IDF" (IoT Development Framework) instead.

It does however present a challenge in terms of supporting both the ESP8266 and the ESP32 in the one NodeMCU branch. At this point I have no knowledge whether there will be a similar "IDF" becoming available for the ESP8266. If there is, we'll need to adopt that. If there isn't, I'll need to see whether it's feasible to massage the ESP32 IDF into the NodeMCU build structure somehow. It would probably still need NodeMCU to be based on the RTOS-SDK in such circumstances.

Interesting times...

@jmattsson
Collaborator

So today, totally unexpected, I got a parcel at work. With an extremely well-packaged ESP32 devboard inside! ./squeee!

Thanks @nodemcu, I assume that was your doing! :)

I'll continue the NodeMCU porting effort now, but in some ways it's like starting over from the beginning considering the massive changes to the build environment that the IDF introduced. Time to read up on Kconfig and see if I can come up with a clever approach to use the IDF framework to build NodeMCU for the ESP32 while using our existing approach for the ESP8266(RTOS)...

So far I've updated the pre-compiled toolchain, so if you're using the dev-rtos branch you can use the tools/toolchain/esp32/bin toolchain with the IDF environment.

@jmattsson
Collaborator

Tentatively, I'm thinking that going IDF all-out is the best way forward. Kconfig is soooo much nicer than user_config.h/user_modules.h and it could easily take over both those roles. Sure, there may be some grunt-work to get the ESP8266 RTOS-SDK compatible (unless Espressif comes through soon on that front), but I think it would be worth it. Thoughts?

@jmattsson
Collaborator

I need to sit down and do a proper write-up on all the stuff I've learned, but here's a visual update.

@luismfonseca
Contributor

Hmm! What a tasty amount of free heap it has!

@pjsg
Collaborator
pjsg commented Sep 20, 2016

The downside of that much heap is that our current approach to allocating it doesn't work any more (the GC will take much longer). The upside is that we can avoid running the GC so often!

@jmattsson
Collaborator

Speaking of pros and cons, there is good news and the bad news. The good news is that the new Espressif IoT Development Framework (IDF) is really nice to work with. It feels flexible, powerful, and polished (with the occasional unfinished spot). While I've certainly got a soft spot for the cozy hackiness of the 8266 SDK, the IDF is playing in a different league, and I'm really liking what they've done here.

The bad news is that it is so different we will need a dedicated ESP32 branch, at least for the foreseeable future. Back with the RTOS SDKs I think we could've managed a single branch, but even trying to get the ESP8266 RTOS SDK to work together with the ESP32 IDF is something I see as unsurmountable given our resources, let alone the non-OS SDK with the IDF.

I tried taking the dev-rtos branch and "IDF-ifying" it, but honestly, that's just not going to work. Getting NodeMCU onto the ESP32 is going to be a case of carefully lifting each module from the dev branch, tweaking it to fit in with the IDF arrangement of headers and libs, and of course making it RTOS-aware and safe. This is in many ways a lot less satisfying than cutting across large swathes of code which "mostly works", but on the other hand I think it will result in better code quality, quicker.

While I thought we had a pretty good platform abstraction layer, the ESP32 is such a significant upgrade and departure from the ESP8266 way of doing things that it will need a lot of revisting. The way I see this playing out is that we'll run both dev branches in parallel for some time. As soon as the ESP32 branch is in any sort of reasonable shape (I'm still tidying stuff up over in the DiUS repo before I push it across to here), we'll have to do extra work whenever we're merging in things to ensure it gets applied to both branches to reduce divergence. The sheer amount of stuff that happened on dev compared to dev-rtos made it infeasible to rebase or merge without a whopping big effort, and I don't think we could cope with that if we let it happen this time around.

In the end, I see three or four possible/likely paths:

  1. We keep the ESP8266 and ESP32 branches separate indefinitely, and leave the 8266 on the non-OS SDK. Less work in the short term, but it precludes actual code sharing between the branches for the most part and is thus very costly in terms of effort over the long term.
  2. The ESP8266 branch gets moved to the RTOS SDK, which would enable code sharing between the branches for certain things, hopefully the majority of modules, subject to some #ifdef'ing. This is the middle path, with more up-front effort but only moderate on-going overhead.
  3. Espressif releases an IDF for the 8266. This would be the ideal option, as it would allow us to merge the branches. Switching the 8266 over to the IDF would take some work, but could be done by adding in ESP8266-specific features on the already-IDF'd ESP32 branch. I'd expect it to be a similar amount of up-front work as the previous option, but a whole lot less on-going. This is the option I'm hoping for, to be honest.
  4. The NodeMCU team decides that it's too much effort to ever support the ESP32, and discontinues the porting effort. Definitely not my preferred option, but something that might happen if we don't find the time/people to cope with it all.

Unless we go with the last option, we will need every module maintainer to pitch in with getting all the modules ESP32 ready (excluding stuff which is pure 8266, naturally). Over the next couple of days I'm hoping to document as much as possible of the changes needed. Some is already at the top of this thread, but there's much more I'd like to share to help form a common view of what's necessary. My plan is to put it in-repo in the docs so it's easily findable for any developer who comes in later too. I'll also keep tidying up what I've done so far, and shunt it across into the official repo. Of course, the difficulty in getting hardware will not help with getting this flying, but that pain will pass.

While the above might sound a bit gloomy, I reckon we can make it happen, and I really think we want to make it happen - the ESP32 is one nice chip! It really feels like Espressif took the list of shortcomings/annoyances of the 8266 and just fixed them all. Properly. And the documentation quality is really good (just waiting on the quantity now!)

@luismfonseca
Contributor

@jmattsson For completion sake, there's also the 5th option:

The NodeMCU team decides that it's too much effort to support both ESP8266 and ESP32, and discontinues the ESP8266. (Obviously also a sad option).

Anyway, I know that I've limited experience in working for NodeMCU but I'm going to help in any effort required towards having NodeMCU on ESP32. :)

@mikewen
mikewen commented Sep 21, 2016 edited

My first reaction was NO, keep support ESP8266 is much more important! To be honest, I am disappointed with ESP32. (Offer very little what I need.)

Second thought, maybe, current NodeMCU is stable enough with many features. Why not only focus on ESP32? But my concern is ESP32 does not have enough interest as when ESP8266 was launched. We can run a simple poll.

@TerryE
Collaborator
TerryE commented Sep 21, 2016

Johny, given that the ESP32 part seems to be shipping for ~ $5 and this will probably fall, I can't see the ESP8266 lasting long. At best it will be shipped as a low cost "sustain" component to support existing production uses. Speaking purely personally, the $1-2 price point isn't important for me. The complete WiFi integrated SoC module is, and the ESP32 seems to address all of the annoyances and constraints of the 8266, so I would personally vote for a switch to the 32 for future development.

The RTOS / IDF stack of the 32 vs. the non-OS SDK of the 8266 will make it very difficult to maintain a common code base going forward, I think so we've got some hard calls to make ahead.

PS. the stone skin of my new house will be finished in a few weeks and we are waiting for the plastering team to board out, so the silly 6-7 days a week should slacken off soon. I am suffering Lua withdrawal symptoms, so its getting time to get back on-board and catch up, I think 😄

@mikewen
mikewen commented Sep 21, 2016 edited

I doubt ESP32 will be as popular as ESP8266. And ESP32 is still at least 1-2 year away from mass production ready.

This thread only have about 5 people, since May, that says something.

@jmattsson
Collaborator

As some of you might've noticed, I've just established the dev-esp32 branch in this repo. It's the "cutting edge" of ESP32 support, and by cutting I really mean I've cut out the 8266 support completely, as mentioned above. I've been building the ESP32 NodeMCU from the ground up, and I believe I've now dealt with all the FIXMEs I had in there, and most of the TODOs. In short, I think it's at the point where others could realistically start helping with the porting effort.

As of writing, that branch has got the console UART functioning (except auto-baud), and I've just finished getting SPIFFS to work today. NodeMCU now uses an explicit partition for the filesystem, rather than magically deducing free space and dropping the fs there (an approach which wasn't a good mix with partitions!). I'm sure @pjsg could polish it further though, and over time we'll need to consider support for embedding a readonly fs within the app, but that's for later.

The next things on my list are to grab some more of the node module functionality, and some of the basic WiFi functions. Once the WiFi is up I'll grab the native-lwip net module across from the PR (or dev if it's made it in). I'm planning on treating this branch as a "cowboy" branch for a while yet, but if ya'all disagree and want to start seeing PRs sooner rather than later I can do that too. There might be a lot of those in that case however.

Of the code in the esp32 branch, the one feature I can't yet enable is the FATFS option in the build since I haven't got the sdcard/spi support ported yet. Someone else is most welcome to look at that.

I've started making developer notes in the extension dev FAQ but it's rather light-weight so far. Somewhere I guess I should document the following:

git checkout dev-esp32
git submodule update --init
make menuconfig
make
make flash

which is the TL;DR for this branch. There may need to be something about PATH=$PATH:$PWD/tools/toolchains/esp32/bin, but if so that should get baked into the Makefile really.

@luismfonseca Other than porting modules over from the latest dev, one thing that would be useful is feedback/help with the porting notes. I'm kind of hoping it will grow as people start porting modules and notice shortcomings in said notes. There is also some older stuff on the wiki, of which some should go into the extension dev FAQ, some discarded as it's no longer relevant, and possibly the progress table redone.

Oh, and of course, testing the functionality that has been ported so far is always welcome, but I appreciate ESP32s are still a bit rare.

@TerryE Yeah I expect the 8266 will hang around for years to come, but for new designs the 32 is certainly a tempting option. It's not that long ago where the 8266 was priced in a similar range, and the 32 is a ridiculously powerful chip for the price-point! Good to hear the house building is progressing well; looking forward to having you back on board and butting heads with me over technical details :)

@mikewen Years away from production ready? Nah, chip production is already (finally!) rolling and module production is ramping up. By xmas I imagine dev boards will be freely available. And back when the ESP8266 was launched, few took notice about it. It took quite a while for it to really break into the hacker/maker community largely due to lack of docs and tools. Espresssif is really working with the community this time, and I expect the 32 will get an overall quicker uptake, tbh.

@devsaurus I've been hitting some stack overflows in the Lua thread even with a significantly larger stack, so if you're feeling adventurous you could look at getting the -fstack-usage
patches into the xtensa-esp32-elf toolchain as well :) I'd be happy to include some patches in the prebuilt toolchains that are submodule'd into the repo.

@TerryE
Collaborator
TerryE commented Sep 22, 2016 edited

@jmattsson Johny, I've been brooding about the Lua architecture. As you know NodeMCU is build on eLua which was built and tested on the assumption that the Lua interpreter is non-reentrant, and hence the VM can only execute a single Lua thread and any multi-threading must be cooperative. IMO, we should stick with this on the ESP32 because moving to a thread-enabled VM is gong to require a lot of regression testing and fixing some subtle and unknown dependencies on the single thread assumption in the RTS libraries.

Given the asymmetric nature of two processors (though did I notice references to making the latest RTOS versions SMP?) I don't see this as a major impediment. However we need to have some clear guidelines for library writers interacting with Lua-land.

  • Whilst there will be nothing stopping device drivers running within the RTOS pre-emptive framework, the tasking mechanism and non-Lua structures will be required to synchronise with the tasking interface to pass control to Lua.
  • As single-threaded Lua is cannot be pre-emptive, Lua event routines cannot have guaranteed latency unless the application is written with strict timing discipline. And I wonder whether we should make more use of the coroutining to facilitate a more procedural applications interface.

More thought needed :)

@jmattsson
Collaborator

@TerryE It's almost fully SMP now. There are a handful of things which can only be done from one core or the other, but pinning a driver task/thread to that particular core is trivial (if we ever need to use those things - off the top of my head I can't remember what they area).

And yes yes yes we're sticking with a single-threaded LVM thank-you-very-much! I'm not debugging the monstrosity that would otherwise appear! :D I already ported the NodeMCU task API when I did the original RTOS work, so that side is covered. Getting everyone to remember to post/queue things from within SDK callbacks rather than calling directly into the LVM will be the challenging part. I've tried to cover this the dev docs I've been writing so far, but it's certainly a point that bears hammering in.

I wonder if I could convince Espressif that their APIs should take a "results-posted to-this-task-please" approach over the current direct callback way...

@igrr
igrr commented Sep 22, 2016

It will rather be "Results posted to this queue please". We are, indeed, going to move away from callbacks. So I apologize for breakage caused before we reach 1.0.

@TerryE
Collaborator
TerryE commented Sep 22, 2016

@jmattsson Johnny, Rereading this whole thread I realise that I must seem that I am going senile because I keep repeating myself. 😆 The problem is one of bandwidth: I need to allocate ~30min every day just to keep up with what is happening on the list, and I just haven't had that time so have got out sync. I need to do some deep reading to catch up, and avoid stating the already stated. Sorry. I'll drop you an email separately.

@jmattsson
Collaborator

@igrr That's great news, Ivan! Thanks for letting us know. That approach will certainly make everyone's life easier. Any rough idea on time frames for this to start appearing in the IDF? And which areas might get it first? I'm just trying to get an idea on how I might best plan my work.

@TerryE Hahaha, you're excused! I remember how hazy I was back when I was doing the multi-year reno/build for my house, so I'm not going to judge.

@igrr
igrr commented Sep 23, 2016

This change needs to land before 1.0 release, which should happen around Oct 1st. If you have some proposals about the way you would like to see this API, i encourage you to open an issue at https://github.com/espressif/esp-idf. I will be refactoring the startup procedure on Monday, my plan is to remove app_main callback from wifi stack. Instead, let application provide a normal C-like main, from which one can initialize WiFi/BT stacks as one would initialize any other driver. I can put event processing refactoring in the same MR.

@devsaurus
Collaborator

if you're feeling adventurous you could look at getting the -fstack-usage patches into the xtensa-esp32-elf toolchain as well

Sure, what repo/branch are you building from?

@jmattsson
Collaborator

I'm using this build script.

@devsaurus
Collaborator

The stack-usage patch is already in some fork out there. I'll cherry-pick it to mine and would check porting the callgraph info thing as well.

@jmattsson
Collaborator

Cool, you could even raise a PR against the Espressif fork when you're ready, I'm sure others would love this functionality too!

@devsaurus
Collaborator

-fstack-usage is available from: https://github.com/devsaurus/crosstool-NG

@jmattsson
Collaborator

@devsaurus cool! Dare I ask for the callgraph-info too? :D

I'll see if I can get some time next week to update the prebuilt toolchains.

@igrr
igrr commented Sep 29, 2016

Thanks, I can incorporate those changes into pre-built toolchains provided by Espressif. Oh and by the way, we do have an image in the docker hub which is made specifically for CI use: espressif/esp32-ci-env.

@jmattsson
Collaborator

@igrr I like the refactoring work you did - it made it quite easy to slip into the event queue handling neatly!

General ESP32 progress update:

  • I've been keeping up with the IDF updates and adjusting accordingly. Some handy UART macros disappeared, but other than that it's all improvements.
  • Most of the node module has been ported across by now, sans ESP8266 specific functions.
  • Got a workaround for the 2MB flash limitation issue, so the usual NodeMCU flash size detection should now be fully functional.
  • Due to $work needs I've created a small BlueTooth module for working with LE Advertisements, and am happy to report that the BlueTooth HCI on the ESP32 seems to be working just fine. We still need to wait for support for enabling both BT and WiFi at the same time however. Also, at some point there should appear a full BT stack, not just the HCI. So far, so good though...
@devsaurus
Collaborator

@jmattsson that was the plan 😃 but took some more time. It just landed at
https://github.com/devsaurus/crosstool-NG/tree/callgraph-info

@igrr do you want me to place a PR for stack-usage and callgraph-info against the Espressif repo?

@TerryE
Collaborator
TerryE commented Sep 29, 2016

All, this thread has evolved into a primary topic: The way forward for NodeMCU on the ESP32 and which I support BTW, but with the new IDF and its API incompatibilities with the legacy RTOS SDK + and the intractable resource issues of getting a functional NodeMCU implementation on the ESP8266 over RTOS, I suggest that the de facto is that we are going to be left with two platforms gong forward:

  • NodeMCU over the non-OS SDK on ESP8266 and
  • NodeMCU over the IDF/RTOS on ESP32

Given this dichotomy, what I wonder is: should we accept this and fork NodeMCU or should we at least attempt to unify these two diverse approaches at some level within the NodeMCU Lua:

  • Do we maintain a single unified distribution that in essence has two targets: ESP8266 and ESP32?
  • Do we attempt to provide at some Lua application level a unified abstraction model that enables Lua developers to migrate like-for-like applications from the ESP8266 to the ESP32 without a total rewrite at a Lua application level?

None of this discussion should detract from or impede what you guys are achieving with the ESP32 port, but I feel that in principle these objectives are more doable than might initially appear. I also feel that achieving this goal will materially easy the migration path for our developers.

The the two issues are largely independent, but not entirely. In terms of the common code base, we would need two localisation hierarchies: ESP8266 and ESP32 together with separate platform abstractions with a unified abstraction API at some level.

However the ESP8266 non-OS model is non-preemptive event driven and the ESP32 RTOS model is pre-emptive and procedural. My view is that at the low C level, these are fundamentally at odds. However, surely the approach here is to see if we can define a unified abstraction at the Lua level. I think that Lua coroutining might just be the magic bullet. I don't want to hijack this thread further, but what is the best way to have this debate? As a separate/new thread? A white-paper / RFC for discussion.

@heymind
heymind commented Oct 2, 2016

Why not use mongoose as its network library...It supports many protos including webdav .so it is easy to manage files in flash💪💪

@jmattsson
Collaborator

Time for another ESP32 progress update!

Thanks to $boss at $work I've been getting some decent time to work on this, and the dev-esp32 branch is now in an almost-useful state!

  • A basic WiFi module implementation has been done, but I've yet to find time to write/update docs for it. I've taken the liberty to change the interface here somewhat, both due to changes in the IDF compared to the SDK, and based on discussions between @marcelstoer and @dnc40085. Changes include:

    • The WiFi stack is now manually started via wifi.start() and can similarly be shut down with `wifi.stop(). This mirrors what the IDF provides now, and could allow for some power savings with a user not needing to have the WiFi stack up except when actually wanting to use it.
    • The event monitoring has moved to wifi.on('event', func) to better be in line with other modules (e.g. net.socket).
    • wifi.setmode() is now just wifi.mode() since it does more than just set a property, and in general "getters" now return the information in the same format as the set/configure function would take.
    • Along those lines, AP scanning interface got another polish, and now returns an array of objects, each of which could be passed to wifi.sta.config().

    Whether we want to keep this interface, revert to the old one, or have the old one as a compat layer, I'll leave for others to debate. I'm seeing the ESP32 version as a chance to clean up our APIs a bit and learn from our 8266 experiences. Moving projects from 8266 to 32 will need changes regardless, so this is the perfect time to improve our user-facing bits.

  • There is a net module based on @djphoenix's native LWIP net module, but RTOS'ified. No docs, but also no API changes. SSL not yet available. The net module can be used a blueprint for inspiration on how to RTOSify other modules safely.

  • As mentioned before, a simple bthci module for basic BlueTooth advertising. I have documented that one.

  • Espressif have been making improvements and fixes to the IDF, and as of writing the dev-esp32 branch is using the latest available IDF ("post-0.9"). There may yet be breaking API changes, but so far I've been staying on top of everything.

  • The developer faq got an update yesterday as well.


All that said, this will be the last you see from me for a bit. Next week I'm going on annual leave and will be heading overseas, so no coding for me for a few weeks. I may be able to keep an eye on github discussions, but don't be surprised if I take a while to respond.

Once I'm back, I'm hoping to do a first cut of a gpiomux/pinmux module to provide an interface to the pin-routing capabilities of the ESP32. That done, it's the gpio module next. I'm very tempted to expose the actual GPIO numbers in the Lua interface this time, and if we have an official NodeMCU board then provide constants a la gpio.D0 to provide the mapping. Potentially it might be an idea to have a bunch of submodules for that, so you'd get gpio.nodemcu.D0 and e.g. gpio.otherboard.D0 for best ease of use. Of course, with the pin-routing capabilities on the ESP32, doing any sort of pre-canned mapping might be foolish. Flexibility - whoo!

If others want to start working on the ESP32 branch (did any of you manage to snag an Adafruit board before they sold out again?), do feel free. We're still waiting on Espressif to start providing more drivers (i2c, spi, etc), so many things aren't ripe for adding yet, but hardware agnostic modules should be pretty easy by now. Kconfig ftw.

We're also expecting to get an official interrupt-allocation API from Espressif, so if you're wiring up ISRs, I'd suggest hard coding for now (see console.c) rather than rolling our own allocator.

@jmattsson
Collaborator
jmattsson commented Nov 27, 2016 edited

I'm closing this mega-thread in favour of the individual "ESP32:" issues/todos set up yesterday. Marcel also set that up as a Project here, making it even easier to find.

@jmattsson jmattsson closed this Nov 27, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment