Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create unit tests in Lua #2145

Open
marcelstoer opened this issue Oct 21, 2017 · 53 comments
Open

Create unit tests in Lua #2145

marcelstoer opened this issue Oct 21, 2017 · 53 comments
Assignees

Comments

@marcelstoer
Copy link
Member

@marcelstoer marcelstoer commented Oct 21, 2017

It would be extremely helpful to have a single Lua script that anyone could run to test hardware-independent modules.

Whenever we cut a release we do so w/o having done any structured and reproducible testing. I understand and accept that it's difficult to do serious (automated) testing for sensor/actor modules because it requires the respective hardware to be installed. However, it'd be fairly straightforward to test all other modules at least partially e.g. WiFi (basic stuff), timer, net/tls, HTTP, sJSON, SNTP, perf, MQTT (against public broker), file, cron, bit etc.

This is related to #516 (i.e. a light version of it).

@HHHartmann
Copy link
Member

@HHHartmann HHHartmann commented Oct 22, 2017

I strongly agree.

A short research brought up the following page: lua-users.org/wiki/UnitTesting
I have not tested any of them but some are marked as minimal or tiny which would propably be a good place to start at. Some might not be suitable out of the box though. minctest-lua for example uses os.clock().

I would also like to include a check for required module(s) in each test or testfile so they could be marked appropriately in the results if the prerequisites fail. (or not be executed at all).

To make sure that arbitrary sets of tests can be run by individuals I could imagine to have a naming schema (like "test-MyFavoriteTest") and a script that runs all matching files.
That way only needed tests can be downloaded to the esp and run without managing any lists.

just my 2c

@marcelstoer
Copy link
Member Author

@marcelstoer marcelstoer commented Oct 22, 2017

minctest-lua for example uses os.clock()

And io.*, math.*, and debug.*... I don't expect we can use any of those other than for inspiration.

include a check for required module(s) in each test or testfile so they could be marked appropriately in the results if the prerequisites fail.

👍

To make sure that arbitrary sets of tests can be run by individuals

I believe it's important to keep the execution & preparation effort minimal. Copy a single file to device and run it. If each test checks whether the required module is available I don't see the need for arbitrary sets.

I have a few ideas how to improve this but I prefer to see working software rather than discussing the perfect solution for weeks and months 😉

@TerryE
Copy link
Collaborator

@TerryE TerryE commented Oct 24, 2017

Marcel, there is already an established test suite for standard Lua and I use it quite a lot for stress testing host variants of my build, e.g. the LCD and the current LFS patches as well as my 5.3 work. The main issue is that it's got far too big a footprint to use on standard NodeMCU builds. Having an LFS will help a lot here, plus a robust Lua provisioning system to download the test overlays.

@luismfonseca
Copy link
Contributor

@luismfonseca luismfonseca commented Oct 29, 2017

Nothing fancy, but I did experiment doing a minimal test framework that works on the nodemcu: https://github.com/luismfonseca/nodemcu-mispec

@marcelstoer
Copy link
Member Author

@marcelstoer marcelstoer commented Oct 29, 2017

Thanks Luís, I knew that there's something like this out there but I didn't remember who of the NodeMCU community brought it up earlier. It pretty much looks what I had in mind.

@stale
Copy link

@stale stale bot commented Jul 21, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Jul 21, 2019
@marcelstoer
Copy link
Member Author

@marcelstoer marcelstoer commented Jul 21, 2019

I think the more we are leaning towards Lua modules instead of C modules the more relevant this topic becomes.

@stale stale bot removed the stale label Jul 21, 2019
@ildar
Copy link

@ildar ildar commented Jul 22, 2019

@nwf
Copy link
Member

@nwf nwf commented Jul 22, 2019

@ildar I think it better to test over the serial link than via the network. Fewer things that need to come up first and everything can be driven by something as straightforward as expect.

@TerryE
Copy link
Collaborator

@TerryE TerryE commented Jul 22, 2019

What we really need to get here is the Lua VM and the integration with NodeMCU. I will be the main user. I just don't see our typical IoT developers interested in something like this for IoT apps themselves. Perhaps professional orgs like DiUS and @jmattsson might be interested. So as @nwf suggests I feel something like the Lua test suite driven over a UART interface will be a better starter.

@jmattsson
Copy link
Member

@jmattsson jmattsson commented Jul 22, 2019

I've found any sort of testing of embedded devices to generally be cumbersome, tedious and/or error prone. I'd certainly be interested in seeing something in this area, but I don't have much to contribute at this point. Past experience with other embedded environments would suggest that control over serial is the most flexible and reliable option, since often you'll want to test network attach/detach and various networking error conditions.

I should mention that in the IDF on the ESP32, Espressif has included a unit testing framework. I have not looked at it other than to note that it is there. I might warrant a quick look at least.

@TerryE
Copy link
Collaborator

@TerryE TerryE commented Jul 22, 2019

The other reason for using the UART is that it avoids depending on the network stack which is complex and complicated because of all of the other network housekeeping tasks that need to go on. A big component of the standard Lua tests ensure that the Lua VM can resiliently deal with memory exhaustion. As any ESP developer will tell you: in general memory exhaustion means death because stacks like LwIP don't currently use the Lua allocator so we can't gracefully recover from out-of-heap when the exhaustion occurs in the network stack.

@ildar
Copy link

@ildar ildar commented Jul 23, 2019

@TerryE
Copy link
Collaborator

@TerryE TerryE commented Jul 23, 2019

That sounds like a good idea. How about a non-trivial example of it in use?

@ildar
Copy link

@ildar ildar commented Jul 23, 2019

@HHHartmann
Copy link
Member

@HHHartmann HHHartmann commented Sep 19, 2019

@marcelstoer @luismfonseca @TerryE
I am using mispec to write some test. Where should I store them and where should mispec go?
Should we submodule https://github.com/luismfonseca/nodemcu-mispec? Or just copy the one file?
I would propose to store the tests in new /lua_tests directory. Name scheme could be mispec_{moduleName}.lua
I feel that some more functionality should be added to mispec. So maybe having the source in our repository might make sense.
Add to mispec:

  • run tests optionally
    • only if certain hardware is added (add central config file, adding what is connected to which pins)
    • only if module is compiled into the firmware
  • Add more tests that emit better diagnostics if they fail
    • (not)equals( a, b )
    • contains( list , element )

@HHHartmann
Copy link
Member

@HHHartmann HHHartmann commented Sep 29, 2019

@marcelstoer @luismfonseca @TerryE
As nodemcu-mispec does not seem to be maintained anymore I would create a module out of it and add it to this project.

Tests would be named /lua_tests/mispec_{moduleName}.lua as mentioned in my last post.

OK I would really like someone to have an opinion. So please If you don't think this is a good idea tell me now, so I won't do this for the bin.

@TerryE
Copy link
Collaborator

@TerryE TerryE commented Sep 29, 2019

Have you seen my dev-lua53-2:app/lua53/host/tests. These hammer the Lua core VM. The problem with testing most modules is that you need the right hardware to be correctly integrated with an ESP module before you can sensibly execute tests. Even with the core modules which don't require addition GPIO-connected H/W we still need other test listeners/responders to execute meaningful tests.

We could use the same approach with some sort of RPC to a host or RPi script to do equivalent on-ESP tests.

@nwf
Copy link
Member

@nwf nwf commented Sep 29, 2019

I think (but am not volunteering, so my $0.02 doesn't count for much more than that) that the best approach for testing is to have a standard nodemcu test environment, which includes 802.11 access, and an expect script that flashes and then speaks to a device under test over a UART link, with some of the modem control lines hooked up, and over the network. For simplicity, I would encourage the use of the Test Anything Protocol for the communication channels; I have some experience manipulating this from expect and it's generally quite pleasant.

UART would give us the truly independent modules (bit, bloom, crypto, encoder, sjson, sqlite?, struct) as well as some "device-only" modules (cron, file, node, rtc*, tmr). The control lines would let us lightly probe at gpio and perhaps uart.

The script should take arguments for network setup, but probably we should expect to join existing infrastructure using WPA(2)-PSK. Such network connectivity lets us test at least wifi (client), net, tls (though we will need to have certificates as part of the standard test environment), and http easily enough. For testing wifi AP and friends, I would think the simplest approach would be to require a second nodemcu device under test that can be programmed to be the AP.

As to the rest... we might be needing a more comprehensive test environment.

@marcelstoer
Copy link
Member Author

@marcelstoer marcelstoer commented Sep 30, 2019

OK I would really like someone to have an opinion.

I have a strong opinion (as expressed in #516 and this issue) but a pragmatic one. Any test is better than none!

This embedded / micro controller environment is quite far from what I'm used to in terms of CI & CD in my comfy enterprise-backend-and-frontend-system-world. Of course, the challenges are unique if you can't abstract the hardware. However, as #2928 has shown we currently don't even ensure that the code we release compiles in all (expected) cases.

nodemcu-mispec, and its gambiarra heritage, looks really nice. I like fluid APIs. You can see that it was likely influence by how you write JavaScript tests these days. So yes to

I would create a module out of it

However, form follows function. Any other test "framework" that gets the job done would probably also be ok. Again, any test is better than none! And of course I fully support the ideas described by @nwf. I feel they're in line with what I (intended to) describe in this issue's initial comment.

@TerryE
Copy link
Collaborator

@TerryE TerryE commented Sep 30, 2019

Marcel, doing this properly would take about 4-6 months of solid developer contribution.

Take this #2928 issue as an example of what you infer that such a capability would have prevented this bug. This was a failure in a build variant enabled by WIFI_SMART_ENABLE. We have at least 20 of these such variants plus other side-effects in our 70 modules. To cover these we would need a fully automated system that could do some 100 or so build firmware + flash to ESP + execute test suite on this particular build config variant.

I have already included the Lua Test Suites in Lua53 build (and that you don't even mention). This has about 10K test vectors covering the core Lua engine. It took me about 5 days solid slog to clear this and I found ~5 genuine implementation bugs in the process. I also had to validly change dozens of tests to accommodate valid differences between the NodeMCU and standard Lua environments. But this is only for one host environment, and not for ~100 target ones.

Let's not make the mistake of doing this bottom-up. We need to be confident that we can resource and create a minimum capability that has a net benefit to the project. We need this confidence before we should even begin. So what is that minimum capability?

Surely the first step is for someone to prepare a design spec for the whole end-to-end test system and then we can do a consensus estimate of the work involved. Once we have resource identified to is willing to commit to resource this, only then we should proceed. In the meantime are you or Gregor prepared to draft a spec of what you envision as the minimum capability. If we can't resource writing the spec then we certainly can't resource implementing it.

@HHHartmann
Copy link
Member

@HHHartmann HHHartmann commented Oct 1, 2019

Ok let me try to sketch the different approaches mentioned in this and other bugs:

  1. Have a test harness run on the host which coordinates running tests on the ESP (Tango/busted, expect)
  2. Have automated test, which start after each commit on a PR
  3. Have tests run on the ESP calling the host with some sort of RPC.
  4. Just run tests on the ESP without host connection (nodemcu-mispec, gambiarra)

Having a combination of 1 and 4 would allow for automated testing of different firmware builds (flash/download/test/report) and casual running of only some tests on the device. That way the test logic would stay on the ESP, allowing also for time sensitive operations and even blind spots where there is no connection to the host due to a test.

Since both nodemcu-mispec and gambiarra seem not to be maintained we could take either one and adapt it as we need. I like the concept of passing a next function to indicate a finished test as used by gambiarra better than the concept of mispec.
Both are under MIT License, IIRC that was no issue ?

I haven't looked into the other parts, so no comment here.

So we should decide what we want as final solution and then identify the steps with the most benefit for the project and do these first.

@TerryE
Copy link
Collaborator

@TerryE TerryE commented Oct 1, 2019

Gregor, take this WIFI_SMART_ENABLE issue: as I said, we've got some 20 of this type of option which changes the firmware build; and 70 modules likewise. At the moment our tools/travis/pr-build.sh script modifies the config to enable all modules, SSL, Develop version and FATFS then does a make. In order to get this to compile and link, it has to patch the iram1 and iram0 segments. All this is is a compile-clean check, as this full firmware image is too big to load into the ESP8266 in reality.

So to get decent coverage of the 20 options and module permutations we might need perhaps 100 different make subsets; certainly more than 10 and probably less than 1,000. For each build a different subset of remote tests would be applicable. We would really need a dedicated test system for example an RPi4 + USB3 SSD + one or more USB connected Wemos D1 Minis. Whilst it would be impractical to try to test all H/W we should at least test minimal GPIO, I2C and say OW functionality, so we would need a small custom board on the D1 Mini. This system could to a release check by cycling around the, say, 100 makes and for each download the firmware and run the subset of tests that would apply to that build config. This would take about perhaps 4hrs elapsed for such an RPi4 config.

At the moment one of my RPis hosts my luac.cross service and it runs a batch script once a day to interrogate the current SHA1 for master and dev and does a rebuild on change. We could use an equivalent test to trigger a test suite run.

The challenge in all of this isn't procuring the H/W or cobbling together the test stack. On this last we do have to be careful because whatever the ESP-end is, we need to make sure that the selected stack leaves enough RAM resources left to do usable testing.

The cost and the value is creating an adequate test vector, all of the test stubs and the PCM (Permitted Configuration Matrix) that links these together. This is the major exercise; it is many months of engineer effort. If we don't do this, then we will end up with a toy system that has not true value.

@jmattsson
Copy link
Member

@jmattsson jmattsson commented Oct 2, 2019

I'm pretty much on the same page as Terry here. Having the ability to run tests against/on hardware is great, but it's a major effort, and even at $work we often make the call that it isn't worth the investment.

I'll add one more requirement to the list as well - whatever is chosen (if any), needs to be possible to run locally. If someone pushes a change that breaks the test, they need to be able to easily replicate the test setup so they can debug locally. Throwing random changes at Travis and seeing what works, especially with a high turn-around time like testing-on-hardware does (even in relatively simple setups), is just not feasible, and would be a net loss in my experience.

Before diving in, we really do need to have a clear goal in mind that's realistic, and isn't going to end up costing us more time than it gains us.

@ildar
Copy link

@ildar ildar commented Oct 2, 2019

@TerryE
Copy link
Collaborator

@TerryE TerryE commented Oct 14, 2019

The test vectors in LTS are very dense; there's about 13K lines and over 10K actual tests (Often for loops are used to generate lots of case cases in a few lines code). Using the LTS test in the luac.cross environment has thrown up some genuine and hard to fix down errors.

The last example was that I missed a code path on generating the optional hashes for long strings. In Lua 5.3 strings of 40 bytes or shorter are interned, that is stored uniquely so comparison collapses to "the TStrings are identical if and only if the addresses are the same". This is now not the case for strings of 41 bytes or more. You can end up with multiple copies of the same string, so their hashes are generated lazily and only if they are used as a table key. Hence simple string comparison first compares the lengths then does a memcmp() if they have the same length; it doesn't even use the hash because the cost of generating it is a lot more than doing the memcmp(). Table lookup uses the hash anyway, and so Lua53 generates this lazily and once generated it caches this hash in the TString -- except that this can't work in read-only LFS so the LFS loader generates hashes for long TStrings on LFS loading. However Lua53 follows address randomisation best practice and now adds a seed to the hashing algo where the seed is randomly generated on boot. There is an issue here with LFS which I picked up and because there was a test:

`load('_012345678901234567890123456789012345678901234567890123456789 = 13')()
assert( _012345678901234567890123456789012345678901234567890123456789 == 13)

and that worked fine out of RAM, but was sometimes failing when executed out of LFS. This was bizarre because the second reference was returning nil yet

for k,v in pairs(_G) do
  if k == '_012345678901234567890123456789012345678901234567890123456789' then print(v) end
end

always prints 13. It was down to this hashing error.

Diagnosing and understand the issue was difficult and this was with the PC gdb where I can set multiple breaks and watchpoints and step through code freely. For LFS, the hash seed is stored in the LFS image and if an LFS image is loaded then its stored value is used for the global seed. (I had missed one code path where the incorrect seed was being being used.) Diagnosing the error was hard, but once I realised what the error was, fixing it was a few lines of change.

On the ESP8266 using the remote debugger, you only have one watch or break point for code in irom1 and you can't step over calls, so the diagnostic cost goes up 10-fold -- which is why I created the luac.cross -e environment and use it first for all Lua testing.

I plan to move this test suite to the ESP by creating an LFS image for each of the ~30 test modules and the test does a node.flashreload() loop around these. Heavy lifting.

I decided to migrate the 793 line files.lua to an ESP equivalent. Except that I soon realised that this was going to be a major piece of work. It took me a few hours to code the first 40 test cases, and that is just coding them and not running and validating them so doing the whole suite would be maybe a week's work. And that is just for the file library which is synchronous. Any asynchronous library like net or tmr would be factors longer. So generating the entire set will be many months work. By comparison writing something like mispec is a few days work.

_ The effort isn't in selecting or even developing the test tool, it is in building a sensible coverage of test vectors._

And only when you come to start to develop the test vectors for file, does this underline the architectural weaknesses in its design. io follows the Lua way. Methods do not return error statuses. They work or throw an error. You write your code assuming io works and you don't have to check and code for error returns. If you want to catch the error then you rap the whole function in a pcall(). file just doesn't work this way. No errors are thrown. Sometimes they are returned as statuses; sometimes they are simply ignored; sometimes the result is just not defined. For example the documentation is silent on whether you can stat() an open file and what happens if you do? What about opening the same file twice? Or an f:seek() outside the file boundaries? And the file library is in fact a wrapper around two separate SPIFFS and FATFS which have completely different code bases; and the behaviour of individual functions might be different for these two FS.

As I go through the files.lua test cases, I realise that there is a main path behaviour that I understand but outside this, I just don't know what the actual behaviour will be and I would need to experiment to find out. Just developing this one suite is going to be a major piece of work. It is probably going to be less work to port the standard io library and then I can use the standard files.lua as a starting point.

Getting a decent coverage of our libraries is going to be another order of magnitude effort -- a few engineer years of effort. Whether do we really begin - apart from expressing wish that any testing is better than no testing or wanting to end world hunger.

So my conclusion from all this is that if anyone wants to propose a test suit then they need to include a decent test vector for file and net at a minimum. If they can't then the choice is purely aspirational and at best premature.

Sorry for this long post, and I realise that most won't be bothered to reading to its end, but the cost of writing this post is maybe 4 orders of magnitude less than that of doing the work that we are talking about.

@jmattsson
Copy link
Member

@jmattsson jmattsson commented Oct 14, 2019

You do raise an important point about how some of our modules currently operate. It would seem that at some point in the future we should do a major update where we get everything to follow the Lua canonical way (i.e. raise errors). Currently users (including yours truly) end up with bugs because either the don't expect an error to be raised somewhere, or they forgot to check a return code somewhere else.

And it's not like we haven't got enough on our TODO list already 🤣

@TerryE
Copy link
Collaborator

@TerryE TerryE commented Oct 15, 2019

Currently users (including yours truly) end up with bugs because either the don't expect an error to be raised somewhere

IMO, the issue here is that if some code way down in bowels of your application is failing without your code correctly handing it, then your application is doing just that:failing, but without you realising.

Bugs that are occurring undetected are the most dangerous because they tend never to get fixed. So often I've been involved in code reviews where a project has had major business or engineering issues because of the error is occurring 20 levels down in the application but the the guy writing level 15, say, forgot to check for errors statuses so the whole application was just failing silently. Better the error that you know.

@fikin
Copy link
Contributor

@fikin fikin commented Oct 15, 2019

what is the current understanding of how to test async methods? or is there any?
my understanding is that a time control is needed ...

i did some mocking of net, tmr and wifi modules among others in pure lua (nodemcu-lua-mocks) so one can unit test app code against it. but i'm not overly happy with the current outcome.

@TerryE
Copy link
Collaborator

@TerryE TerryE commented Oct 15, 2019

what is the current understanding of how to test async methods?

The tl;dr is that it is f***ing complicated. Most scripting / test frameworks are intrinsically procedural. You really need some form of event-driven framework to test an event-driven application. I think I'll be dead before we have canned best practice for this one 🤣

@fikin
Copy link
Contributor

@fikin fikin commented Oct 15, 2019

i've just poke at LTS (5.3.4) and saw assert use in lua tests and opentests c-module.
terry : is that what you've been running so far? have you really managed to run opentests in esp itself?

@TerryE
Copy link
Collaborator

@TerryE TerryE commented Oct 15, 2019

@fikin Nikolay, read through #2917, and back through the recent comments in this issue. As I said above I am using a tweaked version of the Lua Test Suite which I first run on the PC version of NodeMCU. OK, this is built around luac.c and a small subset of libraries (string, table, debug, coroutine, math, utf8, LFS, os and io). It's not a full interactive system but you can enter the Lua execution VM using the -e option to execute a script (including debug.debug() which gives you a basic interactive prompt). But this is the NodeMCU Lua code including ROTables, LFS, etc. There is very little different LUA_USE_HOST vs. LUA_USE_ESP conditional code in the Lua core itself, and the main changes from standard are really to support this modified Harvard style of running where you can load most of your string data and functions into RO memory/

I am currently debugging some issue in the new LFS code paths unique to ESP, but once I've punched through this I've got a test suite of 30 LFS images which I will be running on-ESP. To do, but in the next day or so.

I don't understand what you mean by opentests (and neither does Gooogle).

@fikin
Copy link
Contributor

@fikin fikin commented Oct 15, 2019

opentests -> i'm referring to https://www.lua.org/tests/lua-5.3.4-tests.tar.gz, ltests/ folder, containing c-module tests. or is that module called something else?

@TerryE
Copy link
Collaborator

@TerryE TerryE commented Oct 15, 2019

As in app/lua53/host/tests (that's the current version that will go into dev when merged) ? You can use git blame to see what I've had to change to get them to run. Not a lot. And yes I am trying to bootstrap a material subset into an ESP environment at the moment.

@HHHartmann
Copy link
Member

@HHHartmann HHHartmann commented Oct 18, 2019

I prepared some tests for file based on an extended version of mispec.
I am not really happy with mispec and lock at the features of busted with envy.
Even gambiarra.lua seems to be better suited and already has testcases for itself.

Usually I would split up the tests a bit more but for a rough estimate it should do.
Also the tests are by far not complete, file:* is missing and more odd pathes should be checked.

If you want to have a look: https://github.com/HHHartmann/nodemcu-firmware/tree/tests/lua_tests
Just run one of the files.

@TerryE Terry what you suggest is that someone should stand up and implement tests against the two modules, regardless with what framework (or none at all) and then we will adopt that style and everything is ok?
But as you say rewriting tests (also) takes a lot of time, so I was aiming to first agree on "how" and then start with "what". The Lua tests are very compact, but they are hard to read and will be even harder to adapt to new requirements. So I favor using a test harness to be able to write short and clear tests. Even if they take longer to execute.

@TerryE
Copy link
Collaborator

@TerryE TerryE commented Oct 27, 2019

@marcelstoer @HHHartmann, I've been making progress on porting the Lua Test Suite onto ESP NodeMCU. See for example my gist Rough cut coroutine based provisioning system. I am falling in love with coroutines. Just see in the gist how they pretty much entirely remove all of the callback crap from your application logic. What I am doing is to rsync the LTS on my PC to an Apache2 directory. The sync script will pull down any updated LFS images and a separate init.lua script cycles though these. It executes really fast. I have a bunch of technical issues that I need to bounce off you guys relating to this suite, but this isn't the issue to have this discussion on. So should I raise this as a separate LUA53 tagged issue?

PS. Health warning. These gist scripts run on Lua 5.3 so I've been using some of the nice new 5.3 features, but it should be easy to back port them to Lua 5.1 for anyone that's interested.

@HHHartmann
Copy link
Member

@HHHartmann HHHartmann commented Dec 26, 2019

@TerryE, @nwf , @marcelstoer Terry I would like to see more of your testing environment. Can you show us your init.lua which cycles through the images? What do you do with the outcomes of the tests?

In my opinion test writing and execution has to be as simple as possible. This is because developers tend to see it as a separate burdon to write and execute them. So every extra learning that is needed or any complicated whatever will keep them from writig them.
So I think that it does not make sense to have to learn another language to write tests.
I also don't like that the actual test code in the expect scripts is torn apart by the tcl code. That makes it even harder to understand.

@nwf
Copy link
Member

@nwf nwf commented Dec 26, 2019

In general, I think I'm OK with something like mispec. To further the discussion, today I will patch up our mispec implementation a bit so that it speaks TAP and then port over some expect code I have for parsing TAP messages. I'll also add TCL procs for pushing and pulling file content to the device under test, probably based on my existing telnetd pwrite and pread implementation.

Fundamentally, though, I think we need something like expect involved in the game. Some modules, notably tls and net, probably require a program running on the host that can easily interact with the device under test. (Earlier, I had proposed that we use two NodeMCU devices, one of them being the AP, but even there we require some synchronization between the two, again probably most readily done with expect. Testing tls would be more difficult, but perhaps not impossible, in such an environment, however, than just relying on whatever ambient network is available.)

file and friends are sufficiently contained that I don't really see the need for a dedicated expect driver there, it was just easy to do; I'm not wedded to TCL being our default test language for all modules.

@HHHartmann
Copy link
Member

@HHHartmann HHHartmann commented Dec 26, 2019

I just started implementing the tests for mispec as it was the only one I was aware of at that time.
Looking at gambiarra which @marcelstoer mentioned above I would prefer that. It is iirc better structure, self tests and support for async stuff. So it is on my agenda to rewrite the tests for that.

As for testing net and tls. Could it be possible to configure the NodeMCU as AP and station (I know you can)? But I have no experience with connecting it to itself. Sure would not make the tests any easier as they would have to handle both sides simultaneously. But maybe coroutines cold help here sufficiently

@nwf
Copy link
Member

@nwf nwf commented Dec 26, 2019

The broader problem with tls specifically is that we don't have enough RAM on the device to do both sides at once (truthfully, we barely have enough to be a client, and the requirements for being a server are higher still), and so we really need to have the device tethered to a real computer anyway.
I would be very surprised if an ESP could connect to itself as an AP; when it's transmitting it can't be receiving &c. We might be able to use loopback on the device, but I think it's substantially better to have the devices actually speak to a local network (and/or each other).

Besides, if we want to automate anything, we're going to need to parse the stream sent over the UART, so we may as well just bite that bullet now. I've written the beginnings of a test program that just sends many files to the device and then runs the last one with dofile. It expects prefixed TAP-structured replies from the device under test.

So, ultimately, as I've said before, I think we should...

  • Define a NodeMCU test environment, composed of two ESP8266-s (and eventually ESP32-s) and a handful of peripherals and interconnections.
  • Agree that all on-device tests speak TAP or something like it, for ease of host parsing
  • For modules that can be tested on one device (file, gpio, ...), write test programs in Lua that are pushed down and run. There's no real reason to drive these completely from the host, even if we can, in principle.
  • For modules that cannot be tested on one device (net, wifi, tls, ...), write test programs that send commands (and/or files) to one or both devices and/or run commands on the host as well.
  • Have some "orchestration" scripts that can build a firmware image with all the requisite modules, flash it to both devices, and run the above tests.

@HHHartmann
Copy link
Member

@HHHartmann HHHartmann commented Jan 1, 2020

Besides, if we want to automate anything, we're going to need to parse the stream sent over the UART, so we may as well just bite that bullet now. I've written the beginnings of a test program that just sends many files to the device and then runs the last one with dofile. It expects prefixed TAP-structured replies from the device under test.

That sounds good, Maybe we can combine it with Terrys LFS image generation. We should use LFS because else we will hit memory constraints too early.

* Define a NodeMCU test environment, composed of two ESP8266-s (and eventually ESP32-s) and a handful of peripherals and interconnections.

Agree

* Agree that all on-device tests speak TAP or something like it, for ease of host parsing

Also a good idea. gambiarra supports replacing the output generation, so chnaging it to TAP should be no problem (if we use it)

* For modules that can be tested on one device (`file`, `gpio`, ...), write test programs in Lua that are pushed down and run.  There's no real reason to drive these completely from the host, even if we can, in principle.

👍

* For modules that cannot be tested on one device (`net`, `wifi`, `tls`, ...), write test programs that send commands (and/or files) to one or both devices and/or run commands on the host as well.

Maybe we can have one device set up as counterpoint for these modules and not change it unless needed for new/changed tests. It could be steered by REST requests from the other device to act as station/AP/tls or be a counterpart for GPIO tests. So orchestration from hostside would not be needed. (or not that much)

* Have some "orchestration" scripts that can build a firmware image with all the requisite modules, flash it to both devices, and run the above tests.

That would be to only flash to one device unless required otherwise. We should have as little change and a stable version for this 2. device. Usually no changes would be needed ot just Lua updates.

@HHHartmann HHHartmann self-assigned this Jan 13, 2020
@HHHartmann
Copy link
Member

@HHHartmann HHHartmann commented Jan 21, 2020

Currently I am exploring gambiarra as a test base. If you are interested have another look at PR #2984

@HHHartmann
Copy link
Member

@HHHartmann HHHartmann commented Feb 3, 2020

@nwf as I read in #3032 you are actively working on tls testing. I almost have a solution for esp to esp TCP testing. We should agree to a common approach not to have two working solutions in the end which are not compatible.

@nwf
Copy link
Member

@nwf nwf commented Feb 3, 2020

All my work is available in https://github.com/nwf/nodemcu-firmware/tree/dev-active/tests (but warning, that branch gets frequently push -f'd over). I've got preliminary test modules for tls and mqtt using expect to drive socat and mosquitto on the host computer. I've also started defining what, I think, our test environment should look like in https://github.com/nwf/nodemcu-firmware/blob/dev-active/tests/NodeMCU_Test_Environment.rst

I don't anticipate any issues scaling my expect-based system to handling two devices, setting up an AP and STA configuration, and doing cross-device connectivity, but I have not had time to do it yet.

@HHHartmann
Copy link
Member

@HHHartmann HHHartmann commented Jun 28, 2020

seeing the discussion in #2984,

could we have an agreement to use gambiarra as base for out on device tests and depricate mispec?
I allready added more functionality to gambiarra for testing failing code and have plans to ease async testing by using coroutines.

@TerryE
Copy link
Collaborator

@TerryE TerryE commented Jun 28, 2020

could we have an agreement to use gambiarra as base for our on-device tests and deprecate mispec?

I am happy with this subject to the caveat that Lua core tests will continue to be based on the Lua test suite. It would be a huge effort to port these.

@nwf
Copy link
Member

@nwf nwf commented Jun 28, 2020

I think leaving the Lua core tests alone is fine, and I think I can be happy with gambiarra.

@HHHartmann
Copy link
Member

@HHHartmann HHHartmann commented Nov 26, 2020

Since we have a test system (NTest) now and @nwf is on tzhe way of hardware components test and using the new GitHub Actions is on the way the next step would be to automatically run tests on Nathaniels test board in the course of CI builds.

Doing this seems to be not too hard since GitHub allows self hosted runners for actions now.

As I read it only the runner has to be installed and then the needed steps can be defined in the .yml file.
5 Lines for downloading the firmware and the Lua sources + luac.cross (or ready made LFS image)
2 Lines for running the test suite script.

The only caveat is that it is not recommended for public repositories due to security implications.

See https://docs.github.com/en/free-pro-team@latest/actions/hosting-your-own-runners

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
8 participants