SDK-576 Design: Dfx testing story #354

matthewhammer · 2020-02-03T22:40:49Z

Discussion about what we mean by dfx test.

This PR:

Defines end-to-end (e2e) testing, with a defined, limited scoped.
Defines functional testing, with a defined, limited scope.
Defines unit testing, with a defined, limited scope.

We focus on Motoko here, but not especially so; should work more generally.

Many open things left to nail down, especially around the color of the bike sheds. Discussion is very welcome.

docs/design/dfx-testing-story.adoc

chenyan-dfinity · 2020-02-03T22:49:52Z

A meta level question: How is build different from test? To me, it's all about running a script and inspecting the return code. If we are designing a DSL, we should aim to solve both processes, and possibly depolyment in one shot.

hansl · 2020-02-03T22:56:17Z

To me, it's all about running a script and inspecting the return code.

You are welcome to wrap moc around a script and run that. You don't even need to know where it is, use $(dfx cache show)/moc.

The goal of dfx is also as a general project management tool. Testing and Building have very different use cases.

matthewhammer · 2020-02-03T23:08:48Z

Re: building versus testing. In the interests of reaching agreement, I want to avoid talking about building here, which seems separable in phase, but not totally independent, of course.

To me, testing is about engaging with the replica and seeing what some previously-built wasm code does on it dynamically; building is about obtaining that wasm, and that happens earlier, before testing. It seems separable to me in that sense (distinct phases, in the nix sense).

Of course, nothing is really separate from anything in terms of "functional dependencies" (think nix again), and I think those dependencies are what Yan is alluding to above in the DSL idea (maybe?).

While I think that we do want the beautiful, general solution eventually, a more modest proposal like this may help us understand what we want in detail without having to first build the most general thing we can imagine (which is a task I'd like to avoid in the short term). This proposal can be yet another midway point to that super-general paradise in the sky that we all want to reach, some day. Let's just be more modest in the short term. :)

Finally, FWIW, I also have no appetite (now or ever) to adopt whatever technology Google thinks is best for its engineering culture. My general feeling is that Google's preferences probably don't translate well for us, given how different they are from us as an org, especially around building and testing code. We have small repos with small numbers of devs; they have the opposite. They over-engineer everything because they have "too many" engineers. That's not our problem.

matthewhammer · 2020-02-03T23:14:26Z

docs/design/dfx-testing-story.adoc

+
+```
+<dfx-result-pattern> ::= bind <test-script-var-id>
+                      | match <candid-value-pattern>


If we agree that match is the form we want, we can drop the less general bind form and have that each "dfx result pattern" is the same as a candid value pattern.

I think that makes the most sense, personally. I only included both cases to point out that the simpler (first) case is also useful on its own.

nikclayton-dfinity

Hope the feedback's helpful.

docs/design/dfx-testing-story.adoc

ghost

Submitting a non-committal review to indicate that I'm following the conversation here but don't have strongly formed opinions yet. I guess I'd like to see some examples of where developers would put the e2e tests, how'd they'd be written, etc. I can extrapolate from this document but I'd love to see things explicitly stated.

matthewhammer · 2020-02-04T17:41:36Z

@stanleygjones thanks for the note. Per your suggestion, I will add some concrete examples next.

matthewhammer · 2020-02-04T21:47:37Z

I've added an example project tree, and some candidate example syntax for the test scripts.

@stanleygjones PTAL

docs/design/dfx-testing-story-example/dfx.json

…ingle call

matthewhammer · 2020-02-04T21:53:32Z

docs/design/dfx-testing-story-example/test/simpleAdaptonDivByZero.mo

+  };
+};
+
+//SimpleAdaptonDivByZero.go();


This is a common class of test: it's a single canister, and a single call (e.g., go() like here.)

So to be clear, this class is so common that I don't even want to write a script for any of them. The idea is that mentioning them in dfx.json is enough to get the test system to call their go function (or whatever we call it).

I call each a "functional test" since each one is much more than a unit test, but not as complex as a multi-canister test, or one that requires multiple calls. Within one call, we can do a pretty full test of a non-actor library's functionality, as illustrated here.

(Thanks for the idea @nikclayton-dfinity!)

Is that comment load bearing?

I don't understand the question.

Has that comment semantic meaning?

hansl · 2020-02-04T22:12:10Z

docs/design/dfx-testing-story-example/dfx.json

+                "AliceBob": {
+                    "_comment:": "test-script installs and scripts bots `alice` and `bob`"
+                    "test-script": "test/aliceBob/test-script"
+                    "aliceCanister": "test/aliceBob/aliceCanister.mo"


Are those (already defined canisters) necessary? I see the need for a test-script key I just don't see the need to repeat canisters that are already listed in the /canisters key.

the ideas here are that

some canisters are just for tests, and are only defined for the tests that require them, and

separate name scoping of canister names within e2e tests and their test scripts, but not between distinct e2e tests; they are independent, by assumption. So, distinct e2e tests do not share canister names, and can redefine them to point at different files, with different implementations (I don't do that here, though).

So, the aliceCanister, bobCanister and charlieCanister keys are only defined for the e2e test that defines them. So, they have to be defined if they are reused, as they are here. The idea was to show exactly this kind of sharing, given the restricted scoping of the canister names.

Are there reasons to scope them? It seems this just causes more copy-paste.

Separate namespaces.

I expect that separate namespaces will generally be more useful than one big shared global one, where you have to "mimic" the separation I want here by introducing long names for everything. In terms of implementation complexity, separate namespaces is no more complex either; there's no implementation-based reason not to support them.

In terms of user value: If I have two test scripts that introduce two cansiters called testBot, implemented in two different directories, it's the filesystem structure that distinguishes them as files; as canisters that appear in test scripts, we need to distinguish them to give them the same short name, like testBot, and know that this same short name is never ambiguous. Separate namespaces (much like the scoping rules of any PL with namespaces/modules) is the answer.

It seems this just causes more copy-paste.

I don't follow this concern.

Removing some boiler-plate copy-paste; aliceCanister seems to always point to test/aliceBob/aliceCanister.mo, so maybe instead of doing the name => main_file map, we could just list the canisters we want to expose. Maybe if we move those test definitions to their own files so they free up dfx.json would also be better. WDYT?

hansl · 2020-02-04T22:13:39Z

docs/design/dfx-testing-story-example/test/aliceBob/test-script

+==> #ok(bobId)
+install aliceCanister
+==> #ok(aliceId)
+call bobId addFriend aliceId


With inter-canister calls, could you elaborate on your rationale for using a DSL instead of actual Motoko?

In Motoko:

let result = await bob.addFriend(aliceId); assert(result == Ok); // or some Motoko

The issue is scoping.

The intention of this Alice canister code is that it does not import (or know about) Bob's code statically, and vice versa. They are "test bots" written independently, and only linked dynamically, by a test script.

I realize that this tiny example could mention those names using aliases; but, the point here was to illustrate a multi-canister test that does not use aliases, and still has canisters interact.

Where does bobId come from? Is it a local identifier? How is its name derived?

Yes, bobId is a test script variable, bound by the result pattern #ok(bobId). This pattern will not match an error/trap result, only a successful one.

In general, these patterns may have variables, which may appear free in subsequent actions, as illustrated here with bobId.

Ha so ==> #ok(abc) is also a declaration for abc? I need to read the rest of this document, sorry ^_^

matthewhammer · 2020-02-05T23:04:07Z

Can we figure out what next steps to take here without having a meeting? (That's not a sarcastic question.)

Since the first draft, the latest idea here introduces three classes of tests:

unit tests: one canister does all tests, defined by a language-specific library
functional tests: one test = one special test canister, whose result is binary (either fails or succeeds)
e2e tests: one test is many canisters, whose installation may be complex and require a test-script

If we say that number 1 is a special case of number 2, and say that number 3 is for "later" (TBD), then we can focus on 2 without much more bikeshedding, I think.

The latest threads of discussion between myself and @hansl above are around the exact formal semantics (and perhaps concrete syntax) for the test scripts in item 3. I say we defer resolving those discussions until we have 1 and 2 working in some capacity.

I think this captures the collective sentiment of the standup today. What do others think? (cc @stanleygjones)

matthewhammer · 2020-02-05T23:09:36Z

In parallel with waiting for the discussion to resolve, I will make a first attempt at doing what I said just above, implement dfx test func (and variants), but not dfx test e2e (and variants).

I expect that the unit testing command dfx test unit will be a minor variation of dfx test func, perhaps with some extra assumptions or defaults.

We can wait for the full testing story, including multi-canister e2e tests, until we a working story for simpler tests.

matthewhammer · 2020-02-11T17:06:29Z

Update: I looked into this, but I fear that dfx is not designed or implemented today to make any version of this task very simple or localized.

Today, each dfx "command" makes the assumption that the caller is a shell script or human writing something against the CLI.

That's reasonable in simple use cases, but it is simply not the case for dfx test, where those same dfx commands (start/stop, install, call, etc.) need to be "scriptable" from within ordinary Rust code, and ideally, via ordinary Rust argument and result types (not CLI-based strings).

To meet this need, as well as more automation needs, I think we need a more systematic architecture changes to dfx that separate functions for this behavior (in Rust) from the layer of Rust that processes strings from the CLI-using human or shell script. Ideally, the CLI/string-processing logic would be isolated into a very small part of the dfx source code, and not spread across every single implementation of every command, as it is today. (This plea is not new; I've been saying for some time now.). However, given the state of the source code today, and how long this issue has persisted as we've built out commands for dfx, separating the string-processsing logic from abstract behavior now is dramatic, and touches everything. This is unfortunate.

However, as things stand now, it seems like wasted effort to implement anything for dfx test until we have fixed the architecture so that this code that be written as ordinary Rust code, using ordinary Rust modules and ordinary function types (not code that "introduces" arguments to be processed by other Rust string-processing; yuck). But building on what we have now would mean building dfx test as (morally) a bunch of "shell scripts in Rust", which seems backwards, and hence, like effort that is creating 100% technical debt as output.

Even so, I assume that this separation I am attempting to advocate will also be controversial.

In any case, it is involved. It is more than a simple PR, and has a global scope, well beyond any one JIRA story. I assume that these facts are part of the reason we can keep avoiding this alternative course, or even discussing it earlier. Nevertheless, I'd like to discuss it before seeing dfx test get implemented based on what we have now.

matthewhammer · 2020-04-22T15:21:03Z

Update: @nomeata's demo (from Global R&D) today subsumes many of the ideas from this PR and actually implements them. It targets the ic-ref framework. It is independent of dfx, for better or for worse.

Here's a little screenshot from the demo, for posterity:

initial draft

ad0cb70

matthewhammer requested a review from a team as a code owner February 3, 2020 22:40

matthewhammer added 2 commits February 3, 2020 15:43

nits

51cea0a

nits

01c5858

hansl reviewed Feb 3, 2020

View reviewed changes

docs/design/dfx-testing-story.adoc Outdated Show resolved Hide resolved

matthewhammer mentioned this pull request Feb 3, 2020

Draft: Simplify DFX behavior #350

Closed

matthewhammer added 2 commits February 3, 2020 15:53

nits

7483a22

spelling

40c45b2

matthewhammer commented Feb 3, 2020

View reviewed changes

write down loose ends

b137784

nikclayton-dfinity reviewed Feb 4, 2020

View reviewed changes

ghost reviewed Feb 4, 2020

View reviewed changes

add examples of the design

6abc559

matthewhammer commented Feb 4, 2020

View reviewed changes

docs/design/dfx-testing-story-example/dfx.json Show resolved Hide resolved

add example functional tests; each is just a single canister, and a s…

caa802b

…ingle call

matthewhammer commented Feb 4, 2020

View reviewed changes

matthewhammer added 2 commits February 4, 2020 14:58

fix typos

2ad18e0

JSON canister names agree with those used in test-script files

c3a36d2

hansl reviewed Feb 4, 2020

View reviewed changes

matthewhammer added 4 commits February 4, 2020 15:15

clean up definitions for test categories

10df321

nit

58b05fa

add (empty) example files so that the functional tests make more sense

315a202

add new files to tree listed in doc

672ace7

hansl changed the title ~~Design: Dfx testing story~~ SDK-576 Design: Dfx testing story Feb 5, 2020

hansl mentioned this pull request Mar 20, 2020

A developer should be able to run tests with dfx test #473

Closed

matthewhammer closed this Apr 10, 2020

matthewhammer deleted the dfx-test branch April 10, 2020 16:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SDK-576 Design: Dfx testing story #354

SDK-576 Design: Dfx testing story #354

matthewhammer commented Feb 3, 2020 •

edited

Loading

chenyan-dfinity commented Feb 3, 2020 •

edited

Loading

hansl commented Feb 3, 2020

matthewhammer commented Feb 3, 2020

matthewhammer Feb 3, 2020 •

edited

Loading

nikclayton-dfinity left a comment

ghost left a comment

matthewhammer commented Feb 4, 2020

matthewhammer commented Feb 4, 2020

matthewhammer Feb 4, 2020 •

edited

Loading

hansl Feb 5, 2020

matthewhammer Feb 5, 2020

hansl Feb 6, 2020

hansl Feb 4, 2020

matthewhammer Feb 4, 2020 •

edited

Loading

hansl Feb 5, 2020

matthewhammer Feb 5, 2020

matthewhammer Feb 5, 2020

hansl Feb 6, 2020

hansl Feb 4, 2020 •

edited

Loading

matthewhammer Feb 4, 2020

hansl Feb 5, 2020

matthewhammer Feb 5, 2020

hansl Feb 6, 2020

matthewhammer commented Feb 5, 2020

matthewhammer commented Feb 5, 2020

matthewhammer commented Feb 11, 2020

matthewhammer commented Apr 22, 2020 •

edited

Loading

SDK-576 Design: Dfx testing story #354

SDK-576 Design: Dfx testing story #354

Conversation

matthewhammer commented Feb 3, 2020 • edited Loading

chenyan-dfinity commented Feb 3, 2020 • edited Loading

hansl commented Feb 3, 2020

matthewhammer commented Feb 3, 2020

matthewhammer Feb 3, 2020 • edited Loading

Choose a reason for hiding this comment

nikclayton-dfinity left a comment

Choose a reason for hiding this comment

ghost left a comment

Choose a reason for hiding this comment

matthewhammer commented Feb 4, 2020

matthewhammer commented Feb 4, 2020

matthewhammer Feb 4, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

matthewhammer Feb 4, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hansl Feb 4, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

matthewhammer commented Feb 5, 2020

matthewhammer commented Feb 5, 2020

matthewhammer commented Feb 11, 2020

matthewhammer commented Apr 22, 2020 • edited Loading

matthewhammer commented Feb 3, 2020 •

edited

Loading

chenyan-dfinity commented Feb 3, 2020 •

edited

Loading

matthewhammer Feb 3, 2020 •

edited

Loading

matthewhammer Feb 4, 2020 •

edited

Loading

matthewhammer Feb 4, 2020 •

edited

Loading

hansl Feb 4, 2020 •

edited

Loading

matthewhammer commented Apr 22, 2020 •

edited

Loading