Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SDK-576 Design: Dfx testing story #354

Closed
wants to merge 14 commits into from
Closed

SDK-576 Design: Dfx testing story #354

wants to merge 14 commits into from

Conversation

matthewhammer
Copy link
Contributor

@matthewhammer matthewhammer commented Feb 3, 2020

Discussion about what we mean by dfx test.

This PR:

  • Defines end-to-end (e2e) testing, with a defined, limited scoped.
  • Defines functional testing, with a defined, limited scope.
  • Defines unit testing, with a defined, limited scope.

We focus on Motoko here, but not especially so; should work more generally.

Many open things left to nail down, especially around the color of the bike sheds. Discussion is very welcome.

@matthewhammer matthewhammer requested a review from a team as a code owner February 3, 2020 22:40
@chenyan-dfinity
Copy link
Collaborator

chenyan-dfinity commented Feb 3, 2020

A meta level question: How is build different from test? To me, it's all about running a script and inspecting the return code. If we are designing a DSL, we should aim to solve both processes, and possibly depolyment in one shot.

@hansl
Copy link
Contributor

hansl commented Feb 3, 2020

To me, it's all about running a script and inspecting the return code.

You are welcome to wrap moc around a script and run that. You don't even need to know where it is, use $(dfx cache show)/moc.

The goal of dfx is also as a general project management tool. Testing and Building have very different use cases.

@matthewhammer
Copy link
Contributor Author

Re: building versus testing. In the interests of reaching agreement, I want to avoid talking about building here, which seems separable in phase, but not totally independent, of course.

To me, testing is about engaging with the replica and seeing what some previously-built wasm code does on it dynamically; building is about obtaining that wasm, and that happens earlier, before testing. It seems separable to me in that sense (distinct phases, in the nix sense).

Of course, nothing is really separate from anything in terms of "functional dependencies" (think nix again), and I think those dependencies are what Yan is alluding to above in the DSL idea (maybe?).

While I think that we do want the beautiful, general solution eventually, a more modest proposal like this may help us understand what we want in detail without having to first build the most general thing we can imagine (which is a task I'd like to avoid in the short term). This proposal can be yet another midway point to that super-general paradise in the sky that we all want to reach, some day. Let's just be more modest in the short term. :)

Finally, FWIW, I also have no appetite (now or ever) to adopt whatever technology Google thinks is best for its engineering culture. My general feeling is that Google's preferences probably don't translate well for us, given how different they are from us as an org, especially around building and testing code. We have small repos with small numbers of devs; they have the opposite. They over-engineer everything because they have "too many" engineers. That's not our problem.


```
<dfx-result-pattern> ::= bind <test-script-var-id>
| match <candid-value-pattern>
Copy link
Contributor Author

@matthewhammer matthewhammer Feb 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we agree that match is the form we want, we can drop the less general bind form and have that each "dfx result pattern" is the same as a candid value pattern.

I think that makes the most sense, personally. I only included both cases to point out that the simpler (first) case is also useful on its own.

Copy link
Contributor

@nikclayton-dfinity nikclayton-dfinity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hope the feedback's helpful.

docs/design/dfx-testing-story.adoc Outdated Show resolved Hide resolved
docs/design/dfx-testing-story.adoc Outdated Show resolved Hide resolved
docs/design/dfx-testing-story.adoc Show resolved Hide resolved
docs/design/dfx-testing-story.adoc Show resolved Hide resolved
docs/design/dfx-testing-story.adoc Show resolved Hide resolved
docs/design/dfx-testing-story.adoc Show resolved Hide resolved
docs/design/dfx-testing-story.adoc Show resolved Hide resolved
docs/design/dfx-testing-story.adoc Outdated Show resolved Hide resolved
docs/design/dfx-testing-story.adoc Show resolved Hide resolved
Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Submitting a non-committal review to indicate that I'm following the conversation here but don't have strongly formed opinions yet. I guess I'd like to see some examples of where developers would put the e2e tests, how'd they'd be written, etc. I can extrapolate from this document but I'd love to see things explicitly stated.

@matthewhammer
Copy link
Contributor Author

@stanleygjones thanks for the note. Per your suggestion, I will add some concrete examples next.

@matthewhammer
Copy link
Contributor Author

I've added an example project tree, and some candidate example syntax for the test scripts.

@stanleygjones PTAL

};
};

//SimpleAdaptonDivByZero.go();
Copy link
Contributor Author

@matthewhammer matthewhammer Feb 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a common class of test: it's a single canister, and a single call (e.g., go() like here.)

So to be clear, this class is so common that I don't even want to write a script for any of them. The idea is that mentioning them in dfx.json is enough to get the test system to call their go function (or whatever we call it).

I call each a "functional test" since each one is much more than a unit test, but not as complex as a multi-canister test, or one that requires multiple calls. Within one call, we can do a pretty full test of a non-actor library's functionality, as illustrated here.

(Thanks for the idea @nikclayton-dfinity!)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that comment load bearing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the question.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has that comment semantic meaning?

"AliceBob": {
"_comment:": "test-script installs and scripts bots `alice` and `bob`"
"test-script": "test/aliceBob/test-script"
"aliceCanister": "test/aliceBob/aliceCanister.mo"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are those (already defined canisters) necessary? I see the need for a test-script key I just don't see the need to repeat canisters that are already listed in the /canisters key.

Copy link
Contributor Author

@matthewhammer matthewhammer Feb 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the ideas here are that

  • some canisters are just for tests, and are only defined for the tests that require them, and
  • separate name scoping of canister names within e2e tests and their test scripts, but not between distinct e2e tests; they are independent, by assumption. So, distinct e2e tests do not share canister names, and can redefine them to point at different files, with different implementations (I don't do that here, though).

So, the aliceCanister, bobCanister and charlieCanister keys are only defined for the e2e test that defines them. So, they have to be defined if they are reused, as they are here. The idea was to show exactly this kind of sharing, given the restricted scoping of the canister names.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there reasons to scope them? It seems this just causes more copy-paste.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separate namespaces.

I expect that separate namespaces will generally be more useful than one big shared global one, where you have to "mimic" the separation I want here by introducing long names for everything. In terms of implementation complexity, separate namespaces is no more complex either; there's no implementation-based reason not to support them.

In terms of user value: If I have two test scripts that introduce two cansiters called testBot, implemented in two different directories, it's the filesystem structure that distinguishes them as files; as canisters that appear in test scripts, we need to distinguish them to give them the same short name, like testBot, and know that this same short name is never ambiguous. Separate namespaces (much like the scoping rules of any PL with namespaces/modules) is the answer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems this just causes more copy-paste.

I don't follow this concern.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing some boiler-plate copy-paste; aliceCanister seems to always point to test/aliceBob/aliceCanister.mo, so maybe instead of doing the name => main_file map, we could just list the canisters we want to expose. Maybe if we move those test definitions to their own files so they free up dfx.json would also be better. WDYT?

==> #ok(bobId)
install aliceCanister
==> #ok(aliceId)
call bobId addFriend aliceId
Copy link
Contributor

@hansl hansl Feb 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With inter-canister calls, could you elaborate on your rationale for using a DSL instead of actual Motoko?

In Motoko:

let result = await bob.addFriend(aliceId);
assert(result == Ok); // or some Motoko

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is scoping.

The intention of this Alice canister code is that it does not import (or know about) Bob's code statically, and vice versa. They are "test bots" written independently, and only linked dynamically, by a test script.

I realize that this tiny example could mention those names using aliases; but, the point here was to illustrate a multi-canister test that does not use aliases, and still has canisters interact.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does bobId come from? Is it a local identifier? How is its name derived?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, bobId is a test script variable, bound by the result pattern #ok(bobId). This pattern will not match an error/trap result, only a successful one.

In general, these patterns may have variables, which may appear free in subsequent actions, as illustrated here with bobId.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha so ==> #ok(abc) is also a declaration for abc? I need to read the rest of this document, sorry ^_^

@hansl hansl changed the title Design: Dfx testing story SDK-576 Design: Dfx testing story Feb 5, 2020
@matthewhammer
Copy link
Contributor Author

Can we figure out what next steps to take here without having a meeting? (That's not a sarcastic question.)

Since the first draft, the latest idea here introduces three classes of tests:

  1. unit tests: one canister does all tests, defined by a language-specific library
  2. functional tests: one test = one special test canister, whose result is binary (either fails or succeeds)
  3. e2e tests: one test is many canisters, whose installation may be complex and require a test-script

If we say that number 1 is a special case of number 2, and say that number 3 is for "later" (TBD), then we can focus on 2 without much more bikeshedding, I think.

The latest threads of discussion between myself and @hansl above are around the exact formal semantics (and perhaps concrete syntax) for the test scripts in item 3. I say we defer resolving those discussions until we have 1 and 2 working in some capacity.

I think this captures the collective sentiment of the standup today. What do others think? (cc @stanleygjones)

@matthewhammer
Copy link
Contributor Author

In parallel with waiting for the discussion to resolve, I will make a first attempt at doing what I said just above, implement dfx test func (and variants), but not dfx test e2e (and variants).

I expect that the unit testing command dfx test unit will be a minor variation of dfx test func, perhaps with some extra assumptions or defaults.

We can wait for the full testing story, including multi-canister e2e tests, until we a working story for simpler tests.

@matthewhammer
Copy link
Contributor Author

Update: I looked into this, but I fear that dfx is not designed or implemented today to make any version of this task very simple or localized.

Today, each dfx "command" makes the assumption that the caller is a shell script or human writing something against the CLI.

That's reasonable in simple use cases, but it is simply not the case for dfx test, where those same dfx commands (start/stop, install, call, etc.) need to be "scriptable" from within ordinary Rust code, and ideally, via ordinary Rust argument and result types (not CLI-based strings).

To meet this need, as well as more automation needs, I think we need a more systematic architecture changes to dfx that separate functions for this behavior (in Rust) from the layer of Rust that processes strings from the CLI-using human or shell script. Ideally, the CLI/string-processing logic would be isolated into a very small part of the dfx source code, and not spread across every single implementation of every command, as it is today. (This plea is not new; I've been saying for some time now.). However, given the state of the source code today, and how long this issue has persisted as we've built out commands for dfx, separating the string-processsing logic from abstract behavior now is dramatic, and touches everything. This is unfortunate.

However, as things stand now, it seems like wasted effort to implement anything for dfx test until we have fixed the architecture so that this code that be written as ordinary Rust code, using ordinary Rust modules and ordinary function types (not code that "introduces" arguments to be processed by other Rust string-processing; yuck). But building on what we have now would mean building dfx test as (morally) a bunch of "shell scripts in Rust", which seems backwards, and hence, like effort that is creating 100% technical debt as output.

Even so, I assume that this separation I am attempting to advocate will also be controversial.

In any case, it is involved. It is more than a simple PR, and has a global scope, well beyond any one JIRA story. I assume that these facts are part of the reason we can keep avoiding this alternative course, or even discussing it earlier. Nevertheless, I'd like to discuss it before seeing dfx test get implemented based on what we have now.

@matthewhammer
Copy link
Contributor Author

matthewhammer commented Apr 22, 2020

Update: @nomeata's demo (from Global R&D) today subsumes many of the ideas from this PR and actually implements them. It targets the ic-ref framework. It is independent of dfx, for better or for worse.

Here's a little screenshot from the demo, for posterity:
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants