Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ic-ref-test: Test stopping state #63

Merged
merged 6 commits into from Dec 15, 2021
Merged

ic-ref-test: Test stopping state #63

merged 6 commits into from Dec 15, 2021

Conversation

nomeata
Copy link
Contributor

@nomeata nomeata commented Dec 4, 2021

This allows the test driver to withhold the response to a message, and
control when they are released, in order to produce situations with
outstanding call contexts.

In an ideal world (from our pov), we could instrument and control the
system's scheduler this way, but we can't. So instead, we use some tricks.
Ideally, the details of this trick are irrelevant to the users of this
function (yay, abstraction), and if we find better tricks, we can swap them
out easily. We'll see if that holds water.

One problem with this approach is that a test failure could mean that the
system doesn't pass the test, but it could also mean that the system has a
bug that prevents this trick from working, so take care.

The current trick is: Create a canister (the "stopper"). Make it its own
controller. Tell the canister to stop itself. This call will now hang,
because a canister cannot stop itself. We can release the call (producing a
reject) by starting the canister again.

Some tests are added, and others are now using this mechanism.

This has uncovered bugs in ic-ref related to the handling of stopped.

@nomeata
Copy link
Contributor Author

nomeata commented Dec 4, 2021

This is a draft because I found some problems with the IC.Ref code, related to stopping canisters and the handlng of response messages to stopped/empty/deleted canisters, which should not cause rejects, but be treated like traps. This might even be a bug in the spec itself, didn't check yet, and running out of time now. Will pick this up somem other time.

This allows the test driver to withhold the response to a message, and
control when they are released, in order to produce situations with
outstanding call contexts.

In an ideal world (from our pov), we could instrument and control the
system's scheduler this way, but we can't. So instead, we use some tricks.
Ideally, the details of this trick are irrelevant to the users of this
function (yay, abstraction), and if we find better tricks, we can swap them
out easily. We'll see if that holds water.

One problem with this approach is that a test failure could mean that the
system doesn't pass the test, but it could also mean that the system has a
bug that prevents this trick from working, so take care.

The current trick is: Create a canister (the "stopper"). Make it its own
controller. Tell the canister to stop itself. This call will now hang,
because a canister cannot stop itself. We can release the call (producing a
reject) by starting the canister again.

This has uncovered bugs in `ic-ref` related to the handling of stopped.
@nomeata
Copy link
Contributor Author

nomeata commented Dec 4, 2021

Ok, I patched over the most egregious errors in the reference implementation, so this test now passes.

@marcin-dziadus, do you want to check if it works with the replica as well before we merge it.

We can add more tests related to outstanding messages later. In particular I would like to test upgrading with outstanding messages, or callbacks to deleted canisters etc. This is something that we don't see a lot in the wild, is not tested here a lot, and maybe not so much in the replica's tests, so it's a bit of blind spot.

I think we can make the universal canister cope with callbacks after upgrades, or after deleting/reinstallation, by serializing the callback table.

@nomeata nomeata marked this pull request as ready for review December 4, 2021 18:33
@nomeata
Copy link
Contributor Author

nomeata commented Dec 4, 2021

(this PR was created while traveling in Ghana, on a mobile phone, with external keyboard, via mosh on a rented server. Also a way to work…)

@marcin-dziadus
Copy link
Contributor

Ok, I patched over the most egregious errors in the reference implementation, so this test now passes.

@marcin-dziadus, do you want to check if it works with the replica as well before we merge it.

We can add more tests related to outstanding messages later. In particular I would like to test upgrading with outstanding messages, or callbacks to deleted canisters etc. This is something that we don't see a lot in the wild, is not tested here a lot, and maybe not so much in the replica's tests, so it's a bit of blind spot.

I think we can make the universal canister cope with callbacks after upgrades, or after deleting/reinstallation, by serializing the callback table.

Sorry for the delay, I've been offline for a while.
I'm afraid that it wouldn't work with the replica, but this PR is not to blame here - the reference implementation diverged slightly from the replica. I'd suggest moving forward with the PR. Verifying that everything works after the replica catches up with the ref implementation will be on my plate. @nomeata What do you think?

@nomeata
Copy link
Contributor Author

nomeata commented Dec 15, 2021

Depends:

if we think the tests are correct with regard to the spec, and the replica isn't quite in compliance, then we should merge this, use the feature in the replica's CI to mark these tests as known-to-fail, and fix eventually in replica.

But if they fail in the replica because the tests are not good (maybe they need to wait a bit in some place or another?) we should fix the tests.

How do the tests fail against the replica?

@marcin-dziadus
Copy link
Contributor

Depends:

if we think the tests are correct with regard to the spec, and the replica isn't quite in compliance, then we should merge this, use the feature in the replica's CI to mark these tests as known-to-fail, and fix eventually in replica.

But if they fail in the replica because the tests are not good (maybe they need to wait a bit in some place or another?) we should fix the tests.

How do the tests fail against the replica?

Thanks for the clarification! Performance counting is currently uncovered in the replica which causes the canister parsing to fail.

@marcin-dziadus
Copy link
Contributor

Depends:
if we think the tests are correct with regard to the spec, and the replica isn't quite in compliance, then we should merge this, use the feature in the replica's CI to mark these tests as known-to-fail, and fix eventually in replica.
But if they fail in the replica because the tests are not good (maybe they need to wait a bit in some place or another?) we should fix the tests.
How do the tests fail against the replica?

Thanks for the clarification! Performance counting is currently uncovered in the replica which causes the canister parsing to fail.

I think it falls under the first type of situations you described.

@nomeata
Copy link
Contributor Author

nomeata commented Dec 15, 2021

But this PR is independent of performance counters, isn't it? I'm a bit confused now :-)

@marcin-dziadus
Copy link
Contributor

marcin-dziadus commented Dec 15, 2021

But this PR is independent of performance counters, isn't it? I'm a bit confused now :-)

It is :) My point is that running the most recent version of ic-ref against the replica is doomed to fail at the moment (because of a reason unrelated to your PR). Does that make sense?

Technically it is possible to check what you asked for, but not without a hassle.

@nomeata
Copy link
Contributor Author

nomeata commented Dec 15, 2021

Oh, I see. But it’s kinda unforunate to be blocked here. And we don’t need to: The script that runs ic-ref-test against the replica has provisions to exclude individual tests, see
https://github.com/dfinity/ic/blob/master/tests/ic-ref-test/run#L107-L135

So it should be possible to bump ic-ref-test in the replica’s CI to benefit from new tests even if some are not passing, by adding them to that file.

Anyways, sounds like we can merge this

@nomeata nomeata merged commit 22f0fff into master Dec 15, 2021
@nomeata nomeata deleted the joachim/stopping branch December 15, 2021 19:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants