Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Session Bin Test Fixes #296

Merged
merged 29 commits into from
Dec 13, 2021
Merged

Session Bin Test Fixes #296

merged 29 commits into from
Dec 13, 2021

Conversation

CMCDragonkai
Copy link
Member

@CMCDragonkai CMCDragonkai commented Nov 26, 2021

Description

Fixing sessions bin tests.

Tasks

  1. Should be deleting the parallel tests with a new issue
  2. Remove nested describes
  3. Remove meta?: Metadata and replace it with: auth Session Bin Test Fixes #296 (comment)
  4. Remove unnecessary process.stdout.write calls Session Bin Test Fixes #296 (comment)
  5. Move process* options into before the try block that establishes the PolykeyClient.
  6. Fix up all other tests that has been affected by the changes in this PR

Final checklist

  • Domain specific tests
  • Full tests
  • Updated inline-comment documentation
  • Lint fixed
  • Squash and rebased
  • Sanity check the final build

@CMCDragonkai CMCDragonkai changed the title Sessions bin test fixes Session Bin Test Fixes Nov 26, 2021
@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented Nov 26, 2021

@emmacasolin the nested describes haven't been removed yet. You should also be removing all the nested describes.

@emmacasolin
Copy link
Contributor

This is ready for review/merge. Follow up on the concurrent tests will be done in MatrixAI/Polykey-CLI#18

@CMCDragonkai
Copy link
Member Author

Can you rebase this on top of the BIP 39 PR?

@emmacasolin
Copy link
Contributor

Can you rebase this on top of the BIP 39 PR?

Done

@CMCDragonkai
Copy link
Member Author

Needs squash and rebase on master.

@CMCDragonkai
Copy link
Member Author

This rebase wasn't done correctly. You should have squashed before rebasing. There are several merge conflicts now. @emmacasolin

@CMCDragonkai
Copy link
Member Author

Rebasing now nad resolving conflicts.

@CMCDragonkai CMCDragonkai force-pushed the bin-test-fixes branch 2 times, most recently from 1230250 to 40af8eb Compare December 7, 2021 01:12
@CMCDragonkai
Copy link
Member Author

Ok fully rebased. Now there's only 1 file change.

@CMCDragonkai
Copy link
Member Author

The tests/bin/sessions.test.ts is a bit strange cause there's no independent command called pk sessions. So instead of refactoring it like the other commands, I'll keep it as tests/bin/sessions.test.ts and just use the pk agent status command to test the entire CARL loop.

@CMCDragonkai
Copy link
Member Author

To finish the session testing properly I need:

  • tests/bin/agent/status.test.ts - we will be using the status command
  • tests/bin/agent/stop.test.ts - we will be using the stop command
  • tests/bin/sessions.test.ts - we will be using the global pkAgent here

This also means I'll be trying out https://homoly.me/posts/organizing-tests-with-jest-projects to make the tests/bin a separate "project" that enforces running in band so the same polykey agent can be shared. https://jestjs.io/docs/cli#--runinband. This will ensure that multiple tests that are all using the same polykey agent don't try to run at the same time. Which will save time even if all the tests are not parallelised. This means we don't pay the 10 second startup time all the time. Alternatively we could also use IPC fd locking in the beforeAll and afterAll to ensure that certain test modules are serialised when using the global pk agent. I think the timeouts don't apply to the beforeAll and afterAll @joshuakarp?

@CMCDragonkai
Copy link
Member Author

The agentStop handler was incorrect. This is the right way:

    agentStop: async (
      call: grpc.ServerUnaryCall<utilsPB.EmptyMessage, utilsPB.EmptyMessage>,
      callback: grpc.sendUnaryData<utilsPB.EmptyMessage>,
    ): Promise<void> => {
      try {
        const metadata = await authenticate(call.metadata);
        call.sendMetadata(metadata);
        const response = new utilsPB.EmptyMessage();
        // Respond first to close the GRPC connection
        callback(null, response);
      } catch (err) {
        callback(grpcUtils.fromError(err), null);
      }
      // Stop is called after GRPC resources are cleared
      await polykeyAgent.stop();
    },

This also means stopping the pk agent, one must also wait for the status to be dead before fully finishing. I can add this into our pkAgent utility function. This is because calling agent stop is an "asynchronous" call, the side effects occur after the call has already came back.

@joshuakarp
Copy link
Contributor

I think the timeouts don't apply to the beforeAll and afterAll @joshuakarp?

As discussed, timeouts do apply to beforeAll and afterAll - worth knowing for everyone else too. This also came up when we were figuring out the timeouts for testing - another comment from me here #178 (comment)

@CMCDragonkai
Copy link
Member Author

Ok in that case I'll try with a "max timeout" and a usage of fd-lock when using pkAgent. This would mean that beforeAll and afterAll locks and unlocks, and therefore blocks all other uses of pkAgent. This can enable that some test modules be serialised in tests/bin, while others can still run independently.

@CMCDragonkai
Copy link
Member Author

A few things about pk agent stop.

  • If DEAD, it should return with 0
  • If STARTING it should queue up and wait for LIVE before sending stop
  • If LIVE, it just stop and succeed
  • If STOPPING, it should return with 0

While doing this, it can output STDERR messages about what it is doing.

This makes pk agent stop idempotent.

@CMCDragonkai
Copy link
Member Author

Can't do this:

If STARTING it should queue up and wait for LIVE before sending stop

Because the waitFor mechanism isn't robust enough. It can be in a infinite loop if the state changes to something else and never gets back. So for now it will throw a new error ErrorCLIStatusStarting.

@CMCDragonkai
Copy link
Member Author

The tests/status is a bit flaky right now probably to deal with the waitFor.

@CMCDragonkai
Copy link
Member Author

Will be addressing the waitFor testing as well to make it less flaky as a stop gap until we have better IPC locking for #290.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants