Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mutexes / external resources #2048

Closed
be5invis opened this issue Feb 22, 2019 · 18 comments
Closed

Mutexes / external resources #2048

be5invis opened this issue Feb 22, 2019 · 18 comments

Comments

@be5invis
Copy link

be5invis commented Feb 22, 2019

Sometimes we need to limit some of the tests to run serial, or having a maximum parallel jobs, or wait some external resource to be prepared. Therefore introducing such concepts would be helpful for network-oriented tests or stress tests.

Purposed API:

test('Some stress test', async t => {
    const handle = await t.resources('stress-test-capacity').acquire(1000);
    // The "stress-test-capacity" is a kind of resource, we need 1000 of them.
    //
    // All resources would be automatically released after the test ends.
    // Alternatively, use handle.release() to manually release a resource.
});
@sindresorhus
Copy link
Member

@be5invis It's not clear what 1000 means in your example. And what does resources('memory') actually do? Is memory the key? Would be nice if you could elaborate a bit more about the proposed API.

@be5invis
Copy link
Author

be5invis commented Feb 22, 2019

@sindresorhus

t.resource would return a resource manager of a given resource kind (i.e., memory of DB connection), which allocates resource with a given amount.

interface TestContext {
    resource(id: string): IResourceManager ;
}

interface IResourceManager {
    // Acquire resource with given amount
    acquire(amount: 1000): Promise<IResourceHandle>;
}

interface IResourceHanlde {
    release(): void;
}

Note that the memory resource works pretty like a semaphore: it does not really do malloc, only ensures that there are not too many tests running in parallel.

@sindresorhus
Copy link
Member

But what is 1000 and who enforces/controls its usage? How would AVA enforce this? AVA could easily control concurrency, but not memory usage and other things.

@be5invis
Copy link
Author

be5invis commented Feb 22, 2019

@sindresorhus
1000 is simply a number, a number representing the resource quantity that the test claimed it will use. It has no link to the actual memory usage. Ah I should name it stress-test-capacity instead of memory.
Resource managers will take this number, and decide whether to resolve the promise: resolve, wait or throw a error (unable to allocate).

@novemberborn
Copy link
Member

Within a test file there's plenty of Node.js modules to help you achieve this. I assume you want to coordinate resources across test files. Could you elaborate on that use case?

@be5invis
Copy link
Author

@novemberborn Yes, coordinate resources across tests.
I have a lib with some stress tests, may take a lot of memory or whatever, so I want to limit the parallelism among them -- they are distributes in multiple files, belonging to multiple sub modules.

@lo1tuma
Copy link
Contributor

lo1tuma commented Feb 25, 2019

IMHO there is no need for builtin support in ava. We use throat to limit the amount of parallel runs of tests using puppeteer.

@be5invis
Copy link
Author

@lo1tuma
Does Throat work across Ava's worker processes?

@lo1tuma
Copy link
Contributor

lo1tuma commented Feb 25, 2019

@be5invis good point, probably it doesn’t. I have all the puppeteer-based tests in the same file, so that wasn’t an issue.

@novemberborn
Copy link
Member

I think this is a duplicate of #1366. @be5invis what do you feel about moving the discussion there?

In order to make progress on this we need to flesh out use cases further, and make the argument for why this can't be done by other modules building on top of AVA. Our proposal process is lacking at the moment, so for now bullet points in GitHub comments will suffice.

@be5invis
Copy link
Author

be5invis commented Feb 25, 2019 via email

@novemberborn
Copy link
Member

Yes but what if we can find a solution that works for both? Roughly, if you could start a "resource manager" that can communicate with test workers, you could use that to share random URLs or mutex states.

@be5invis
Copy link
Author

@novemberborn
A resource manager could do both, but in general this issue and #1366 are two different aspects. I recommend handle them separately.

@novemberborn
Copy link
Member

I'm interested in solving the commonalities between those issues, so that specific solutions can be built on top of that.

@schmod
Copy link

schmod commented Jun 26, 2019

Agreed that this is a slightly different issue than #1366.

My $0.02 is that a simply string-based mutex (powered by something like live-mutex) would allow for a clean API, and wouldn't require any additional configuration/setup. Resources would simply be assigned an arbitrary (string) name by the test author, and the mutex broker wouldn't need to know anything about the tests/resources.

The only drawback is that tests wouldn't be able to request "how much" of a resource they consume.

I'm not sure how many real-world scenarios would actually benefit from quantified resource-management. In my experiences, I'm either trying to isolate a handful of tests that are memory-hogs, or to disallow any concurrent access to a particular file, database, or database table.

I don't really see many test authors micromanaging the specific amount of memory that a single test requires, nor do I envision many use-cases where a testing database can handle 2 concurrent connections, but not 8.


I'd propose two APIs -- one that restricts concurrent access to a resource across all test processes, and another that's only scoped to a local process.

The second (local) kind of mutex would make it far easier to use existing spy/stub libraries (ie. Sinon) with Ava (#1825).

test('should limit concurrency', async (t) => {
  await t.mutex('db');
  // or this shorthand for taking out multiple mutexes at once
  await t.mutex(['db', 'lotsOfMemory']);
});
test('using stubs', async (t) => {
  await t.localMutex('userStub');
  const userStub = sinon.stub(userService, 'getUser').returns({ name: 'jane' });
});

We could also provide a call-signature that allows us to define a "supplier" and "release handler" that explicitly associates the mutex with an object, and runs any code necessary to "clean up" the shared resource when its mutex is released (either automatically at the end of a test, or by manually calling handle.release())

test('release handler', async(t) => {
  const [handle, userStub] = await t.mutex(
    'userStub', 
    () => sinon.stub(userService, 'getUser'), 
    (userStub) => userStub.restore()
  );
});

This also creates a slightly cleaner/nicer and type-safe alternative to many uses of before and beforeEach hooks, allowing shared resources to be instead instantiated and obtained via a helper function (rather than relying on some combination of beforeEach, context and/or hoisted variables). Conceptually, this would be vaguely reminiscent of the underlying patterns behind React Hooks.

@novemberborn
Copy link
Member

@schmod I like it!

To clarify, I think "local" should mean the "test file". We could name that fileMutex. But I wonder if mutex should be scoped to the test file, and then we could have a globalMutex to manage mutexes across concurrently executing files?

I like both automatically releasing mutexes, and allowing it to be managed more directly. We could support both I reckon?

@ronen
Copy link
Contributor

ronen commented Nov 20, 2019

I'm not sure how many real-world scenarios would actually benefit from quantified resource-management. ... nor do I envision many use-cases where a testing database can handle 2 concurrent connections, but not 8.

I've got a scenario that would qualify, I think. I'm essentially doing integration testing on my front-end JS code communicating with a back-end server which I run locally for testing. I'd like to run my tests in parallel, with the server handling an independent connection from each test. But I may need to limit the number of simultaneous connections to the server so I don't thrash my local machine. In this case it's not a question of 2 or 8 concurrent connections but potentially many more: I've currently got ~65 such integration tests, and of course that number keeps growing.

[Careful readers will note that in #2268 I said my backend server doesn't support simultaneous requests. But that will hopefully be changing soon, in which case I'd like to take advantage of that to parallelize my tests. But I'm concerned about too much parallelism!]

Being able to limit the amount of simultaneous access to a resource feels to me to be analogous to the --concurrency CLI option & default limit, which I think nobody debates as an unreasonable thing to have :)

One simple way to handle this might be to have a --concurrent-tests CLI option to would limit the total number of tests that are run simultaneously, though that would be overly restrictive if only some of the tests connect to the resource. Even if --concurrent-tests only affect to the number of concurrent tests in a single file, I could get some overall control of.

Thinking about it more, I think that case actually is the same as #1366, but with concurrency limits. Will go make a comment there...

@novemberborn
Copy link
Member

AVA now has experimental shared worker support. Code runs in a worker thread, in the main process, and can communicate with test workers.

The proposals made in this issue could be implemented using this infrastructure. Please join us in #2605.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants