Separate populating the queue from working off the queue #19

wvanbergen · 2017-07-14T11:50:47Z

I think I'd like to split up populating the queue, and working off the queue, so we can put a filter/order step in between. This allows us to remove tests we don't want to run, or order the list so the slowest ones get run first.

Retrieve list of tests. [Implementations: Minitest discovery, File, ...]
Annotate tests [e.g. Get statistics about every test case]
Filter/order based on annotations.
Populate to worker queue [Implementations: Memory, Redis]

Then, the existing worker code can work off this queue.

For this to work, we probably need to wrap every test case in a class, rather than representing it as a string, e.g. CI::Queue::Entry. This class would hold all annotation.
Annotations can come from the input list. This could be a more elaborate file format. For the minitest discovery, we could annotate test methods in the Ruby code.
Annotations can also come from external sources as a separate step. E.g. retrieve the duration data for all tests in the list from an external database.

@DazWorrall @casperisfine what do you think?

The text was updated successfully, but these errors were encountered:

casperisfine · 2017-07-17T09:46:43Z

It's already split off, here's how we initialize the queue right now:

  Minitest.queue = CI::Queue::Redis.new(
    TestGlobs.all_tests(seed: ENV['BUILDKITE_COMMIT']),
    build_id: build_id,
    worker_id: ENV.fetch('BUILDKITE_PARALLEL_JOB'),
    redis: CIRedis.connection,
    **queue_config,
  )

All queue implementations just stupidly take a list of strings (test identifiers). If we want to filter out parts of the tests, it can and should be done beforehand.

The only downside is that for resiliency purposes, all workers can potentially be elected master, which mean they all compute the list of tests, but only one actually get to populate the shared queue with it.

So if we were to have a costly way to reduce the list it would be a bit wasteful.

what do you think?

IMO if we want to filter out the tests, it should be the responsibility of a dedicated service which can hold state, record all tests runs results and refine it's heuristics based on that data.

e.g. curl https://example.com/commits/abcdef092332432/tests.txt.

casperisfine mentioned this issue Nov 16, 2017

Decouple queue instantiation and population #34

Merged

casperisfine closed this as completed in #34 Nov 17, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Separate populating the queue from working off the queue #19

Separate populating the queue from working off the queue #19

wvanbergen commented Jul 14, 2017

casperisfine commented Jul 17, 2017

Separate populating the queue from working off the queue #19

Separate populating the queue from working off the queue #19

Comments

wvanbergen commented Jul 14, 2017

casperisfine commented Jul 17, 2017