Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skipping some specs #15

Closed
jankeesvw opened this issue Jun 23, 2016 · 6 comments
Closed

Skipping some specs #15

jankeesvw opened this issue Jun 23, 2016 · 6 comments

Comments

@jankeesvw
Copy link

We use Knapsack Pro with Buildkite and we have currently 8 nodes running our whole test suite, works pretty great. Except for one thing;

We have noticed that when a node fails and you click retry on that specific node another set of specs get assigned.

Is there a way to set a fixed seed value so all nodes get the same specs every time you rebuild for a certain version (maybe based on the git sha)?

@ArturT
Copy link
Member

ArturT commented Jun 23, 2016

Hi thanks for info. This is similar problem as in #12

The idea with seed sounds good. I'm wondering about doing it this way:
When you start tests for particular commit sha and total number of nodes then the first node that requested Knapsack Pro API to get list of tests will force API to generated the seed token.
Seed will have expiration timestamp like 1 hour. Other nodes will call API with the same git commit sha and total number of nodes so the Knapsack Pro API will know that for this commit there is existing seed and it will use the cached seed response with list of tests.

I will add seed token in logger output so you will be able to see if all your nodes are running based on the same seed token.

This should solve the problem with retrying the failed node with the same tests.

What do you think? 1h expiration time for seed token would be enough? We can extend cached seed response on API side by 1h every time when you read the response from API so you will be able to retry failed node as long as you do retry in 1h from last failure.

There is 1 disadvantage by caching the response by using the seed. When you run tests and we collected the data from nodes and you will run tests for the same commit within 1h you won't run them base on most up to date time execution data. You will run them based on cached seed data. We can assume it's rare case and not much useful because when tests passed then probably you don't want to run them again.

@jankeesvw
Copy link
Author

Thanks for the quick reply!

Why does the seed need to expire? One hour should work in most cases but I can imagine it's not always enough. Sometimes I take a break after pushing some changes. That will expire the seed then.

There is 1 disadvantage by caching the response by using the seed. When you run tests and we collected the data from nodes and you will run tests for the same commit within 1h you won't run them base on most up to date time execution data.

I don't think it matters that you get the same split up every time for a given SHA. I rather have a complete test suite then a fully optimised one.

If you need more input please let me know, I would love to help.

@ArturT
Copy link
Member

ArturT commented Jun 24, 2016

I don't think it matters that you get the same split up every time for a given SHA. I rather have a complete test suite then a fully optimised one.
Makes sense. We could just assume that edge cases are not important.

For instance when you run tests for the first time when there are no time execution data in API DB then you will get simple test suite split based on directory names. If we cache that split and you will retry the tests for the commit sha you will get the same split without optimisation based on time execution. You will have to push a new git commit in order to take advantage of time execution data stored on API side.

Benefit of constant split for particular git commit might be bigger than disadvantages in edge case scenarios.

@jankeesvw
Copy link
Author

That sounds like a perfect solution!

The first time you could also add some extra logging output that there is not sufficient data to split it by time yet.

@ArturT
Copy link
Member

ArturT commented Jun 28, 2016

I've just released a new knapsack_pro version 0.10.0. Please use it. It should solve the problem.

You can learn more how it works now in readme:
https://github.com/KnapsackPro/knapsack_pro-ruby#knapsack_pro_fixed_test_suite_splite-test-suite-split-based-on-seed

or check the changelog:
https://github.com/KnapsackPro/knapsack_pro-ruby/blob/master/CHANGELOG.md#0100

@jankeesvw Thanks for help to make the knapsack_pro gem better! 😄

@ArturT ArturT closed this as completed Jun 28, 2016
@jankeesvw
Copy link
Author

Great work, thanks! 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants