-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Manager tests failures related to Spot instances #7413
Comments
no there isn't a magic thing to solve it, spot can be taken during test, and something they aren't available. at the end it's a matter of cost, also in core we run the longer test with on_demand, and when there a release we do the same but for a very small set of regularly triggered jobs in you case you might want to have the trigger for releases to be with on_demend, and the ones on master with spot. |
Agree, sounds like a good solution in our case. |
I'd argue there's a difference between no spot available and spot termination. The former - we can easily fallback to ondemand, and I think it makes sense. The latter - harder to deal with - but I'd like to hope is less common - and happens when the tests are 1h or longer, I reckon? |
those tests are longer then 1 hour, and it's currently the same, since someone need to manually re-run if they are failing. (we are not doing it automatically) |
All of them are longer than 1h? |
@mykaul - all of them. |
That's too bad. I don't have time now, but I'd be happy to review this at a later point. It makes little sense to me - we should be able to be more efficient. |
@mikliapko is working on shorter tests but there were not merged yet - #7456 |
@mykaul - if you wish to review - https://docs.google.com/spreadsheets/d/1enOmxToYVXQEQgGPBCPZIA0JblV5zaBBEFNRaKMS5Ho/edit#gid=605769695 |
Issue description
Last month I've seen a multiple failures for manager tests related to Spot instances (manager jobs use this type of instances by default).
Couple of examples:
Switching to
on_demand
type of instances solved the problem in each case.@fruch
on_demand
instance types.Impact
High
How frequently does it reproduce?
I'd say ~50% of test executions last month.
The text was updated successfully, but these errors were encountered: