Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

馃悰 node labels inconsistencies: s4l-lite services are not guaranteed anymore to run alone on autoscaled nodes #323

Closed
Tracked by #675
sanderegg opened this issue Aug 22, 2023 · 3 comments
Assignees
Labels
p:high-prio t:bug Something isn't working
Milestone

Comments

@sanderegg
Copy link
Member

This is a blocker for any production release of the machine labels!!!

With the current node labels on staging the s4l-lite services are not guaranteed anymore to run alone on autoscaled nodes

Staging now has the following node labels defined:

  • gpu workers: [dasksidecar==true, dynamicsidecar==true, gpu==true]
  • autoscaled nodes for s4l-lite: [dynamicsidecar==true, gpu=true, io.simcore.autoscaled-node==true]

this means that s4l-lite goes as usual to the correct nodes.

BUT: any dynamic service created by director-v0/director-v2 might go there as well. That is not the intention and must be fixed before these labels move to production.

@sanderegg sanderegg added t:bug Something isn't working p:high-prio labels Aug 22, 2023
@sanderegg sanderegg added this to the Baklava milestone Aug 22, 2023
@sanderegg
Copy link
Member Author

Possible fixes:

  • define a negative constraint by default on all services dispatched by dv0/dv2 such as node.labels.io.simcore.autoscaled-node!=true (to test if that works with dv0, dv2 should work)
  • define again some kind of standardworker label so that there is at least one difference between the autoscaled nodes and *non-*autoscaled nodes

@YuryHrytsuk
Copy link
Collaborator

Possible fixes:

* define a negative constraint by default on all services dispatched by dv0/dv2 such as `node.labels.io.simcore.autoscaled-node!=true` (to test if that works with dv0, dv2 should work)

* define again some kind of `standardworker` label so that there is at least one difference between the autoscaled nodes and *non-*autoscaled nodes

I have updated the configuration to go with the first approach you advise. Please, see all the PRs in here.

I will need to consult @sanderegg, @GitHK and, perhaps, @pcrespov to make sure my changes properly address the issue

@YuryHrytsuk
Copy link
Collaborator

These pull requests were merged today. From now on, this issue shall be fixed with node.labels.io.simcore.autoscaled-node!=true constraints

@mrnicegyu11 mrnicegyu11 changed the title Blocker: node labels inconsistencies: s4l-lite services are not guaranteed anymore to run alone on autoscaled nodes 馃悰 node labels inconsistencies: s4l-lite services are not guaranteed anymore to run alone on autoscaled nodes Sep 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
p:high-prio t:bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants