Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default affinity #147

Merged
merged 8 commits into from
May 17, 2019
Merged

Default affinity #147

merged 8 commits into from
May 17, 2019

Conversation

jhamman
Copy link
Member

@jhamman jhamman commented May 6, 2019

At this point, this is for illustration purposes only. It builds on #145 and gives #146 a try. By default, this now creates dask-worker pods with the following affinity and tolerations:

spec:
  affinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - preference:
          matchExpressions:
          - key: k8s.dask.org/node-purpose
            operator: In
            values:
            - worker
        weight: 100
  tolerations:
  - effect: NoSchedule
    key: k8s.dask.org/dedicated
    operator: Equal
    value: worker
  - effect: NoSchedule
    key: k8s.dask.org_dedicated
    operator: Equal
    value: worker
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300

If my understanding is correct: A kubernetes cluster / node-pool with the following labels/taints should attract and accept dask-workers with this spec.

labels:

  • k8s.dask.org/node-purpose : worker

taints:

  • k8s.dask.org/dedicated=worker
  • k8s.dask.org_dedicated=worker

I should note that I have a node-pool with these options set and my dask-workers are not landing in the desired place yet so I'm probably still missing at least one piece of the puzzle.

@mrocklin
Copy link
Member

mrocklin commented May 8, 2019

@quasiben if you have a moment can I ask you to check out this PR?

Copy link
Member

@jacobtomlinson jacobtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again this looks reasonable to me.

@jhamman
Copy link
Member Author

jhamman commented May 9, 2019

Thanks @jacobtomlinson - I need to add a unit test or two but now that its passed your sniff test, I can forge ahead.

@jhamman
Copy link
Member Author

jhamman commented May 9, 2019

I'm working on tests now. I've also created a binder with this branch on it that can be run on binder.pangeo.io: Binder

)
)
else:
print('TODO: need to handle the various partial states here')

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've done partial state handling in #148, doing this was really messy sadly.

Copy link

@consideRatio consideRatio May 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are very welcome to copy/paste/modify etc. to accomplish the desired outcome.

@jhamman
Copy link
Member Author

jhamman commented May 14, 2019

@consideRatio - I merged our two efforts. Let me know what you think.

Copy link

@consideRatio consideRatio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jhamman LGTM except a small technical typo (I think).

):
affinity.node_affinity.preferred_during_scheduling_ignored_during_execution = (
[]
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assignment above should be an array I think. If we could say the type of elements it should be an array of V1PreferredSChedulingTerm objects, but as we cannot type the arrays element it should simply be an array.

I understand it currently as it is a tuple with one element: an array. I'm not confident on how the parenthesis are interpreted though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This evaluates to an empty list.

In [1]: x = ([])

In [2]: x
Out[2]: []

As you say, we can't type this further. AFAIK, there isn't a way to instantiate an empty list of V1PreferredSchedulingTerm objects, like we do for the V1NodeSelectors above.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah okay! Technically then all good!!

I just learned that () is a 0-tuple or a tuple with 0 elements, and ("test",) is a tuple with 1 element, and ("test", "more") is a tuple with two elements, but ("something") is just "something".

@jhamman jhamman changed the title [WIP] Default affinity Default affinity May 15, 2019
Copy link

@consideRatio consideRatio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@jhamman
Copy link
Member Author

jhamman commented May 16, 2019

@yuvipanda / @jacobtomlinson - I think we're all done here. Ready to merge?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants