Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix HasResource inverted boolean error #3162

Conversation

s4ke
Copy link
Contributor

@s4ke s4ke commented Jan 19, 2024

- What I did

While fixing moby/moby#44378 for named resources in #3082, we inadvertedly broke scheduling for people with discrete resources.

- How I did it

Investigated the code and added more unit tests.

- How to test it

Run the unit tests. Broken behaviour can be reproduced with a setup with gpu_test=5:

moby/moby#44378 (comment)

@s4ke
Copy link
Contributor Author

s4ke commented Jan 19, 2024

I honestly would love to add more unit tests to constraint_enforcer_test.go, but I am unsure about how to test for "no failure". I would appreciate some pointers.

Signed-off-by: Martin Braun <braun@neuroforge.de>
@s4ke s4ke force-pushed the 44378-moby-fix-constraint-enforcer-generic-resources branch from 481ead2 to d67eec8 Compare January 19, 2024 19:41
@codecov-commenter
Copy link

codecov-commenter commented Jan 19, 2024

Codecov Report

Merging #3162 (d67eec8) into master (7eb5046) will increase coverage by 61.80%.
The diff coverage is 100.00%.

Additional details and impacted files
@@             Coverage Diff             @@
##           master    #3162       +/-   ##
===========================================
+ Coverage        0   61.80%   +61.80%     
===========================================
  Files           0      155      +155     
  Lines           0    31139    +31139     
===========================================
+ Hits            0    19244    +19244     
- Misses          0    10341    +10341     
- Partials        0     1554     +1554     

@s4ke
Copy link
Contributor Author

s4ke commented Jan 19, 2024

@dperny PTAL. Kind of embarrasing that I didn't check this last time around. But boolean logic is hard, okay?

@s4ke
Copy link
Contributor Author

s4ke commented Jan 19, 2024

I additionally verified the fix by running a dind version of moby/moby with the fix applied.

root@6a234821266f:/go/src/github.com/docker/docker# docker ps
CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES
root@6a234821266f:/go/src/github.com/docker/docker# docker ^C
root@6a234821266f:/go/src/github.com/docker/docker# docker network create -d overlay --scope swarm test-network
Error response from daemon: cannot create a swarm scoped network when swarm is not active
root@6a234821266f:/go/src/github.com/docker/docker# docker swarm init
Swarm initialized: current node (o6ek9v8noxrc2n0dgfw5usw3j) is now a manager.

To add a worker to this swarm, run the following command:

    docker swarm join --token SWMTKN-1-39mijudamxqjwwifamr1e7lpe8cjxxrnftkpn31342q91parvd-7g63d3i274dmhx264mkbwbfpo 172.17.0.6:2377

To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.

root@6a234821266f:/go/src/github.com/docker/docker# docker network create -d overlay --scope swarm test-network
zem9t2vf033kkcpa5si9ygukb
root@6a234821266f:/go/src/github.com/docker/docker# docker service create --network test-network --generic-resource "gpu_test=2" --name test-service-2 ubuntu tail -f /dev/n
ull
uhxbdxii0dmbtkfhlqmqhf23d
overall progress: 1 out of 1 tasks 
1/1: running   [==================================================>] 
verify: Service converged 
root@6a234821266f:/go/src/github.com/docker/docker# docker service create --network test-network --generic-resource "gpu_test=4" --name test-service-3 ubuntu tail -f /dev/n
ull
ip5x4qx6meorygcu5plo2a465
overall progress: 0 out of 1 tasks 
1/1: no suitable node (insufficient resources on 1 node) 
Operation continuing in background.
Use `docker service ps ip5x4qx6meorygcu5plo2a465` to check progress.
root@6a234821266f:/go/src/github.com/docker/docker# docker service create --network test-network --generic-resource "gpu_test=3" --name test-service-4 ubuntu tail -f /dev/n
ull
vqxyav44kkyiqriyrdi289vk4
overall progress: 1 out of 1 tasks 
1/1: running   [==================================================>] 
verify: Service converged 
root@6a234821266f:/go/src/github.com/docker/docker# 

@s4ke
Copy link
Contributor Author

s4ke commented Jan 25, 2024

@thaJeztah PTAL - this is blocking people from upgrading to >23.x from 20.x . It fell through last time I fixed something in the same area.

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@thaJeztah thaJeztah merged commit dcda100 into moby:master Jan 25, 2024
8 checks passed
@thaJeztah
Copy link
Member

This will also need a re-vendor in moby/moby 😅

@s4ke
Copy link
Contributor Author

s4ke commented Jan 25, 2024

I will take care of that if you want. The process for vendoring for moby/swarmkit into moby/moby is something that I keep having to make up as I go. Is there documentation that I missed for that?

@thaJeztah
Copy link
Member

Within the moby repo, you can edit the vendor.mod and set the latest commit from the swarmkit repository for the dependency; then run hack/vendor.sh (which updates the vendored code), and commit the results 👍

I tend to include a diff link in the description, for others to have an easier way to see what changed, but that's not required.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants