Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duffy can't return node when $target node is unreachable #5

Closed
arrfab opened this issue Apr 25, 2018 · 3 comments · Fixed by #6
Closed

Duffy can't return node when $target node is unreachable #5

arrfab opened this issue Apr 25, 2018 · 3 comments · Fixed by #6

Comments

@arrfab
Copy link
Member

arrfab commented Apr 25, 2018

By default, when someone requests a node, duffy does a mysql query to find which nodes are available to be "given" to projects (https://github.com/CentOS/duffy/blob/master/duffy/api_v1/views.py#L62)
problem with that query is that it will always try the same node first, because of the "order by used_count"

So when that node is unreachable (the case we had today), duffy can't contextualize it (https://github.com/CentOS/duffy/blob/master/duffy/api_v1/views.py#L76) and so answer "Failed to allocate nodes" (https://github.com/CentOS/duffy/blob/master/duffy/api_v1/views.py#L90)

We should enhance (https://github.com/CentOS/duffy/blob/master/duffy/models/nodes.py#L56) to be sure that if we can't reach a node, it's just skipped and be marked to be reinstalled, so that it's removed from "Ready" status

So probably switching that line ? https://github.com/CentOS/duffy/blob/master/duffy/models/nodes.py#L72

Let's discuss this

@kbsingh
Copy link
Contributor

kbsingh commented Apr 25, 2018

at the very first attempt, it should mark the node as 'Failed' and move along - the node must not remain in the db as 'Ready'.

@bstinsonmhk
Copy link
Collaborator

We might want to put the host back to 'Active' instead. The common case is that we had a flake with a Seamicro machine and a single reinstall will fix it.

If the machine goes back to Active and something is more seriously wrong, the install workers will mark it failed.

@arrfab
Copy link
Member Author

arrfab commented Apr 25, 2018

Yes, +1 on my side : putting it to "Active" so that a reinstall would be tested

arrfab added a commit to arrfab/duffy that referenced this issue Apr 25, 2018
bstinsonmhk pushed a commit that referenced this issue Apr 25, 2018
gridhead pushed a commit that referenced this issue Oct 25, 2021
Update the README and add uvicorn as a development dependency.

Fixes: #5

Signed-off-by: Nils Philippsen <nils@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants