Skip to content

Fix gunicorn "Control server error" on kubernetes#7591

Open
pedro-psb wants to merge 1 commit intopulp:mainfrom
pedro-psb:fix/gunicorn-control-socket-path
Open

Fix gunicorn "Control server error" on kubernetes#7591
pedro-psb wants to merge 1 commit intopulp:mainfrom
pedro-psb:fix/gunicorn-control-socket-path

Conversation

@pedro-psb
Copy link
Copy Markdown
Member

gunicorn 25.1.0 introduced a control socket (gunicornc) that defaults to gunicorn.ctl relative to the working directory. Since pulpcore-content sets its CWD to WORKING_DIRECTORY (/var/lib/pulp/tmp by default), the socket lands on the shared PVC and persists across pod restarts, causing Permission denied when a new pod tries to recreate it during a rolling update.

Default to /tmp/pulpcore-content.ctl, which is pod-local ephemeral storage. Users who want a different path can override via gunicorn.conf.py.

fixes: #7574
Assisted-by: Claude Code

📜 Checklist

  • Commits are cleanly separated with meaningful messages (simple features and bug fixes should be squashed to one commit)
  • A changelog entry or entries has been added for any significant changes
  • Follows the Pulp policy on AI Usage
  • (For new features) - User documentation and test coverage has been added

See: Pull Request Walkthrough

…ling updates

gunicorn 23.x introduced a control socket (gunicornc) that defaults to
gunicorn.ctl relative to the working directory. Since pulpcore-content sets
its CWD to WORKING_DIRECTORY (/var/lib/pulp/tmp by default), the socket
lands on the shared PVC and persists across pod restarts, causing Permission
denied when a new pod tries to recreate it during a rolling update.

Default to /tmp/pulpcore-content.ctl, which is pod-local ephemeral storage.
Users who want a different path can override via gunicorn.conf.py.

Assisted-by: Claude Code
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
fixes: pulp#7574
@pedro-psb pedro-psb force-pushed the fix/gunicorn-control-socket-path branch from c41bf04 to d79792c Compare April 14, 2026 18:49
@pedro-psb pedro-psb marked this pull request as ready for review April 14, 2026 18:50
Comment on lines +18 to +19
# On k8s, the default location may persist across restarts and cause permission errors
# See: <https://github.com/pulp/pulpcore/issues/7574>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the default location?
This file belongs in /run/ somewhere.

This directory contains system information data describing the system since it was booted. Files under this directory must be cleared (removed or truncated as appropriate) at the beginning of the boot process.
[...]
System programs that maintain transient UNIX-domain sockets must place them in this [/run] directory or an appropriate subdirectory as outlined above.

https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch03s15.html

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It defaults to the current directory: https://gunicorn.org/guides/gunicornc/#start-gunicorn-with-control-socket
I wonder why.

Copy link
Copy Markdown
Member Author

@pedro-psb pedro-psb Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, it was changed in 25.2: benoitc/gunicorn@0ad47db

I guess we should not do anything, then.
This change was to improve on gunicorn's default (of version 25.1), but they have improved by themselves.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or do you think it's still worth it, to account for the case 25.1 is installed?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what I understand is that gunicorn tries XDG_RUNTIME_DIR first and falls back to HOME.
I would claim that the variable XDG_RUNTIME_DIR should have been set. Not sure if the os in the container or the container runtime is to blame, but the default gunicorn behaviour seems sound to me and your change makes that unnecessarily rigid.
We should probably propagate the option instead so it stays possible to overwrite it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Control server error: [Errno 13] Permission denied 0 PermissionError(13, 'Permission denied')

2 participants