-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade minikube to 1.33.0 #1375
Conversation
@akalenyu do you want to review this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! just something to consider about leaving an escape hatch for the user to inject files/scripts
Must be called after the cluster is started, before running any addon. Not | ||
need when starting a stopped cluster. | ||
""" | ||
_load_sysctl(profile) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this works and is totally fine, however, it might be a tiny bit limiting;
I think it's very likely that new developers on this project will have varying hacking requirements to get minikube (or minikube+other operators in my case) working properly.
Would you consider executing user-placed scripts from ~/.drenv/minikube_setup
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Running same scripts for every cluster is too broad in scope. The way to run custom code is to add an addon in test/addons/myaddon/start and put your addon in the first of of one of the workers in the clusters you want to run on.
We don't have yet mechanism to run some addon before all other addons in a cluster, but we can add something like this, maybe:
profiles:
- name: my-profile
before:
# Run before all other addons
- name: my-before-addon
run:
- addons:
# Run in parallel
- addons:
- name: kubevirt
- addons:
- name: cdi
after:
# Run after all other addons
- name: my-after-addon
If you have a use case for running your own script for every drenv environment please open an issue describing the use case.
.github/workflows/e2e.yaml
Outdated
- name: Setup drenv | ||
working-directory: test | ||
run: drenv setup envs/regional-dr.yaml | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the CI is green, which means this was exercised, and 1.33 works?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, since we do not install yet the tool dynamically in the CI, this just shows that the changes are compatible with 1.32.0 (version installed in the CI runner).
test/drenv/__main__.py
Outdated
@@ -124,6 +124,12 @@ def handle_termination_signal(signo, frame): | |||
sys.exit(1) | |||
|
|||
|
|||
def cmd_setup(env, args): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we document this step in the user-quick-start.md?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that in this commit message you setup $ drenv setup envs/regional-dr.yaml
but end up starting $ drenv start envs/kubevirt.yaml -v
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we document this step in the user-quick-start.md?
Yes we must, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that in this commit message you setup
$ drenv setup envs/regional-dr.yaml
but end up starting$ drenv start envs/kubevirt.yaml -v
The issue is that command line parsing is too simplistic, so the envfile is required for every command. Like "clear", the new "setup" command requires an enf file but does not use it.
I'll fix this in the next version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we document this step in the user-quick-start.md?
Yes we must, thanks!
Current version setup minikube by default when creating virtual env (make venv) so this change is transparent for users and developers.
Looks like recent python does not install setuptools, used in drenv setup.py script. We probably need to switch to more modern library, but lets first fix the error by installing setuptools. We can look at modernizing the setup later. Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Lets keep main() cleaner by moving out the details of command line parsing. Signed-off-by: Nir Soffer <nsoffer@redhat.com>
fa12baf
to
8110e02
Compare
The previous command line parsing was too simplistic, resulting in several problems: - The filename argument is require for all commands, even the command does not use it. Because we validate it, you could not use a dummy string or /dev/null. - Some commands line options have no effect on some commands, but they were still documented (e.g. --skip-*) - Commands had no help This change fixes the issue by creating a command line parser for every command, and unifying the command interface to accept only args. Commands operating on an evnfile load it in the command. Example help with this change: $ drenv --help usage: drenv [-h] {start,stop,delete,suspend,resume,dump,fetch,clear} ... options: -h, --help show this help message and exit commands: {start,stop,delete,suspend,resume,dump,fetch,clear} start Start an environment stop Stop an environment delete Delete an environment suspend Suspend virtual machines resume Resume virtual machines dump Dump an environment yaml fetch Cache environment resources clear Cleared cached resources Help for specific commands: $ for cmd in start stop delete suspend resume dump fetch clear; do echo echo "\$ drenv $cmd --help"; drenv $cmd --help done $ drenv start --help usage: drenv start [-h] [-v] [--name-prefix PREFIX] [--max-workers N] [--skip-tests] [--skip-addons] filename positional arguments: filename path to environment file options: -h, --help show this help message and exit -v, --verbose be more verbose --name-prefix PREFIX prefix profile names --max-workers N maximum number of workers per profile --skip-tests Do not run addons 'test' hooks --skip-addons Do not run addons 'start' hooks $ drenv stop --help usage: drenv stop [-h] [-v] [--name-prefix PREFIX] [--max-workers N] [--skip-addons] filename positional arguments: filename path to environment file options: -h, --help show this help message and exit -v, --verbose be more verbose --name-prefix PREFIX prefix profile names --max-workers N maximum number of workers per profile --skip-addons Do not run addons 'stop' hooks $ drenv delete --help usage: drenv delete [-h] [-v] [--name-prefix PREFIX] [--max-workers N] filename positional arguments: filename path to environment file options: -h, --help show this help message and exit -v, --verbose be more verbose --name-prefix PREFIX prefix profile names --max-workers N maximum number of workers per profile $ drenv suspend --help usage: drenv suspend [-h] [-v] [--name-prefix PREFIX] [--max-workers N] filename positional arguments: filename path to environment file options: -h, --help show this help message and exit -v, --verbose be more verbose --name-prefix PREFIX prefix profile names --max-workers N maximum number of workers per profile $ drenv resume --help usage: drenv resume [-h] [-v] [--name-prefix PREFIX] [--max-workers N] filename positional arguments: filename path to environment file options: -h, --help show this help message and exit -v, --verbose be more verbose --name-prefix PREFIX prefix profile names --max-workers N maximum number of workers per profile $ drenv dump --help usage: drenv dump [-h] [-v] [--name-prefix PREFIX] [--max-workers N] filename positional arguments: filename path to environment file options: -h, --help show this help message and exit -v, --verbose be more verbose --name-prefix PREFIX prefix profile names --max-workers N maximum number of workers per profile $ drenv fetch --help usage: drenv fetch [-h] [-v] [--name-prefix PREFIX] [--max-workers N] filename positional arguments: filename path to environment file options: -h, --help show this help message and exit -v, --verbose be more verbose --name-prefix PREFIX prefix profile names --max-workers N maximum number of workers per profile $ drenv clear --help usage: drenv clear [-h] [-v] options: -h, --help show this help message and exit -v, --verbose be more verbose Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Minikube supports injecting configuration files into the cluster nodes. This is very useful for solving 2 issues: - Regression in minikube 1.33 enabling DNSSEC by default. This breaks pulling images from some registries, breaking us badly. But minikube 1.33 have other improvements we would like to consume (such as faster cluster start) so this change work around this regression. - Random failures when starting kubevirt VM, failing to create inotify watch. This is caused by low fs.inotify limits, and can be fixed by increasing the limits. We use the default limits from OpenShift worker node. Creating the configuration files is not enough, since minikube inject the files to the VM too late. We need to reload daemons or run sysctl to apply the configuration changes when the cluster starts. This change adds new "setup" and "cleanup" commands. The "setup" command should run once when setting up a development system. It writes drenv configuration to $MINIKUBE_HOME/.minikube/files/, that will be used by future cluster. The "cleanup" command remove the changes made by the setup command, and will be used in CI to clean up after a build. Example usage: $ tree $MINIKUBE_HOME/.minikube/files /data/tmp/.minikube/files $ drenv setup -v 2024-05-07 17:49:11,444 INFO [main] Setting up minikube for drenv 2024-05-07 17:49:11,444 DEBUG [minikube] Writing drenv sysctl configuration /data/tmp/.minikube/files/etc/sysctl.d/99-drenv.conf 2024-05-07 17:49:11,445 DEBUG [minikube] Writing drenv systemd-resolved configuration /data/tmp/.minikube/files/etc/systemd/resolved.conf.d/99-drenv.conf ... $ tree $MINIKUBE_HOME/.minikube/files /data/tmp/.minikube/files └── etc ├── sysctl.d │ └── 99-drenv.conf └── systemd └── resolved.conf.d └── 99-drenv.conf $ drenv start envs/kubevirt.yaml -v ... 2024-05-07 17:55:04,568 DEBUG [kubevirt] Loading drenv sysctl configuration 2024-05-07 17:55:04,568 DEBUG [kubevirt] Running ['minikube', 'ssh', '--profile', 'kubevirt', 'sudo sysctl -p /etc/sysctl.d/99-drenv.conf'] 2024-05-07 17:55:04,783 DEBUG [kubevirt] fs.inotify.max_user_instances = 8192 2024-05-07 17:55:04,783 DEBUG [kubevirt] fs.inotify.max_user_watches = 65536 2024-05-07 17:55:04,789 DEBUG [kubevirt] Loading drenv systemd-resolved configuration 2024-05-07 17:55:04,790 DEBUG [kubevirt] Running ['minikube', 'ssh', '--profile', 'kubevirt', 'sudo systemctl restart systemd-resolved.service'] ... $ minikube ssh -p kubevirt 'cat /etc/systemd/resolved.conf.d/99-drenv.conf' # Added by drenv setup [Resolve] DNSSEC=no $ minikube ssh -p kubevirt 'cat /etc/sysctl.d/99-drenv.conf' # Added by drenv setup fs.inotify.max_user_instances = 8192 fs.inotify.max_user_watches = 65536 $ drenv cleanup -v 2024-05-07 17:50:38,608 INFO [main] Cleaning up minikube 2024-05-07 17:50:38,608 DEBUG [minikube] Removed file /data/tmp/.minikube/files/etc/systemd/resolved.conf.d/99-drenv.conf 2024-05-07 17:50:38,608 DEBUG [minikube] Removed empty directory /data/tmp/.minikube/files/etc/systemd/resolved.conf.d 2024-05-07 17:50:38,609 DEBUG [minikube] Removed empty directory /data/tmp/.minikube/files/etc/systemd 2024-05-07 17:50:38,609 DEBUG [minikube] Removed file /data/tmp/.minikube/files/etc/sysctl.d/99-drenv.conf 2024-05-07 17:50:38,609 DEBUG [minikube] Removed empty directory /data/tmp/.minikube/files/etc/sysctl.d 2024-05-07 17:50:38,609 DEBUG [minikube] Removed empty directory /data/tmp/.minikube/files/etc ... Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Minikube 1.33.0 adds useful features, fixes, and performance improvements, but we could not use it because of a regression in systemd-resolved[1]. A critical change in 1.33.0 is upgrading the kernel to 5.10.207. This version fixes bad bug with minikube 1.32.0, when qemu assertion fails while starting a kubevirt VM[2] on newer Intel CPUs (i7-12700k). Now that we setup systemd-resolved configuration we can upgrade to minikube 1.33.0, and Alex can run drenv kubevirt environment without manual hacks. This change: - Updates the docs that latest minikube test is 1.33.0 - Setup minikube by default when creating the virtual environment, so minikube 1.33.0 support is added transparently for developers - Setup drenv in the e2e job to support minikube 1.33.0 - Cleanup drenv in the e2e job so setup from previous build will not leak into the next build [1] kubernetes/minikube#18705 [2] https://gitlab.com/qemu-project/qemu/-/issues/237 Thanks: Alex Kalenyuk <akalenyu@redhat.com> Signed-off-by: Nir Soffer <nsoffer@redhat.com>
path = _sysctl_drenv_conf() | ||
data = """# Added by drenv setup | ||
fs.inotify.max_user_instances = 8192 | ||
fs.inotify.max_user_watches = 65536 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opened minikube issue for this, maybe they will adopt this and we can remove this code soon:
kubernetes/minikube#18831
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Posted a PR to minikube:
kubernetes/minikube#18832
data = """# Added by drenv setup | ||
[Resolve] | ||
DNSSEC=no | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minikube is adopting this workaround, so we can probably remove this very soon:
kubernetes/minikube#18830
Minikube 1.33.0 adds useful features, fixes, and performance
improvements, but we could not use it because of a regression in
systemd-resolved[1].
A critical change in 1.33.0 is upgrading the kernel to 5.10.207. This
version fixes bad bug with minikube 1.32.0, when qemu assertion fails
while starting a kubevirt VM[2] on newer Intel CPUs (i7-12700k).
This change:
drenv configuration includes:
With the new configuration we can upgrade to minikube 1.33.0, and Alex can run drenv
kubevirt environment without manual hacks.
When the configuration changes are released in minikube, we can remove the code in
drenv.minikube, but I would keep the infrastructure for adding and loading configuration
so in the next time we have an issue we will have easy way fix.
[1] kubernetes/minikube#18705
[2] https://gitlab.com/qemu-project/qemu/-/issues/237