Skip to content

DLPX-73603 Grub should always default to the kernel version that comes with the active Delphix version #10

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 11, 2021

Conversation

pzakha
Copy link
Contributor

@pzakha pzakha commented Jan 6, 2021

The "grub-mkconfig" utility is responsible for generating the "grub.cfg" configuration. It lists the kernel binaries installed on the system and generates entries in the grub menu for each kernel version. It also generates a "default" entry, which points to the latest kernel version. The "latest" kernel version is determined by doing a version string comparison of all the kernel versions.

The problem is that the "latest" kernel version as determined by string comparison will not necessarily coincide with the "latest" kernel version from the Delphix Engine's perspective. From the DE's perspective the "latest" kernel is the one that comes with the latest Delphix image, which might not necessarily have the highest numerical string. To figure out what is the latest kernel version, we must look at the "delphix-kernel" package name that is listedin the delphix-entire's package list at "/usr/share/doc/delphix-entire-*/packages.list.gz".

Open question

Should we explicitly fail if version_delphix_latest() returns nothing?

Testing

  • ab-pre-push: http://selfservice.jenkins.delphix.com/job/devops-gate/job/master/job/appliance-build-orchestrator-pre-push/4565/
  • I've tested manually that the proper kernel version is being selected by simulating multiple kernel versions being installed. Grub looks at vmlinuz binaries in /boot/vmlinuz-* to determine which kernels are installed, so I just created some fake entries there with a higher version number and checked that the proper kernel version is selected by default by grub.
  • TODO: test this in a real-life scenario by doing an upgrade to a Delphix image that has a lower kernel version string.

Copy link
Contributor

@prakashsurya prakashsurya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you envision this working for hotfixes? e.g. if we install a new kernel package, but don't update the delphix-entire package?

I think, based on this, we should always update the delphix-entire package when we install a hotfix (at least with the new hotfix project stuff).. in which case, this solution would work for that case too.. what do you think?

@prakashsurya
Copy link
Contributor

Note that zfs-precommit has been deprecated and we now have to use ab-pre-push --zfs-tests.

I don't think this is relevant, since we still use the underlying test implementation withing Jenkins, which is what I was referring too. Please correct me if I'm wrong, though.

I don't expect that to happen without appliance-build being involved and therefore delphix-entire being updated.

I'm OK with that approach, but I think this means (with this change) we do not support installing a new kernel without going through the normal upgrade/hotfix/etc. process; whereas we do support this workflow for other packages (at least, internally).

Personally, I'm good with that "drawback".

@pzakha
Copy link
Contributor Author

pzakha commented Jan 7, 2021

I don't think this is relevant, since we still use the underlying test implementation withing Jenkins, which is what I was referring too. Please correct me if I'm wrong, though.

Yeah, this Jenkins implementation doesn't work anymore following the linux-pkg rework, so an appliance-build with the new zfs bits is now required to run appliance-build.

I'm OK with that approach, but I think this means (with this change) we do not support installing a new kernel without going through the normal upgrade/hotfix/etc. process; whereas we do support this workflow for other packages (at least, internally).

Yeah that's mostly what it means. We could still theoretically provide some kernel changes without modifying the kernel version (similar to what I did for the current kernel we are using on 6.0, by manually repackaging an existing kernel package and modifying the package's revision), but I find that whole process quite gross so I rather avoid it ;).

@prakashsurya
Copy link
Contributor

OK, so I think I'm good with this idea in general, but I'd like to simplify the code a bit such that it relies on a delphix-entire package being installed.

Ideally, I'd like to throw an error if grub-mkconfig is run without delphix-entire being installed, but we'll need to verify that'll work properly for upgrade, since currently we install the new delphix-platform package first, before we install the new delphix-entire package.. e.g. if we install the new delphix-kernel package when we install the new delphix-platform package, the old delphix-entire package will be installed at that time.

@pzakha
Copy link
Contributor Author

pzakha commented Jan 8, 2021

I've posted an update which now fails if the latest delphix-kernel version is not found.

I've tested that it works for initial install: http://selfservice.jenkins.delphix.com/job/devops-gate/job/master/job/appliance-build-orchestrator-pre-push/4569/

I've also tested that it fails if the delphix-appliance/platform file is missing or if it failed to retrieve the delphix-kernel version from the packages-list file:

$ sudo grub-mkconfig -o /mnt/boot/grub/grub.cfg
Sourcing file `/etc/default/grub'
Sourcing file `/etc/default/grub.d/kdump-tools.cfg'
Sourcing file `/etc/default/grub.d/override.cfg'
Generating grub configuration file ...
cat: /var/lib/delphix-appliance/platform: No such file or directory
$ sudo grub-mkconfig -o /mnt/boot/grub/grub.cfg
Sourcing file `/etc/default/grub'
Sourcing file `/etc/default/grub.d/kdump-tools.cfg'
Sourcing file `/etc/default/grub.d/override.cfg'
Generating grub configuration file ...
Error: Failed to retrieve latest delphix-kernel version from '/usr/share/doc/delphix-entire-aws/packages.list.gz'

I've looked at the upgrade workflow and we always run rootfs-container set-bootfs as the last step of the upgrade, which will take care of updating grub config.

@pzakha pzakha merged commit 5550664 into delphix:master Jan 11, 2021
pzakha added a commit to pzakha/grub2 that referenced this pull request Oct 28, 2021
pzakha added a commit to pzakha/grub2 that referenced this pull request Oct 28, 2021
prakashsurya pushed a commit that referenced this pull request Apr 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants