Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update host-upgrades addon to use pharos-host-upgrades #393

Merged
merged 22 commits into from
Jun 20, 2018

Conversation

SpComb
Copy link
Contributor

@SpComb SpComb commented May 31, 2018

See https://github.com/kontena/pharos-host-upgrades

Currently using the quay.io/kontena/pharos-host-upgrades:edge build, pending testing and a tagged release.

The defaults run upgrades at midnight, waiting up to two hours for the kube lock, without rebooting the host.

Replaces the kured addon, which needs to be deprecated somehow... This functionality is optional, it needs reboot: true to enable it.

TODO

@SpComb SpComb added the enhancement New feature or request label May 31, 2018
@jakolehm jakolehm added this to the 1.2.0 milestone May 31, 2018
# minimal = yum --bugfix update-minimal
# minimal-security = yum --security update-minimal
# minimal-security-severity:Critical = --sec-severity=Critical update-minimal
update_cmd = default
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this will install all available updated package from all yum repos... my understanding is that CentOS is missing the required repo metadata for things like minimal-security to work, those are only available in RHEL:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, CentOS is missing security metadata. We lock kube/docker packages with yum-lockversion which should make this a bit more safe. For RHEL we could use security or similar?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably still consider testing the combination of yum-lockversion and the yum-cron configuration to make sure it doesn't upgrade pinned packages.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested, yum-versionlock works with yum-cron.


// Automatically upgrade packages from these (origin:archive) pairs
Unattended-Upgrade::Allowed-Origins {
// "${distro_id}:${distro_codename}";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Ubuntu:xenial origin is whitelisted in the default configuration, AFAIK that corresponds to LTS point releases (16.04.x):

// Automatically upgrade packages from these (origin:archive) pairs
Unattended-Upgrade::Allowed-Origins {
	"${distro_id}:${distro_codename}";
	"${distro_id}:${distro_codename}-security";
	// Extended Security Maintenance; doesn't necessarily exist for
	// every release and this system may not have it installed, but if
	// available, the policy for updates is such that unattended-upgrades
	// should also install from here by default.
	"${distro_id}ESM:${distro_codename}";
//	"${distro_id}:${distro_codename}-updates";
//	"${distro_id}:${distro_codename}-proposed";
//	"${distro_id}:${distro_codename}-backports";
};

@jnummelin jnummelin modified the milestones: 1.2.0, 1.1.0 Jun 5, 2018
@@ -1,20 +1,25 @@
# frozen_string_literal: true

Pharos.addon 'host-upgrades' do
version '0.1.0'
version '0.0.0'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a reminder that this need to be updated.

license 'Apache License 2.0'

config {
attribute :interval, Pharos::Types::String
attribute :image_tag, Pharos::Types::String.default("edge")
attribute :schedule, Pharos::Types::String
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This absolutely needs to have some regexp for validation, because anyone configuring this is going to be more likely to get the syntax wrong rather than right, and then be left debugging their CrashLoopBackoff pods... the feedback cycle for fixing this configuration is too long and painful otherwise.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to merge this sooner than later.. if regexp takes time then we should open a new issue and mark it to the 1.2.0 milestone.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need, I already have something, it just needs specs to test it.

attribute :interval, Pharos::Types::String
attribute :image_tag, Pharos::Types::String.default("edge")
attribute :schedule, Pharos::Types::String
attribute :schedule_window, Pharos::Types::String
Copy link
Contributor Author

@SpComb SpComb Jun 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: The --schedule-window defaults to 1h within pharos-host-upgrades itself, which might be a mistake... it should still be possible to disable it by using schedule_window: 0 - or something else that makes sense given the schedule, like schedule_window: 24h.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will now get overridden with --schedule-window=0 by default, which is IMO good behavior.

@SpComb SpComb requested a review from jakolehm June 18, 2018 16:11
@SpComb
Copy link
Contributor Author

SpComb commented Jun 18, 2018

Leaving #420 as a separate issue for the kured deprecation... having both kured and host-upgrades' running with reboot: true` may or may not behave badly.

DURATION_RE = /^(\d+(\.\d+)?(ns|us|ms|s|m|h|))+$/

def schedule?(value)
!!SCHEDULE_RE.match?(value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering should we use valid cron schedule (we have fugit gem as a dep)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pharos-cluster itself doesn't need to actually parse the cron string, but Fugit::Cron.parse would probably be better than a regexp... unfortunately, the github.com/robfig/cron package used by pharos-host-upgrades doesn't accept the standard five-field cron syntax, it expects an extended cron syntax with a leading seconds field... which is a bad idea IMO, it would be better if this used a standard cron syntax, we don't really need per-second granularity here.

Maybe we should change pharos-host-upgrades to accept a standard five-field cron syntax before releasing this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant that we could use Fugit::Cron.parse to parse standard five-field cron and then addon could convert it to something that pharos-host-upgrades understands.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that conversion should happen within pharos-host-upgrades.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like Fugit::Cron.parse also supports the alternative six-field cron syntax, which confuses things a bit now...

irb(main):003:0> Fugit::Cron.parse('0 0 0 * * *')
=> #<Fugit::Cron:0x0000000001c172c0 @original="0 0 0 * * *", @cron_s=nil, @seconds=[0], @minutes=[0], @hours=[0], @monthdays=nil, @months=nil, @weekdays=nil, @zone=nil, @timezone=nil>

# minimal = yum --bugfix update-minimal
# minimal-security = yum --security update-minimal
# minimal-security-severity:Critical = --sec-severity=Critical update-minimal
update_cmd = default
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, CentOS is missing security metadata. We lock kube/docker packages with yum-lockversion which should make this a bit more safe. For RHEL we could use security or similar?

@@ -5,16 +5,48 @@
license 'Apache License 2.0'

config {
attribute :interval, Pharos::Types::String
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we wanted to make the 1.2 upgrade smoother, we could continue to accept existing addon configurations with interval, and just translate that to schedule: "@every ...".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if it's needed if the new addon complains when schedule is missing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everyone using the existing host-upgrades addon will need to edit their cluster.yml when upgrading to 1.2. That could be avoided.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's ok to edit cluster.yml in this case.

it "fails with long cron schedule" do
result = described_class.validate({enabled: true, schedule: '0 0 0 * * *'})

expect(result).to_not be_success
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a hard time figuring out how to get Fugit::Cron.parse to reject or normalize the extended seconds syntax without monkey-patching it...

  1) Pharos::Addons::HostUpgrades#validate schedule fails with long cron schedule
     Failure/Error: expect(result).to_not be_success
       expected `#<Dry::Validation::Result output={:enabled=>true, :schedule=>"0 0 0 * * *"} errors={}>.success?` to return false, got true
     # ./spec/pharos/addons/host_upgrades_spec.rb:49:in `block (4 levels) in <top (required)>'
     # ./spec/spec_helper.rb:121:in `block (2 levels) in <top (required)>'
     # ./spec/spec_helper.rb:111:in `block (2 levels) in <top (required)>'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hacked this in with a trivial monkeypatch to add a missing attr_reader... validating/normalizing the cron schedule was more difficult than expected this way.


module Fugit
class Cron
attr_reader :seconds # XXX: missing from upstream
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should try to push this also to the upstream.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ this got merged and released upstream as 1.1.2, so I'll still update this PR to use that instead.

Copy link
Contributor

@jakolehm jakolehm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jakolehm
Copy link
Contributor

@SpComb ok to merge?

@SpComb
Copy link
Contributor Author

SpComb commented Jun 20, 2018

@SpComb ok to merge?

Good to go now.

@jakolehm jakolehm merged commit 3db98c5 into master Jun 20, 2018
@jakolehm jakolehm deleted the addons/pharos-host-upgrades branch June 20, 2018 06:32
@jakolehm
Copy link
Contributor

@SpComb docs pr is still missing...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants