Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes #26393 - fix uptime calculation for btime #6843

Merged
merged 2 commits into from Jul 19, 2019

Conversation

ares
Copy link
Member

@ares ares commented Jun 14, 2019

Thet source of uptime information coming from subscription manager does
not mean uptime but boot time, so we need to convert it in uptime
seconds method.

@theforeman-bot
Copy link
Member

Issues: #26393

test 'should return nil if no uptime fact is available' do
host = FactoryBot.create(:host)
assert_nil host.uptime_seconds
end

test 'should return uptime and based on backup facts even if there are multiple other facts' do
host = FactoryBot.create(:host)
ansible_uptime_fact = FactoryBot.create(:fact_name, name: 'ansible_uptime_seconds', :type => 'FactName::Ansible')
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note for the reviewer: this is in fact needed since now we instantiate FactName model, in core we don't have these classes defined

@@ -366,7 +366,12 @@ def render_template(template:, **params)
end

def uptime_seconds
self.uptime_fact&.value&.to_i
fact = self.uptime_fact
if fact&.fact_name&.name == 'proc_stat::btime'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is super hacky. Can we please not hardcode subscription manager specific code in core?
Why do we have the fact parser models if we don't use them to parse facts?

Big 👎 from me. Sorry.

Copy link
Member

@timogoebel timogoebel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we refactor this to put the fact parsing logic into a fact parser?

@timogoebel
Copy link
Member

To explain this a little bit more:

I think we should make the uptime a field on the host model. This then allows us to use scoped search for this.
This field should be updated with every fact import as we do with all other fields during a fact import. Note that we need to clear the field when the host is put in build mode.
Impelemting this feature this way we can add proper tests in the respective plugin's code and not add more technical dept here in core. And the data is still available when the facts expire.

@lzap
Copy link
Member

lzap commented Jun 18, 2019

Let's add new fact called foreman_uptime or foreman::uptime and modify our importers to set it appropriately.

@ekohl
Copy link
Member

ekohl commented Jun 18, 2019

I like the idea of a derived fact and to store it somewhere, but there's some things to consider. For example, what would be the fact source on that fact?

I must also say that I like the boot time from subscription manager a lot more than the uptime from Facter. It doesn't change as often and to get the current uptime you need to calculate anyway. We can normalize uptime into boot time.

@lzap
Copy link
Member

lzap commented Jun 18, 2019

Good point on immutability, let's prefer immutable fact and calculate this in the UI instead!

@ares
Copy link
Member Author

ares commented Jun 24, 2019

I agree with Timo, making uptime a first class attribute of host model makes the most sense to me. It's not different to any other information we parse from facts. This will be a bit bigger patch though and this should be fixed in 1.22-stable, so it would be harder and riskier to cherry-pick. Would it be acceptable to take this for 1.22-stable and get the "refactored" PR for develop? Could I base the new work on top of this PR?

I also agree with @ekohl, the new attribute should be boottime instead of uptime to avoid unnecessary updates. But I disagree with derived fact, I belive it should be attribute of host object itself.

@timogoebel
Copy link
Member

Would it be acceptable to take this for 1.22-stable and get the "refactored" PR for develop?

Fine for me.

But I disagree with derived fact, I belive it should be attribute of host object itself.

... or a facet.

@ares
Copy link
Member Author

ares commented Jun 25, 2019

I was thinking about facet too, but what would that be representing? What other attributes we'd extract? Would it be for facts based values only or generic reported info. Boottime seems as quite generic info, such mac, ip. But the facet may be a way to solve the long standing issue of provisioned vs reported OS.

@timogoebel
Copy link
Member

I was thinking about facet too, but what would that be representing?

I was thinking about data we store from the hosts that we don't actually need. So data we collect to show it in the UI. Basically, some kind of inventory.

What other attributes we'd extract?

I'm not sure I would actually extract something but use it to collect more interesting data in the future.

@ares
Copy link
Member Author

ares commented Jun 30, 2019

I added a second commit that adds a new facet. I'm working on other fact parsers patches. I suppose I'll need to add tests here but I'd appreciate if @timogoebel could confirm this is what you had in mind. Also @ShimShtein this seems to be the first facet in core, any comments from that perspective? Since it's just one attribute, I didn't want to add new tab to host detail page.

Btw with this, people can search using boot_time > "1 week ago". If we're adding more reported data in future (e.g. OS) maybe we should prefix search terms accordingly from the beginning

FactoryBot.create(:fact_value, fact_name: fact, host: host, :value => 123)
assert_equal 123, host.uptime_seconds
boot_time = 1.day.ago
reported_data = host.create_reported_data(:boot_time => boot_time)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lint/UselessAssignment: Useless assignment to variable - reported_data.

Copy link
Member

@timogoebel timogoebel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ares: Yeah, that's what I had in mind. Looks good, just some comments.

Note that in the plugin PRs tests are missing.

@@ -1,7 +1,7 @@
module Host
class Base < ApplicationRecord
KERNEL_RELEASE_FACTS = [ 'kernelrelease', 'ansible_kernel', 'kernel::release' ]
UPTIME_FACTS = [ 'system_uptime::seconds', 'ansible_uptime_seconds', 'uptime_seconds', 'proc_stat::btime' ]
UPTIME_FACTS = [ 'system_uptime::seconds', 'ansible_uptime_seconds', 'uptime_seconds' ]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still used?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are still uptime facts which can be useful, similar to kernel release fact, now these are really just uptime facts, but if you prefer I can drop.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't use this somewhere, I'd prefer to drop it.

@@ -234,6 +235,14 @@ def set_interfaces(parser)
self.interfaces.reload
end

def set_reported_data(parser)
if parser.boot_time
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return unless parser.boot_time

@@ -234,6 +235,14 @@ def set_interfaces(parser)
self.interfaces.reload
end

def set_reported_data(parser)
if parser.boot_time
reported_data_facet = self.reported_data || self.build_reported_data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe extract this to a method so you can just call host.reported_data_facet and get either the saved object or a new one?
Then we could just do reported_data_facet.update(boot_time: parser.boot_time)

assert_present host.reported_data.boot_time
assert_kind_of DateTime, host.reported_data.boot_time

sleep 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope this is was just for debugging. Note that rails allows you to travel in time.

@timogoebel
Copy link
Member

Note, that the facet should be destroyed when the host enters build mode. Just extend clear_data_on_build. A test for this would be great as well.

def clear_data_on_build
return unless respond_to?(:old) && old && build? && !old.build?
clear_facts
clear_reports
self.build_errors = nil
end

@ares
Copy link
Member Author

ares commented Jul 3, 2019

updated according to comments, let's see what jenkins says before I squash

@@ -366,7 +375,8 @@ def render_template(template:, **params)
end

def uptime_seconds
self.uptime_fact&.value&.to_i
boot_time = self&.reported_data&.boot_time.to_i
boot_time > 0 ? Time.zone.now.to_i - boot_time : nil

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style/TernaryParentheses: Use parentheses for ternary expressions with complex conditions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about:

boot_time = self&.reported_data&.boot_time
Time.zone.now.to_i - boot_time.to_i unless boot_time.nil? # or if boot_time

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we know what timezone the boot time is in? I wonder if this can lead to things like -3 hours uptime in the UI.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about:

I reverted to .nil? and moved to_i part lower, I think it's more readable than statement modified at the end of the method

How do we know what timezone the boot time is in? I wonder if this can lead to things like -3 hours uptime in the UI.

the bootime in /proc/stat is I believe in system time zone, the parsing uses Time.at which assumes boot time in UTC (# of seconds since unix epoch does not have time zone info). When we display it to user, current timezone is used to convert the time in Time.zone.now, the same for boot_time thanks to active record. When I was testing this, it seemed to work fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that the to_i removes any information about timezone, what makes maybe the Time.zone misleading. Time.now.to_i should give the same result.

@@ -132,6 +132,10 @@ def support_interfaces_parsing?
true
end

def boot_time
Time.zone.now.to_i - facts[:uptime_seconds].to_i
Copy link
Member

@ekohl ekohl Jul 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The (non-deprecated) Facter 3 fact is system_uptime:

$ facter --json system_uptime
{
  "system_uptime": {
    "days": 4,
    "hours": 100,
    "seconds": 361191,
    "uptime": "4 days"
  }
}

That means you should first look for facts[:system_uptime][:seconds] and then fall back. I don't know if the nesting properly converts keys to symbols so probably good to check.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@ezr-ondrej ezr-ondrej self-assigned this Jul 16, 2019
Copy link
Member

@ezr-ondrej ezr-ondrej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good already., just two more comments.
I haven't tested facter yet, if anyone would like to help out :P

@@ -276,7 +276,7 @@ def overview_fields(host)
fields += [[_("Operating System"), link_to(host.operatingsystem.to_label, hosts_path(:search => %{os_title = "#{host.operatingsystem.title}"}))]] if host.operatingsystem.present?
fields += [[_("PXE Loader"), host.pxe_loader]] if host.operatingsystem.present? && !host.image_build?
fields += [[_("Host group"), link_to(host.hostgroup, hosts_path(:search => %{hostgroup_title = "#{host.hostgroup}"}))]] if host.hostgroup.present?
fields += [[_("Uptime"), time_ago_in_words(host.uptime_seconds.seconds.from_now)]] if host.uptime_seconds.present?
fields += [[_("Boot time"), date_time_relative(host&.reported_data&.boot_time) || _('Not reported')]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not reported is never going to show up as date_time_relative will handle nil value in the react component as 'N/A'.
Could you eighter remove the fallback, or handle it somewhat like (uglier code, but nicer value):

Suggested change
fields += [[_("Boot time"), date_time_relative(host&.reported_data&.boot_time) || _('Not reported')]]
fields += [[_("Boot time"), (boot_time = host&.reported_data&.boot_time) ? date_time_relative(boot_time) : _('Not reported')]]

Or add the option/quark default_value to the date_time_relative method and propagate it through mount_date_component to the react - the best solution 😛

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took your suggestion, also squashed all back to just 2 commits since I feel we're getting closer, they should be merged separately at the end

@@ -366,7 +375,8 @@ def render_template(template:, **params)
end

def uptime_seconds
self.uptime_fact&.value&.to_i
boot_time = self&.reported_data&.boot_time
boot_time.nil? ? Time.zone.now.to_i - boot_time.to_i : nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is reversed, test failures are most likely related to that:

Suggested change
boot_time.nil? ? Time.zone.now.to_i - boot_time.to_i : nil
boot_time.present? ? Time.zone.now.to_i - boot_time.to_i : nil
# OR
boot_time && Time.zone.now.to_i - boot_time.to_i

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I figured while fixing tests :-)

Copy link
Member

@ezr-ondrej ezr-ondrej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, will retest with puppet after the jenkins ack and I believe we are set 👍

@ezr-ondrej ezr-ondrej self-requested a review July 16, 2019 15:03
ezr-ondrej
ezr-ondrej previously approved these changes Jul 16, 2019
@ares
Copy link
Member Author

ares commented Jul 16, 2019

[test foreman] (broken pipe)

Copy link
Member

@ezr-ondrej ezr-ondrej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests are failing, but it's just because of the tests.

boot_time = 1.day.ago
host.create_reported_data(:boot_time => boot_time)
refute_nil host.uptime_seconds
host.clear_data_on_build
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe, we need to reload after this, the record is destroyed, but still loaded.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in fact it needed more tweaking so that clear_data_on_build would actually work (setting old and build)

assert host.import_facts(raw['facts'])
first_boot_time = host.reported_data.boot_time
refute_nil host.reported_data.boot_time
assert_kind_of DateTime, host.reported_data.boot_time
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fails, but I don't like asserting exact type match in dynamicaly typed language anyway.
Cant we predict the value and assert the values are equal? The equality should be handled more predictably.
Time is bit tricky, but something like this would maybe do?

Suggested change
assert_kind_of DateTime, host.reported_data.boot_time
assert (Time.now - host.reported_data.boot_time - raw['facts']['uptime_seconds'].to_i.seconds) < 2.seconds

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to do something smarter with freeze_time

freeze_time do
host = Host.import_host(raw['name'], 'puppet')
assert host.import_facts(raw['facts'])
first_boot_time = host.reported_data.boot_time

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lint/UselessAssignment: Useless assignment to variable - first_boot_time.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh dear, fixed cop now

test 'should import facts boot time to report data facet' do
refute Host.find_by_name('sinn1636.lan')
raw = read_json_fixture('facts/facts_with_certname.json')
first_boot_time = nil

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Layout/TrailingWhitespace: Trailing whitespace detected.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙈

raw = read_json_fixture('facts/facts_with_certname.json')
first_boot_time = nil
freeze_time do
host = Host.import_host(raw['name'], 'puppet')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should define the host variable outside of the block. It is undefined in the other one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I did the same for first_boot_time already, importing needs to happen while time is frozen... feel free to wait with the review until tests are green, I'm monitoring jenkins output here :-)

A new Reported Data facet is introduced. This should be used for storing
reported information from various fact parsers. Fact Parser now defines
a new optional to implement method, to gather boot time in seconds. Some
fact sources will need to construct this based on uptime.

Hosts can be now search by boot_time search attribute.
@ares
Copy link
Member Author

ares commented Jul 19, 2019

I believe that failures are now unrelated and this beast can be merged :-) (please keep 2 separate commits)

Copy link
Member

@ezr-ondrej ezr-ondrej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM, works fine.
Thanks @ares 👍 and thanks @timogoebel for most of the review, to @ekohl and @lzap for valuable notes 😊 Merging!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants