New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes #26393 - fix uptime calculation for btime #6843
Conversation
|
Issues: #26393 |
| test 'should return nil if no uptime fact is available' do | ||
| host = FactoryBot.create(:host) | ||
| assert_nil host.uptime_seconds | ||
| end | ||
|
|
||
| test 'should return uptime and based on backup facts even if there are multiple other facts' do | ||
| host = FactoryBot.create(:host) | ||
| ansible_uptime_fact = FactoryBot.create(:fact_name, name: 'ansible_uptime_seconds', :type => 'FactName::Ansible') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note for the reviewer: this is in fact needed since now we instantiate FactName model, in core we don't have these classes defined
app/models/host/base.rb
Outdated
| @@ -366,7 +366,12 @@ def render_template(template:, **params) | |||
| end | |||
|
|
|||
| def uptime_seconds | |||
| self.uptime_fact&.value&.to_i | |||
| fact = self.uptime_fact | |||
| if fact&.fact_name&.name == 'proc_stat::btime' | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is super hacky. Can we please not hardcode subscription manager specific code in core?
Why do we have the fact parser models if we don't use them to parse facts?
Big 👎 from me. Sorry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we refactor this to put the fact parsing logic into a fact parser?
|
To explain this a little bit more: I think we should make the uptime a field on the host model. This then allows us to use scoped search for this. |
|
Let's add new fact called |
|
I like the idea of a derived fact and to store it somewhere, but there's some things to consider. For example, what would be the fact source on that fact? I must also say that I like the boot time from subscription manager a lot more than the uptime from Facter. It doesn't change as often and to get the current uptime you need to calculate anyway. We can normalize uptime into boot time. |
|
Good point on immutability, let's prefer immutable fact and calculate this in the UI instead! |
|
I agree with Timo, making uptime a first class attribute of host model makes the most sense to me. It's not different to any other information we parse from facts. This will be a bit bigger patch though and this should be fixed in 1.22-stable, so it would be harder and riskier to cherry-pick. Would it be acceptable to take this for 1.22-stable and get the "refactored" PR for develop? Could I base the new work on top of this PR? I also agree with @ekohl, the new attribute should be boottime instead of uptime to avoid unnecessary updates. But I disagree with derived fact, I belive it should be attribute of host object itself. |
Fine for me.
... or a facet. |
|
I was thinking about facet too, but what would that be representing? What other attributes we'd extract? Would it be for facts based values only or generic reported info. Boottime seems as quite generic info, such mac, ip. But the facet may be a way to solve the long standing issue of provisioned vs reported OS. |
I was thinking about data we store from the hosts that we don't actually need. So data we collect to show it in the UI. Basically, some kind of inventory.
I'm not sure I would actually extract something but use it to collect more interesting data in the future. |
|
I added a second commit that adds a new facet. I'm working on other fact parsers patches. I suppose I'll need to add tests here but I'd appreciate if @timogoebel could confirm this is what you had in mind. Also @ShimShtein this seems to be the first facet in core, any comments from that perspective? Since it's just one attribute, I didn't want to add new tab to host detail page. Btw with this, people can search using |
test/models/host_test.rb
Outdated
| FactoryBot.create(:fact_value, fact_name: fact, host: host, :value => 123) | ||
| assert_equal 123, host.uptime_seconds | ||
| boot_time = 1.day.ago | ||
| reported_data = host.create_reported_data(:boot_time => boot_time) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lint/UselessAssignment: Useless assignment to variable - reported_data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ares: Yeah, that's what I had in mind. Looks good, just some comments.
Note that in the plugin PRs tests are missing.
app/models/host/base.rb
Outdated
| @@ -1,7 +1,7 @@ | |||
| module Host | |||
| class Base < ApplicationRecord | |||
| KERNEL_RELEASE_FACTS = [ 'kernelrelease', 'ansible_kernel', 'kernel::release' ] | |||
| UPTIME_FACTS = [ 'system_uptime::seconds', 'ansible_uptime_seconds', 'uptime_seconds', 'proc_stat::btime' ] | |||
| UPTIME_FACTS = [ 'system_uptime::seconds', 'ansible_uptime_seconds', 'uptime_seconds' ] | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this still used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are still uptime facts which can be useful, similar to kernel release fact, now these are really just uptime facts, but if you prefer I can drop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't use this somewhere, I'd prefer to drop it.
app/models/host/base.rb
Outdated
| @@ -234,6 +235,14 @@ def set_interfaces(parser) | |||
| self.interfaces.reload | |||
| end | |||
|
|
|||
| def set_reported_data(parser) | |||
| if parser.boot_time | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return unless parser.boot_time
app/models/host/base.rb
Outdated
| @@ -234,6 +235,14 @@ def set_interfaces(parser) | |||
| self.interfaces.reload | |||
| end | |||
|
|
|||
| def set_reported_data(parser) | |||
| if parser.boot_time | |||
| reported_data_facet = self.reported_data || self.build_reported_data | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe extract this to a method so you can just call host.reported_data_facet and get either the saved object or a new one?
Then we could just do reported_data_facet.update(boot_time: parser.boot_time)
test/models/host_test.rb
Outdated
| assert_present host.reported_data.boot_time | ||
| assert_kind_of DateTime, host.reported_data.boot_time | ||
|
|
||
| sleep 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope this is was just for debugging. Note that rails allows you to travel in time.
|
Note, that the facet should be destroyed when the host enters build mode. Just extend foreman/app/models/host/managed.rb Lines 299 to 304 in 870c50a
|
|
updated according to comments, let's see what jenkins says before I squash |
app/models/host/base.rb
Outdated
| @@ -366,7 +375,8 @@ def render_template(template:, **params) | |||
| end | |||
|
|
|||
| def uptime_seconds | |||
| self.uptime_fact&.value&.to_i | |||
| boot_time = self&.reported_data&.boot_time.to_i | |||
| boot_time > 0 ? Time.zone.now.to_i - boot_time : nil | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style/TernaryParentheses: Use parentheses for ternary expressions with complex conditions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about:
boot_time = self&.reported_data&.boot_time
Time.zone.now.to_i - boot_time.to_i unless boot_time.nil? # or if boot_timeThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we know what timezone the boot time is in? I wonder if this can lead to things like -3 hours uptime in the UI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about:
I reverted to .nil? and moved to_i part lower, I think it's more readable than statement modified at the end of the method
How do we know what timezone the boot time is in? I wonder if this can lead to things like -3 hours uptime in the UI.
the bootime in /proc/stat is I believe in system time zone, the parsing uses Time.at which assumes boot time in UTC (# of seconds since unix epoch does not have time zone info). When we display it to user, current timezone is used to convert the time in Time.zone.now, the same for boot_time thanks to active record. When I was testing this, it seemed to work fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that the to_i removes any information about timezone, what makes maybe the Time.zone misleading. Time.now.to_i should give the same result.
app/services/puppet_fact_parser.rb
Outdated
| @@ -132,6 +132,10 @@ def support_interfaces_parsing? | |||
| true | |||
| end | |||
|
|
|||
| def boot_time | |||
| Time.zone.now.to_i - facts[:uptime_seconds].to_i | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The (non-deprecated) Facter 3 fact is system_uptime:
$ facter --json system_uptime
{
"system_uptime": {
"days": 4,
"hours": 100,
"seconds": 361191,
"uptime": "4 days"
}
}
That means you should first look for facts[:system_uptime][:seconds] and then fall back. I don't know if the nesting properly converts keys to symbols so probably good to check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good already., just two more comments.
I haven't tested facter yet, if anyone would like to help out :P
app/helpers/hosts_helper.rb
Outdated
| @@ -276,7 +276,7 @@ def overview_fields(host) | |||
| fields += [[_("Operating System"), link_to(host.operatingsystem.to_label, hosts_path(:search => %{os_title = "#{host.operatingsystem.title}"}))]] if host.operatingsystem.present? | |||
| fields += [[_("PXE Loader"), host.pxe_loader]] if host.operatingsystem.present? && !host.image_build? | |||
| fields += [[_("Host group"), link_to(host.hostgroup, hosts_path(:search => %{hostgroup_title = "#{host.hostgroup}"}))]] if host.hostgroup.present? | |||
| fields += [[_("Uptime"), time_ago_in_words(host.uptime_seconds.seconds.from_now)]] if host.uptime_seconds.present? | |||
| fields += [[_("Boot time"), date_time_relative(host&.reported_data&.boot_time) || _('Not reported')]] | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not reported is never going to show up as date_time_relative will handle nil value in the react component as 'N/A'.
Could you eighter remove the fallback, or handle it somewhat like (uglier code, but nicer value):
| fields += [[_("Boot time"), date_time_relative(host&.reported_data&.boot_time) || _('Not reported')]] | |
| fields += [[_("Boot time"), (boot_time = host&.reported_data&.boot_time) ? date_time_relative(boot_time) : _('Not reported')]] |
Or add the option/quark default_value to the date_time_relative method and propagate it through mount_date_component to the react - the best solution 😛
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took your suggestion, also squashed all back to just 2 commits since I feel we're getting closer, they should be merged separately at the end
app/models/host/base.rb
Outdated
| @@ -366,7 +375,8 @@ def render_template(template:, **params) | |||
| end | |||
|
|
|||
| def uptime_seconds | |||
| self.uptime_fact&.value&.to_i | |||
| boot_time = self&.reported_data&.boot_time | |||
| boot_time.nil? ? Time.zone.now.to_i - boot_time.to_i : nil | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is reversed, test failures are most likely related to that:
| boot_time.nil? ? Time.zone.now.to_i - boot_time.to_i : nil | |
| boot_time.present? ? Time.zone.now.to_i - boot_time.to_i : nil | |
| # OR | |
| boot_time && Time.zone.now.to_i - boot_time.to_i |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I figured while fixing tests :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, will retest with puppet after the jenkins ack and I believe we are set 👍
|
[test foreman] (broken pipe) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests are failing, but it's just because of the tests.
| boot_time = 1.day.ago | ||
| host.create_reported_data(:boot_time => boot_time) | ||
| refute_nil host.uptime_seconds | ||
| host.clear_data_on_build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe, we need to reload after this, the record is destroyed, but still loaded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in fact it needed more tweaking so that clear_data_on_build would actually work (setting old and build)
test/models/host_test.rb
Outdated
| assert host.import_facts(raw['facts']) | ||
| first_boot_time = host.reported_data.boot_time | ||
| refute_nil host.reported_data.boot_time | ||
| assert_kind_of DateTime, host.reported_data.boot_time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fails, but I don't like asserting exact type match in dynamicaly typed language anyway.
Cant we predict the value and assert the values are equal? The equality should be handled more predictably.
Time is bit tricky, but something like this would maybe do?
| assert_kind_of DateTime, host.reported_data.boot_time | |
| assert (Time.now - host.reported_data.boot_time - raw['facts']['uptime_seconds'].to_i.seconds) < 2.seconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to do something smarter with freeze_time
| freeze_time do | ||
| host = Host.import_host(raw['name'], 'puppet') | ||
| assert host.import_facts(raw['facts']) | ||
| first_boot_time = host.reported_data.boot_time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lint/UselessAssignment: Useless assignment to variable - first_boot_time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh dear, fixed cop now
test/models/host_test.rb
Outdated
| test 'should import facts boot time to report data facet' do | ||
| refute Host.find_by_name('sinn1636.lan') | ||
| raw = read_json_fixture('facts/facts_with_certname.json') | ||
| first_boot_time = nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Layout/TrailingWhitespace: Trailing whitespace detected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🙈
| raw = read_json_fixture('facts/facts_with_certname.json') | ||
| first_boot_time = nil | ||
| freeze_time do | ||
| host = Host.import_host(raw['name'], 'puppet') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should define the host variable outside of the block. It is undefined in the other one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I did the same for first_boot_time already, importing needs to happen while time is frozen... feel free to wait with the review until tests are green, I'm monitoring jenkins output here :-)
A new Reported Data facet is introduced. This should be used for storing reported information from various fact parsers. Fact Parser now defines a new optional to implement method, to gather boot time in seconds. Some fact sources will need to construct this based on uptime. Hosts can be now search by boot_time search attribute.
|
I believe that failures are now unrelated and this beast can be merged :-) (please keep 2 separate commits) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code LGTM, works fine.
Thanks @ares 👍 and thanks @timogoebel for most of the review, to @ekohl and @lzap for valuable notes 😊 Merging!
Thet source of uptime information coming from subscription manager does
not mean uptime but boot time, so we need to convert it in uptime
seconds method.