Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HostNode statistics are missing timestamps #1908

Merged
merged 12 commits into from Mar 3, 2017

Conversation

daniel-bytes
Copy link
Contributor

Fixes 1905

In MessageHandler#on_container_stat and MessageHandler#on_node_stat, Mongoid timestamp values are now manually set, by either parsing incoming data['time'] values or defaulting to Time.now.utc.

Refactored MessageHandler#on_container_stat buffering to use a configurable buffer size.

Additional unit tests added to verify timestamp fields, buffer size refactor code, and on_container_stat calls (previously on_container_stat did not have any unit tests).

Daniel Battaglia added 4 commits February 28, 2017 10:25
…t call.

Adding MessageHandler#on_node_stat unit test.
Adding support for Mongoid timestamps when saving container stats.
Refactoring on_container_stat to use a configurable batch size
@CLAassistant
Copy link

CLAassistant commented Feb 28, 2017

CLA assistant check
All committers have signed the CLA.

Copy link
Contributor

@jakolehm jakolehm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will probably conflict with #1855 (just a note, it can be resolved after pr is merged).

@@ -188,13 +194,17 @@ def on_node_stat(grid, data)
node = grid.host_nodes.find_by(node_id: data['id'])
return unless node
data = fixnums_to_float(data)
time = data['time'] ? Time.parse(data['time']) : Time.now.utc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't actually send time from the agent.. should we? /cc @nevalla

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, it looks like both the agent host and container stats are missing any time field:

The agent events may be queued up for long periods of time if there's e.g. websocket connection issues.

IMO either also fix the agent side as part of this PR, or file a separate issue for that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, it would be more reliable if agent sends the time. Could be also a separate PR.

@jakolehm jakolehm added this to the 1.2.0 milestone Feb 28, 2017
@daniel-bytes
Copy link
Contributor Author

Had to fix several merge conflicts as a result of refactoring done in PR 1855

@daniel-bytes
Copy link
Contributor Author

daniel-bytes commented Mar 1, 2017

@jakolehm @nevalla @SpComb I have a question: Since these data points are immutable facts, would it make more sense to only store the created_at timestamp in the database? It looks like mongoid provides this. Considering we will probably store a lot of these statistics, it could end up saving a decent amount of duplicated data.

Proposal:
In container_stat.rb and host_node_stat.rb,
change
include Mongoid::Timestamps
to
include Mongoid::Timestamps::Created

@jakolehm
Copy link
Contributor

jakolehm commented Mar 1, 2017

@daniel-bytes yes, that makes sense.

@daniel-bytes
Copy link
Contributor Author

Updated PR to handle timestamp values being sent from the agent. Added unit tests on agent side.


describe '#publish_node_stats' do
it 'sends stats via rpc' do
expect(rpc_client).to receive(:notification).once do |key, msg|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to compose rspec matchers to do this, without having to use a block:

expect(rpc_client).to receive(:notification).once.with('/nodes/stats', [Hash])

Or even better for this spec:

expect(rpc_client).to receive(:notification).once.with('/nodes/stats', [hash_including(time: String)])

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, let me take a look. I was having issues with hash_including but I probably was doing it wrong.

@@ -107,23 +107,148 @@
#puts bm
total_time = Time.now.to_f - start_time
expect(grid.container_logs.count).to eq(1_000)
expect(total_time < 0.5).to be_truthy
expect(total_time).to be < 0.5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 for much more useful rspec errors with this kind of expectation

end
end

describe '#on_container_health' do
it 'updates health status' do
describe '#save' do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did this group of #save specs get duplicated during the merge?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes good call I think there is some duplication going on in this spec file. The merge got pretty nasty....

@@ -7,6 +7,11 @@ class NodeHandler

def initialize(grid)
@grid = grid
@db_session = ContainerLog.collection.session.with(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ContainerLog looks a bit odd here, but I suppose it doesn't really make any difference. Maybe use HostNode instead, though?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that makes sense. Let me fix this.

Fixed duplicated unit test after bad merge.
@SpComb
Copy link
Contributor

SpComb commented Mar 2, 2017

Has this been tested to verify that the agent timestamps in the stats entries make it all the way to the server MongoDB intact?

@SpComb SpComb merged commit b1e3163 into master Mar 3, 2017
@SpComb SpComb deleted the fix/hostnode_statistics_missing_timestamp branch March 3, 2017 09:04
@daniel-bytes
Copy link
Contributor Author

E2E tests on my local environment looked good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants