Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Show executor metrics #1096

Merged
merged 3 commits into from Nov 15, 2017
Merged

Show executor metrics #1096

merged 3 commits into from Nov 15, 2017

Conversation

philipnrmn
Copy link
Contributor

This PR fixes DCOS-19009

Running dcos task metrics details <task-id> does the following:

  1. requests container metrics from .../metrics/v0/containers/<container-id>
  2. requests app metrics from .../metrics/v0/containers/<container-id>/app
  3. joins the two and prints them out in a table

In some cases, container metrics are not present and the dcos-metrics API returns a 204 empty. We were assuming in those cases that there were also no app metrics. Unfortunately, that is not correct in cases of executors, as in DC/OS SDK frameworks like cassandra and kafka.

With this PR, we catch the empty case and poll app metrics as well.

@philipnrmn philipnrmn self-assigned this Nov 7, 2017
@@ -9,6 +9,11 @@
emitter = emitting.FlatEmitter()


class EmptyMetricsException(DCOSException):
def __init__(self):
DCOSException.__init__(self, 'No metrics found')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we don't have to support Python 2, could you switch this to super().__init__( 'No metrics found')?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

except EmptyMetricsException:
pass

app_datapoints = _fetch_metrics_datapoints(app_url)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible that we have the other way around, container data exists but not app data?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, there will always be app data because we always pass two app values from the mesos module:

https://github.com/dcos/dcos-metrics/blob/3c96b31b746170981211329d35c96da3eda5843b/mesos_module/container_reader_impl.cpp#L153-L158

bamarni
bamarni previously approved these changes Nov 8, 2017
Copy link
Contributor

@bamarni bamarni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, CI is currently broken but once we'll merge #1094 I'll trigger CI here.

@bamarni
Copy link
Contributor

bamarni commented Nov 9, 2017

run linux integration tests

@bamarni
Copy link
Contributor

bamarni commented Nov 9, 2017

@philipnrmn : there are a few code style errors :

[linux-tests] tests/unit/test_task.py:182:42: E128 continuation line under-indented for visual indent
[linux-tests] tests/unit/test_task.py:183:42: E128 continuation line under-indented for visual indent
[linux-tests] tests/unit/test_task.py:202:1: E302 expected 2 blank lines, found 1

You can check these locally by running tox -e py35-syntax in the cli directory (after running make env and source env/bin/activate).

I just noticed btw that CI continues and runs tests even though it's going to fail eventually because of syntax errors... I'll look into it.

When container metrics (`/v0/containers/<c-id>`) returns a 204 empty
response, we should still check the app metrics
(`/v0/containers/<c-id>/app`) endpoint, which may have data.

This test replicates the task metrics details test, but responds with a
204 Empty followed by a 200 OK. Without the accompanying fix, it will
fail.
@bamarni
Copy link
Contributor

bamarni commented Nov 15, 2017

run linux integration tests

@bamarni bamarni merged commit 14f9706 into master Nov 15, 2017
@bamarni bamarni deleted the philipnrmn/executor-metrics branch November 15, 2017 15:11
bamarni pushed a commit that referenced this pull request Nov 23, 2017
* Use dedicated exception class when metrics response is empty

* Add test for empty container metrics

When container metrics (`/v0/containers/<c-id>`) returns a 204 empty
response, we should still check the app metrics
(`/v0/containers/<c-id>/app`) endpoint, which may have data.

This test replicates the task metrics details test, but responds with a
204 Empty followed by a 200 OK. Without the accompanying fix, it will
fail.

* Catch empty metrics exception for container datapoints
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants