Skip to content
This repository has been archived by the owner on Jun 10, 2021. It is now read-only.

Add process CPU usage measurement node #59

Merged
merged 4 commits into from Jan 14, 2020
Merged

Conversation

mm318
Copy link
Member

@mm318 mm318 commented Jan 2, 2020

This implements the node that does per-process CPU usage collection.

Example debug output:

$ ./install/system_metrics_collector/lib/system_metrics_collector/main | grep linuxProcessCpuCollector
[DEBUG] [1578005328.012000628] [linuxProcessCpuCollector]: PerformPeriodicMeasurement: nan
[DEBUG] [1578005328.012084643] [linuxProcessCpuCollector]: name=linuxProcessCpuCollector, measurement_period=1000ms, publishing_topic=not_publishing_yet, publish_period=60000ms, started=true, avg=nan, min=nan, max=nan, std_dev=nan, count=0
[DEBUG] [1578005329.011345403] [linuxProcessCpuCollector]: PerformPeriodicMeasurement: 3.670513
[DEBUG] [1578005329.011465002] [linuxProcessCpuCollector]: name=linuxProcessCpuCollector, measurement_period=1000ms, publishing_topic=not_publishing_yet, publish_period=60000ms, started=true, avg=3.670513, min=3.670513, max=3.670513, std_dev=0.000000, count=1
[DEBUG] [1578005330.011254244] [linuxProcessCpuCollector]: PerformPeriodicMeasurement: 3.843482
[DEBUG] [1578005330.011410743] [linuxProcessCpuCollector]: name=linuxProcessCpuCollector, measurement_period=1000ms, publishing_topic=not_publishing_yet, publish_period=60000ms, started=true, avg=3.756997, min=3.670513, max=3.843482, std_dev=0.086485, count=2
[DEBUG] [1578005331.011297740] [linuxProcessCpuCollector]: PerformPeriodicMeasurement: 4.094694
[DEBUG] [1578005331.011433186] [linuxProcessCpuCollector]: name=linuxProcessCpuCollector, measurement_period=1000ms, publishing_topic=not_publishing_yet, publish_period=60000ms, started=true, avg=3.869563, min=3.670513, max=4.094694, std_dev=0.174150, count=3
[DEBUG] [1578005332.025331858] [linuxProcessCpuCollector]: PerformPeriodicMeasurement: 4.664516
[DEBUG] [1578005332.025468622] [linuxProcessCpuCollector]: name=linuxProcessCpuCollector, measurement_period=1000ms, publishing_topic=not_publishing_yet, publish_period=60000ms, started=true, avg=4.068301, min=3.670513, max=4.664516, std_dev=0.375815, count=4
[DEBUG] [1578005333.028630184] [linuxProcessCpuCollector]: PerformPeriodicMeasurement: 3.307438
[DEBUG] [1578005333.028917962] [linuxProcessCpuCollector]: name=linuxProcessCpuCollector, measurement_period=1000ms, publishing_topic=not_publishing_yet, publish_period=60000ms, started=true, avg=3.916129, min=3.307438, max=4.664516, std_dev=0.453449, count=5
[DEBUG] [1578005334.011455014] [linuxProcessCpuCollector]: PerformPeriodicMeasurement: 3.533342
[DEBUG] [1578005334.011652595] [linuxProcessCpuCollector]: name=linuxProcessCpuCollector, measurement_period=1000ms, publishing_topic=not_publishing_yet, publish_period=60000ms, started=true, avg=3.852331, min=3.307438, max=4.664516, std_dev=0.437832, count=6
[DEBUG] [1578005335.012703392] [linuxProcessCpuCollector]: PerformPeriodicMeasurement: 5.471830
[DEBUG] [1578005335.012868267] [linuxProcessCpuCollector]: name=linuxProcessCpuCollector, measurement_period=1000ms, publishing_topic=not_publishing_yet, publish_period=60000ms, started=true, avg=4.083688, min=3.307438, max=5.471830, std_dev=0.696755, count=7

Example topic output:

$ ros2 topic echo /not_publishing_yet
measurement_source_name: linuxProcessCpuCollector
metrics_source: 23296_cpu_percent_used
window_start:
  sec: 1578005387
  nanosec: 13896897
window_stop:
  sec: 1578005447
  nanosec: 16203847
statistics:
- data_type: 1
  data: 3.9269039630889893
- data_type: 3
  data: 10.672758102416992
- data_type: 2
  data: 2.0373220443725586
- data_type: 5
  data: 60.0
- data_type: 4
  data: 1.0283935070037842
---
.
.
.
---
measurement_source_name: linuxProcessCpuCollector
metrics_source: 23296_cpu_percent_used
window_start:
  sec: 1578005447
  nanosec: 16265587
window_stop:
  sec: 1578005507
  nanosec: 18301838
statistics:
- data_type: 1
  data: 3.6470861434936523
- data_type: 3
  data: 8.337632179260254
- data_type: 2
  data: 1.4981666803359985
- data_type: 5
  data: 60.0
- data_type: 4
  data: 1.2943096160888672
---


double PeriodicMeasurement() override
{
LinuxProcessCpuMeasurementNode::PeriodicMeasurement();
Copy link
Contributor

@dabonnie dabonnie Jan 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this assume that the measurement works on the machine running the integration tests? Is that an assumption we can make? I would consider stubbing out the system specific calls and provide fake data.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if we can't make the measurement on the machine running the integration test, then it's something wrong with the way we're measuring or something is wrong with the test machine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't agree: what happens if we need to support multiple platforms? There's no requirement that the build system is able to make these measurements, which is why many of the Linux specific functions are have been separately stubbed out and tested independently.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this is in test_LINUX_process_cpu_measurement_node.cpp

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this is in test_LINUX_process_cpu_measurement_node.cpp

Right, but the unit test makes the assumption that the machine running the test supports the linux specific measurements. I think it's important to point out none of the other Linux node tests make this assumption.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's important to point out none of the other Linux node tests make this assumption.

And consequently some of the functions under utilities.cpp have no code coverage.

We can always add tests where not making system specific calls (e.g. GetPID). I need to understand the coverage, but the majority of the utilities functions have corresponding unit tests covering their different return cases (via test_utilities.cpp)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops sorry, not utilities.cpp, but LinuxCpuMeasurementNode::MakeSingleMeasurement(), LinuxProcessMemoryMeasurementNode::PeriodicMeasurement(), and LinuxMemoryMeasurementNode::PeriodicMeasurement().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get the point you're trying to make, because we can walk through the existing unit tests and go over how all of the non-specific system calls directly tested. Otherwise one could mock these classes solely to call these methods directly, but that kind of test exists for the superclass.

However, we're getting off topic here. Let's address the tests in this PR and improve coverage, etc in other issues.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how all of the non-specific system calls directly tested.

My point is why are we not testing system-specific calls?

Anyway yes, I'll disagree and commit™ to this PR comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been mocked out via overriding LinuxProcessCpuMeasurementNode::MakeSingleMeasurement().

@codecov
Copy link

codecov bot commented Jan 3, 2020

Codecov Report

Merging #59 into master will increase coverage by 0.14%.
The diff coverage is 35.84%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #59      +/-   ##
==========================================
+ Coverage   40.95%   41.09%   +0.14%     
==========================================
  Files          28       33       +5     
  Lines         967     1039      +72     
  Branches      566      605      +39     
==========================================
+ Hits          396      427      +31     
- Misses         54       60       +6     
- Partials      517      552      +35
Flag Coverage Δ
#unittests 41.09% <35.84%> (+0.14%) ⬆️
Impacted Files Coverage Δ
...s_collector/linux_process_cpu_measurement_node.hpp 100% <100%> (ø)
...lector/test_linux_process_cpu_measurement_node.cpp 29.34% <29.34%> (ø)
...s_collector/linux_process_cpu_measurement_node.cpp 76.92% <76.92%> (ø)
...llector/src/system_metrics_collector/utilities.cpp 35.86% <0%> (-4.35%) ⬇️
...r/test/system_metrics_collector/test_utilities.cpp 20.58% <0%> (-0.07%) ⬇️
...src/system_metrics_collector/proc_pid_cpu_data.hpp 100% <0%> (ø)
...r/test/system_metrics_collector/test_utilities.hpp 25% <0%> (ø)
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9a51d68...62efe92. Read the comment docs.

Copy link

@zmichaels11 zmichaels11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

#include "utilities.hpp"

#include "rclcpp/rclcpp.hpp"
#include "rcutils/logging_macros.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using logging utilities like those in rosbag2

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a separate PR.

constexpr const char kTestTopic[] = "test_process_cpu_measure_topic";
}

class TestLinuxProcessCpuMeasurementNode : public system_metrics_collector::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider calling this fake and not test. test would be useful for a fixture where you use the fake (IMO).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I'm unsure if this comment was referring to the value of the kTestTopic constant or the name of the class.

Anyway, I have renamed TestLinuxProcessCpuMeasurementNode to MockLinuxProcessCpuMeasurementNode, since mock is a more accurate description than fake.

@dabonnie dabonnie merged commit 48f587b into master Jan 14, 2020
@dabonnie
Copy link
Contributor

Merged given passing tests and bypassing broken codecov.

@emersonknapp emersonknapp deleted the miaofei/measure-process-cpu branch August 31, 2020 18:28
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants