Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cpufreq plugin: support for cpufreq/stats and hot-pluggable CPUs #1823

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

rinigus
Copy link
Contributor

@rinigus rinigus commented Jul 30, 2016

This PR adds cpufreq support for

  • hot-pluggable CPUs
  • following CPU used frequencies distribution, as provided by cpufreq/stats

On mobile platforms, CPUs are frequently switched off to save battery. As a result, at the moment when CPU is not available, the corresponding cpufreq and cpufreq/stats is unavailable. The current cpufreq plugin considers such situation as an error. This PR adds support for such configurations and flags an error only if there are no CPUs providing cpufreq data.

In Linux, an accurate cpufreq statistics is given through cpufreq/stats directory. This allows to account for usage of CPU frequencies between the statistics collection by collectd. On mobile platforms, CPU frequency is changed frequently, and this PR adds support for this.

An example graph from the collected data on CPU frequency distribution is shown below in the system with hot-pluggable CPUs.

cpufreq_stats

@rubenk rubenk added the Feature label Jul 31, 2016
src/cpufreq.c Outdated

// determine number of cpus. here, just check whether cpus are
// there. since cpus can be hot-plugged, cpufreq could be absent
// at this particualr time moment
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/particualr/particular/

@octo
Copy link
Member

octo commented Aug 7, 2016

As mentioned inline, it may be preferable to convert all the counter values to a rate, using value_to_rate() from "common.h": with a rate you can compute the percentage spent in each frequency, which may be more useful than some arbitrary "ticks per second" that you'll get otherwise. We just spend quite some time changing the cpu plugin in this direction. With percentages, you can also calculate average frequency, 90th percentile frequency, etc from these rates, which may be interesting config options.

Apart from this it would be nice if you could spend some time cleaning up the coding style so the code blends into the rest of the file. Spaces inside parenthesis (if ( foo )) caught my attention but there is other unusual whitespace (i=0, function_call( args ), …)

The init() function is nested very deep, maybe you could break some functionality out into a utility function to make that easier to read.

Best regards,
—octo

@rinigus
Copy link
Contributor Author

rinigus commented Aug 7, 2016

@octo,

I'll be happy to clean up the code - I'm just used to put spaces in few places that you guys don't seem to do.

but lets get the calculus right first. the values that are calculated are in seconds, not ticks. At least kernel documentation is rather specific about it. So, maybe I should convert it to ms as in cpusleep. I will read about the function that you propose, but so far its not clear why I cannot sum up times together and take derivative later. It seems like rather simple derivative rules that should be OK. But I'll get back after I've read about that function .

thank you for the review,

rinigus

@rinigus
Copy link
Contributor Author

rinigus commented Aug 7, 2016

@octo:

Thank you for your suggestions and review! I'd like to agree first on the calculations/algorithm and, after that, deal with the cleanup of the code.

For background: I somehow mistakenly took derive_t as a double and that's why in this and some other plugins I used casting to double asap. My mistake and I'll deal with fixing it in this code as well.

I looked through the proposed value_to_rate function and I don't think it would make the code easier. As I understand, value_to_rate takes the derivative that can be later used in plugin. In the proposed implementation, I report just the fraction of time spent by either one or all CPUs at given frequency. Now the units are not ticks, but fraction of time. So, its easy for user to see immediately how much the corresponding frequency was used. As you could see on the graph, it also takes into account that CPU was inactive in rather large fraction of time (sleep on mobile device) leading to the total less than 100%.

Now, to avoid rounding errors, I suggest to change the units and move to total_time_in_ms, as we did in cpusleep. Its a bit of a pain to change units, since I have users on Sailfish who run it on their devices, but its better to do it now, before merging. I'll just have to account for it in GUI.

When reporting total_time_in_ms, it should be fine to sum up all values and feed it to the writers as derive_t. The difference between this approach and taking derivatives separately maybe in some corner cases, like rollover. However, to have an impact, single rollover of one of the summed values should not have any impact. How multiple derive_t roll overs during a single measurement would influence the result, I cannot say at the moment.

Best wishes,

rinigus

@octo
Copy link
Member

octo commented Aug 8, 2016

Hi @rinigus,

if the CPUs are sleeping and not actually hot-plugged, then summing up derive_ts should be good enough in this case. It's important, though, that counters are not reset or disappear from view when a CPU is sleeping. Overflows are not a concern: assuming that the kernel is using 64bit wide counters, it will take ~292 million years to overflow.

Changing the type to total_time_in_ms sounds good.

I still think that there is potential in calculating the average frequency from the distribution: right now the cpufreq plugin only reports the current frequency of the CPU, which is read only during collectd's update cycle, which may skew the measurement quite a bit. By using the distribution I think we can do much better. I'm happy to file a feature request and handle this in a future PR though.

Best regards,
—octo

@rinigus
Copy link
Contributor Author

rinigus commented Aug 8, 2016

Morning @octo!

If total_time_in_ms is fine, then I'll rewrite it to use that and as much integer math as I can.

Now few remarks regarding hot-plugged, Intel CPUs, and stats availibility.

For hot-plugged CPUs, I have only one platform to test it - Nexus 4 phone running Sailfish OS, Linux kernel 3.4. On this platform, CPUs are not hot-plugged by kernel directly, but by external qcom utility. Either because of that or some implementation detail in the kernel, counters seem to reset on hot plug/unplug cycle (and corresponding disappear in /sys folder). This is in contrast to cpuidle statistics which does not disappear and stays on plugging/unplugging CPUs on Nexus 4 [the cpuplugin is ready as well and I'll submit it a bit later]. So, I consider it a kernel limitation and would have to warn the users regarding it. Whether its limited to Nexus 4 and other qcom phones, I don't know.

One other limitation is related to the new P-State drivers by Intel CPUs. On these CPUs, frequency is not really driven by kernel and the stats on them is unavailable. So, only reported value is the current one which can be significantly skew the measurements, as you mention (collectd work driving cpu frequency up, for example). For these CPUs, following idle states makes more sense and I am doing it through other plugin, cpuidle, mentioned above.

Finally, to have stats available, user has to enable corresponding kernel module, which maybe not on some distributions. On Gentoo, it depends on the user.

These are the limitations that I am aware of. I'd suggest to keep implementation still simple and warn users on possible skews in the calculated distributions. In the end, on phones, you want not exact values but get an idea on whats going on. For that, the current implementation should be reasonable, in my eyes. But let me know your opinion and let's agree on overall implementation. After agreeing, I'll work on implementing it.

Best wishes,

rinigus

@rinigus
Copy link
Contributor Author

rinigus commented Aug 9, 2016

@octo : ping! Just a reminder - waiting for your reply regarding cpufreq calculations.

In addition, do you have already some C++ map-like functionality in collectd? In C++, I could have used std::map to calculate easily the time spent by all cpus on particular frequency. Do you have similar functionality implemented already in some common files?

Best wishes,

rinigus

@octo
Copy link
Member

octo commented Aug 10, 2016

@rinigus Sorry, didn't realize you were waiting for me.

There are two options:

  • Convert the derives to a rate and sum up the rates. These rates are a percentage, so I'd use the type percentage, which is easy to grasp by users. (the time derivative of a 10ms counter is "10ms / s" = 1/100 = 1%. The rates will not necessarily sum up to 100; the remainder is the time the CPU was suspended.)
  • Postpone this, sum up the derives directly and report them as total_time_in_ms. Add code to convert to rates when it turns out that summing the derives results in too many broken datapoints. There is going to be an extra step converting the rates back to a derive (so we can continue to report total_time_in_ms), but it's not the end of the world.

Either option is okay for me, though I prefer the first one: it sounds like the plugin would benefit from calculating a rate, since CPUs disappear and re-appear from our view.

Does this unblock you?
Best regards,
—octo

@octo
Copy link
Member

octo commented Aug 10, 2016

src/daemon/utils_avltree.h is the closest we have to a std::map.

@rinigus
Copy link
Contributor Author

rinigus commented Aug 10, 2016

@octo:

yes, it does unblock me, thank you :). I'll think a bit and would do as consistent plugin as I can.

thank you,

rinigus

@octo octo added the Waiting label Aug 10, 2016
@rinigus
Copy link
Contributor Author

rinigus commented Aug 15, 2016

Dear @octo:

here is a cleaned up code and I think it is ready for the new review.

  1. I modified the stats to be reported in total_time_in_ms. As a result, precision has visibly improved.
  2. I added comments that should help the reader to get through the code easier. I looked into partitioning _init function, but couldn't identify a good partitioning that would not make the code harder to grasp.
  3. As for hot-plugged CPUs with the stats disappearing between readouts. It seems to me that this is a limitation in kernel implementation on the phone that I have and using the kernel that I am using now. Whether it is a limited case for this particular hardware/kernel/userspace combination, I don't know. In my case, I can still use the reported data, but have to treat it with caution. Unfortunately, in my case, since the data is reset between readouts, there is no data to report either. For me, the current implementation would still help me to catch cases when multiple CPUs are suddenly switched on and start working on high frequencies. I have added corresponding information to the documentation of the plugin to warn users. I'll keep my eyes open with respect of disappearing stats and maybe it should be reported as an issue to the kernel developers.

Best wishes,

rinigus

@rinigus
Copy link
Contributor Author

rinigus commented Aug 15, 2016

@octo: Please let me know if I am supposed to resolve the merge conflict. It looks that there have been large changes in the code over the whole tree, which makes older PRs problematic. So, it would be good to know what to do with this PR.

Best,

rinigus

@rubenk
Copy link
Contributor

rubenk commented Aug 15, 2016

Hey @rinigus. Yes, please resolve the conflict. If you merge master instead of rebasing that will keep the history intact. As far as I can see there's just one small conflict in cpufreq.c around https://github.com/collectd/collectd/pull/1823/files#diff-614f18f527573074470e543a663b90edL81 that should be easy to resolve. You changed the ordering of the variables, so that will make it a tiny bit harder.

src/cpufreq.c Outdated
{
status = ssnprintf (filename, sizeof (filename),
"/sys/devices/system/cpu/cpu%d/cpufreq/"
"scaling_cur_freq", num_cpu);
"/sys/devices/system/cpu/cpu%zu",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep the existing indentation intact. This only lines out if you use 4 space tabs. It also makes it hard to see what this actual change does.

src/cpufreq.c Outdated
if (!found)
{
hertz_size += 1;
hertz_all_cpus = (long int*)realloc (hertz_all_cpus, hertz_size * sizeof(long int));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

long int *temp;

temp = realloc (hertz_all_cpus, (hertz_size +1) * sizeof(*hertz_all_cpus));
if (temp)
{
  sfree (hertz_all_cpus);
  // other error handling
}
hertz_all_cpus = temp;
hertz_all_cpus[hertz_size] = hertz;
hertz_size++;

@octo
Copy link
Member

octo commented Sep 13, 2016

Hi @rinigus, sorry that this took a couple days longer than I though. I've left some review comments inline. Thanks for all your work!

@rinigus
Copy link
Contributor Author

rinigus commented Sep 14, 2016

Hi @octo: thank you for reviewing it! I'll make the changes, but it may take couple of days. I'll let you know when changes are in.

best,

rinigus

PS: congratulations on new release!

@octo
Copy link
Member

octo commented Sep 14, 2016

Thank you :-)

@rinigus
Copy link
Contributor Author

rinigus commented Sep 17, 2016

Dear @octo:

I think that I addressed all your comments in the last revision. Please review when you get time for it. Took a bit longer than I expected since there were significant changes in upstream :)

Best

rinigus

@rinigus
Copy link
Contributor Author

rinigus commented Oct 19, 2016

Hi @octo:

it would be great to get some ETA on when do you plan to look into this and radio plugins. If warned in advance, I could try to plan my time as well and be ready to address the questions. I think the questions are addressed though and the both plugins should be quite ready for merge.

Sorry to bug you regarding these plugins - just don't want them to get stale... Radio plugin already requires some additional patch, but it requires your decision on whether you want it all first.

Best wishes,

rinigus

@rinigus
Copy link
Contributor Author

rinigus commented Nov 8, 2016

@octo: ping, as instructed on 5.7 release schedule email :)

@rinigus
Copy link
Contributor Author

rinigus commented Nov 30, 2016

Just to let you know that the PR has been reformatted in accordance with the new formatting policy. I have great hopes that this new policy will help you out!

@collectd-bot collectd-bot changed the base branch from master to main July 3, 2020 09:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants