New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ipmi plugin: add more analog sensors support #2071
Conversation
|
@rpv-tomsk is there any feedback on this patchset? I don't quite understand the comment |
I'm about we are doing the same work ) and also I want to point you to my work/patch, just you to be informed about its existence. My feedback/opinion about your changes: In IPMI, percentage is a flag, not a unit itself. I don't like idea to mix different units into one "percent" type. It will be hard to group different metrics of Maybe it will be better to add '_percent' suffix to What do you think about this? |
|
Hi @rpv-tomsk, Thank you for your feedbacks. I agree that mixing different units into one "percent" type is not a best idea. But, the sensors with "percenage" flag set have "unspecified" unit type. So, in that case I suggest to use sensor type + _percentage suffix. E.g.: memory_percentage, temperature_percentage etc. Regards, |
|
Hi @rpv-tomsk, Actually, we cannot use my suggestion above, as ipmi plugin may select sensors (using precent flag) that may be unsupported by collectd type db. See example of such sensor below: Sensor has "memory" type and provide only "percent" flag. Unit type of this sensor is "unspecified". For this reason, we cannot build and use Also, I think it isn't correct to select sensors by just presense of 'percent' falg. Thus, I'm going to make appropriate changes the PR to fix all described issues. Thanks and Regards, |
|
Hi @rpv-tomsk, I've removed logic that selects sensors by presence of of 'percent' falg. Right now, when the sensor provides percentage flag, the type of the sensor sets to Regards, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @vmytnykx,
thank you very much for your PR! Overall this looks good with some comments inline :)
Best regards,
—octo
src/ipmi.c
Outdated
| @@ -200,6 +211,19 @@ static void sensor_read_handler(ipmi_sensor_t *sensor, int err, | |||
| plugin_dispatch_values(&vl); | |||
| } /* void sensor_read_handler */ | |||
|
|
|||
| static const char *sensor_get_db_type(ipmi_sensor_t *sensor) { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name of this function is not very descriptive. How about sensor_unit_to_type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far I remember, use of only one ipmi_sensor_get_base_unit() is not enough for a such task.
The sensor, besides the base unit, also has a 'modifier' and 'rate' values (ipmi_sensor_get_modifier_unit() and ipmi_sensor_get_rate_unit() respectively). May be these values are rare used, but I think we need to check they anyway - otherwise incorrect result can be obtained.
src/ipmi.c
Outdated
| /* if sensor provides the percentage value, add "_percent" suffix to the | ||
| * sensor collectd type */ | ||
| if (ipmi_sensor_get_percentage(sensor)) { | ||
| ssnprintf(percent_type, sizeof(percent_type), "%s_percent", type); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use the existing percent type instead of introducing more specialized foo_percent types. The original type should go into the type instance. For example, "power-ac0" (type: "power", type instance: "ac0") would become "percent-power-ac0" (type: "percent", type instance: "power-ac0").
It's best to think of collectd's types as a "unit label", if that makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dislike such a mixing of different types into one 'percent' type.
IMHO, personally I would not use such implementation in my systems. As for me, direct type without 'percent' suffix will be better than 'percent' type.
src/ipmi.c
Outdated
| /* if sensor provides the percentage value, add "_percent" suffix to the | ||
| * sensor collectd type */ | ||
| if (ipmi_sensor_get_percentage(sensor)) { | ||
| ssnprintf(percent_type, sizeof(percent_type), "%s_percent", type); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use snprintf(). The ssnprintf() wrapper has been removed from master.
|
P.S.: Don't worry about the merge conflicts, we'll take care of them after the review concluded. |
This PR and mine #2024 do the same thing by a different ways. So, one of implementation completely contradicts the other. |
src/ipmi.c
Outdated
| @@ -275,10 +306,17 @@ static int sensor_list_add(ipmi_sensor_t *sensor) { | |||
| type = "fanspeed"; | |||
| break; | |||
|
|
|||
| default: { | |||
| const char *sensor_type_str; | |||
| case IPMI_SENSOR_TYPE_MEMORY: | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please show 'ipmitool' output for such a sensor?
For example, by running sensor get SENSOR_NAME ?
Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @rpv-tomsk, see me comment above for this example #2071 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi,
Sorry, I have missed that comment.
What metric that sensor represents? What is 'MTT CPU'?
What the value of this sensor, for example, 50% , would mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In nutshell, this sensor represents "Memory Thermal Throttling" and is related to memory thermal management system. Based on the DIMM thermal conditions it may restrict read and write traffic/bandwidth to main memory as a means of controlling power consumption. This metric is measured as a percentage and 0% means no memory throttling occurs. When thermal conditions are going high, the memory management system enables throttling and restricts the read or write traffic (e.g. 50%).
I think the end goal is the same here... discrete sensors don't get read... so from my perspective it's fine to go with your suggestion |
|
Let's discuss what to do with 'percent' flag? Also I want to notice one line from I have some doubt that IPMI has similar limitation. |
|
We have a sensor types and we have units of measurements. We have a 'temperature' type with 'degrees C' unit. Some IPMI sensor types are mapped to Collectd types directly. As for me, 'percent' is a unit, not a type. Let's solve this task starting from a practical requirements? |
|
@rpv-tomsk There are thermal and other sensors on Intel platforms that report values in %... My understanding of types in collectd is that type can be thought of the unit used to measure a value (but I could be wrong) ...
I would prefer option 1. As an aside: MTT sensor is the Memory Throttling sensor on Intel platforms... which would be interesting to track from a platform performance perspective to see if an attempt to cool DRAM (by reducing the memory traffic allowed on a bus) is having an impact on the workload running on your platform. |
+1, that's exactly what I wanted to suggest. @maryamtahhan, @rpv-tomsk How exactly does this PR relate to #2024? Should we close this in favor of #2024 or do they only overlap in the bit that deals with discrete sensors? |
|
This PR intended to add new mappings of IPMI units to Collectd types. I tried to solve this task in rpv-tomsk@ba30d9b |
What is about a such solution: map sensors with a 'percent' flag to 'percent' type, but allow to configure custom mapping by a sensor name (may be implemented later)? |
|
Just want to summarize the remaining work on the changes:
Is my understanding correct? |
|
Volodymyr, I think that your summary is correct. Does anybody will comment mine proposal of configurable mapping? Any opinions? |
|
@rpv-tomsk, could you please clarify your proposal. Do you mean to add new bool config option into collectd conf which will enable adding prefix or not? Or you mean new option which will do the whole percent mapping or not depends on the flag? |
|
@rpv-tomsk Yes, I'll review #2024 |
|
I propose to discuss adding of something similar to: That will allow to override all hardcoded mappings (which is done by sensor type or by unit type) in favour of sensor name (the value which is currently reported as a 'type instance'). |
|
@rpv-tomsk as regards your last proposal, I think you can do this with filtering chains in collectd... so even if you are not happy with the hardcoding you can change it before the metric gets published or even before it hits the cache... at least for the type instance... |
|
Filtering chains... Some while ago there was a same suggestion about use of filtering chains for setting plugin name / plugin instance in 'generic plugins'.. I prefer to configure all things in one place and in a most simplest way (by a form). Chains are closer to a hack for me, not to a solution. Replace at filtering chains is not optimal, it is much more optimal to make one match at sensor add, than do matching for each reported metric. Look for this like for presence of Chains are powerful, of course, but.... |
|
How do we proceed with this? #2024 has been merged and this PR needs a rebase, ideally pulling in the existing handling of discrete sensors. Then do a final review? |
|
@octo, working on review comments changes now. Then I can re-base these changes based on the latest master. Do you prefer to do final review before re-base or after? |
|
Given the complexity of the rebase, I think a review after rebase makes more sense in this case. |
71b3398
to
7408f19
Compare
- Add support of new analog sensors: System Airflow, PS1 Input Power, MTT CPU1, MTT CPU2 - Extend list of sensors to support "memory" sensor type. - Map sensors with a 'percent' flag to 'percent' type and add sensor type as a prefix to type_instance in this case. The type_instance of non % sensors will remain the same (just sensor name). Signed-off-by: Mytnyk, Volodymyr <volodymyrx.mytnyk@intel.com>
7408f19
to
1f88406
Compare
|
Hi @octo, @rpv-tomsk, I've addressed the review comments and changed the implementation as discussed in #2071 (comment) The changes are re-based and squashed into one commit. Thank you all for review :) Regards, |
|
Hi, Volodymyr! Thanks for update and your work on this. Mine proposal is to add units values to the message output, like was done here: rpv-tomsk@ba30d9b#diff-88ee790203c729461340e7b84aa31bb5R439 |
|
Hi @rpv-tomsk, the idea sounds good to me. I will fix it and update the PR. Regarding the build failure, seems a Jenkins issue: Thanks and Regards, |
Change-Id: I5fc6056b2d0c5bb5cb1b28c3d24aaca163eb245b Signed-off-by: Mytnyk, Volodymyr <volodymyrx.mytnyk@intel.com>
|
Hi @rpv-tomsk, I've addressed the comments and updated the PR. Thanks for your review! Regards, |
|
Hi Volodymyr! Thanks for your work on this! |
|
Whoop whoop! Great work all, thanks for the perseverance |
Example of analog sensors:
Example of discrete sensors which are added to internal ipmi list on init but falied to read using ipmi_sensor_id_get_reading() func:
Regards,
Volodymyr