Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

riello_usb.c - incorrect check for battery_charge cap value. #1685

Closed
amikot opened this issue Oct 23, 2022 · 19 comments · Fixed by #1692
Closed

riello_usb.c - incorrect check for battery_charge cap value. #1685

amikot opened this issue Oct 23, 2022 · 19 comments · Fixed by #1692
Labels
Riello Riello UPS devices (serial, usb), usually with a variant of Qx protocol Shutdowns and overrides and battery level triggers Issues and PRs about system shutdown, especially if battery charge/runtime remaining is involved USB

Comments

@amikot
Copy link
Contributor

amikot commented Oct 23, 2022

In the code we see following lines:

	if ((DevData.BatCap < 0xFFFF) &&  (DevData.BatTime < 0xFFFF)) {
		dstate_setinfo("battery.charge", "%u", DevData.BatCap);
		dstate_setinfo("battery.runtime", "%u", DevData.BatTime*60);
	}

I don't know how about BatTime, but for BatCap the value should not exceed 100 or in HEX it should be 0X64.
Instead of that IF instruction checks if BatCap is lower then 0XFFFF which equals to 65535 this is nonsense because BatCap (type uint16_t) can't exceed 65535 anyway. So that IF has no any sense - it wont ever work.

As we see in quoted code BatCap variable is being assigned to global nut variable: battery.charge
According to NUT developers guide
battery.charge is holding the value as percentage. Percentage of battery cannot exceed 100%, so 100 should be maximum value for battery.charge, so also for BatCap.

It's known problem that for some riello UPS devices the values of battery.charge are showing incorrect value of 255.
IMHO: These values should be filtered off and value of battery.charge should be not assigned.

Similar issues are with two other values: battery.runtime (DevData.BatTime) and ups.temperature (DevData.Tsystem).
First one is still limited to max value of 65535 of it's type. That value is multiplied by 60 and driver returns 3932100 - this is for problematic models of Riello.
Second (or actually third) problematic value for some Riello models is ups.temperature (DevData.Tsystem):

	if (DevData.Tsystem < 0xFF)
		dstate_setinfo("ups.temperature", "%u", DevData.Tsystem);

This time, the cut off value is 255 - and that's what we see when we query the driver with nut.

I don't know if the values of DevData are correcty assigned in riello.c - This also can be the issue, from some reason we assume the values are displayed correctly everyone but that one series. But maybe users of that series complains ? Maybe this series is most popular among linux users?

I don't know.

Personally, I would suggest to change the IFs showed above (and linked bellow) to some normal values. If the values would be exceeded then values wont be assigned. Maybe it's possible to calculate (recover) at least some missing values using current voltage and nominal capacity?
If specification says Rating (kVA) is 0.6 and Rating (kW) is 0.36 and that actually is also displayed by nut info, maybe something can be calculated.

Thanks.


if ((DevData.BatCap < 0xFFFF) && (DevData.BatTime < 0xFFFF)) {

@jimklimov
Copy link
Member

jimklimov commented Oct 24, 2022

CC @geoghegan who last changed that part of code - before a0b4c4e there was no such check. FWIW, did you investigate the post above with a recent build of NUT or with binaries made from whatever old code snapshots the OS distros package?

Even if that clause is currently "effective no-op", result would be same as before for uint16_t types capped at 0xFFFF (at least not a regression here). One thing that comes to mind that this IF excludes equality to 0xFFFF (possibly hijacked as -1 for errors?)

<DevData.BatTime> is still limited to max value of 65535 of it's type. That value is multiplied by 60 and driver returns 3932100 - this is for problematic models of Riello.

Do some units in practice return such large values (e.g. exactly 65535 that this code should now ignore)? Otherwise, do we have any reason to filter out some values at an arbitrary cut-off point? I mean it is not likely that an UPS would hold up the load for weeks, but a big UPS feeding some Raspberry Pi might...

Likewise, DevData.Tsystem selects values smaller than 255, so if NUT reports temperature as 255 it comes from some other codepath.

...for some riello UPS devices the values of battery.charge are showing incorrect value of 255.

That does sound like a firmware bug (at least if the vendor protocol is known to report these values natively as a percentage integer, and not e.g. as a fraction of byte-max with 255 being 100%). Maybe worth bringing up with Riello support too, after all they contributed the driver some years back (and engineers were active in the mailing list some time afterwards).

In any case, changes to the driver - especially workarounds to vendor bugs - are best made if proposed and tested by someone who has access to actual devices.

@jimklimov jimklimov added USB Riello Riello UPS devices (serial, usb), usually with a variant of Qx protocol labels Oct 24, 2022
@amikot
Copy link
Contributor Author

amikot commented Oct 24, 2022

<DevData.BatTime> is still limited to max value of 65535 of it's type. That value is multiplied by 60 and driver returns 3932100 - this is for problematic models of Riello.

Do some units in practice return such large values (e.g. exactly 65535 that this code should now ignore)? Otherwise, do we have any reason to filter out some values at an arbitrary cut-off point? I mean it is not likely that an UPS would hold up the load for weeks, but a big UPS feeding some Raspberry Pi might...

For RPi it may happen, but incapable models of Riello always display only maximum value, so if cap would be set to 65534 insead of 65535 - that would resolve the issue. Or it can be filtered off by model number which is discovered correctly.

Likewise, DevData.Tsystem selects values smaller than 255, so if NUT reports temperature as 255 it comes from some other codepath.

As above - maximum value 255 can be assumed as incorrect. If that temperature would appear on battery It would be fire alarm triggered already. Or again it can be model number used to determine if value should be passed to NUT.

...for some riello UPS devices the values of battery.charge are showing incorrect value of 255.

That does sound like a firmware bug (at least if the vendor protocol is known to report these values natively as a percentage integer, and not e.g. as a fraction of byte-max with 255 being 100%). Maybe worth bringing up with Riello support too, after all they contributed the driver some years back (and engineers were active in the mailing list some time afterwards).

Vendor protocols are very old (from around 2000) I don't think my device is so old, so protocol may be not up to date.
However, value 255 is incorrect anyway, so if the protocol is not giving that information it would be better for users to not display it.

In any case, changes to the driver - especially workarounds to vendor bugs - are best made if proposed and tested by someone who has access to actual devices.

I have the device, so can test the driver. I would fork it myself, but has no experience with drivers and didn't program in C since 22 years on Amiga, so I'm not sure if I would manage to compile anything on my own.

@jimklimov
Copy link
Member

jimklimov commented Oct 25, 2022

My question about the numbers reported by the device was rather if these integers are defined (in some known protocol) to mean exactly the degrees Celsius for example, and be unsigned integers at that (as in the structure).

They could in practice be signed, halving the range but all-bits-set meaning a minus-one and probably an error like missing reading, broken sensor, not available in this model, etc.

Likewise, a "255" might (with knowledge of protocol that I lack currently) mean "25.5" degC (factor of 10 is common), or be further complicated with offsets and multipliers. Or even not degC but degF?

That's why finding a protocol spec - or deriving one from practical experiments (including safe measurements of temperatures and voltages, or cross-reading vs. device panel if it reports the info, or vendor-provided program if it behaves sanely) - is crucial for workarounds and fixes.

Since Riello did cooperate earlier by contributing the NUT driver supposedly valid as of back-then, it may be worthwhile to contact them for protocol or other arcane knowledge (ask if it can also be published, or you must use it alone under some NDA). You are a paying customer after all...

@jimklimov
Copy link
Member

Building NUT nowadays should not be a problem on a large range of architectures and OSes, including Linux containers and VMs you can instantiate easily (including Ubuntu in WSL2, if you are normally on Windows).

As a quick first shot, check docs/config-prereqs.txt for packages to install on different distributions (written for a maximum build scope possible on CI farm - so you might want to skip including tools for docs generation, libgd for CGI clients, and Java for Jenkins agents to save lots of space). If possible, set up ccache so full re-builds are much faster as you iterate.

The ./ci_build.sh script should by default pick up and configure whatever it can build on the current environment; then you just iterate the code and make to produce binaries again quickly.

@jimklimov
Copy link
Member

Finally, I tend to agree that shifting the "acceptable max" by a little is something worth trying - e.g. to accept e.g. 0..253 and 0..65533 ranges to rule out a couple of fringe values if those are associated with invalid readings and others are not.

@amikot
Copy link
Contributor Author

amikot commented Oct 25, 2022

I terms of values it could be just different range or factor of 10 as you suggested if the 255 wouldn't be maximum value that can be showed. Apart of that the value never get lower ... it always stays on 255.

In terms of building - I guess I need just to clone the git to my disk and build it ? (BTW. I'm on the Mint and driver is used on Debian with OMV).

@jimklimov
Copy link
Member

Then I think it is a reasonable guess about -1 for some error reporting...

As for building - yes, git clone and ./ci_build.sh (assuming tools and prerequisite libraries are there already). There may be more manual labor to build it in a way that you'd install into the OS (configure paths to configs, binaries, temp files etc.), but to iterate and run and test in-place it would suffice. Speaking of latter, see also tests/NIT/nit.sh for testing built binaries with custom-made (and custom-located) config and pipe files.

@amikot
Copy link
Contributor Author

amikot commented Oct 27, 2022

Okay, I'm experimenting - have done some changes to driver code even compiled it, but I'm not sure how to test it.
Can I have installed modified version on my PC and use network to connect to the PC that has UPS connected?
Will be the client driver or server driver used in that case ?

Can driver be somehow tested without installing ?

@jimklimov
Copy link
Member

You should run the driver on the system that has connection to the UPS, more so for a direct-connect device. Even if you could tunnel that, it would be another layer of uncertainty that you want to avoid in debugging :)

Usually you'd need to stop the packaged driver service, to let another program (your newly built driver) exclusively access the port in the OS.

It may help to run the built driver with -h option to see the tunables, and run it with CLI configuration like

./riello_usb -s myups -DDDDDD -x port=auto

and maybe some more (e.g. start with sudo and use -x user=root option to not drop privileges and be allowed to access the port - if you did not manually configure the build with options like those used in packaging... or something like -x user=nut to match the packaging).

Not sure what the last question meant, maybe read up on the three-layered NUT architecture?

@amikot
Copy link
Contributor Author

amikot commented Oct 27, 2022

Will maybe test it after work I hope it will work - I have no any IDE - just used Geany and compiled the code.
Normally I do PHP/JS/HTML etc. so for debugging have apache/php log files and dev tools in browser. When rarely doing something for Android - Android Studio is complex solution with all what I need.
As this is just meantime project, I don't want to install IDE that I don't know anyway.
So I hope it gonna work as changes are simple and based on maths - calculated charge value and time on battery + some IFs to determine if values obtained UPS are correct or they need to be calculated.

@amikot
Copy link
Contributor Author

amikot commented Oct 28, 2022

Okay, I've tested and driver obviously works, but values from debug are different then expected, so need to work on it a bit more.

@jimklimov
Copy link
Member

No problem, I mostly use Midnight Commander editor in one terminal, and a command to build and run the tested program in shell history of another.

That said, Netbeans with C/C++ suppprt modules may be a good choice for multi-platform builds and runs (including remote debugging) when needed, e.g. to step through code.

@amikot
Copy link
Contributor Author

amikot commented Oct 30, 2022

Thanks for all advice. I've done the same - one terminal tab for compilation and tests, another tab with mc and editor.

But going back to topic - I've done all changes as I could and it looks good, but have some questions:

  1. There is unknown number of Riello models with incorrect data. My model is USV5 (iPLUG 600), but I've also seen that someone reported UNV1 which is iDIALOG 800
    I believe that there is much more models with this issue. I expect at least iPLUG 800 be one of them (don't know what model code would be discovered by NUT). In that case, I'm not sure if workaround should be enabled relying on model number as it is now in my code.

  2. Temperature cannot be calculated anyhow, so there are two options: set a certain dummy value like 0, or maybe I could push there simple n/a string that stands for non applicable.

  3. If I do my changes, how can I share them with the world? How can I get nice compiled deb production package? Now compiled files are 5 times bigger then those from packages - probably because of debug data? I've tried to replace currently installed riello_usb driver with compiled by me, but computer says "no" ;) I guess this may be caused they're from different versions - on my server I have 2.7.3 while edited one is 2.8.0.

So for now debug displays that which in my opinion is Okay:

   2.785513	[D5] send_to_all: SETINFO input.frequency "49.90"
   2.785571	[D5] send_to_all: SETINFO input.bypass.frequency "409.50"
   2.785601	[D5] send_to_all: SETINFO output.frequency "49.90"
   2.785628	[D5] send_to_all: SETINFO battery.voltage "13.6"
   2.785651	[D5] send_to_all: SETINFO battery.charge "100"
   2.785679	[D5] send_to_all: SETINFO battery.runtime "840"
   2.785703	[D5] send_to_all: SETINFO ups.temperature "N/A"
   2.785728	[D5] send_to_all: SETINFO input.voltage "249"
   2.785751	[D5] send_to_all: SETINFO input.bypass.voltage "4095"
   2.785775	[D5] send_to_all: SETINFO output.voltage "245"
   2.785798	[D5] send_to_all: SETINFO output.power.percent "0"
   2.785814	[D5] send_to_all: SETINFO ups.load "0"
   2.785832	[D5] send_to_all: SETINFO ups.status "OL"

I have done some tests disconnecting the input power, and values are being updated as well, but could not check most recent updates because beeper would alarm whole house, so this will be done tomorrow.

@jimklimov
Copy link
Member

jimklimov commented Oct 30, 2022

A common approach is to add a configuration file tunable (in this case probably a boolean flag mapped from text config to C variable - for one practical example see pollonly setting and handling in usbhid-ups codebase, and there are many more all across the codebase). So for "UPS where user sees such and such behavior, try adding X to its ups.conf section and restart the NUT driver".

As for the temperature, reported values should be valid (per docs/nut-names.txt, lacking some formal schema currently). Graphing and other programs may rely on the value being a number if present at all. A zero might be a reasonable report for an error; otherwise just detecting that it does not make sense (and/or relying on user-provided option to decide) and thus not calling dstate method to create/update the value should also work - the temperature would be just absent from the list of values.

@jimklimov
Copy link
Member

Regarding installation - size may differ due to debugging data built in and/or compiler optimizations not applied. Also some code could have grown in the years between 2.7.3 release and nowadays.

In case of NUT USB drivers, also note that since release 2.8.0 NUT supports a choice of building against libusb-0.1 and libusb-1.0. Earlier code only supported libusb-0.1. It is possible that the server with 2.7.3 does not even have libusb-1.0 installed while your build system did and preferred it - so the binary program would fail to dynamically link at run-time.

To "share with the world", set a uniquely named Git branch locally, commit the changes to it, post a pull request here and if it passes CI and makes sense otherwise - it would be merged and become part of NUT "master" branch and eventually next official release, and OS distributed packages based on that.

For local in-place upgrades and packaging, find the recipes used by your OS distro (NUT has many configure options to tune - and older packaging might not account for new options like choice of libusb), apply those to your build workspace, and either sudo make install to replace existing files (as root) or make install DESTDIR=/tmp/nut-inst to create a "proto area" under e.g. /tmp/nut-inst to tarball them and transfer to another machine. The latter is what packaging recipes typically do, in order to package the files that appear in that dedicated proto area.

Default configure options use paths mostly under /usr/local/ups/... to not conflict with distribution-provided files, so custom builds may coexist with packaging (may have to fiddle with service startup so only one implementation actually runs, and configuration file location - usually by --sysconfig=/etc/... option or symlinks).

@amikot
Copy link
Contributor Author

amikot commented Oct 30, 2022

Great!
Thanks for all your help. Pull request created, hopefully it will be accepted.
This is very new for me, so I'm very excited.

@amikot
Copy link
Contributor Author

amikot commented Nov 12, 2022

I would be pleased if you could somehow merge the changes from my branch into master. Sorry to say, but I don't understand the manual ... I'm too old or too stupid for that.

@amikot
Copy link
Contributor Author

amikot commented Nov 13, 2022

Hi again,
I feel I've lost in the jungle of github.
New commits appeared on main branch, but I cant pull them because of conflict.
I don't understand what conflicts are there if changes in mine fork and changes in main nut repo are not even in the same files.
For me github is like black bagic. I can and want to do some programming, but I can't learn github - sorry :(

If someone could pull changes from riello_ups.c that are on my fork amikot/nut - would be great.
Otherwise everything will be just wasted. :(

@jimklimov
Copy link
Member

Don't be harsh on yourself; git and its multiverse-like concepts take time to get adjusted to... not unlike jumping to OOP from "usual" programming. You may try for months in vain, and then something just clicks and all becomes obvious :\

Here it is not so much about github as one of many platforms to centralize the development effort on otherwise decentralized repos (each workspace is one, unique and independent), and more about git communications generally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Riello Riello UPS devices (serial, usb), usually with a variant of Qx protocol Shutdowns and overrides and battery level triggers Issues and PRs about system shutdown, especially if battery charge/runtime remaining is involved USB
Projects
None yet
2 participants