Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please port to libusb 1.0 #300

Closed
bigon opened this issue Jul 18, 2016 · 138 comments
Closed

Please port to libusb 1.0 #300

bigon opened this issue Jul 18, 2016 · 138 comments
Assignees
Milestone

Comments

@bigon
Copy link
Contributor

bigon commented Jul 18, 2016

Hi,

nut currently uses libusb 0.1 which is deprecated for quite some time. Nut should be ported to libusb 1.0

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=810449

@jblachly
Copy link

Agree; I was trying to debug a nut installation on illumos and it took me quite some time to even locate the source code for libusb 0.1.

@aquette
Copy link
Member

aquette commented Aug 1, 2016

thanks for the report guys, this is on my TODO stack for too long, and time has come to move on...
I'll start pushing a libusb-1.0 branch over the day, and will start to call for testing as soon as possible.

@clepple
Copy link
Member

clepple commented Aug 1, 2016

@aquette I am curious if this is the best use of our time (both development and testing). libusb-1.0 is designed to overcome API issues relating to asynchronous calls (NUT operates synchronously, in a single thread), and to add support for USB 2.0+ features (Remember that most UPSes are USB 1.x, and many are 1.5 MBit/sec). If nothing else, this sounds like an excellent use case for libusb-compat, which is not marked deprecated, just that the 0.1 API is not recommended for new code.

If we are going to require users to test their setup again, let's at least make it worth their time.

@clepple
Copy link
Member

clepple commented Aug 1, 2016

@jblachly not sure how illumos does their package mirroring, but for future reference, the upstream source is at SourceForge and Debian keeps mirrors of the tarballs that go into their packages.

@aquette
Copy link
Member

aquette commented Aug 3, 2016

@jimklimov you may be able to shed some light on the Illumos package for libusb 0.1...

@jimklimov
Copy link
Member

@jblachly @aquette : I should start by reminding that "illumos" is not an OS by itself, but rather a kernel and some basic tooling. Things like libusb are part of userland which comes (or not) with specific distributions, of which there are a dozen or so.

I mostly work with OpenIndiana (and its bleeding-edge Hipster subproject), which currently has recipes mostly from back in 2012, for versions 0.1.8 of libusb-wrapper and libusb-ugen, with sources for both coming from http://libusb.sf.net/ - see https://github.com/OpenIndiana/oi-userland/tree/oi/hipster/components/libusb/ugen and https://github.com/OpenIndiana/oi-userland/tree/oi/hipster/components/libusb/wrapper .

As I see on an OmniOS Bloody box, they have driver/usb and driver/usb/ugen as well as system/header/header-usb and system/header/header-ugen packages, though I can't quickly determine the source and version for those.

I believe both of these sets of packages/recipes can plausibly date back to Sun OpenSolaris packaging (SUNWlibusbugen SUNWugen{,u} SUNWusb{,s,u,vc} etc.) including non-bumped versions.

@bigon
Copy link
Contributor Author

bigon commented Aug 3, 2016

I don't think libusb-compat is in the debian archive

@jblachly
Copy link

jblachly commented Aug 3, 2016

@jimklimov I am working on SmartOS, but more importantly I am going to put some effort into the illumos kernel USB drivers in the area of HID-power. As you know, illumos recognizes HID-power class USB devices as "hid", but illumos' / opensolaris usba driver hid has only drivers for kb and mouse, IIRC. So in order to use an HID-power UPS with any illumos distribution and nut, one has to load the ugen driver.

My earlier comment about "locating the source" referred not to getting nut installed (SmartOS and pkgsrc 'nut' package got everything installed no problem) , but rather when trying to track down why nut didn't work for me (the problem being of course with the hid driver versus the ugen driver), as I needed to review the libusb 0.1 source code to figure out that it was looking for /dev/usb/<vid>,<pid>/ directory tree that doesn't exist when USB device is owned by hid driver.

Anyway, please let me know if I can help in any way, and thanks everyone for working on NUT.

@aquette
Copy link
Member

aquette commented Aug 5, 2016

@jblachly having worked / assisted / contributed to ugen on Solaris with the developer (Jan Van Bruen), back in the early 2000, and also on the Linux kernel implementation, I can help. But I don't think this effort is worth. As for NUT, most USB (and HID) userland implementation just needs a generic USB access, and implement HID support on their own. Otherwise, take a look at libhid and similar userland efforts. Moreover, HID kernel support tends to be overweighted.
It would be better, as I wanted for years, to develop hotplug script (iirc, we have something to complete in NUT) so that ugen is loaded when a device requires it...

@aquette
Copy link
Member

aquette commented Aug 5, 2016

worklog on the port: I've committed the configure checks:
...
checking for libusb version via pkg-config... 1.0.19 found
checking for libusb cflags... -I/usr/include/libusb-1.0
checking for libusb ldflags... -lusb-1.0
checking for libusb.h... yes
checking for libusb_init... yes
checking for libusb_detach_kernel_driver... yes
...

Configuration summary:

...
build USB drivers: yes (libusb-1.0)
...

Actual implementation for the 1.0 backend underway... stay tuned.

@jblachly
Copy link

jblachly commented Aug 5, 2016

@aquette Thanks -- Perhaps best solution is to ensure that HID-Power class devices get the ugen driver loaded instead of hid, without having to put vendor specific usb<vid>,<pid> strings in /etc/driver_aliases?

@aquette
Copy link
Member

aquette commented Aug 9, 2016

@jblachly: indeed, that would be optimal... but not sufficient. NUT has support for USB/HID PDC (power device class) and also for XCP, Qx, and more (drivers: bcmxcp_usb, blazer_usb, nutdrv_atcl_usb, richcomm_usb, riello_usb, tripplite_usb, nutdrv_qx)

worklog on the port: usbhid-ups port complete (available on the https://github.com/networkupstools/nut/tree/libusb-1.0 branch).
I've made some non-regression on an Eaton Protection Station (Vs 2.7.4), and tested settings (upsrw) too. Everything's fine so far, but tests and feedback welcome.

@aquette
Copy link
Member

aquette commented Aug 10, 2016

Worklog tracker

  • configure checks, and libusb 1.0 / 0.1 detection and activation
  • usbhid-ups port
  • bcmxcp_usb port
  • blazer_usb port
  • nutdrv_atcl_usb port
  • richcomm_usb port
  • riello_usb port
  • tripplite_usb port
  • nutdrv_qx port
  • nut-scanner port

@clepple
Copy link
Member

clepple commented Aug 10, 2016 via email

@aquette
Copy link
Member

aquette commented Aug 10, 2016

indeed @clepple , thx for pointing ;)
should be good for the current scope, however bear in mind that the port is not finished, so buildbot will fail to compile the tree, while it's not complete...

@aquette
Copy link
Member

aquette commented Aug 11, 2016

all,
I've completed the port and further reviews/adjustments, which should cover most if not all things, hopefully.
There may still be a few things left, so I'll be reviewing a last time, but you can anyhow review and test on your side, prior to considering the mainline merge.
Otherwise, it compiles and distcheck-light fine on both 1.0 and 0.1.
usbhid-ups is good and tested (not big tests, but good coverage still);
I'll try to test bcmxcp_usb too if I can get a hand on an XCP/USB.

@aquette aquette added this to the 2.7.5 milestone Aug 12, 2016
@aquette
Copy link
Member

aquette commented Aug 12, 2016

@cleppe: I've completely missed your comment on "the use of our time". I indeed think that switching to libusb 1.0 (not compat, which can't be marked as deprecated, since part of libusb 1.0) is a good thing. Even if we're not using isochronous endpoint, 1.0 is the way forward, and as you saw in the commit list, the port was not a huge deal. This will ease some integration, esp. on Windows and probably some BSD too (inc. Darwin). So we're now ready for the future, on the USB front...

@aquette aquette added ready / code review Author (and CI) consider the PR worthy of human rewievers' time and removed WIP labels Aug 12, 2016
@clepple
Copy link
Member

clepple commented Aug 12, 2016

@aquette there may not have been much to commit, but there is a lot to test, and we can't simulate all of the USB error conditions that users may encounter.

I am still not sure this is a big win for users. On OS X, we still have the same problem with USB HID devices: libusb-style access competes with the HID driver. BSD has had a problematic mix of libusb 0.1 and 1.0 APIs, and again, error conditions have been just different enough to cause problems. And I am not sure that moving to libusb 1.0 automatically solves the USB access problem on Windows (though I would like to be proven wrong on that front).

Let's not rush the 2.7.5 release. This will need a lot of testing.

@clepple
Copy link
Member

clepple commented Aug 12, 2016

@aquette can you please check on this? might only be needed for libusb-0.1.

../../../drivers/libusb1.c:70:19: warning: unused function 'typesafe_control_msg' [-Wunused-function]

@clepple
Copy link
Member

clepple commented Aug 14, 2016

@aquette I commented over on the libusb1.c patch, but it isn't showing up here. Slightly updated:

HAVE_LIBUSB_DETACH_KERNEL_DRIVER matches libusb_detach_kernel_driver() rather than libusb_set_auto_detach_kernel_driver(). FreeBSD 10.1-10.3 have the former, but not the latter. I'm guessing we use the auto variant if it exists, and fall back to the earlier one (or nothing) if not.

Reference: https://www.freebsd.org/cgi/man.cgi?format=html&query=libusb(3)

@aquette
Copy link
Member

aquette commented Aug 16, 2016

@clepple

  • I've fixed unused function 'typesafe_control_msg'
  • thanks for shedding light on the detach Vs auto_detach, I'll check the situation and rework that
  • on the need (or not) to rush for pushing this code: my stance is that if this code is released, it has a wider availability so more chances to be packaged (dedicated build) and tested, than github branches. That said, I'm wondering if we shouldn't force default to 0.1 to avoid people thinking they run 0.1 while it's 1.0...

aquette added a commit that referenced this issue Aug 17, 2016
libusb 1.0 has introduced a new function (libusb_set_auto_detach_kernel_driver),
beside from the explicit kernel driver detachment request
(libusb_detach_kernel_driver). However, the former is not available on all
systems. As an example, FreeBSD 10.1-10.3 does not have this. The detachment and
interface claiming has been reworked to handle this case (reported by Charles
Lepple)

Reference: #300
@clepple
Copy link
Member

clepple commented Aug 20, 2016

@aquette the last commit doesn't cover richcomm_usb, which doesn't use the normal libusb code: http://buildbot.networkupstools.org/public/nut/builders/FreeBSD-x64/builds/407/steps/compile/logs/stdio

@clepple
Copy link
Member

clepple commented Aug 22, 2016

@aquette Rather than make the altsetting code only build on libusb-0.1, it should probably use this: http://libusb.sourceforge.net/api-1.0/group__dev.html#ga3047fea29830a56524388fd423068b53

@abratchik
Copy link
Contributor

USB protocol is as old as mammoth manure so it meant to be confusing :) On a bit serious note, it would be probably nice to follow some standards in order to make our life a bit easier.

I guess, 0x8 or 0x9 means the packet size in bytes. 1000 means timeout interval in milliseconds. Probably should be a const variable rather than a macro.

EP 0 is important in the USB, it is used for control communication with the device. Some more light reading here

@clepple
Copy link
Member

clepple commented Nov 17, 2021

@abratchik wrote:

I guess, 0x8 or 0x9 means the packet size in bytes. 1000 means timeout interval in milliseconds. Probably should be a const variable rather than a macro.

No need to guess: https://www.cs.unm.edu/~hjelmn/libusb_hotplug_api/group__syncio.html#gadb11f7a761bd12fc77a07f4568d56f38

EP 0 is important in the USB, it is used for control communication with the device.

I think Jim was referring to interface number zero here, rather than EP 0, because PR 1044 is about configurable interface numbers.

@jimklimov wrote:

My question to libusb experts is: can (and should) these magical numbers (and mathematical additions) be replaced with "proper" named macros, bit-flag operations, etc. to reduce confusion and simplify maintenance? If the integer values are multi-purpose (e.g. flags in high bits, interface or endpoint number in lower bits), the variable "hid_ep_in" should be range-checked, and/or some union type used?

https://github.com/libusb/libusb/blob/master/libusb/libusb.h#L594 describes the format of the endpoint number.

Something symbolic sounds good. I have no preference on the implementation, though using | for bitfields seems more idiomatic than +. (Ideally, the endpoint numbers are fetched from interface descriptors anyway, but I imagine there are a number of devices with broken i/f descriptors.)

@jimklimov
Copy link
Member

jimklimov commented Nov 17, 2021

I am not fully sure what I was asking about (vague a bit about terminology and concepts) so risk saying something stupid or misleading, but I saw weirdness in (I think) argument formulations for both endpoints (and some "1" often there) and interfaces. Having all of them referenced in an understandable manner is beneficial :)

@marcan
Copy link

marcan commented Nov 18, 2021

Question to people who dealt with non-zero USB interface numbers (CC: @abratchik, @zykh, @clepple, @aquette OTOH) and LibUSB in general: we have several drivers and libusbN.c code which follows the same pattern, to deal with USB in master branch or these libusb-1.0 related branches, and part of that pattern had a number of operations dealing with either a literal 0x81 argument, or some macro (with version-dependent name) resolving to 0x80 and either adding or bitwise-OR'ing a 1 as a number.

If you see 0x80, that's an IN endpoint number, not an interface number.

What the right way to deal with endpoint numbers is depends on the devices involved. Often you can get away with hardcoding endpoint numbers if they never change between device variants. Sometimes it's better to look at the descriptors and identify the endpoints by their position inside an interface, which itself can be identified by index or by its class or in some other way.

usb_clear_halt(dev_h, 0x81)

That's IN endpoint 1.

ret = usb_interrupt_read(udev, 0x81, (usb_ctrl_char)tmp, 8, 1000);

Same, IN endpoint 1 (which must be of interrupt type in this case). The 8 later is the buffer length and 1000 is a timeout.

ep = 0x81 | USB_ENDPOINT_IN

This is actually wrong; USB_ENDPOINT_IN should be equal to 0x80. It should be 1 | USB_ENDPOINT_IN to denote IN endpoint 1.

(note that IN 1 (0x81) and OUT 1 (0x01) are distinct, separate endpoints)

ret = usb_interrupt_read(udev, USB_ENDPOINT_IN | 1, (usb_ctrl_char)&buf[i], 8, 1000);

Same story again, IN endpoint 1, length 8, timeout 1000.

ret = libusb_interrupt_transfer(udev,
	LIBUSB_ENDPOINT_IN + usb_subdriver.hid_ep_in,
	(unsigned char *)buf, bufsize, &bufsize, timeout);

In this case the HID input endpoint is, presumably, dynamically determined based on descriptors or some other way, so it comes from a structure, but it presumably lacks the IN bit so that gets added here (ORed would be more appropriate though).

res = usb_control_msg(udev,
                                USB_ENDPOINT_IN + 1, // <<< this
                                USB_REQ_GET_DESCRIPTOR,
                                (USB_DT_HID << 8) + usb_subdriver.hid_desc_index,
                                usb_subdriver.hid_rep_index,
                                buf, 0x9, USB_TIMEOUT);
               /* in libusb0 was: USB_ENDPOINT_IN + 1 */
               res = libusb_control_transfer(udev,
                       LIBUSB_ENDPOINT_IN|LIBUSB_REQUEST_TYPE_STANDARD|LIBUSB_RECIPIENT_INTERFACE,
                       LIBUSB_REQUEST_GET_DESCRIPTOR,
                       (LIBUSB_DT_HID << 8) + usb_subdriver.hid_desc_index,
                       usb_subdriver.hid_rep_index,
                       buf, 0x9, USB_TIMEOUT);

I'm pretty sure the libusb0 code is written incorrectly (but works by accident). The prototype for that function is:

int usb_control_msg(usb_dev_handle *dev, int requesttype, int request,
	int value, int index, char *bytes, int size, int timeout);

So the first parameter is the request type. But USB_ENDPOINT_IN + 1 reads like it's trying to specify that the control request should be sent to IN endpoint 1, which doesn't make much sense (sending control requests to non-zero endpoints is exceedingly rare and not even supported by this API), and isn't what that parameter does. However, if you evaluate LIBUSB_ENDPOINT_IN|LIBUSB_REQUEST_TYPE_STANDARD|LIBUSB_RECIPIENT_INTERFACE, you'll see it is 0x81 indeed (which does not mean IN 1 here, rather it's a bitfield describing the control request type). So in the end both versions do the same thing, but the libusb0 one is written in a confusing way and the author may not have understood it properly.

and examples above also show a magical 0x9 that is sometimes 0x8 in other calls.

That's a length.

My question to libusb experts is: can (and should) these magical numbers (and mathematical additions) be replaced with "proper" named macros, bit-flag operations, etc. to reduce confusion and simplify maintenance? If the integer values are multi-purpose (e.g. flags in high bits, interface or endpoint number in lower bits), the variable "hid_ep_in" should be range-checked, and/or some union type used?

The 0x81 stuff can be replaced with LIBUSB_ENDPOINT_IN|1 in most cases. That said, endpoint numbers are such a fundamental part of USB that I think anyone working on this stuff probably understands that 0x81 means IN 1 and 0x01 means OUT 1. It would be more useful to have a define specific to each driver, e.g. something like:

#define FOODEV_EP_CMD (LIBUSB_ENDPOINT_IN | 1)
#define FOODEV_EP_REPLY (LIBUSB_ENDPOINT_OUT | 1)

Which makes it clear that these are hardcoded endpoint numbers for that given device.

Addition should almost always be a logical OR instead; it doesn't make a difference for combining bitfields, but addition will break if e.g. one component already had the IN bit set, and OR is more idiomatic.

Otherwise it is quite annoying to not know if e.g. two branches of same or similar-looking logic do same things or not, and whether that is intentional or just historic/other discrepancy (in short: should I overwrite one with another? Looking at code, I have no idea nor clues to nudge me in the right direction so far).

And also, since there were several efforts in these libusb-1.x branches and otherwise (e.g. in PR #1044) to avoid the hardcoded USB interface number zero in favor of configurable use of other values, I can't say now if all of NUT codebase magically benefits from existence of this configuration, or if some code continues to interact with "number zero". Likewise, I do not really know if the "number zero" is somehow magical for USB (e.g. maybe some mgmt operations must be done through it, and data operations may be at other numbers).

Endpoint zero is magical and used for control transfers almost always (and nothing else). For simple devices with only two data endpoints, those are often going to be either IN 1 (0x81) and OUT 1 (0x01), or something like IN 1 (0x81) and OUT 2 (0x02) (some devices prefer to use separate endpoint numbers for IN/OUT; this could be a USB device controller hardware limitation, or just caused by confused embedded engineers thinking they can't do that even though it's fully allowed). For a device with one interface, they would be part of interface 0, and you can usually get away with hardcoding them in that case. Devices with multiple interfaces though could end up having endpoint numbers change between device revisions / hardware; in that case it's a better idea to get the endpoint numbers from the interface descriptor.

HID is a particularly special case, because it's a standard and often implemented alongside other things. Since many devices implement HID, you can't hardcode interface or endpoint numbers. The correct way to do this is to look through all interface descriptors and find the one that claims to have the HID class; then use that interface number and get the endpoint descriptors from that interface descriptor. That way you don't have to hardcode anything for specific devices.

@jimklimov
Copy link
Member

Thanks a lot for great details, should help us laymen understand the context... Would any PRs from you be possible to rectify this "correctly"? ;)

@almereyda
Copy link

After the recent merge of master into #1184, the installation instructions in #300 (comment) seem still valid, while little adjustments were needed here on a supposedly more blank Ubuntu system to reach the end:

  • In addition to the requirements stated above, an apt install autoconf libtool helped providing further necessities.
  • The configure command parameters reference two different architectures. It was preferable to stick with x86_64 in both cases, while others will only find aarch64 useful.

The service is running and no broadcast messages about lost Communications were displayed again up to this point. Will return and update here, in case that happens again.

@jimklimov
Copy link
Member

jimklimov commented Dec 21, 2021

Further working to converge the candidate libusb-1.0(+0.1) branches to get one of them merged into master, I realized that indeed libusb-1.0+0.1 is a superset of "plain" libusb-1.0, and beside some fixes applied in C code (xmalloc/xstrdup mostly), they differed in configure.ac and m4 included code to enable libusb-1.0 (or allow choosing the implementation).

I proceeded currently with additional branches https://github.com/networkupstools/nut/tree/fightwarn-libusb-1.0+0.1 and https://github.com/networkupstools/nut/tree/fightwarn-libusb-1.0 to minimize the "non-functional differences" of the two branches (in particular, using the same M4 code for configure script now), and to proceed with the "Fightwarn effort" started a year ago with #823 - to have to code build cleanly without warnings, iterating to survive more and more warnings enabled. The latter is important, to allow for peace of mind when we add new features and refactor old bugs, to know that we did not introduce regressions for some platform and tool/language revision.

To this effect, the latest changes I applied to those two efforts are a bit questionable to myself, but still seem the most reasonable way forward for the fightwarn effort and even for the goal of cleanly gaining libusb-1.0+0.1 in master as well.

Namely, with our closely-related implementations for libusb0, libusb1, libshut, libhid and hidparser sources, we juggle more or less the same logical content all with different integer and pointer types as dictated by this or that backend. Some of those libraries were already converged to size_t and similar standard types during the past year of fightwarn, but for others trying to do so caused an explosion of issued warnings about inconsistent use of the types. Eventually the easiest way to complete the fightwarn-medium level for a confident NUT 2.7.5 release was to revive the libusb1.0 integration (otherwise tentatively delayed to 2.7.6) because it had introduced an abstraction for "usb_ctrl_char" type, and to build up from that - typedef'ing other typical arguments of the USB API and peppering the code with range-checked casts to those names.

This way, at least as long as each driver is ultimately built into a single binary (no shared objects involved to mix the ABIs) - maybe starting with same sources (and different macro settings via Makefiles and configure script results), but using different object files where appropriate (e.g. SHUT vs USB builds of otherwise mostly-same drivers), each such built codepath uses the types of the respective backend API either "natively" or safely cast, with no compiler warnings in the end.

In fact, this seems like a good stepping stone toward a more reasonable implementation - with NUT code defining and using some same standard integer types for these values (probably picking some "least common denominator" for the native backend APIs and ultimately wire protocols, using uint8_t, uint16_t, uint32_t, size_t etc. and adding more range-checked casts where appropriate in the libraries) so the ultimate consumer code in the actual drivers would not have to bother about the typedefs and their value ranges, but would use the C-standard numeric types and allow shared objects and compatible ABIs. But now that goal can be achieved separately (and maybe in another release), by redefining these typedefs to same tokens and making the codebase variants build and work with that new definition.

Another separate effort would be to revise and reword the "magic 0x81" and bit maths, detailed in a few recent comments above.

Perhaps one large last thing to take care of before landing these changes into master is to check that these "rather theoretical" codebase fixes did not break functionality in practice! So whoever can test https://github.com/networkupstools/nut/tree/fightwarn-libusb-1.0+0.1 with their USB and/or SHUT devices using various libusb versions - please do so and chime back here :)

@jimklimov
Copy link
Member

Hm, auto-close was not anticipated... but more or less matches reality - the branches were merged...

buildroot-auto-update pushed a commit to buildroot/buildroot that referenced this issue Jul 23, 2022
Removed dependency to host-perl, not needed anymore after
networkupstools/nut@3f7d050

Removed autoreconf, the tarball provided by upstream contains a
configure script.

This configure script however adds /usr/local/include to some tests
which fail for us, instead of patching configure.ac and going through
the autoreconf nightmare again we patch the configure script directly.

Added configure option --without-doc

Added more autoreconf variables to fix cross-compile issues.

Changed "--with-drivers=all" to "--with-drivers=auto" because some new
drivers need packages not provided by buildroot, for example:
configure: error: Powerman client libraries not found, required for
 Powerman PDU client driver

Added optional dependency to freeipmi.

Added optional dependency to libusb, for details see
networkupstools/nut#300

Signed-off-by: Bernd Kuhls <bernd.kuhls@t-online.de>
Signed-off-by: Arnout Vandecappelle (Essensium/Mind) <arnout@mind.be>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests