Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes missing subpackets in OS X #435

Merged
merged 1 commit into from
Oct 28, 2015
Merged

Conversation

fran6co
Copy link
Contributor

@fran6co fran6co commented Oct 20, 2015

Had some issues in OS X with libusb missing some subpackets for no good reason and some other weird errors as noted in => #390

Looking at libfreenect 1.x code I found that https://github.com/OpenKinect/libfreenect/blob/master/src/usb_libusb10.h#L33 uses different parameters for OS X, just ported that from there.

@fran6co fran6co changed the title Fixes missing packets in OS X Fixes missing subpackets in OS X Oct 20, 2015
@xlz
Copy link
Member

xlz commented Oct 20, 2015

Overall if this improves packet loss on Mac OSX, I'd say it's good to merge.

Though changing the size of transfer pool is risky if you don't understand why this change works. I thought about doing this long for several previous bugs where errors were reported for not enough bandwidth. It seems hard to provide a guarantee it works across all hardware.

My basic understanding of is transfer pool will reserve USB iso bandwidth with guaranteed latency w.r.t the size of the transfer pool. Reserving too much will cause not enough bandwidth error. Reserving too little will risk having lots of packet loss. And there is certain abstraction from usbfs buffer to "transfer" (Linux) so the number of transfers is not simply the bandwidth reserved.

Hopefully this can be explained with USB 3.0 spec and libusb code later.

@fran6co
Copy link
Contributor Author

fran6co commented Oct 20, 2015

That's why I changed it only for OS X, no idea how it works for the rest of the platforms but it works fine on mine and libfreenect 1 works fine too so I figured we could use the same config. I tried to keep PKTS_PER_XFER* NUM_XFERS around the same, it's 128 * 4 = 512 for OS X and 60 * 8 = 480 for the rest.

@xlz
Copy link
Member

xlz commented Oct 21, 2015

This uses 128 packets per transfer. I don't know your max_iso_packet_size (in libfreenect2.cpp). Assuming 0x8400, that is 4224KB per transfer. A single depth image requires raw data of 10x352x424 bytes, 1457.5 KB. So a single transfer contains data for 2 depth images, and incomplete data for the 3rd depth image, which would have to wait for the next transfer.

I think here unnecessary and non-uniform latency is introduced.

@xlz
Copy link
Member

xlz commented Oct 21, 2015

kIOReturnIsoTooOld comes from this pseudo stack trace:

  • ReadIsochPipeAsync() libusb/os/darwin_usb.c:1587
  • IOUSBInterfaceClass::ReadIsochPipeAsync(), IOUSBFamily...
  • IOConnectCallAsyncScalarMethod(..., kUSBInterfaceUserClientReadIsochPipe, ...)
  • IOUSBInterfaceUserClientV2::_ReadIsochPipe
  • DoIsochPipeAsync()
  • IOUSBPipe::Read(???)
  • IsocIO()
  • DoIsocTransfer()
  • IsocTransaction()
  • AppleUSBXHCI::UIMCreateIsochTransfer() - no public source code, use AppleUSBEHCI as reference

UIMCreateIsochTransfer() is the only place that generates kIOReturnIsoTooOld. kIOReturnIsoTooOld is returned immediately for the async read when command->GetStartFrame() < pEP->firstAvailableFrame or command->GetStartFrame() < (curFrameNumber - maxOffset).

Not sure what this means yet.

There are some more things you can try. I see a lot of USBLog() lines which kprintf to "kernel log" (is this a thing on Mac OSX?) Probably you can dig that out with https://developer.apple.com/hardwaredrivers/ "Debug releases of the IOUSBFamily kernel extension".

The second thing is this kIOReturnIsoTooOld seems to be generated by some static conditions. Maybe explore how frame changes in libusb/os/darwin_usb.c:1587

    kresult = (*(cInterface->interface))->ReadIsochPipeAsync(
          cInterface->interface, pipeRef, transfer->buffer, frame,
          transfer->num_iso_packets, tpriv->isoc_framelist, darwin_async_io_callback,
          itransfer);

and maybe certain conditions really make frame here outdated somehow.

@fran6co
Copy link
Contributor Author

fran6co commented Oct 21, 2015

I'm increasing it to 136, that should cover it.
In any case I'm going to setup a machine to run the kinect2 all day and see if I get the kIOReturnIsoTooOld error or the missing packets. The master version throws them from the go, so this is an improvement already.

@fran6co
Copy link
Contributor Author

fran6co commented Oct 21, 2015

It's been around 2 hours and no errors so far, it looks like this fixed the issue.

@xlz
Copy link
Member

xlz commented Oct 21, 2015

I think you should decrease it to reduce latency.

@fran6co
Copy link
Contributor Author

fran6co commented Oct 21, 2015

I don't understand what's the problem with the latency... I mean, the only problem is the time it takes to retrieve the extra data for the next frame, I don't think that extra time is that big compared to all the processing it's doing. In any case I'll reduce it to fix a single depth image in the transmission.

@xlz
Copy link
Member

xlz commented Oct 21, 2015

The application is realtime and latency sensitive. If you batch two depth images in one libusb transfer, the two images are processed together, resulting in non-uniform performance down the pipeline. And by the time the second image is processed, the first image is already outdated.

@fran6co
Copy link
Contributor Author

fran6co commented Oct 21, 2015

Ok, I reduced it to 64 that's around 1 depth image and it looks to be working.

@xlz
Copy link
Member

xlz commented Oct 22, 2015

Can you try libusb with this change (with 8 packets per transfer setup):

  /* schedule for a frame a little in the future */
  frame += 64;

frame += 64 was originally added in libfreenect ("+= 2 was too short"):
OpenKinect/libfreenect@562a7b2

I looked at AppleUSBXHCI, and frame should be scheduled a little bit a head of pEP->firstAvailableFrameNumber, otherwise other transfers could race beyond that. (But I don't see any race condition though). If 4 is not enough, try 64.

@floe floe added this to the 0.1 milestone Oct 22, 2015
@fran6co
Copy link
Contributor Author

fran6co commented Oct 22, 2015

I'll check it out, in any case keep it mind that the errors are totally gone with this PR using homebrewed libusb 1.0.20, tested in 10.10 and 10.11.

@fran6co
Copy link
Contributor Author

fran6co commented Oct 22, 2015

Ok, just tested. I'm not getting the kIOReturnIsoTooOld errors anymore but I still get all the rest.

@xlz
Copy link
Member

xlz commented Oct 22, 2015

OK, let's go with this for Mac users for now with increased latency. It's still a mystery why 60 concurrent transfers are giving MacOSX a hard time though.

@xlz
Copy link
Member

xlz commented Oct 22, 2015

One more note, when receiving two depth images together the depth processor will not be ready for the next one, resulting in "skipping depth packet" for every other depth image. I tested 128 packets per transfer on Linux and got that.

@fran6co
Copy link
Contributor Author

fran6co commented Oct 22, 2015

Uhm, I think the only way to solve that is to add some buffering in between. Any other ideas?

@xlz
Copy link
Member

xlz commented Oct 22, 2015

Sorry, there was an error in my calculation.

It's 352x424 shorts, not 352x424 bytes. Therefore for each depth images, there are 10*(352*2*424+38*4) = 2986480 bytes transferred. Assuming packet size of 0x8400, that is 89 packets at minimum for each depth image. There is idle room between each "subpacket", so 128 packets per transfer is actually probably the correct number if you want to fit one depth image in one transfer. Sorry for the confusion.

@fran6co
Copy link
Contributor Author

fran6co commented Oct 22, 2015

Back to 128. I'm getting skipped packets warnings but not more than before.

floe added a commit that referenced this pull request Oct 28, 2015
Fixes missing subpackets in OS X
@floe floe merged commit cdeb079 into OpenKinect:master Oct 28, 2015
@fran6co fran6co deleted the fix-apple branch October 28, 2015 13:36
@vcastellm
Copy link

I think I'm suffering this problem in OSX 10.11 but with libfreenect, The device just stop working with glview.

Tested the Kinect 360 with another Mac with 10.10 and it works well so I thinks it's related to the libusb 1.0.20

Relevant logs https://gist.github.com/victorcoder/cb259ce2eb3b04682134

@xlz
Copy link
Member

xlz commented Jan 13, 2016

@Victorcoder I think you should report the issue with https://github.com/OpenKinect/libfreenect/issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants