Blind fix of raspberry halt due to unhandled int #195

Closed
wants to merge 1 commit into
from

Projects

None yet

3 participants

@spiliot
spiliot commented Jan 18, 2013

Following this
http://www.raspberrypi.org/phpBB3/viewtopic.php?f=28&t=23544&sid=5e54cf47d02ac7f344957b19d4f944b2&start=132
(first message should come up as by spiliot, 6th page)

I blind edited the int handler to do something with the interrupt instead of just exiting. This now works in my case without any implications.

The first time I send something to /dev/usb/lp0 (debug enabled) I get a
usblp0: nonzero read bulk status received: -5
which probably explains why the dwc_otg driver was entering the INT loop in the first place, but I lack the knowledge to pursue this further.

@spiliot spiliot Blind fix of raspberry halt due to unhandled int
Following this 
http://www.raspberrypi.org/phpBB3/viewtopic.php?f=28&t=23544&sid=5e54cf47d02ac7f344957b19d4f944b2&start=132
(first message should come up as by spiliot, 6th page)

I blind edited the int handler to do something with the interrupt instead of just exiting. This now works in my case without any implications.

The first time I send something to /dev/usb/lp0 (debug enabled) I get a 
usblp0: nonzero read bulk status received: -5 
which probably explains why the dwc_otg driver was entering the INT loop in the first place, but I lack the knowledge to pursue this further.
7517eeb
@P33M
Contributor
P33M commented Jan 18, 2013

Ahoy there.

I think we went down this route previously when figuring out why USB-serial was so broken on r-pi.

From http://www.raspberrypi.org/phpBB3/viewtopic.php?f=28&t=16280&start=25

The gist of it is, a data toggle error interrupt can occur at the same time as a channel halted interrupt. The hc_chltd handler doesn't check for anything to do with data toggle errors, does nothing to reset the interrupt and therefore loops forever. According to "the silicon documentation" which I have yet to see, a data toggle error should not create a halt in the host channel.

As your case is reproducible on your setup, could you apply the patch

diff --git a/drivers/usb/host/dwc_otg/dwc_otg_hcd_intr.c b/drivers/usb/host/dwc_otg/dwc_otg_hcd_intr.c
index 3e762e2..df03c3f 100644
--- a/drivers/usb/host/dwc_otg/dwc_otg_hcd_intr.c
+++ b/drivers/usb/host/dwc_otg/dwc_otg_hcd_intr.c
@@ -1918,8 +1918,32 @@ static int32_t handle_hc_datatglerr_intr(dwc_otg_hcd_t * hcd,
                                         dwc_otg_hc_regs_t * hc_regs,
                                         dwc_otg_qtd_t * qtd)
 {
+       /* A data toggle error in a BULK or INTR transaction is benign and continuing
+        * the transaction will (as per USB spec) result in resynchronisation.
+        * In DMA mode the channel may also be halted automatically by the host -
+         * Therefore there is nothing to do here but cleanup host-side and try again
+        */
+       // FIXME: This code is for TEST PURPOSES to solve infinite looping in an interrupt
+       char * eptype;
        DWC_DEBUGPL(DBG_HCDI, "--Host Channel %d Interrupt: "
                    "Data Toggle Error--\n", hc->hc_num);
+       switch (hc->ep_type) {
+               case DWC_OTG_EP_TYPE_BULK:
+                       eptype = "BULK";
+                       break;
+               case DWC_OTG_EP_TYPE_INTR:
+                       eptype = "INTERRUPT";
+                       break;
+               case DWC_OTG_EP_TYPE_ISOC:
+                       eptype = "ISOCHRONOUS";
+                       break;
+               case DWC_OTG_EP_TYPE_CONTROL:
+                       eptype = "CONTROL";
+                       break;
+               default:
+                       eptype = "NO IDEA";
+       }
+       DWC_ERROR("Data Toggle Error - on endpoint type %s\n", eptype);

        if (hc->ep_is_in) {
                qtd->error_count = 0;
@@ -1927,9 +1951,10 @@ static int32_t handle_hc_datatglerr_intr(dwc_otg_hcd_t * hcd,
                DWC_ERROR("Data Toggle Error on OUT transfer,"
                          "channel %d\n", hc->hc_num);
        }
-
+
+       /* No choice but to disable and restart DMA channel as core has halted */
        disable_hc_int(hc_regs, datatglerr);
-
+       halt_channel(hcd, hc, qtd, DWC_OTG_HC_XFER_NO_HALT_STATUS);
        return 1;
 }

@@ -2078,6 +2103,8 @@ static void handle_hc_chhltd_intr_dma(dwc_otg_hcd_t * hcd,
                handle_hc_babble_intr(hcd, hc, hc_regs, qtd);
        } else if (hcint.b.frmovrun) {
                handle_hc_frmovrun_intr(hcd, hc, hc_regs, qtd);
+       } else if (hcint.b.datatglerr) {
+               handle_hc_datatglerr_intr(hcd, hc, hc_regs, qtd);
        } else if (!out_nak_enh) {
                if (hcint.b.nyet) {
                        /*
@@ -2120,6 +2147,7 @@ static void handle_hc_chhltd_intr_dma(dwc_otg_hcd_t * hcd,
                                halt_channel(hcd, hc, qtd,
                                             DWC_OTG_HC_XFER_PERIODIC_INCOMPLETE);
                        } else {
+                               /* BULK or CONTROL */
                                DWC_ERROR
                                    ("%s: Channel %d, DMA Mode -- ChHltd set, but reason "
                                     "for halting is unknown, hcint 0x%08x, intsts 0x%08x\n",
@@ -2127,6 +2155,8 @@ static void handle_hc_chhltd_intr_dma(dwc_otg_hcd_t * hcd,
                                     DWC_READ_REG32(&hcd->
                                                    core_if->core_global_regs->
                                                    gintsts));
+                               dump_stack();
+                               halt_channel(hcd, hc, qtd, DWC_OTG_HC_XFER_NO_HALT_STATUS);
                        }

                }

Note that this was vs. a very old version of the pi kernel. Some mix and match may be required to reproduce the same result.

@spiliot
spiliot commented Jan 19, 2013

Indeed looks like a data toggle error is occurring.

After incorporating the debug code I get an endless loop of ERROR::handle_hc_datatglerr_intr:1948 Data Toggle Error - on endpoint type BULK

@ghollingworth
Contributor

This should be closed, this was fixed with a bunch of error handling

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment