Skip to content
This repository

USB devices causing "XactErr without NYET/NAK/ACK" #35

Closed
tomstrummer opened this Issue June 07, 2012 · 8 comments

6 participants

Thom Nichols ghollingworth jstsch Zealot Filippo Valsorda popcornmix
Thom Nichols

I have two USB devices that fail to work when plugged into any of the onboard USB ports. The following appears in dmesg:

DEBUG:handle_hc_chhltd_intr_dma:: XactErr without NYET/NAK/ACK

One is the aluminum Apple USB keyboard, the other is a realtek wifi adapter. Both of the devices work if plugged into a powered USB hub (I haven't tried when the hub is not AC powered.) Some folks are suggesting that this is not a power problem though, that it could be a USB issue that is inadvertently solved when a hub is used.

So - is this a power issue or a USB timing issue?

jstsch
jstsch commented June 16, 2012

I have seen the same errors quite often. I think this is similar to issue #29 and I doubt it is power related.

Zealot
kghost commented June 27, 2012

I have tested, this is not power issue, my powered hub don't drain any power from pi (measured 0V over usb fuse), still got this error.

On the contract, when plugin the usb device on pi directly, it drains 80-100mA from pi (measured 0.4V over usb fuse), but works perfect

Filippo Valsorda

I am affected too, with a rt73usb wireless dongle over a powered hub.

Filippo Valsorda

For me it happened during boot or as soon as the device was connected and never stopped. Now preloading usbcore (from /etc/modules) it shows up for a couple of seconds and then everything get working (also the Wi-Fi).

Filippo Valsorda

Nevermind, it stopped working as soon as I tried to connect to a network (scanning was working). Now dmesg is filled up with these errors and everything that tries to use the device freeze until I plug it out.

popcornmix
Owner

Can you try with latest firmware and with:
dwc_otg.microframe_schedule=1
in cmdline.txt

popcornmix
Owner

Is this still a problem with latest firmware?

Stephen Warren swarren referenced this issue from a commit September 28, 2012
Commit has since been removed from the repository and is no longer available.
Stephen Warren swarren referenced this issue from a commit October 02, 2012
Commit has since been removed from the repository and is no longer available.
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi May 09, 2013
Masami Hiramatsu (Hitachi) ftrace, kprobes: Fix a deadlock on ftrace_regex_lock
Fix a deadlock on ftrace_regex_lock which happens when setting
an enable_event trigger on dynamic kprobe event as below.

----
sh-2.05b# echo p vfs_symlink > kprobe_events
sh-2.05b# echo vfs_symlink:enable_event:kprobes:p_vfs_symlink_0 > set_ftrace_filter

=============================================
[ INFO: possible recursive locking detected ]
3.9.0+ #35 Not tainted
---------------------------------------------
sh/72 is trying to acquire lock:
 (ftrace_regex_lock){+.+.+.}, at: [<ffffffff810ba6c1>] ftrace_set_hash+0x81/0x1f0

but task is already holding lock:
 (ftrace_regex_lock){+.+.+.}, at: [<ffffffff810b7cbd>] ftrace_regex_write.isra.29.part.30+0x3d/0x220

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(ftrace_regex_lock);
  lock(ftrace_regex_lock);

 *** DEADLOCK ***
----

To fix that, this introduces a finer regex_lock for each ftrace_ops.
ftrace_regex_lock is too big of a lock which protects all
filter/notrace_hash operations, but it doesn't need to be a global
lock after supporting multiple ftrace_ops because each ftrace_ops
has its own filter/notrace_hash.

Link: http://lkml.kernel.org/r/20130509054417.30398.84254.stgit@mhiramat-M0-7522

Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Tom Zanussi <tom.zanussi@intel.com>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
[ Added initialization flag and automate mutex initialization for
  non ftrace.c ftrace_probes. ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
f04f24f
ghollingworth ghollingworth closed this July 17, 2013
ghollingworth
Owner

There was no further report of this case

popcornmix popcornmix referenced this issue from a commit July 10, 2013
drm/radeon: add missing ttm_eu_backoff_reservation to radeon_bo_list_…
…validate

Op 10-07-13 12:03, Markus Trippelsdorf schreef:
> On 2013.07.10 at 11:56 +0200, Maarten Lankhorst wrote:
>> Op 10-07-13 11:46, Markus Trippelsdorf schreef:
>>> On 2013.07.10 at 11:29 +0200, Maarten Lankhorst wrote:
>>>> Op 10-07-13 11:22, Markus Trippelsdorf schreef:
>>>>> By simply copy/pasting a big document under LibreOffice my system hangs
>>>>> itself up. Only a hard reset gets it working again.
>>>>> see also: https://bugs.freedesktop.org/show_bug.cgi?id=66551
>>>>>
>>>>> I've bisected the issue to:
>>>>>
>>>>> commit ecff665
>>>>> Author: Maarten Lankhorst <m.b.lankhorst@gmail.com>
>>>>> Date:   Thu Jun 27 13:48:17 2013 +0200
>>>>>
>>>>>     drm/ttm: make ttm reservation calls behave like reservation calls
>>>>>
>>>>>     This commit converts the source of the val_seq counter to
>>>>>     the ww_mutex api. The reservation objects are converted later,
>>>>>     because there is still a lockdep splat in nouveau that has to
>>>>>     resolved first.
>>>>>
>>>>>     Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
>>>>>     Reviewed-by: Jerome Glisse <jglisse@redhat.com>
>>>>>     Signed-off-by: Dave Airlie <airlied@redhat.com>
>>>> Hey,
>>>>
>>>> Can you try current head with CONFIG_PROVE_LOCKING set and post the
>>>> lockdep splat from dmesg, if any? If there is any locking issue
>>>> lockdep should warn about it.  Lockdep will turn itself off after the
>>>> first splat, so if the lockdep splat happens before running the
>>>> affected parts those will have to be fixed first.
>>> There was an unrelated EDAC lockdep splat, so I simply disabled it.
>>>
>>> This is what I get:
>>>
>>> Jul 10 11:40:44 x4 kernel: ================================================
>>> Jul 10 11:40:44 x4 kernel: [ BUG: lock held when returning to user space! ]
>>> Jul 10 11:40:44 x4 kernel: 3.10.0-08587-g496322b #35 Not tainted
>>> Jul 10 11:40:44 x4 kernel: ------------------------------------------------
>>> Jul 10 11:40:44 x4 kernel: X/211 is leaving the kernel with locks still held!
>>> Jul 10 11:40:44 x4 kernel: 2 locks held by X/211:
>>> Jul 10 11:40:44 x4 kernel: #0:  (reservation_ww_class_acquire){+.+.+.}, at: [<ffffffff813279f0>] radeon_bo_list_validate+0x20/0xd0
>>> Jul 10 11:40:44 x4 kernel: #1:  (reservation_ww_class_mutex){+.+.+.}, at: [<ffffffff81309306>] ttm_eu_reserve_buffers+0x126/0x4b0
>>> Jul 10 11:40:52 x4 kernel: SysRq : Emergency Sync
>>> Jul 10 11:40:53 x4 kernel: Emergency Sync complete
>>>
>> Thanks, exactly what I thought. I missed a backoff somewhere..
>>
>> Does the below patch fix it?
> Yes. Thank you for your quick reply.

8<------
If radeon_cs_parser_relocs fails ttm_eu_backoff_reservation doesn't get called.
This left open a bug where ttm_eu_reserve_buffers succeeded but the bo's were
not unlocked afterwards:

Jul 10 11:40:44 x4 kernel: ================================================
Jul 10 11:40:44 x4 kernel: [ BUG: lock held when returning to user space! ]
Jul 10 11:40:44 x4 kernel: 3.10.0-08587-g496322b #35 Not tainted
Jul 10 11:40:44 x4 kernel: ------------------------------------------------
Jul 10 11:40:44 x4 kernel: X/211 is leaving the kernel with locks still held!
Jul 10 11:40:44 x4 kernel: 2 locks held by X/211:
Jul 10 11:40:44 x4 kernel: #0:  (reservation_ww_class_acquire){+.+.+.}, at: [<ffffffff813279f0>] radeon_bo_list_validate+0x20/0xd0
Jul 10 11:40:44 x4 kernel: #1:  (reservation_ww_class_mutex){+.+.+.}, at: [<ffffffff81309306>] ttm_eu_reserve_buffers+0x126/0x4b0
Jul 10 11:40:52 x4 kernel: SysRq : Emergency Sync
Jul 10 11:40:53 x4 kernel: Emergency Sync complete

This is a regression caused by commit ecff665.
"drm/ttm: make ttm reservation calls behave like reservation calls"

Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Tested-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
1b6e5fd
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi September 27, 2013
x86: Improve the printout of the SMP bootup CPU table
As the new x86 CPU bootup printout format code maintainer, I am
taking immediate action to improve and clean (and thus indulge
my OCD) the reporting of the cores when coming up online.

Fix padding to a right-hand alignment, cleanup code and bind
reporting width to the max number of supported CPUs on the
system, like this:

 [    0.074509] smpboot: Booting Node   0, Processors:      #1  #2  #3  #4  #5  #6  #7 OK
 [    0.644008] smpboot: Booting Node   1, Processors:  #8  #9 #10 #11 #12 #13 #14 #15 OK
 [    1.245006] smpboot: Booting Node   2, Processors: #16 #17 #18 #19 #20 #21 #22 #23 OK
 [    1.864005] smpboot: Booting Node   3, Processors: #24 #25 #26 #27 #28 #29 #30 #31 OK
 [    2.489005] smpboot: Booting Node   4, Processors: #32 #33 #34 #35 #36 #37 #38 #39 OK
 [    3.093005] smpboot: Booting Node   5, Processors: #40 #41 #42 #43 #44 #45 #46 #47 OK
 [    3.698005] smpboot: Booting Node   6, Processors: #48 #49 #50 #51 #52 #53 #54 #55 OK
 [    4.304005] smpboot: Booting Node   7, Processors: #56 #57 #58 #59 #60 #61 #62 #63 OK
 [    4.961413] Brought up 64 CPUs

and this:

 [    0.072367] smpboot: Booting Node   0, Processors:    #1 #2 #3 #4 #5 #6 #7 OK
 [    0.686329] Brought up 8 CPUs

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Libin <huawei.libin@huawei.com>
Cc: wangyijing@huawei.com
Cc: fenghua.yu@intel.com
Cc: guohanjun@huawei.com
Cc: paul.gortmaker@windriver.com
Link: http://lkml.kernel.org/r/20130927143554.GF4422@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
646e29a
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi September 30, 2013
x86/boot: Further compress CPUs bootup message
Turn it into (for example):

[    0.073380] x86: Booting SMP configuration:
[    0.074005] .... node   #0, CPUs:          #1   #2   #3   #4   #5   #6   #7
[    0.603005] .... node   #1, CPUs:     #8   #9  #10  #11  #12  #13  #14  #15
[    1.200005] .... node   #2, CPUs:    #16  #17  #18  #19  #20  #21  #22  #23
[    1.796005] .... node   #3, CPUs:    #24  #25  #26  #27  #28  #29  #30  #31
[    2.393005] .... node   #4, CPUs:    #32  #33  #34  #35  #36  #37  #38  #39
[    2.996005] .... node   #5, CPUs:    #40  #41  #42  #43  #44  #45  #46  #47
[    3.600005] .... node   #6, CPUs:    #48  #49  #50  #51  #52  #53  #54  #55
[    4.202005] .... node   #7, CPUs:    #56  #57  #58  #59  #60  #61  #62  #63
[    4.811005] .... node   #8, CPUs:    #64  #65  #66  #67  #68  #69  #70  #71
[    5.421006] .... node   #9, CPUs:    #72  #73  #74  #75  #76  #77  #78  #79
[    6.032005] .... node  #10, CPUs:    #80  #81  #82  #83  #84  #85  #86  #87
[    6.648006] .... node  #11, CPUs:    #88  #89  #90  #91  #92  #93  #94  #95
[    7.262005] .... node  #12, CPUs:    #96  #97  #98  #99 #100 #101 #102 #103
[    7.865005] .... node  #13, CPUs:   #104 #105 #106 #107 #108 #109 #110 #111
[    8.466005] .... node  #14, CPUs:   #112 #113 #114 #115 #116 #117 #118 #119
[    9.073006] .... node  #15, CPUs:   #120 #121 #122 #123 #124 #125 #126 #127
[    9.679901] x86: Booted up 16 nodes, 128 CPUs

and drop useless elements.

Change num_digits() to hpa's division-avoiding, cell-phone-typed
version which he went at great lengths and pains to submit on a
Saturday evening.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: huawei.libin@huawei.com
Cc: wangyijing@huawei.com
Cc: fenghua.yu@intel.com
Cc: guohanjun@huawei.com
Cc: paul.gortmaker@windriver.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20130930095624.GB16383@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
a17bce4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.