Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PW_SID:657598] [v3,1/3] Bluetooth: Add support for devcoredump #992

Closed
wants to merge 22 commits into from

Conversation

BluezTestBot
Copy link
Owner

From: Abhishek Pandit-Subedi abhishekpandit@chromium.org

Add devcoredump APIs to hci core so that drivers only have to provide
the dump skbs instead of managing the synchronization and timeouts.

The devcoredump APIs should be used in the following manner:

  • hci_devcoredump_init is called to allocate the dump.
  • hci_devcoredump_append is called to append any skbs with dump data
    OR hci_devcoredump_append_pattern is called to insert a pattern.
  • hci_devcoredump_complete is called when all dump packets have been
    sent OR hci_devcoredump_abort is called to indicate an error and
    cancel an ongoing dump collection.

The high level APIs just prepare some skbs with the appropriate data and
queue it for the dump to process. Packets part of the crashdump can be
intercepted in the driver in interrupt context and forwarded directly to
the devcoredump APIs.

Internally, there are 5 states for the dump: idle, active, complete,
abort and timeout. A devcoredump will only be in active state after it
has been initialized. Once active, it accepts data to be appended,
patterns to be inserted (i.e. memset) and a completion event or an abort
event to generate a devcoredump. The timeout is initialized at the same
time the dump is initialized (defaulting to 10s) and will be cleared
either when the timeout occurs or the dump is complete or aborted.

Signed-off-by: Abhishek Pandit-Subedi abhishekpandit@chromium.org
Signed-off-by: Manish Mandlik mmandlik@google.com
Reviewed-by: Abhishek Pandit-Subedi abhishekpandit@chromium.org

Changes in v3:

  • Add attribute to enable/disable and set default state to disabled

Changes in v2:

  • Move hci devcoredump implementation to new files
  • Move dump queue and dump work to hci_devcoredump struct
  • Add CONFIG_DEV_COREDUMP conditional compile

include/net/bluetooth/coredump.h | 110 +++++++
include/net/bluetooth/hci_core.h | 5 +
net/bluetooth/Makefile | 2 +
net/bluetooth/coredump.c | 504 +++++++++++++++++++++++++++++++
net/bluetooth/hci_core.c | 9 +
net/bluetooth/hci_sync.c | 2 +
6 files changed, 632 insertions(+)
create mode 100644 include/net/bluetooth/coredump.h
create mode 100644 net/bluetooth/coredump.c

pyma1 and others added 22 commits June 2, 2022 17:25
It is 13d3:3568 for MediaTek MT7922 USB Bluetooth chip.

T:  Bus=03 Lev=01 Prnt=01 Port=02 Cnt=01 Dev#=  2 Spd=480 MxCh= 0
D:  Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=13d3 ProdID=3568 Rev=01.00
S:  Manufacturer=MediaTek Inc.
S:  Product=Wireless_Device
S:  SerialNumber=...
C:  #Ifs= 3 Cfg#= 1 Atr=e0 MxPwr=100mA
I:  If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E:  Ad=81(I) Atr=03(Int.) MxPS=  16 Ivl=125us
E:  Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:  If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=   0 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=   0 Ivl=1ms
I:  If#= 2 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=(none)
E:  Ad=0a(O) Atr=03(Int.) MxPS=  64 Ivl=125us
E:  Ad=8a(I) Atr=03(Int.) MxPS=  64 Ivl=125us

Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This fixes the return value of qca_wakeup(), since
.wakeup work inversely with original .prevent_wake.

Fixes: 4539ca6 (Bluetooth: Rename driver .prevent_wake to .wakeup)
Signed-off-by: Sai Teja Aluvala <quic_saluvala@quicinc.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Add property, "brcm,requires-autobaud-mode", to enable autobaud mode
selection.

Some devices (e.g. CYW5557x) require autobaud mode to enable FW loading.
Autobaud mode can also be required on some boards where the controller
device is using a non-standard baud rate when first powered on.

Signed-off-by: Hakan Jansson <hakan.jansson@infineon.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Use the presence of a DT property, "brcm,requires-autobaud-mode", to enable
startup in autobaud mode. If the property is present, the device is started
in autobaud mode by asserting RTS (BT_UART_CTS_N) prior to powering on the
device.

Also prevent the use of unsupported commands for devices started in
autobaud mode. Only a limited subset of HCI commands are supported in
autobaud mode.

Some devices (e.g. CYW5557x) require autobaud mode to enable FW loading.
Autobaud mode can also be required on some boards where the controller
device is using a non-standard baud rate in normal mode when first powered
on.

Signed-off-by: Hakan Jansson <hakan.jansson@infineon.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
If a hardware error occurs and the connections are flushed without a
disconnection_complete event being signaled, the temporary linkkeys are
not flushed.

This change ensures that any outstanding flushable linkkeys are flushed
when the connection are flushed from the hash table.

Additionally, this also makes use of test_and_clear_bit to avoid
multiple attempts to delete the link key that's already been flushed.

Signed-off-by: Alain Michaud <alainm@chromium.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
As platform_driver_register() could fail, it should be better
to deal with the return value in order to maintain the code
consisitency.

Fixes: 1ab1f23 ("Bluetooth: hci_intel: Add support for platform driver")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The HCI command, event, and data packet processing workqueue is drained
to avoid deadlock in commit
76727c0 ("Bluetooth: Call drain_workqueue() before resetting state").

There is another delayed work, which will queue command to this drained
workqueue. Which results in the following error report:

Bluetooth: hci2: command 0x040f tx timeout
WARNING: CPU: 1 PID: 18374 at kernel/workqueue.c:1438 __queue_work+0xdad/0x1140
Workqueue: events hci_cmd_timeout
RIP: 0010:__queue_work+0xdad/0x1140
RSP: 0000:ffffc90002cffc60 EFLAGS: 00010093
RAX: 0000000000000000 RBX: ffff8880b9d3ec00 RCX: 0000000000000000
RDX: ffff888024ba0000 RSI: ffffffff814e048d RDI: ffff8880b9d3ec08
RBP: 0000000000000008 R08: 0000000000000000 R09: 00000000b9d39700
R10: ffffffff814f73c6 R11: 0000000000000000 R12: ffff88807cce4c60
R13: 0000000000000000 R14: ffff8880796d8800 R15: ffff8880796d8800
FS:  0000000000000000(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000c0174b4000 CR3: 000000007cae9000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 ? queue_work_on+0xcb/0x110
 ? lockdep_hardirqs_off+0x90/0xd0
 queue_work_on+0xee/0x110
 process_one_work+0x996/0x1610
 ? pwq_dec_nr_in_flight+0x2a0/0x2a0
 ? rwlock_bug.part.0+0x90/0x90
 ? _raw_spin_lock_irq+0x41/0x50
 worker_thread+0x665/0x1080
 ? process_one_work+0x1610/0x1610
 kthread+0x2e9/0x3a0
 ? kthread_complete_and_exit+0x40/0x40
 ret_from_fork+0x1f/0x30
 </TASK>

To fix this, we can add a new HCI_DRAIN_WQ flag, and don't queue the
timeout workqueue while command workqueue is draining.

Fixes: 76727c0 ("Bluetooth: Call drain_workqueue() before resetting state")
Reported-by: syzbot+63bed493aebbf6872647@syzkaller.appspotmail.com
Signed-off-by: Schspa Shi <schspa@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
When a userchannel socket is released, we should check whether the hdev
is already unregistered before sending out an IndexAdded.

Signed-off-by: Abhishek Pandit-Subedi <abhishekpandit@chromium.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
When HCI_USERCHANNEL is used, unregister the suspend notifier when
binding and register when releasing. The userchannel socket should be
left alone after open is completed.

Signed-off-by: Abhishek Pandit-Subedi <abhishekpandit@chromium.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The BCM4349B1, aka CYW/BCM89359, is a WiFi+BT chip and its Bluetooth
portion can be controlled over serial.
Extend the binding with its DT compatible.

Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The BCM4349B1, aka CYW/BCM89359, is a WiFi+BT chip and its Bluetooth
portion can be controlled over serial.

Two subversions are added for the chip, because ROM firmware reports
002.002.013 (at least for the chips I have here), while depending on
patchram firmware revision, either 002.002.013 or 002.002.014 is
reported.

Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Preserve the error code from hci_register_suspend_notifier().  Don't
return success.

Fixes: d6bb2a9 ("Bluetooth: Unregister suspend with userchannel")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Similar to the handling of l2cap_ecred_connect in commit d3715b2
("Bluetooth: use memset avoid memory leaks"), we thought a patch
might be needed here as well.

Use memset to initialize structs to prevent memory leaks
in l2cap_le_connect

Signed-off-by: Xiaohui Zhang <xiaohuizhang@ruc.edu.cn>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Both dev_name and short_name are not guaranteed to be NULL terminated so
this instead use strnlen and then attempt to determine if the resulting
string needs to be truncated or not.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=216018
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
The scan response and advertising data needs to be tracked on a per
instance (adv_info) since when these instaces are removed so are their
data, to fix that new flags are introduced which is used to mark when
the data changes and then checked to confirm when the data needs to be
synced with the controller.

Tested-by: Tedd Ho-Jeong An <tedd.an@intel.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Set the connection data before calling get_conn_info_sync, so it can be
verified the connection is still connected, before refreshing cached
values.

Fixes: 47db6b4 ("Bluetooth: hci_sync: Convert MGMT_OP_GET_CONN_INFO")
Signed-off-by: Zhengping Jiang <jiangzp@google.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Commit ce64b3e ("Bluetooth: mt7921s: Support wake on bluetooth")
adds the wake on bluethooth via a dedicated GPIO.

Extend the wake-on-bluetooth to use the SDIO DAT1 pin (in-band wakeup),
when supported by the SDIO host driver.

Co-developed-by: Yake Yang <yake.yang@mediatek.com>
Signed-off-by: Yake Yang <yake.yang@mediatek.com>
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
`cancel_work_sync(&hdev->power_on)` was moved to hci_dev_close_sync in
commit [1] to ensure that power_on work is canceled after HCI interface
down.

But, in certain cases power_on work function may call hci_dev_close_sync
itself: hci_power_on -> hci_dev_do_close -> hci_dev_close_sync ->
cancel_work_sync(&hdev->power_on), causing deadlock. In particular, this
happens when device is rfkilled on boot. To avoid deadlock, move
power_on work canceling out of hci_dev_do_close/hci_dev_close_sync.

Deadlock introduced by commit [1] was reported in [2,3] as broken
suspend. Suspend did not work because `hdev->req_lock` held as result of
`power_on` work deadlock. In fact, other BT features were not working.
It was not observed when testing [1] since it was verified without
rfkill in place.

NOTE: It is not needed to cancel power_on work from other places where
hci_dev_do_close/hci_dev_close_sync is called in case:
* Requests were serialized due to `hdev->req_workqueue`. The power_on
work is first in that workqueue.
* hci_rfkill_set_block which won't close device anyway until HCI_SETUP
is on.
* hci_sock_release which runs after hci_sock_bind which ensures
HCI_SETUP was cleared.

As result, behaviour is the same as in pre-dd06ed7 commit, except
power_on work cancel added to hci_dev_close.

[1]: commit ff7f292 ("Bluetooth: core: Fix missing power_on work cancel on HCI close")
[2]: https://lore.kernel.org/lkml/20220614181706.26513-1-max.oss.09@gmail.com/
[2]: https://lore.kernel.org/lkml/1236061d-95dd-c3ad-a38f-2dae7aae51ef@o2.pl/

Fixes: ff7f292 ("Bluetooth: core: Fix missing power_on work cancel on HCI close")
Signed-off-by: Vasyl Vavrychuk <vasyl.vavrychuk@opensynergy.com>
Reported-by: Max Krummenacher <max.krummenacher@toradex.com>
Reported-by: Mateusz Jonczyk <mat.jonczyk@o2.pl>
Tested-by: Max Krummenacher <max.krummenacher@toradex.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
This patch adds workflow files for ci:

[schedule_work.yml]
 - The workflow file for scheduled work
 - Sync the repo with upstream repo and rebase the workflow branch
 - Review the patches in the patchwork and creates the PR if needed

[ci.yml]
 - The workflow file for CI tasks
 - Run CI tests when PR is created

Signed-off-by: Tedd Ho-Jeong An <tedd.an@intel.com>
Add devcoredump APIs to hci core so that drivers only have to provide
the dump skbs instead of managing the synchronization and timeouts.

The devcoredump APIs should be used in the following manner:
 - hci_devcoredump_init is called to allocate the dump.
 - hci_devcoredump_append is called to append any skbs with dump data
   OR hci_devcoredump_append_pattern is called to insert a pattern.
 - hci_devcoredump_complete is called when all dump packets have been
   sent OR hci_devcoredump_abort is called to indicate an error and
   cancel an ongoing dump collection.

The high level APIs just prepare some skbs with the appropriate data and
queue it for the dump to process. Packets part of the crashdump can be
intercepted in the driver in interrupt context and forwarded directly to
the devcoredump APIs.

Internally, there are 5 states for the dump: idle, active, complete,
abort and timeout. A devcoredump will only be in active state after it
has been initialized. Once active, it accepts data to be appended,
patterns to be inserted (i.e. memset) and a completion event or an abort
event to generate a devcoredump. The timeout is initialized at the same
time the dump is initialized (defaulting to 10s) and will be cleared
either when the timeout occurs or the dump is complete or aborted.

Signed-off-by: Abhishek Pandit-Subedi <abhishekpandit@chromium.org>
Signed-off-by: Manish Mandlik <mmandlik@google.com>
Reviewed-by: Abhishek Pandit-Subedi <abhishekpandit@chromium.org>
Since device/firmware dump is a debugging feature, we may not need it
all the time. Add a sysfs attribute to enable/disable bluetooth
devcoredump capturing. The default state of this feature would be
disabled and it can be enabled by echoing 1 to enable_coredump sysfs
entry as follow:

$ echo 1 > /sys/class/bluetooth/<device>/enable_coredump

Signed-off-by: Manish Mandlik <mmandlik@google.com>
Intercept debug exception events from the controller and put them into
a devcoredump using hci devcoredump APIs. The debug exception contains
data in a TLV format and it will be parsed in userspace.

Signed-off-by: Abhishek Pandit-Subedi <abhishekpandit@chromium.org>
Signed-off-by: Manish Mandlik <mmandlik@google.com>
Reviewed-by: Abhishek Pandit-Subedi <abhishekpandit@chromium.org>
Reviewed-by: Chethan Tumkur Narayan <chethan.tumkur.narayan@intel.com>
@github-actions
Copy link

github-actions bot commented Jul 7, 2022

CheckPatch
Test ID: checkpatch
Desc: Run checkpatch.pl script with rule in .checkpatch.conf
Duration: 4.51 seconds
Result: FAIL
Output:

[v3,1/3] Bluetooth: Add support for devcoredump\Traceback (most recent call last):
  File "scripts/spdxcheck.py", line 6, in <module>
    from ply import lex, yacc
ModuleNotFoundError: No module named 'ply'
WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#133: 
new file mode 100644

WARNING:SPDX_LICENSE_TAG: Improper SPDX comment style for 'include/net/bluetooth/coredump.h', please use '/*' instead
#138: FILE: include/net/bluetooth/coredump.h:1:
+// SPDX-License-Identifier: GPL-2.0-only

WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#138: FILE: include/net/bluetooth/coredump.h:1:
+// SPDX-License-Identifier: GPL-2.0-only

WARNING:SPLIT_STRING: quoted string split across lines
#589: FILE: net/bluetooth/coredump.c:300:
+				    "Devcoredump complete with size %u "
+				    "(expect %u)",

WARNING:SPLIT_STRING: quoted string split across lines
#608: FILE: net/bluetooth/coredump.c:319:
+				    "Devcoredump aborted with size %u "
+				    "(expect %u)",

WARNING:OOM_MESSAGE: Possible unnecessary 'out of memory' message
#691: FILE: net/bluetooth/coredump.c:402:
+	if (!skb) {
+		bt_dev_err(hdev, "Failed to allocate devcoredump init");

WARNING:OOM_MESSAGE: Possible unnecessary 'out of memory' message
#734: FILE: net/bluetooth/coredump.c:445:
+	if (!skb) {
+		bt_dev_err(hdev, "Failed to allocate devcoredump pattern");

WARNING:OOM_MESSAGE: Possible unnecessary 'out of memory' message
#760: FILE: net/bluetooth/coredump.c:471:
+	if (!skb) {
+		bt_dev_err(hdev, "Failed to allocate devcoredump complete");

WARNING:OOM_MESSAGE: Possible unnecessary 'out of memory' message
#782: FILE: net/bluetooth/coredump.c:493:
+	if (!skb) {
+		bt_dev_err(hdev, "Failed to allocate devcoredump abort");

total: 0 errors, 9 warnings, 0 checks, 670 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

/github/workspace/src/12910303.patch has style problems, please review.

NOTE: Ignored message types: UNKNOWN_COMMIT_ID

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

[v3,2/3] Bluetooth: Add sysfs entry to enable/disable devcoredump\WARNING:CONSIDER_KSTRTO: simple_strtol is obsolete, use kstrtol instead
#131: FILE: net/bluetooth/hci_sysfs.c:111:
+	if (simple_strtol(buf, NULL, 10))

total: 0 errors, 1 warnings, 0 checks, 45 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

/github/workspace/src/12910304.patch has style problems, please review.

NOTE: Ignored message types: UNKNOWN_COMMIT_ID

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.

@github-actions
Copy link

github-actions bot commented Jul 7, 2022

GitLint
Test ID: gitlint
Desc: Run gitlint with rule in .gitlint
Duration: 3.02 seconds
Result: PASS

@github-actions
Copy link

github-actions bot commented Jul 7, 2022

SubjectPrefix
Test ID: subjectprefix
Desc: Check subject contains "Bluetooth" prefix
Duration: 2.72 seconds
Result: PASS

@github-actions
Copy link

github-actions bot commented Jul 7, 2022

BuildKernel
Test ID: buildkernel
Desc: Build Kernel with minimal configuration supports Bluetooth
Duration: 32.04 seconds
Result: PASS

@github-actions
Copy link

github-actions bot commented Jul 7, 2022

BuildKernel32
Test ID: buildkernel32
Desc: Build 32bit Kernel with minimal configuration supports Bluetooth
Duration: 27.65 seconds
Result: PASS

@github-actions
Copy link

github-actions bot commented Jul 7, 2022

Incremental Build with patches
Test ID: incremental_build
Desc: Incremental build per patch in the series
Duration: 52.40 seconds
Result: PASS

@github-actions
Copy link

github-actions bot commented Jul 7, 2022

TestRunner: Setup
Test ID: testrunnersetup
Desc: Setup environment for running Test Runner
Duration: 474.56 seconds
Result: PASS

@github-actions
Copy link

github-actions bot commented Jul 7, 2022

TestRunner: l2cap-tester
Test ID: testrunnerl2cap-tester
Desc: Run test-runner with l2cap-tester
Duration: 16.85 seconds
Result: PASS
Output:

Total: 40, Passed: 40 (100.0%), Failed: 0, Not Run: 0

@github-actions github-actions bot force-pushed the workflow branch 28 times, most recently from 6daff20 to ceebe87 Compare August 6, 2022 23:38
@BluezTestBot BluezTestBot deleted the 657598 branch August 6, 2022 23:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet