Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dbus] fix Scan crashing #1814

Merged
merged 2 commits into from
May 17, 2023
Merged

Conversation

jdswensen
Copy link
Contributor

@jdswensen jdswensen commented Apr 5, 2023

I ran into an issue while trying to use the dbus Scan API. For my platform (imx6ull, built with yocto), I was able to repeatably cause otbr to error by running:

dbus-send --reply-timeout=30000 --print-reply --system --type="method_call" --dest=io.openthread.BorderRouter.wpan0 /io/openthread/BorderRouter/wpan0 io.openthread.BorderRouter.Scan

Callstack:

[ 00 ] __libc_do_syscall
[ 01 ] __pthread_kill_implementation ( pthread_kill.c:43 )
[ 02 ] raise ( raise.c:26 )
[ 03 ] abort ( abort.c:79 )
[ 04 ] 0x76ee4f63
[ 05 ] free_mem
[ 06 ] 0x76ed56a5
[ 07 ] 0x76eedbbe
[ 08 ] 0x76eec106
[ 09 ] 0x76ed5927
[ 10 ] 0x76ef27e2
[ 11 ] 0x76eedbbe
[ 12 ] 0x76eecbe2
[ 13 ] 0x76eec106
[ 14 ] 0x76eec106
[ 15 ] 0x76ec0d15
[ 16 ] 0x76eedbbe
[ 17 ] 0x76ec0bdf
[ 18 ] 0x76eedbbe
[ 19 ] otbr::DBus::DBusMessageEncode(DBusMessageIter*, bool) ( dbus_message_helper.cpp:114 )
[ 20 ] otbr::DBus::DBusMessageEncode(DBusMessageIter*, otbr::DBus::ActiveScanResult const&) ( dbus_message_helper_openthread.cpp:96 )
[ 21 ] otbr::DBus::DBusThreadObject::ReplyScanResult(otbr::DBus::DBusRequest&, otError, std::vector<otActiveScanResult, std::allocator<otActiveScanResult> > const&) ( dbus_message_helper.hpp:475 )
[ 22 ] otbr::agent::ThreadHelper::ActiveScanHandler(otActiveScanResult*) ( std_function.h:590 )
[ 23 ] ot::Mac::Mac::PerformActiveScan() ( mac.cpp:298 )
[ 24 ] otSysMainloopProcess ( system.cpp:393 )
[ 25 ] otbr::Ncp::ControllerOpenThread::Process(otSysMainloopContext const&) ( ncp_openthread.cpp:237 )
[ 26 ] otbr::MainloopManager::Process(otSysMainloopContext const&) ( mainloop_manager.cpp:57 )
[ 27 ] otbr::Application::Run() ( application.cpp:166 )
[ 28 ] realmain ( main.cpp:282 )
[ 29 ] main ( main.cpp:321 )
[ 30 ] __libc_start_call_main ( libc_start_call_main.h:58 )
[ 31 ] __libc_start_main_impl ( libc-start.c:389 )
[ 32 ] _start
[ 33 ] 0x76f1189f

There are three intertwined problems with the current dbus Scan API

ActiveScanResult is uninitialized when forming a reply message.
When cross-compiling for a Cortex-A7 platform, the boolean values were not being set causing the uninitialized values to be used during message encoding. This resulted in the boolean values being invalid data and causing dbus error messages when attempting to send the message.

arguments to dbus_message_iter_append_basic() were incorrect, assertion "*bool_p == 0 || *bool_p == 1"

I tested the same thing on my x86_64 host machine at one point, but it appears that the compiler chose to initialize the values to zero, so there were no issues.

The introspect contract and the scan reply message do not match.
The introspect document lists five variables, but the actual return has 12 variables. For example, after fixing the boolean return:

      struct {
         uint64 10547911602365048480
         string ""
         uint64 0
         array [
         ]
         uint16 57578
         uint16 0
         byte 11
         byte 199
         byte 206
         byte 0
         boolean false
         boolean false
      }

However, the definition in the introspect is:

    <!-- Scan: Perform a Thread network scan.
      @scan_result: array of scan results.

      The result struture definition is:
      <literallayout>
        struct {
          uint64 ext_address
          uint16 panid
          uint16 channel
          uint16 rssi
          uint8 lqi
        }
      </literallayout>
    -->
    <method name="Scan">
      <arg name="scan_result" type="a(tqqqy)" direction="out"/>
    </method>

I have assumed that the introspect document is the source of truth and removed the excess data from both the encoding step and the data structure itself. The data is not present (XPAN is 0), so it looks like it isn't being populated by anything anyway.

The introspect contract data types do not match the data being extracted from the data structure.
For example, the rssi is an 8-bit value in the ActiveScanResult data structure, but it is a 16-bit value in the dbus return definition. This causes dbus to error with:

Array or variant type requires that type uint16 be written, but byte was written

Given that the older definitions (channel, rssi) were correct, I chose to fix this by reverting the definitions in the introspect document.

Please let me know if anything needs to be changed!

@jwhui jwhui requested review from bukepo and superwhd April 6, 2023 02:20
src/dbus/common/types.hpp Outdated Show resolved Hide resolved
@jdswensen
Copy link
Contributor Author

Finally got around to addressing the feedback (sorry for the delay). Please let me know if either of you would like additional changes. 😄

@jdswensen jdswensen requested review from bukepo and superwhd May 11, 2023 01:12
Copy link
Contributor

@superwhd superwhd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

superwhd

This comment was marked as duplicate.

superwhd

This comment was marked as duplicate.

@jdswensen
Copy link
Contributor Author

@bukepo, would you have time to give this another look? 😃

@jwhui jwhui changed the title Fix dbus Scan crashing [dbus] fix Scan crashing May 16, 2023
Copy link
Member

@jwhui jwhui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! 👍🏼

@jwhui
Copy link
Member

jwhui commented May 16, 2023

@jdswensen , if you would like to keep the commits separate after merge, can you add the PR number to the commit header? For example:

[dbus] init ActiveScanResult to zero (#1814)

Alternatively, I can squash-merge and GitHub will automatically add the PR number to the commit header.

Let me know what you prefer. Thanks!

Avoid crashes when the compiler does not automatically initialize the
scan results to zero.

This behavior was not expressing itself on x86_64 platforms, but did
cause crashes on an ARM Cortex-A7.
The dbus encoding and decoding methods were attempting to send the
incorrect amount of data over dbus. This breaks the dbus introspect
contract and causes runtime errors.

To fix the issue:
- Re-introduce the entire `ActiveScanResult` data structure to the
  introspect document.
- Coerce the RSSI value into an int16 when encoding and decoding
  an `ActiveScanResult`.
@jdswensen
Copy link
Contributor Author

@jwhui, commits have been updated and PR has been rebased onto the latest.

@jwhui jwhui merged commit d78e768 into openthread:main May 17, 2023
31 checks passed
jwhui pushed a commit that referenced this pull request May 17, 2023
Avoid crashes when the compiler does not automatically initialize the
scan results to zero.

This behavior was not expressing itself on x86_64 platforms, but did
cause crashes on an ARM Cortex-A7.
@jdswensen jdswensen deleted the jds/fix-upstream-scan branch July 24, 2023 11:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants