Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

softflowd: do_page_fault(): sending SIGSEGV to softflowd for invalid read access from ..... #19655

Closed
HolgerHees opened this issue Oct 21, 2022 · 5 comments

Comments

@HolgerHees
Copy link

HolgerHees commented Oct 21, 2022

Maintainer: @jesferman , @rvandegrift
Environment: OpenWrt 22.03.0, TP-Link Archer C7 v2, Qualcomm Atheros QCA9558 ver 1 rev 0

@rvandegrift this is not related to my ticket from yesterday. It is a different device running at a friend's house.

Description:

I'm running softflowd on a router where I get no flows when I call "softflowctl dump-flows"

When I call "softflowctl statistics" I see

root@OpenWrt:~# softflowctl statistics
softflowd[6688]: Accumulated statistics since 2022-10-21T07:12:06 UTC:
Number of active flows: 112
Packets processed: 20685
Fragments: 42
Ignored packets: 2 (2 non-IP, 0 too short)
Flows expired: 46 (0 forced)
Flows exported: 46 (79 records) in 4 packets (0 failures)
Packets received by libpcap: 21226
Packets dropped by libpcap: 0
Packets dropped by interface: 0

Expired flow statistics:  minimum       average       maximum
  Flow bytes:                  60        120147       1738147
  Flow packets:                 1           344          6350
  Duration:                  0.00s        47.31s        63.06s

Expired flow reasons:
       tcp =        14   tcp.rst =         1   tcp.fin =         0
       udp =         6      icmp =         0   general =         0
   maxlife =        25
over 2 GiB =         0
  maxflows =         0
   flushed =         0

Per-protocol statistics:     Octets      Packets   Avg Life    Max Life
           icmp (1):          10920          130      60.01s      60.02s
            tcp (6):         715552         2075      43.78s      63.06s
           udp (17):        4800272        13600      43.77s      63.01s

but when I call "softflowctl dump-flows" I get

root@OpenWrt:~# softflowctl dump-flows
softflowd[6688]: Dumping flow data:

an a kernel log message like

Fri Oct 21 07:13:47 2022 kern.info kernel: [745743.417947] do_page_fault(): sending SIGSEGV to softflowd for invalid read access from 00000326
Fri Oct 21 07:13:47 2022 kern.info kernel: [745743.427175] epc = 77de0dc0 in libc.so[77d62000+a9000]
Fri Oct 21 07:13:47 2022 kern.info kernel: [745743.432682] ra  = 77de2b04 in libc.so[77d62000+a9000]
Fri Oct 21 07:13:47 2022 kern.info kernel: [745743.443552] device br-lan.1 left promiscuous mode
Fri Oct 21 07:13:47 2022 kern.info kernel: [745743.448485] device br-lan left promiscuous mode
Fri Oct 21 07:13:52 2022 kern.info kernel: [745748.464316] device br-lan.1 entered promiscuous mode
Fri Oct 21 07:13:52 2022 kern.info kernel: [745748.469774] device br-lan entered promiscuous mode
@HolgerHees
Copy link
Author

and this are the last logs when I run softflowd with strace

read(7, "dump-flows\n", 1024)           = 11
getpid()                                = 6867
writev(7, [{iov_base="softflowd[6867", iov_len=14}, {iov_base="]: Dumping flow data:\n", iov_len=22}], 2) = 36
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x97} ---
+++ killed by SIGSEGV +++
Segmentation fault

@micmac1
Copy link
Contributor

micmac1 commented Oct 21, 2022

I can reproduce this on same target platform (mips24kc). Build log shows some time_t-related warnings.

softflowd.c: In function 'format_flow':
softflowd.c:307:13: warning: format '%ld' expects argument of type 'long int', but argument 15 has type 'suseconds_t' {aka 'long long int'} [-Wformat=]
  307 |             "seq:%" PRIu64 " [%s]:%hu <> [%s]:%hu proto:%u "
      |             ^~~~~~~
......
  315 |             (flow->flow_start.tv_usec + 500) / 1000, fin_time,
      |             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                                              |
      |                                              suseconds_t {aka long long int}
softflowd.c:309:27: note: format string is defined here
  309 |             "start:%s.%03ld finish:%s.%03ld tcp>:%02x tcp<:%02x "
      |                       ~~~~^
      |                           |
      |                           long int
      |                       %03lld
softflowd.c:307:13: warning: format '%ld' expects argument of type 'long int', but argument 17 has type 'suseconds_t' {aka 'long long int'} [-Wformat=]
  307 |             "seq:%" PRIu64 " [%s]:%hu <> [%s]:%hu proto:%u "
      |             ^~~~~~~
......
  316 |             (flow->flow_last.tv_usec + 500) / 1000, flow->tcp_flags[0],
      |             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                                             |
      |                                             suseconds_t {aka long long int}
softflowd.c:309:43: note: format string is defined here
  309 |             "start:%s.%03ld finish:%s.%03ld tcp>:%02x tcp<:%02x "
      |                                       ~~~~^
      |                                           |
      |                                           long int
      |                                       %03lld
softflowd.c: In function 'dump_flows':
softflowd.c:1173:16: warning: format '%ld' expects argument of type 'long int', but argument 4 has type 'time_t' {aka 'long long int'} [-Wformat=]
 1173 |                "EXPIRY EVENT for flow %" PRIu64 " in %ld seconds\n",
      |                ^~~~~~~~~~~~~~~~~~~~~~~~~
 1174 |                expiry->flow->flow_seq, (long int) expiry->expires_at - now);
      |                                        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                                                                      |
      |                                                                      time_t {aka long long int}
softflowd.c:1173:56: note: format string is defined here
 1173 |                "EXPIRY EVENT for flow %" PRIu64 " in %ld seconds\n",
      |                                                      ~~^
      |                                                        |
      |                                                        long int
      |                                                      %lld

So source code is assuming time_t is 32 bit because target is 32 bit platform, but because of musl in 22.03 always using 64 bit for time_t this isn't correct.

I added a patch for this and recompiled. Then no more segfault.
030-time_t.txt

Can you try and confirm from your end?

Regards,
Seb

@Ansuel
Copy link
Member

Ansuel commented Oct 21, 2022

btw i found a very similar bug for the ripe atlas probe...

the fix is to just cast everything related to time_t to %lld and cast them to (long long) to handle both 32 and 64 bit

@HolgerHees
Copy link
Author

@micmac1 the patch works for me. Thanks a lot. Hope it will be merged soon.

micmac1 added a commit to micmac1/softflowd that referenced this issue Oct 21, 2022
Compiling softflowd against musl 1.2.x for 32 bit targets shows some
warnings:

softflowd.c: In function 'format_flow':
softflowd.c:307:13: warning: format '%ld' expects argument of type 'long int', but argument 15 has type 'suseconds_t' {aka 'long long int'} [-Wformat=]
  307 |             "seq:%" PRIu64 " [%s]:%hu <> [%s]:%hu proto:%u "
      |             ^~~~~~~
......
  315 |             (flow->flow_start.tv_usec + 500) / 1000, fin_time,
      |             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                                              |
      |                                              suseconds_t {aka long long int}
softflowd.c:309:27: note: format string is defined here
  309 |             "start:%s.%03ld finish:%s.%03ld tcp>:%02x tcp<:%02x "
      |                       ~~~~^
      |                           |
      |                           long int
      |                       %03lld
softflowd.c:307:13: warning: format '%ld' expects argument of type 'long int', but argument 17 has type 'suseconds_t' {aka 'long long int'} [-Wformat=]
  307 |             "seq:%" PRIu64 " [%s]:%hu <> [%s]:%hu proto:%u "
      |             ^~~~~~~
......
  316 |             (flow->flow_last.tv_usec + 500) / 1000, flow->tcp_flags[0],
      |             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                                             |
      |                                             suseconds_t {aka long long int}
softflowd.c:309:43: note: format string is defined here
  309 |             "start:%s.%03ld finish:%s.%03ld tcp>:%02x tcp<:%02x "
      |                                       ~~~~^
      |                                           |
      |                                           long int
      |                                       %03lld
softflowd.c: In function 'dump_flows':
softflowd.c:1173:16: warning: format '%ld' expects argument of type 'long int', but argument 4 has type 'time_t' {aka 'long long int'} [-Wformat=]
 1173 |                "EXPIRY EVENT for flow %" PRIu64 " in %ld seconds\n",
      |                ^~~~~~~~~~~~~~~~~~~~~~~~~
 1174 |                expiry->flow->flow_seq, (long int) expiry->expires_at - now);
      |                                        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                                                                      |
      |                                                                      time_t {aka long long int}
softflowd.c:1173:56: note: format string is defined here
 1173 |                "EXPIRY EVENT for flow %" PRIu64 " in %ld seconds\n",
      |                                                      ~~^
      |                                                        |
      |                                                        long int
      |                                                      %lld

Dumping flow data results in a segmentation fault:

root@OpenWrt:~# softflowctl dump-flows
softflowd[6688]: Dumping flow data:

No further output. But in the system log a segmentation fault is
visible:

Fri Oct 21 07:13:47 2022 kern.info kernel: [745743.417947] do_page_fault(): sending SIGSEGV to softflowd for invalid read access from 00000326
Fri Oct 21 07:13:47 2022 kern.info kernel: [745743.427175] epc = 77de0dc0 in libc.so[77d62000+a9000]
Fri Oct 21 07:13:47 2022 kern.info kernel: [745743.432682] ra  = 77de2b04 in libc.so[77d62000+a9000]
Fri Oct 21 07:13:47 2022 kern.info kernel: [745743.443552] device br-lan.1 left promiscuous mode
Fri Oct 21 07:13:47 2022 kern.info kernel: [745743.448485] device br-lan left promiscuous mode
Fri Oct 21 07:13:52 2022 kern.info kernel: [745748.464316] device br-lan.1 entered promiscuous mode
Fri Oct 21 07:13:52 2022 kern.info kernel: [745748.469774] device br-lan entered promiscuous mode

This was reported on the OpenWrt "packages" repository's issue tracker
([1]).

musl 1.2 has time64 support, meaning it always uses 64 bit time_t (also
when compiling for 32 bit targets), to be Y2K38 safe. This commit
changes the format specifier from "%ld" to "%lld" to fix the compiler
warnings and make the segmentation fault go away.

[1] openwrt/packages#19655

Signed-off-by: Sebastian Kemper <sebastian_ml@gmx.net>
micmac1 added a commit to micmac1/packages that referenced this issue Oct 22, 2022
Add upstreamed patch to fix segmentation fault reported in [1].

[1] openwrt#19655

Signed-off-by: Sebastian Kemper <sebastian_ml@gmx.net>
micmac1 added a commit to micmac1/packages that referenced this issue Oct 22, 2022
Add upstreamed patch to fix segmentation fault reported in [1].

[1] openwrt#19655

Signed-off-by: Sebastian Kemper <sebastian_ml@gmx.net>
(cherry picked from commit 9a1d8ff)
@HolgerHees
Copy link
Author

as this bug will be fixed in another ticket, I will close this one.

PolynomialDivision pushed a commit that referenced this issue Oct 23, 2022
Add upstreamed patch to fix segmentation fault reported in [1].

[1] #19655

Signed-off-by: Sebastian Kemper <sebastian_ml@gmx.net>
micmac1 added a commit to micmac1/packages that referenced this issue Oct 23, 2022
Add upstreamed patch to fix segmentation fault reported in [1].

[1] openwrt#19655

Signed-off-by: Sebastian Kemper <sebastian_ml@gmx.net>
(cherry picked from commit 7d01963)
neheb pushed a commit that referenced this issue Oct 24, 2022
Add upstreamed patch to fix segmentation fault reported in [1].

[1] #19655

Signed-off-by: Sebastian Kemper <sebastian_ml@gmx.net>
(cherry picked from commit 7d01963)
1582130940 pushed a commit to 1582130940/OpenWrt-Lean-Packages that referenced this issue Nov 10, 2022
Add upstreamed patch to fix segmentation fault reported in [1].

[1] openwrt/packages#19655

Signed-off-by: Sebastian Kemper <sebastian_ml@gmx.net>
1582130940 pushed a commit to 1582130940/OpenWrt-Lean-Packages that referenced this issue Nov 10, 2022
Add upstreamed patch to fix segmentation fault reported in [1].

[1] openwrt/packages#19655

Signed-off-by: Sebastian Kemper <sebastian_ml@gmx.net>
stokito pushed a commit to stokito/packages that referenced this issue Dec 6, 2022
Add upstreamed patch to fix segmentation fault reported in [1].

[1] openwrt#19655

Signed-off-by: Sebastian Kemper <sebastian_ml@gmx.net>
SibrenVasse pushed a commit to SibrenVasse/packages that referenced this issue Feb 26, 2023
Add upstreamed patch to fix segmentation fault reported in [1].

[1] openwrt#19655

Signed-off-by: Sebastian Kemper <sebastian_ml@gmx.net>
(cherry picked from commit 7d01963)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants