Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Never finding streams #11

Closed
ghostmansd opened this issue Jul 11, 2014 · 12 comments
Closed

Never finding streams #11

ghostmansd opened this issue Jul 11, 2014 · 12 comments

Comments

@ghostmansd
Copy link

During libntoh testing, it was shown that libntoh never actually finds a stream, but always creates a new one. Here is an example source file.
https://gist.github.com/ghostmansd/5cb5b6f41fe72e554696

@ngo
Copy link

ngo commented Jul 11, 2014

Expected behaviour: create stream once, find it 9 times.
Real behaviour: create 10 streams.

@ngo
Copy link

ngo commented Jul 11, 2014

Hello again. We made some investigations - this appears to be a very interesting and subtle bug.

The problem lies in the tcp_getkey returning different values for the same session and tuple when called from different places in the code. The culprits are sfhash_3words and sfhash functions from sfhash.c. Gcc inlines sfhash.c inside sfhash_3words, and apparently does some tricky instruction reordering. This results in aux variable from sfhash_3words being uninitialized when used in sfhash! gcc even warns about this at build-time (see below).

We verified this by adding attribute ((noinline)) to the definition of sfhash function. This makes the result of tcp_getkey consistent and resolves our bug. The bug is also resolved if we use clang instead of gcc: SET (CMAKE_C_COMPILER "clang" )

We failed to understand whether the code itself is incorrect (i.e. contains undefined behavior) or the gcc makes some wrong assumptions about what can be inlined and reordered.

GCC version is:

gcc (Ubuntu/Linaro 4.8.1-10ubuntu9) 4.8.1

OS is:

Linux devspace 3.11.0-15-generic #25-Ubuntu SMP Thu Jan 30 17:22:01 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

Warnings during build:

/home/ngo/libntoh/src/sfhash.c:4:23: warning: ‘aux’ is used uninitialized in this function [-Wuninitialized]
 #define get16bits(d) (*((const unsigned short *) (d)))
                       ^
/home/ngo/libntoh/src/sfhash.c:63:15: note: ‘aux’ was declared here
  unsigned int aux[3] = {a,b,c};
               ^
/home/ngo/libntoh/src/sfhash.c:4:23: warning: ‘*((void *)&aux+2)’ is used uninitialized in this function [-Wuninitialized]
 #define get16bits(d) (*((const unsigned short *) (d)))
                       ^
/home/ngo/libntoh/src/sfhash.c:63:15: note: ‘aux’ was declared here
  unsigned int aux[3] = {a,b,c};
               ^
/home/ngo/libntoh/src/sfhash.c:4:23: warning: ‘*((void *)&aux+4)’ is used uninitialized in this function [-Wuninitialized]
 #define get16bits(d) (*((const unsigned short *) (d)))
                       ^
/home/ngo/libntoh/src/sfhash.c:63:15: note: ‘aux’ was declared here
  unsigned int aux[3] = {a,b,c};
               ^
/home/ngo/libntoh/src/sfhash.c:4:23: warning: ‘*((void *)&aux+6)’ is used uninitialized in this function [-Wuninitialized]
 #define get16bits(d) (*((const unsigned short *) (d)))
                       ^
/home/ngo/libntoh/src/sfhash.c:63:15: note: ‘aux’ was declared here
  unsigned int aux[3] = {a,b,c};
               ^
/home/ngo/libntoh/src/sfhash.c:4:23: warning: ‘*((void *)&aux+8)’ is used uninitialized in this function [-Wuninitialized]
 #define get16bits(d) (*((const unsigned short *) (d)))
                       ^
/home/ngo/libntoh/src/sfhash.c:63:15: note: ‘aux’ was declared here
  unsigned int aux[3] = {a,b,c};
               ^
/home/ngo/libntoh/src/sfhash.c:4:23: warning: ‘*((void *)&aux+10)’ is used uninitialized in this function [-Wuninitialized]
 #define get16bits(d) (*((const unsigned short *) (d)))
                       ^
/home/ngo/libntoh/src/sfhash.c:63:15: note: ‘aux’ was declared here
  unsigned int aux[3] = {a,b,c};

@ghostmansd
Copy link
Author

Everything works well if either clang or tcc is used; seems to be an UB case.

@ngo
Copy link

ngo commented Jul 11, 2014

Removing _HIDDEN attribute from sfhash* function also resolves the bug

@ngo
Copy link

ngo commented Jul 11, 2014

Final thougths:
we also think that using linux/sfhash.h instead of copy-pasting the implementation is probably a better idea. Or even use jhash (because sfhash.h is no more present in kernel headers)

@ngo
Copy link

ngo commented Jul 11, 2014

I tried removing sfhash.c and replacing sfhash.h with version from kernel (https://gitorious.org/wive-rtnl-ralink-rt305x-routers-firmware/wive-rtnl-ralink-rt305x-routers-firmware/source/81de0c05c5046b0c73eb07e8c43b68579b6cbf48:linux-2.6.21.x/include/linux/sfhash.h)

the bug is still there. I'm more and more inclined to think of a probable gcc bug O_O.

@sch3m4
Copy link
Owner

sch3m4 commented Jul 12, 2014

First of all, thank you ghostmansd and ngo.
I've tested the initial code provided by ghostmansd and this is the output:

10.0.0.233:48515 -> 173.194.71.94:80 (nil)
10.0.0.233:48515 -> 173.194.71.94:80 0x2233d00
10.0.0.233:48515 -> 173.194.71.94:80 0x2233d00
10.0.0.233:48515 -> 173.194.71.94:80 0x2233d00
10.0.0.233:48515 -> 173.194.71.94:80 0x2233d00
10.0.0.233:48515 -> 173.194.71.94:80 0x2233d00
10.0.0.233:48515 -> 173.194.71.94:80 0x2233d00
10.0.0.233:48515 -> 173.194.71.94:80 0x2233d00
10.0.0.233:48515 -> 173.194.71.94:80 0x2233d00
10.0.0.233:48515 -> 173.194.71.94:80 0x2233d00
10.0.0.233:48515 -> 173.194.71.94:80 0x2233d00

All goes well, using gcc version 4.7.2 (Debian 4.7.2-5) ...
I cannot reproduce the bug, have you tried to test the initial source code against another gcc version?
On this point, what are the alternatives? As both of you said:

  • Remove _HIDDEN attribute from sfhash* function
  • Use clang/tcc instead of gcc
  • Try to replace sfhash by jhash
    I'll review the last option, so please, keep an eye on the repository to test the changes, and let me know about any progress on this issue, please.

Thank you both!

@sch3m4
Copy link
Owner

sch3m4 commented Jul 12, 2014

Hi again, I've created a mailing list for libntoh development issues, if you want to join please, send me your email. The libntoh-dev mailing list is: libntoh-dev@safetybits.net

@bachwehbi
Copy link

Hi,
I have the same problem with gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
Removing _HIDDEN attribute from sfhash function fixes it!

@hoxnox
Copy link

hoxnox commented Mar 16, 2015

Hi. The same for me (gentoo, gcc-4.9.2). Removing _HIDDEN helps.

@luongnv89
Copy link

Hi,
Same for me: gcc (Ubuntu 15.04)
Removing _HIDDEN attribute from the functions in sfhash.c fixes it.

luongnv89 added a commit to luongnv89/libntoh that referenced this issue Jul 28, 2015
@sch3m4
Copy link
Owner

sch3m4 commented Nov 25, 2015

So, removed _HIDDEN attribute. Thank you all!

@sch3m4 sch3m4 closed this as completed Nov 25, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants