Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support swapcontext #189

Closed
ramosian-glider opened this issue Aug 31, 2015 · 33 comments
Closed

support swapcontext #189

ramosian-glider opened this issue Aug 31, 2015 · 33 comments

Comments

@ramosian-glider
Copy link
Member

@ramosian-glider ramosian-glider commented Aug 31, 2015

Originally reported on Google Code with ID 189

AddressSanitizer does not fully support swapcontext. 
Sometimes, swapcontext causes the entire shadow region (16T) 
to be written by asan-internal routines (e.g. __asan_handle_no_return)
because the location of the stack changes w/o asan noticing it.
This may cause the machine to die or hang for a long time. 

I am not at all sure if asan can fully support swapcontext, 
but we at least should collect more test cases. 

Reported by konstantin.s.serebryany on 2013-05-22 07:40:59

@ramosian-glider
Copy link
Member Author

@ramosian-glider ramosian-glider commented Aug 31, 2015

http://llvm.org/viewvc/llvm-project?rev=182456&view=rev adds a workaround and a better
test.

Full fix may require a significant surgery, so I'd like to see if a simple thing
is enough. 

Reported by konstantin.s.serebryany on 2013-05-22 09:04:27

@ramosian-glider
Copy link
Member Author

@ramosian-glider ramosian-glider commented Aug 31, 2015

I've got a test case that gives a false-positive error around swapcontext:

"ERROR: AddressSanitizer: SEGV on unknown address 0x000000000"

When I blacklist the file making that call, the code then prints a warning referring
to this bug:

"WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: . . . 
False positive error reports may follow
For details see http://code.google.com/p/address-sanitizer/issues/detail?id=189"

It's from the test suite of the Charm++ parallel runtime system (http://charmplusplus.org).
If test cases for this would be useful, I'd be happy to help in understanding that
code. If you want it reduced, I can probably do some, but it's a fairly large system
with a lot of cross-dependencies.

Reported by unmobile on 2014-01-08 20:30:01

@ramosian-glider
Copy link
Member Author

@ramosian-glider ramosian-glider commented Aug 31, 2015

Does asan actually report false positives after the warning about swapcontext?
A minimized test is always welcome, but we can not promise that we'll fix it -- 
swapcontext is a really tricky beast. 

Reported by konstantin.s.serebryany on 2014-01-09 04:50:37

@ramosian-glider
Copy link
Member Author

@ramosian-glider ramosian-glider commented Aug 31, 2015

Note that it generally makes little sense in blacklisting the code that performs a NULL
dereference.

Reported by ramosian.glider on 2014-01-10 10:39:22

@ramosian-glider
Copy link
Member Author

@ramosian-glider ramosian-glider commented Aug 31, 2015

Here is false positive.

When you destroy a std::exception_ptr allocated from another stack without rethrowing
it, then it crashes.

GCC 4.9.2 (on Gentoo). Boost 1.56.0 compiled with C++11 support.

{{{
==26409==WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: 0x7fff0420b000;
bottom 0x63100000f000; size: 0x1cef041fc000 (31812891951104)
False positive error reports may follow
For details see http://code.google.com/p/address-sanitizer/issues/detail?id=189
=================================================================
==26409==ERROR: AddressSanitizer: stack-buffer-underflow on address 0x6310000104a0
at pc 0x7fd9fccdcde3 bp 0x631000010320 sp 0x63100000fac8
WRITE of size 240 at 0x6310000104a0 thread T0
    #0 0x7fd9fccdcde2 (/usr/lib/gcc/x86_64-pc-linux-gnu/4.9.2/libasan.so.1+0x2fde2)
    #1 0x7fd9fbe8b046 in _Unwind_Resume (/usr/lib/gcc/x86_64-pc-linux-gnu/4.9.2/libgcc_s.so.1+0x10046)
    #2 0x406dc9 in my_coroutine(boost::coroutines::pull_coroutine<std::__exception_ptr::exception_ptr>&)
(/tmp/a.out+0x406dc9)
    #3 0x41e7f4 in boost::coroutines::detail::push_coroutine_object<boost::coroutines::pull_coroutine<std::__exception_ptr::exception_ptr>,
std::__exception_ptr::exception_ptr, void (&)(boost::coroutines::pull_coroutine<std::__exception_ptr::exception_ptr>&),
boost::coroutines::basic_standard_stack_allocator<boost::coroutines::stack_traits>
>::run(std::__exception_ptr::exception_ptr*) (/tmp/a.out+0x41e7f4)
    #4 0x41bb88 in void boost::coroutines::detail::trampoline_push<boost::coroutines::detail::push_coroutine_object<boost::coroutines::pull_coroutine<std::__exception_ptr::exception_ptr>,
std::__exception_ptr::exception_ptr, void (&)(boost::coroutines::pull_coroutine<std::__exception_ptr::exception_ptr>&),
boost::coroutines::basic_standard_stack_allocator<boost::coroutines::stack_traits>
> >(long) (/tmp/a.out+0x41bb88)
    #5 0x7fd9fc89e710 in make_fcontext (/usr/lib64/libboost_context-cxx11-gcc4_9_2.so.1.56.0+0x710)

0x6310000104a0 is located 64672 bytes inside of 65536-byte region [0x631000000800,0x631000010800)
allocated by thread T0 here:
    #0 0x7fd9fcd04787 in malloc (/usr/lib/gcc/x86_64-pc-linux-gnu/4.9.2/libasan.so.1+0x57787)
    #1 0x414890 in boost::coroutines::basic_standard_stack_allocator<boost::coroutines::stack_traits>::allocate(boost::coroutines::stack_context&,
unsigned long) (/tmp/a.out+0x414890)
    #2 0x40d975 in boost::coroutines::push_coroutine<std::__exception_ptr::exception_ptr>::push_coroutine<void
(&)(boost::coroutines::pull_coroutine<std::__exception_ptr::exception_ptr>&)>(void
(&)(boost::coroutines::pull_coroutine<std::__exception_ptr::exception_ptr>&), boost::coroutines::attributes
const&) (/tmp/a.out+0x40d975)
    #3 0x406ecf in main (/tmp/a.out+0x406ecf)
    #4 0x7fd9fbaf8dc4 in __libc_start_main (/lib64/libc.so.6+0x24dc4)
}}}

Reported by vdavid@vizrt.com on 2014-12-10 18:51:48


- _Attachment: [test_coroutine.cpp](https://storage.googleapis.com/google-code-attachments/address-sanitizer/issue-189/comment-6/test_coroutine.cpp)_
@ramosian-glider
Copy link
Member Author

@ramosian-glider ramosian-glider commented Aug 31, 2015

Reported by ramosian.glider on 2015-07-30 09:05:31

  • Labels added: ProjectAddressSanitizer
@ramosian-glider
Copy link
Member Author

@ramosian-glider ramosian-glider commented Aug 31, 2015

Adding Project:AddressSanitizer as part of GitHub migration.

Reported by ramosian.glider on 2015-07-30 09:06:55

@hbowden
Copy link

@hbowden hbowden commented Jan 11, 2016

I ran into this bug as well and made a test case. It's derived from the test suite in a fuzzer I'm writing. https://github.com/2trill2spill/nextgen . This was tested on Mac OSX 10.11.12, and below is the output from clang --version.

nahs-MBP:desktop nah$ clang --version
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.2.0
Thread model: posix

And the test case, which was compiled with clang -fsanitize=address -o test.c test.

#include <setjmp.h>
#include <stdio.h>
#include <string.h>
#include <signal.h>

static jmp_buf return_jump;

static void signal_handler(int sig)
{
    longjmp(return_jump, 1);
}

static void setup_test_sig_handler(void)
{
    struct sigaction sa;
    sigset_t ss;
    unsigned int i;

    for(i = 1; i < 512; i++)
    {
        (void)sigfillset(&ss);
        sa.sa_flags = SA_RESTART;
        sa.sa_handler = signal_handler;
        sa.sa_mask = ss;
        (void)sigaction((int)i, &sa, NULL);
    }

    return;
}

int main(void)
{
    int rtrn = setjmp(return_jump);
    if(rtrn < 0)
    {
        perror("setjmp");
        return (-1);
    }

    setup_test_sig_handler();

    /* Cause signal. */
    memmove(NULL, "123456789", 9);

    return (0);
}
@kcc
Copy link
Contributor

@kcc kcc commented Jan 12, 2016

2trill2spill, why is the previous comment related to this bug?
The code does not even have swapcontext call.
Please open a separate bug explaining what exactly went wrong.

@hbowden
Copy link

@hbowden hbowden commented Jan 12, 2016

The error message I get from running the test case points to this page, so I assumed that It was the same issue.

ASAN:SIGSEGV
=================================================================
==8425==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x000105ab39e2 bp 0x7fff5a1ab9a0 sp 0x7fff5a1ab128 T0)
    #0 0x105ab39e1 in __sanitizer::internal_memmove(void*, void const*, unsigned long) (/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/7.0.2/lib/darwin/libclang_rt.asan_osx_dynamic.dylib+0x569e1)
    #1 0x105a54a9e in main (/Users/nah/Desktop/./test+0x100000a9e)
    #2 0x7fff95a7c5ac in start (/usr/lib/system/libdyld.dylib+0x35ac)
    #3 0x0  (<unknown module>)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV ??:0 __sanitizer::internal_memmove(void*, void const*, unsigned long)
==8425==ABORTING
==8425==WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: 0x7fff5a1ac000; bottom 0x000106b5d000; size: 0x7ffe5364f000 (140730297544704)
False positive error reports may follow
For details see http://code.google.com/p/address-sanitizer/issues/detail?id=189
ASAN:SIGSEGV
==8425==AddressSanitizer: while reporting a bug found another one. Ignoring.
`
@kcc
Copy link
Contributor

@kcc kcc commented Jan 12, 2016

I don't see such message on Linux, so it might be a OSX-specific issue,
unrelated to swapcontext. Please file a separate bug, will discuss it there.

ghost pushed a commit to facebook/folly that referenced this issue Mar 17, 2016
Summary:ASAN needs to know about the stack extents. Currently it has no knowledge of
fibers and so it can give false positives, particularly in cases of no-return
(think exceptions).

See: google/sanitizers#189

This change along with a related ASAN diff fixes that, and I've verified it
fixes false positive test failures I'm seeing when throws happen from fibers.

Also rips out some hacks that attempted to work around the limitations of
ASAN these changes should fix.

This change depends on:
D3017630
D2952854
D3017619

And will also depend on rollout of libasan.so to /usr/local/fbcode platform dirs on all machines.

Reviewed By: andriigrynenko

Differential Revision: D2952899

fb-gh-sync-id: 19da779227c6c0f30c5755806325aa4cba364cfe
shipit-source-id: 19da779227c6c0f30c5755806325aa4cba364cfe
@felixguendling
Copy link

@felixguendling felixguendling commented Apr 6, 2016

Can anyone explain how facebook/folly@2ea64dd works? I could not find anything like "__asan_enter_fiber"?

Is there something like VALGRIND_STACK_REGISTER / VALGRIND_STACK_DEREGISTER available for address sanitizer?

@blastrock
Copy link

@blastrock blastrock commented May 16, 2016

Since there is nothing on the Internet about those functions, I guess that facebook has a fork of clang where they have implemented them.

I tried to implement the functions myself here, but I have very little knowledge of how things work, feedback is welcome. Note that the function prototypes are a little different from those used by folly. I tested this code with an implementation of coroutines on top of boost context v2, the warning about handle_no_return has indeed disappeared and it seems to work.

@felixguendling
Copy link

@felixguendling felixguendling commented May 17, 2016

Thank you! I will try your modified version as soon as possible. You just include the "asan_interface_internal.h" header in your binary or do you use the approach from facebook/folly@2ea64dd (dlsym)?

@avikivity
Copy link

@avikivity avikivity commented May 17, 2016

@blastrock
Copy link

@blastrock blastrock commented May 17, 2016

I only did some tests for the moment where I just declared the functions in my project:

extern "C"
{
void __asan_enter_fiber(void const* stack_top, void const* stack_bottom);
void __asan_exit_fiber();
}

Of course this fails to link if I don't compile with asan. I think folly's approach (with the weak symbol attributes and the fallback on dlsym) should work too.

@blastrock
Copy link

@blastrock blastrock commented Jun 29, 2016

FYI my patch finally passed review (a lot of things were fixed in the meantime), http://llvm.org/viewvc/llvm-project?view=revision&revision=273260 . Now let's wait for clang 3.9 :)

@felixguendling
Copy link

@felixguendling felixguendling commented Jun 29, 2016

That's great news! Thank you very much for your effort! 👍

@ioquatix
Copy link

@ioquatix ioquatix commented Jul 4, 2017

I just ran into this issue. I see the solution is to notify asan if switching stacks? I'm implementing coroutines.

@avikivity
Copy link

@avikivity avikivity commented Jul 4, 2017

@ioquatix on gcc? Whoa!

I think gcc 7 and latest clang have better support for makecontext and friends.

@blastrock
Copy link

@blastrock blastrock commented Jul 4, 2017

Yes, the idea is to annotate your code to notify asan when you switch context.

You can find some documentation here https://github.com/llvm-mirror/compiler-rt/blob/master/include/sanitizer/common_interface_defs.h#L166 and an example in the tests https://github.com/llvm-mirror/compiler-rt/blob/master/test/asan/TestCases/Linux/swapcontext_annotation.cc .

Since the test is still there, I don't think swapcontext has got any more support for asan.

@ioquatix
Copy link

@ioquatix ioquatix commented Sep 17, 2017

Cool- I'm not using makecontext/swapcontext but using ASM to switch stacks directly. I'll try out the annotations.

@ioquatix
Copy link

@ioquatix ioquatix commented Sep 18, 2017

Okay, so I've tried to implement this and it appars to be compiling, but I'm having some issues.

First, the changes I made:

Fiber::resume which swaps from main stack to fiber stack (and potentially nested fibers):

https://github.com/kurocha/concurrent/blob/6315ca4da220bdffec8fd292a04150a9eacea41d/source/Concurrent/Fiber.cpp#L64-L75

Fiber::yield which swaps from fiber stack back to main stack (or potentially parent stack if nested):

https://github.com/kurocha/concurrent/blob/6315ca4da220bdffec8fd292a04150a9eacea41d/source/Concurrent/Fiber.cpp#L97-L108

cocall which is the first function executed on the stack:

https://github.com/kurocha/concurrent/blob/6315ca4da220bdffec8fd292a04150a9eacea41d/source/Concurrent/Fiber.hpp#L160-L165

So, the order is always balanced, e.g. call resume, start stack, then in cocall, finish stack, then in yield, start stack, back to resume exit stack, finish.

Is this a reasonable implementation?

I wasn't entirely sure what I should be doing with all the stack pointers/sizes, I guess that start stack should be details of the stack you are transferring to, and finish stack should be the details of the stack you came from. However, what is the purpose of fake stack and how should I handle it given that fibers can transfer in a non-nested way?

Finally, even thought this seems to work, I now get a error:

--- Concurrent::Fiber ---
__sanitizer_start_switch_fiber (resume)
__sanitizer_finish_switch_fiber (call)
__sanitizer_start_switch_fiber (yield)
__sanitizer_finish_switch_fiber (resume)
[it should resume] 1 passed out of 1 total
__sanitizer_start_switch_fiber (resume)
__sanitizer_finish_switch_fiber (call)
__sanitizer_start_switch_fiber (yield)
__sanitizer_finish_switch_fiber (resume)
__sanitizer_start_switch_fiber (resume)
__sanitizer_finish_switch_fiber (yield)
__sanitizer_start_switch_fiber (yield)
__sanitizer_finish_switch_fiber (resume)
[it should yield] 2 passed out of 2 total
__sanitizer_start_switch_fiber (resume)
__sanitizer_finish_switch_fiber (call)
__sanitizer_start_switch_fiber (yield)
__sanitizer_finish_switch_fiber (resume)
==14488==WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: 0x000000000000; bottom 0x7ffff98be000; size: 0xffff800006742000 (-140737380081664)
False positive error reports may follow
For details see https://github.com/google/sanitizers/issues/189
[it should throw exceptions] 1 passed out of 1 total
__sanitizer_start_switch_fiber (resume)
__sanitizer_finish_switch_fiber (call)
__sanitizer_start_switch_fiber (yield)
__sanitizer_finish_switch_fiber (resume)
__sanitizer_start_switch_fiber (resume)
__sanitizer_finish_switch_fiber (yield)
__sanitizer_start_switch_fiber (yield)
__sanitizer_finish_switch_fiber (resume)
[it can be stopped] 4 passed out of 4 total
__sanitizer_start_switch_fiber (resume)
__sanitizer_finish_switch_fiber (call)
=================================================================
==14488==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fa0ccbfebc8 at pc 0x55baaadf659f bp 0x7fa0ccbfea20 sp 0x7fa0ccbfea18
WRITE of size 8 at 0x7fa0ccbfebc8 thread T0
    #0 0x55baaadf659e in std::exception_ptr::exception_ptr() /usr/bin/../include/c++/v1/exception:143:59
    #1 0x55baaadf659e in Concurrent::Fiber::Fiber<Concurrent::$_4::operator()(UnitTest::Examiner&) const::{lambda()#1}::operator()() const::{lambda()#1}>(Concurrent::$_4::operator()(UnitTest::Examiner&) const::{lambda()#1}::operator()() const::{lambda()#1}&&, unsigned long) include/Concurrent/Fiber.hpp:53
    #2 0x55baaadf5941 in Concurrent::$_4::operator()(UnitTest::Examiner&) const::{lambda()#1}::operator()() const concurrent/test/Concurrent/Test.Fiber.cpp:112:12
    #3 0x55baaadf43d3 in Concurrent::Coentry<Concurrent::$_4::operator()(UnitTest::Examiner&) const::{lambda()#1}>::cocall(void*) include/Concurrent/Fiber.hpp:169:4
    #4 0x55baaae94396 in coro_init concurrent/source/Concurrent/coro.c:97:3
    #5 0x7fa0cf71ed3f  (/usr/lib/libc.so.6+0x35d3f)

Address 0x7fa0ccbfebc8 is located in stack of thread T0 at offset 104 in frame
    #0 0x55baaadf55bf in Concurrent::$_4::operator()(UnitTest::Examiner&) const::{lambda()#1}::operator()() const concurrent/test/Concurrent/Test.Fiber.cpp:109

  This frame has 2 object(s):
    [32, 144) 'inner' <== Memory access at offset 104 is inside this variable
    [176, 184) 'ref.tmp'
HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext
      (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow /usr/bin/../include/c++/v1/exception:143:59 in std::exception_ptr::exception_ptr()
Shadow bytes around the buggy address:
  0x0ff499977d20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ff499977d30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ff499977d40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ff499977d50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ff499977d60: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
=>0x0ff499977d70: 00 00 00 00 00 00 00 00 00[f3]f3 f3 00 00 f2 f2
  0x0ff499977d80: f2 f2 00 f3 f3 f3 f3 f3 00 00 00 00 00 00 00 00
  0x0ff499977d90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ff499977da0: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
  0x0ff499977db0: 00 f2 f2 f2 00 f2 f2 f2 00 f3 f3 f3 00 00 00 00
  0x0ff499977dc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==14488==ABORTING

It appears as if exception_ptr of an exception thrown on another stack is not working? However, that stack is not deallocated yet, until after the exception is handled, AFAIK. I will review further but just wondering if anyone can give me feedback on my implementation.

@ioquatix
Copy link

@ioquatix ioquatix commented Sep 18, 2017

Okay, I changed all fake_stack to nullptr and I no longer get any error, but I still get warning

==14488==WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: 0x000000000000; bottom 0x7ffff98be000; size: 0xffff800006742000 (-140737380081664)
False positive error reports may follow
For details see https://github.com/google/sanitizers/issues/189

So, I guess I'm doing something a bit wrong. I'll read documentation a bit more. Any ideas appreciated.

@ioquatix
Copy link

@ioquatix ioquatix commented Sep 18, 2017

Okay, so after reading the documentation and playing around a bit, I tried implementing it as so:

#if defined(VARIANT_SANITIZE)
		void * fake_stack = nullptr;
		__sanitizer_start_switch_fiber(&fake_stack, _stack.base(), _stack.allocated_size());
		std::cerr << "__sanitizer_start_switch_fiber (resume, fake_stack=" << fake_stack << ")" << std::endl;
#endif

		coro_transfer(&_caller->_context, &_context);

#if defined(VARIANT_SANITIZE)
		std::cerr << "__sanitizer_finish_switch_fiber (resume, fake_stack=" << fake_stack << ")" << std::endl;
		__sanitizer_finish_switch_fiber(fake_stack, nullptr, nullptr);
#endif

The first (odd?) thing is that the pointer is always null? Is it using the address of the pointer to mark the stack somehow?

Secondly, I followed the instructions regarding the last __sanitizer_start_switch_fiber having nullptr as the first argument. It seems to work as expected. Still, I'm not completely confident I understand how it should be used. The test code is a bit messy, it would be nice to have a simple annotated example, not a test designed to stress test asan.

Finally, I still couldn't get the warning to go away.

@blastrock
Copy link

@blastrock blastrock commented Sep 18, 2017

The first argument (the fake stack save) is only used when the fake stack is enabled.

To enable it, define the variable ASAN_OPTIONS=detect_stack_use_after_return=1 .

It should be unrelated to the warning you get. You get this warning when you used the API wrong, or when there is a bug in ASAN...

I am not sure (since things have changed since my patch), but I think you need to put the stack of the coroutine you come from, here https://github.com/kurocha/concurrent/blob/6315ca4da220bdffec8fd292a04150a9eacea41d/source/Concurrent/Fiber.cpp#L74

EDIT: same for yield

@ioquatix
Copy link

@ioquatix ioquatix commented Sep 18, 2017

Thanks for the useful information, I will try it out and report back.

@ioquatix
Copy link

@ioquatix ioquatix commented Sep 18, 2017

Okay I managed to get it to work with no warnings.

https://github.com/kurocha/concurrent/blob/663aacb14430777fc61c86c03b5eb10b8a93611c/source/Concurrent/Fiber.hpp#L110-L120

Conceptually, I had to break the functions into 4 variations.

Once I did this, I could understand and reason about how they should be called from resume, yield, transfer and so on. It makes sense now.

@ioquatix
Copy link

@ioquatix ioquatix commented Sep 19, 2017

So, I got the core code working without warnings/errors, but found some issue here:

/home/samuel/Documents/kurocha/async/teapot/platforms/development/linux-sanitize/test/Async-tests

--- Async::Protocol::Buffer ---
[it can append data] 2 passed out of 2 total
[it can read data from a file] 1 passed out of 1 total
[it can read data into non-contiguous buffer] 12 passed out of 12 total
[it can read data from a file in chunks] 3 passed out of 3 total

--- Async::Notification ---
Fiber::start_push_stack(resume, 0x7f7844efe000, 4202496)
Fiber::finish_push_stack(cocall, 0x7ffef3e66000, 8388608)
Fiber::start_pop_stack(yield, 0x7ffef3e66000, 8388608, 0)
Fiber::finish_pop_stack(resume, 0x7f7844efe000, 4202496)
Fiber::start_push_stack(resume, 0x7f7844efe000, 4202496)
Fiber::finish_push_stack(yield, 0x7ffef3e66000, 8388608)
Fiber::start_pop_stack(coreturn, 0x7ffef3e66000, 8388608, 1)
Fiber::finish_pop_stack(resume, 0x7f7844efe000, 4202496)
[it can notify the fiber to continue] 1 passed out of 1 total

--- Async::Writable ---
Fiber::start_push_stack(resume, 0x7f7844efe000, 4202496)
Fiber::finish_push_stack(cocall, 0x7ffef3e66000, 8388608)
Fiber::start_pop_stack(yield, 0x7ffef3e66000, 8388608, 0)
Fiber::finish_pop_stack(resume, 0x7f7844efe000, 4202496)
Fiber::start_push_stack(resume, 0x7f78442fb000, 4202496)
Fiber::finish_push_stack(cocall, 0x7ffef3e66000, 8388608)
Fiber::start_pop_stack(coreturn, 0x7ffef3e66000, 8388608, 1)
Fiber::finish_pop_stack(resume, 0x7f78442fb000, 4202496)
Fiber::start_push_stack(resume, 0x7f7844efe000, 4202496)
Fiber::finish_push_stack(yield, 0x7ffef3e66000, 8388608)
Fiber::start_pop_stack(coreturn, 0x7ffef3e66000, 8388608, 1)
Fiber::finish_pop_stack(resume, 0x7f7844efe000, 4202496)
[it can wait for writing] 1 passed out of 1 total

--- Async::Readable ---
Fiber::start_push_stack(resume, 0x7f7844efe000, 4202496)
Fiber::finish_push_stack(cocall, 0x7ffef3e66000, 8388608)
Fiber::start_pop_stack(yield, 0x7ffef3e66000, 8388608, 0)
Fiber::finish_pop_stack(resume, 0x7f7844efe000, 4202496)
Fiber::start_push_stack(resume, 0x7f78442fb000, 4202496)
Fiber::finish_push_stack(cocall, 0x7ffef3e66000, 8388608)
Fiber::start_pop_stack(coreturn, 0x7ffef3e66000, 8388608, 1)
Fiber::finish_pop_stack(resume, 0x7f78442fb000, 4202496)
Fiber::start_push_stack(resume, 0x7f7844efe000, 4202496)
Fiber::finish_push_stack(yield, 0x7ffef3e66000, 8388608)
Fiber::start_pop_stack(coreturn, 0x7ffef3e66000, 8388608, 1)
Fiber::finish_pop_stack(resume, 0x7f7844efe000, 4202496)
[it can wait for reading] 2 passed out of 2 total

--- Async::Job ---
Fiber::start_push_stack(resume, 0x7f7844efe000, 4202496)
Fiber::finish_push_stack(cocall, 0x7ffef3e66000, 8388608)
=================================================================
==13117==AddressSanitizer CHECK failed: /build/llvm/src/llvm-4.0.1.src/projects/compiler-rt/lib/asan/asan_thread.cc:320 "((ptr[0] == kCurrentStackFrameMagic)) != (0)" (0x0, 0x0)
    #0 0x55e6f67cb527 in __asan::AsanCheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) (/home/samuel/Documents/kurocha/async/teapot/platforms/development/linux-sanitize/test/Async-tests+0x1d0527)
    #1 0x55e6f67e7655 in __sanitizer::CheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) (/home/samuel/Documents/kurocha/async/teapot/platforms/development/linux-sanitize/test/Async-tests+0x1ec655)
    #2 0x55e6f67d076a in __asan::AsanThread::GetStackFrameAccessByAddr(unsigned long, __asan::AsanThread::StackFrameAccess*) (/home/samuel/Documents/kurocha/async/teapot/platforms/development/linux-sanitize/test/Async-tests+0x1d576a)
    #3 0x55e6f6718148 in __asan::AddressDescription::AddressDescription(unsigned long, unsigned long, bool) (/home/samuel/Documents/kurocha/async/teapot/platforms/development/linux-sanitize/test/Async-tests+0x11d148)
    #4 0x55e6f671a8e0 in __asan::ErrorGeneric::ErrorGeneric(unsigned int, unsigned long, unsigned long, unsigned long, unsigned long, bool, unsigned long) (/home/samuel/Documents/kurocha/async/teapot/platforms/development/linux-sanitize/test/Async-tests+0x11f8e0)
    #5 0x55e6f67cac9e in __asan::ReportGenericError(unsigned long, unsigned long, unsigned long, unsigned long, bool, unsigned long, unsigned int, bool) (/home/samuel/Documents/kurocha/async/teapot/platforms/development/linux-sanitize/test/Async-tests+0x1cfc9e)
    #6 0x55e6f67cbe5b in __asan_report_store8 (/home/samuel/Documents/kurocha/async/teapot/platforms/development/linux-sanitize/test/Async-tests+0x1d0e5b)
    #7 0x55e6f688df37 in std::__1::function<void ()>::function<Async::$_0::operator()(UnitTest::Examiner&) const::{lambda()#1}::operator()() const::{lambda()#1}, void>(Async::$_0::operator()(UnitTest::Examiner&) const::{lambda()#1}::operator()() const::{lambda()#1}) /usr/bin/../include/c++/v1/functional:1763:7
    #8 0x55e6f688d8b3 in Async::$_0::operator()(UnitTest::Examiner&) const::{lambda()#1}::operator()() const /home/samuel/Documents/kurocha/async/test/Async/Test.Job.cpp:32:23
    #9 0x55e6f688c2ff in Concurrent::Coentry<Async::$_0::operator()(UnitTest::Examiner&) const::{lambda()#1}>::cocall(void*) /home/samuel/Documents/kurocha/async/test/../teapot/platforms/development/linux-sanitize/include/Concurrent/Fiber.hpp:175:4
    #10 0x55e6f69605d6 in coro_init /home/samuel/Documents/kurocha/async/teapot/packages/development/concurrent/source/Concurrent/coro.c:97:3
    #11 0x7f7847da4d3f  (/usr/lib/libc.so.6+0x35d3f)

Task #<TaskClassForAsyncTests_47339649637900:0x00561c3e102e20> failed: "Async-tests" exited with status 256
Task #<TaskClassForAsyncTests_47339649637900:0x00561c3e1c0c90> failed: Children tasks failed!
Task #<TaskClassForAsyncTests_47339649637900:0x00561c3f58c058> failed: Children tasks failed!

It's also.. a little bit odd.. in that if I only run that test, it fails a bit later:

/home/samuel/Documents/kurocha/async/teapot/platforms/development/linux-sanitize/test/Async-tests Async::Job

--- Async::Job ---
Fiber::start_push_stack(resume, 0x7fb119ef6000, 4202496)
Fiber::finish_push_stack(cocall, 0x7ffdf69a5000, 8388608)
Fiber::start_pop_stack(yield, 0x7ffdf69a5000, 8388608, 0)
Fiber::finish_pop_stack(resume, 0x7fb119ef6000, 4202496)
Fiber::start_push_stack(resume, 0x7fb119ef6000, 4202496)
Fiber::finish_push_stack(yield, 0x7ffdf69a5000, 8388608)
Fiber::start_pop_stack(coreturn, 0x7ffdf69a5000, 8388608, 1)
Fiber::finish_pop_stack(resume, 0x7fb119ef6000, 4202496)
[it can wait for result] 1 passed out of 1 total
Fiber::start_push_stack(resume, 0x7fb119ef6000, 4202496)
Fiber::finish_push_stack(cocall, 0x7ffdf69a5000, 8388608)
=================================================================
==13215==AddressSanitizer CHECK failed: /build/llvm/src/llvm-4.0.1.src/projects/compiler-rt/lib/asan/asan_thread.cc:320 "((ptr[0] == kCurrentStackFrameMagic)) != (0)" (0x0, 0x0)
    #0 0x55c23ae6d527 in __asan::AsanCheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) (/home/samuel/Documents/kurocha/async/teapot/platforms/development/linux-sanitize/test/Async-tests+0x1d0527)
    #1 0x55c23ae89655 in __sanitizer::CheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) (/home/samuel/Documents/kurocha/async/teapot/platforms/development/linux-sanitize/test/Async-tests+0x1ec655)
    #2 0x55c23ae7276a in __asan::AsanThread::GetStackFrameAccessByAddr(unsigned long, __asan::AsanThread::StackFrameAccess*) (/home/samuel/Documents/kurocha/async/teapot/platforms/development/linux-sanitize/test/Async-tests+0x1d576a)
    #3 0x55c23adba148 in __asan::AddressDescription::AddressDescription(unsigned long, unsigned long, bool) (/home/samuel/Documents/kurocha/async/teapot/platforms/development/linux-sanitize/test/Async-tests+0x11d148)
    #4 0x55c23adbc8e0 in __asan::ErrorGeneric::ErrorGeneric(unsigned int, unsigned long, unsigned long, unsigned long, unsigned long, bool, unsigned long) (/home/samuel/Documents/kurocha/async/teapot/platforms/development/linux-sanitize/test/Async-tests+0x11f8e0)
    #5 0x55c23ae6cc9e in __asan::ReportGenericError(unsigned long, unsigned long, unsigned long, unsigned long, bool, unsigned long, unsigned int, bool) (/home/samuel/Documents/kurocha/async/teapot/platforms/development/linux-sanitize/test/Async-tests+0x1cfc9e)
    #6 0x55c23ae6dcab in __asan_report_store1 (/home/samuel/Documents/kurocha/async/teapot/platforms/development/linux-sanitize/test/Async-tests+0x1d0cab)
    #7 0x55c23af4bb06 in UnitTest::Expectation<UnitTest::Examiner, Async::$_1::operator()(UnitTest::Examiner&) const::{lambda()#1}::operator()() const::{lambda()#2}>::Expectation(UnitTest::Examiner&, {lambda()#1} const&, bool) /home/samuel/Documents/kurocha/async/test/../teapot/platforms/development/linux-sanitize/include/UnitTest/Expectation.hpp:22:120
    #8 0x55c23af43ea4 in UnitTest::Expectation<UnitTest::Examiner, Async::$_1::operator()(UnitTest::Examiner&) const::{lambda()#1}::operator()() const::{lambda()#2}> UnitTest::Examiner::expect<Async::$_1::operator()(UnitTest::Examiner&) const::{lambda()#1}::operator()() const::{lambda()#2}>(Async::$_1::operator()(UnitTest::Examiner&) const::{lambda()#1}::operator()() const::{lambda()#2} const&) /home/samuel/Documents/kurocha/async/test/../teapot/platforms/development/linux-sanitize/include/UnitTest/UnitTest.hpp:53:11
    #9 0x55c23af428cd in Async::$_1::operator()(UnitTest::Examiner&) const::{lambda()#1}::operator()() const /home/samuel/Documents/kurocha/async/test/Async/Test.Job.cpp:63:15
    #10 0x55c23af4102f in Concurrent::Coentry<Async::$_1::operator()(UnitTest::Examiner&) const::{lambda()#1}>::cocall(void*) /home/samuel/Documents/kurocha/async/test/../teapot/platforms/development/linux-sanitize/include/Concurrent/Fiber.hpp:175:4
    #11 0x55c23b0025d6 in coro_init /home/samuel/Documents/kurocha/async/teapot/packages/development/concurrent/source/Concurrent/coro.c:97:3
    #12 0x7fb120e3ed3f  (/usr/lib/libc.so.6+0x35d3f)

Task #<TaskClassForAsyncTests_47187663006060:0x0055d5794d4ff0> failed: "Async-tests" exited with status 256
Task #<TaskClassForAsyncTests_47187663006060:0x0055d5792c1ab0> failed: Children tasks failed!
Task #<TaskClassForAsyncTests_47187663006060:0x0055d5792c2dc0> failed: Children tasks failed!

These tests check that a fiber adding a job to a thread pool works as expected. The tests pass without sanity checks.

The only thing I can think of, is that between tests sometimes stacks are allocated at the same address. Perhaps there is something left over from a previous invocation that's causing it to fail?

@ioquatix
Copy link

@ioquatix ioquatix commented Sep 19, 2017

So, I checked, and individually the tests work fine.

@ioquatix
Copy link

@ioquatix ioquatix commented Sep 19, 2017

Okay, I updated from clang 4.x to 5.x and the problem is... gone.

qemu-deploy pushed a commit to qemu/qemu that referenced this issue Feb 7, 2018
It helps ASAN to detect more leaks on coroutine stacks, and to get rid
of some extra warnings.

Before:

tests/test-coroutine -p
/basic/lifecycle
/basic/lifecycle: ==20781==WARNING: ASan doesn't fully support
makecontext/swapcontext functions and may produce false positives in
some cases!
==20781==WARNING: ASan is ignoring requested __asan_handle_no_return:
stack top: 0x7ffcb184d000; bottom 0x7ff6c4cfd000; size: 0x0005ecb50000
(25446121472)
False positive error reports may follow
For details see google/sanitizers#189
OK

After:

tests/test-coroutine -p /basic/lifecycle
/basic/lifecycle: ==21110==WARNING: ASan doesn't fully support
makecontext/swapcontext functions and may produce false positives in
some cases!
OK

A similar work would need to be done for sigaltstack & windows fibers
to have similar coverage. Since ucontext is preferred, I didn't bother
checking the other coroutine implementations for now.

Update travis to fix the build with ASAN annotations.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180116151152.4040-4-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
@morehouse
Copy link
Member

@morehouse morehouse commented Jun 5, 2018

@kcc: Seems that we have a workaround now with fiber annotations. Can we close this?

@kcc
Copy link
Contributor

@kcc kcc commented Jun 5, 2018

let's close. If anyone sees a remaining problem, please open a new bug with new details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
8 participants
You can’t perform that action at this time.