Skip to content
This repository has been archived by the owner on Jun 12, 2018. It is now read-only.

Stack corruption? #14

Closed
SumolX opened this issue Aug 30, 2015 · 23 comments
Closed

Stack corruption? #14

SumolX opened this issue Aug 30, 2015 · 23 comments

Comments

@SumolX
Copy link

SumolX commented Aug 30, 2015

I've been experiencing stack and heap collisions with my software and I've narrowed it down to server_http.hpp. Running valgrind with the following parameters reports some extremely large stack usage on my software and explains why I am consistently crashing in the same spot.

I decided to run the same check on the http_examples.cpp and found extremely high usage of the stack as well. By the way I want to thank you for creating this simple but yet elegant library, it was exactly what I was looking for to serve REST/HTML web services for my project.

Lastly, I run your library on both OSX and Linux and the crash only occurs on Linux. I will have to install and run valgrind on OSX and see if the stack behavior is the same on apple.

valgrind --log-file=valgrind.log --leak-check=full --track-origins=yes --show-reachable=yes ./http_examples

==3641== Memcheck, a memory error detector
==3641== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==3641== Using Valgrind-3.10.0.SVN and LibVEX; rerun with -h for copyright info
==3641== Command: ./http_examples
==3641== Parent PID: 2733
==3641==
==3641== Warning: client switching stacks? SP change: 0x7aed548 --> 0x4036f98
==3641== to suppress, use: --max-stackframe=61564336 or greater
==3641== Warning: client switching stacks? SP change: 0x4036d98 --> 0x7aed550
==3641== to suppress, use: --max-stackframe=61564856 or greater
==3641== Warning: client switching stacks? SP change: 0x7aed508 --> 0x4036da0
==3641== to suppress, use: --max-stackframe=61564776 or greater
==3641== further instances of this message will not be shown.
==3641==

@eidheim
Copy link
Owner

eidheim commented Aug 31, 2015

Is this from running an unmodified version of http_examples.cpp?

If not, and you are using async_flush, please use flush instead. I guess async_flush in a for/while loop could result in this.

My last thought is about buffer overflows that can corrupt the streams. Be sure there are no such errors in your code.

@SumolX
Copy link
Author

SumolX commented Aug 31, 2015

I ran valgrind against your unmodified http_examples.cpp test.

Now it could be that it is not stack corruption but your library requires a large amount of stack memory (stack/heap collision). I can modify the server_http.hpp to reserve a large stack size when a thread is spawned to handle an inbound request and see if it makes any difference on my end.

I did run a quick test to see the addresses from where the task is spawned and when the resource function is executed and found the difference between the function minus the start to be rather excessive.

function which is executed:

void foo(response, request)
{
int x = 0;
printf("%d\n", &x);
}

threaded initialized:

threads.emplace_back( this
{
int y = 0;
printf("%d\n", &y);

io_service.run();
}

Simple stack size check:

diff = x - y

Edit

I ran valgrind on my app with the same options on OSX and no memory issues were reported and the stack size was roughly 2 MBytes.

@eidheim
Copy link
Owner

eidheim commented Aug 31, 2015

I also tried the same valgrid options on both OS X and Debian Stable, and no memory issues we reported. What g++ and boost version are you using on your Linux Distro, and which distro is it? I have a few distros, so I can probably test it there.

@SumolX
Copy link
Author

SumolX commented Aug 31, 2015

I am using Ubuntu 14.04 with g++ 4.9.2 and libboost-all-dev 1.55

Currently testing with boost 1.59 and I'm getting some interesting behavior where it is no longer constantly crashing on the same spot, but is now completely random.

@eidheim
Copy link
Owner

eidheim commented Aug 31, 2015

Are you just starting the http_examples.cpp, or are you running several requests towards the server?

@SumolX
Copy link
Author

SumolX commented Aug 31, 2015

I'm using Chrome to test http_examples and disabled the client portion within the file. I increase the main thread sleep from 1 second to 10 and the use Chrome to perform the get localhost:8080.

While the crash does not happen in the http_examples, i can still see the large stack usage when using chome browser with valgrind.

==2289== Memcheck, a memory error detector
==2289== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==2289== Using Valgrind-3.10.0.SVN and LibVEX; rerun with -h for copyright info
==2289== Command: ./http_examples
==2289== Parent PID: 2155
==2289==
==2289== Warning: client switching stacks? SP change: 0x72ec308 --> 0x4036f98
==2289== to suppress, use: --max-stackframe=53171056 or greater
==2289== Warning: client switching stacks? SP change: 0x4036d98 --> 0x72ec310
==2289== to suppress, use: --max-stackframe=53171576 or greater
==2289== Warning: client switching stacks? SP change: 0x72ec2c8 --> 0x4036da0
==2289== to suppress, use: --max-stackframe=53171496 or greater
==2289== further instances of this message will not be shown.
==2289==
==2289== HEAP SUMMARY:
==2289== in use at exit: 72,704 bytes in 1 blocks
==2289== total heap usage: 4,761 allocs, 4,760 frees, 225,798 bytes allocated
==2289==
==2289== 72,704 bytes in 1 blocks are still reachable in loss record 1 of 1
==2289== at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==2289== by 0x591126F: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21)
==2289== by 0x4010139: call_init.part.0 (dl-init.c:78)
==2289== by 0x4010222: _dl_init (dl-init.c:36)
==2289== by 0x4001309: ??? (in /lib/x86_64-linux-gnu/ld-2.19.so)
==2289==
==2289== LEAK SUMMARY:
==2289== definitely lost: 0 bytes in 0 blocks
==2289== indirectly lost: 0 bytes in 0 blocks
==2289== possibly lost: 0 bytes in 0 blocks
==2289== still reachable: 72,704 bytes in 1 blocks
==2289== suppressed: 0 bytes in 0 blocks
==2289==
==2289== For counts of detected and suppressed errors, rerun with: -v
==2289== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

@SumolX
Copy link
Author

SumolX commented Sep 2, 2015

Running on Ubuntu 14.04 with g++ 4.9.2 and boost v1.59 with 0 client threads, I believe I have narrowed down the issue. The following logic within async_flush() is causing a double free within dispatch:

            //Wait until previous async_flush is finished
            strand->dispatch([this](){
                if(*async_writing) {
                    *async_waiting=true;
                    try {
                        async_timer->async_wait(yield);
                    }
                    catch(std::exception& e) {
                    }
                    *async_waiting=false;
                }
            });

@eidheim
Copy link
Owner

eidheim commented Sep 2, 2015

Thank you! I'll remove the async_flush mechanism then, it probably will never be used anyway (and I never liked the dirty code that might be needed). I'll try do it Tonight.

@SumolX
Copy link
Author

SumolX commented Sep 2, 2015

Keep in mind that I tried commenting it out, but then ran into other problems.

@eidheim
Copy link
Owner

eidheim commented Sep 2, 2015

I've removed the async_flush mechanism, just need to test it. Committing the changes in about 30 minutes.

@SumolX
Copy link
Author

SumolX commented Sep 2, 2015

I'm more than glad to test out your changes without having you check-in new versions. Just add at gmail dot com to my username.

@eidheim
Copy link
Owner

eidheim commented Sep 2, 2015

There, removal of async_flush commited.

@eidheim
Copy link
Owner

eidheim commented Sep 2, 2015

Have also fixed the default_resource example, which now closes the file on connection interruption.

@SumolX
Copy link
Author

SumolX commented Sep 2, 2015

The following two new functions will greatly reduce memory, which made it possible for me to now use both sqlite for acquiring data and rapidxml to generate the properly formatted xml responses back to the client without any stack or heap corruptions.

Response& operator<<(const char* p) {
stream.write(&p[0], strlen(p));
return *this;
}

Response& operator<<(const std::string& t) {
stream.write(&t[0], t.length());
return *this;
}

@eidheim
Copy link
Owner

eidheim commented Sep 2, 2015

This is strange, should not the following produce similar code:

template <class T>
Response& operator<<(const T& t) {
    stream << t;
    return *this;
}

Edit: forgot the template

@eidheim
Copy link
Owner

eidheim commented Sep 2, 2015

Hmm, I have to check Tomorrow what happens with stream after this operation in void flush:

response << stream.rdbuf();

@eidheim
Copy link
Owner

eidheim commented Sep 2, 2015

What if you replace the void flush with:

void flush() {
    boost::system::error_code ec;
    boost::asio::async_write(*socket, streambuf, yield[ec]);

    if(ec)
        throw std::runtime_error(ec.message());
}

Does that make any difference?

@eidheim
Copy link
Owner

eidheim commented Sep 2, 2015

Without Response& operator<<(const char* p) and Response& operator<<(const std::string& t) that is.

@SumolX
Copy link
Author

SumolX commented Sep 2, 2015

Works perfectly without the overloaded operators and with your new flush() method. Makes sense since there was essentially a duplication of data but in a different container type.

@eidheim eidheim closed this as completed in 3b85cd0 Sep 3, 2015
@eidheim
Copy link
Owner

eidheim commented Sep 3, 2015

Great! What happens is that all the pointers from your stream was copied to the response stream that was used to send data to client, and this could lead to stack overflow since the pointers resides on the stack. At least I think. The crash you experienced should be stack overflow, or am I incorrect?

@eidheim
Copy link
Owner

eidheim commented Sep 3, 2015

Fixed in latest commit by the way.

@SumolX
Copy link
Author

SumolX commented Sep 3, 2015

Correct! The behavior was exactly that and with your latest changes I am now running rock solid. Thank you for this class as it is convenient and extremely easy to add RESTful web services to my or any application.

Question, Is the intent of the flush to send data from the server to the client right away? At the end of my code block should I be manually flushing the response?

@eidheim
Copy link
Owner

eidheim commented Sep 5, 2015

The flush method is just there to avoid building too large memory buffers when sending data to clients. For instance if you want to send a file larger than the available memory, or send several files that together would take up all the available memory of the server.

The flush method is called automatically when a resource function returns (reaches the end).

Keep in mind that flushing data to the clients includes some overhead, so only use it when needed.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants