New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible space leak in warp #649

Closed
horus opened this Issue Oct 20, 2017 · 13 comments

Comments

Projects
None yet
4 participants
@horus

horus commented Oct 20, 2017

The symptom can be reproduce by compiling & running the following code with Warp >= 3.2.11.2:

{-# LANGUAGE OverloadedStrings #-}

module Main (main) where

import Network.HTTP.Types
import Network.Wai
import Network.Wai.Handler.Warp

main :: IO ()
main =
  run 8081 $ \_ resp -> do
    resp $ responseLBS status200 [] "Hello World"

A simple keep-alive client will do (I don't think clients that close the connection can make this happen):

#!/usr/bin/env perl

use strict;
use warnings;
use LWP::UserAgent qw();

my $http = LWP::UserAgent->new(keep_alive => 1);
while (1) {
   my $resp = $http->get("http://127.0.0.1:8081/");
   if ($resp->is_success) {
           #print $resp->decoded_content, "\n";
   }
   select undef, undef, undef, 0.05;
}

Monitor the app using top(1) for awhile.

If compiled with 3.2.11.1, the RES column will be more or less stable, otherwise it will grow.

  PID USERNAME    THR PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMAND
72065 admin         4  20    0 48896K  8456K uwait  23   0:09   1.79% warptest

I encountered this issue when running a Scotty app on CentOS 6.6. Later tested the code above on FreeBSD 11 with GHC 8.2.1, 8.0.2, 7.10.2.

EDIT:

The original title is misleading.

After some experiments, I think the problem lies much deeper than I thought it would be. But I still got some results back.

I moved those functions out of where clause, limited the stack using -K1K:

** Exception (reporting due to +RTS -xc): (THUNK_STATIC), stack trace:
  Main.app,
  called from Network.Wai.Handler.Warp.Run.processRequest,
  called from Network.Wai.Handler.Warp.Run.http1,
  called from Network.Wai.Handler.Warp.Run.serveConnection,
  called from Network.Wai.Handler.Warp.Settings.defaultSettings,
  called from Network.Wai.Handler.Warp.Settings.settingsFork,
  called from Network.Wai.Handler.Warp.Run.fork,
  called from Network.Wai.Handler.Warp.Run.acceptConnection,
  called from Network.Wai.Handler.Warp.FileInfoCache.withFileInfoCache,
  called from Network.Wai.Handler.Warp.FdCache.withFdCache,
  called from Network.Wai.Handler.Warp.Date.withDateCache,
  called from Network.Wai.Handler.Warp.Run.runSettingsConnectionMakerSecure,
  called from Network.Wai.Handler.Warp.Run.runSettingsConnectionMaker,
  called from Network.Wai.Handler.Warp.Run.runSettingsConnection,
  called from Network.Wai.Handler.Warp.Run.runSettingsSocket,
  called from Network.Wai.Handler.Warp.Run.runSettings,
  called from Network.Wai.Handler.Warp.Run.run,
  called from Main.main
stack overflow

Please also refer to these graphs. I used different heap profiling options (-hm vs -hc), but it's enough to demonstrate the problem:

warptest-1
warptest-2

@horus horus changed the title from Possible space leak after 3.2.11.2 to Possible space leak after warp-3.2.11.2 Oct 20, 2017

@horus horus changed the title from Possible space leak after warp-3.2.11.2 to Possible space leak in warp Oct 24, 2017

@kazu-yamamoto kazu-yamamoto self-assigned this Dec 28, 2017

@kazu-yamamoto

This comment has been minimized.

Contributor

kazu-yamamoto commented Jan 10, 2018

@snoyberg We have only #618 between 3.2.11.1 and 3.2.11.2.
I guess that this patch made Warp to not release connections.
So, if Warp receives a lot of connections at the same time, connections look like leaking.
Would you give a look?

@snoyberg

This comment has been minimized.

Member

snoyberg commented Jan 10, 2018

snoyberg added a commit that referenced this issue Jan 17, 2018

@snoyberg

This comment has been minimized.

Member

snoyberg commented Jan 17, 2018

I wasn't able to reliably reproduce the error, but I think I see what the issue is: the exception handlers are stacking up. I've rewritten the code to hoist the exception handler out of the loop and restore the functions to being proper tail calls. Can you try out #667?

@horus

This comment has been minimized.

horus commented Jan 19, 2018

@snoyberg I think it solved this issue.

I've recompiled my testcase with Warp 3.2.15. The resulting graph is attached below. No crazy climbing lines any more. :-)

3 2 15

Thank you very much for your time!

@snoyberg

This comment has been minimized.

Member

snoyberg commented Jan 19, 2018

Awesome, thanks!

@kazu-yamamoto want me to cut a release, or did you want to?

@kazu-yamamoto

This comment has been minimized.

Contributor

kazu-yamamoto commented Jan 19, 2018

@snoyberg I will review this, too. After approval and merging, you should release by yourself. :-)

kazu-yamamoto added a commit to kazu-yamamoto/wai that referenced this issue Jan 19, 2018

@kazu-yamamoto

This comment has been minimized.

Contributor

kazu-yamamoto commented Jan 19, 2018

Merged.

@snoyberg Please release Warp whenever you like.

@snoyberg

This comment has been minimized.

Member

snoyberg commented Jan 19, 2018

@ryantrinkle

This comment has been minimized.

ryantrinkle commented Feb 27, 2018

I'm still seeing this issue reliably with warp-3.2.17. Here's my test setup:

  • Same main.hs as given above
  • Run with +RTS -S
  • Hit the running server with ab -c 1 -n 100000 http://localhost:8081/

Desired: column 3 of the output ("Live bytes") should be approximately constant.
Actual: "Live bytes" grows approximately linearly with each request

Note: Eventually, the amount of live bytes will decrease suddenly. This seems to happen more often if the settingsTimeout is set to a lower value.

@ryantrinkle

This comment has been minimized.

ryantrinkle commented Feb 27, 2018

Another note: in order to see this leaked memory in the heap profile, I had to run with +RTS -xt. Otherwise, it would appear in -S's output, but not in -hc's output.

@ryantrinkle

This comment has been minimized.

ryantrinkle commented Feb 27, 2018

@kazu-yamamoto @snoyberg Let me know if you'd prefer I open a new issue, since this seems like it may have a different underlying cause from the previous one.

@snoyberg

This comment has been minimized.

Member

snoyberg commented Mar 2, 2018

Doesn't look like a leak to me. It appears that the slowloris protection is keeping track of the connections that have been opened and then flushing that information each time it does a purge.

@ryantrinkle

This comment has been minimized.

ryantrinkle commented Mar 2, 2018

@snoyberg Thanks for taking a look! That makes sense. It looks like ab wasn't using Keep-Alive (and I couldn't get it to use it properly, even with -k), and when I manually send a stream of requests on a single connection, there's no increase in memory usage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment