Memory leak #169

AlexDaniel · 2017-04-01T00:27:11Z

Sometimes around ≈20MB per fetched page.

Feel free to try it yourself (I was using blead rakudo):

use HTTP::UserAgent;
my $ua = HTTP::UserAgent.new;
loop {
    my $response = $ua.get(‘https://perl6.org’);
    say $response.content.chars
}

MasterDuke17 · 2017-04-01T05:32:02Z

With sergot/openssl#37 applied the leak is significantly reduced. I changed the code to loop just ten times and profiled with heaptrack. It showed 10Gb allocated before and only 268Mb allocated after.

AlexDaniel · 2017-04-01T05:42:01Z

@MasterDuke17 sounds awesome!

That said, I don't think it is good enough.

MasterDuke17 · 2017-04-01T12:59:42Z

sergot/openssl#38 reduces the memory used even further, down to 187Mb.

jonathanstowe · 2017-04-01T14:19:10Z

Leaving aside the SSL thing, there is a built-in growth in memory consumption that is by design (though arguably not entirely sensible design,) as it saves every response in order to attempt to detect redirect loops, This is appended to the attribute history on the UserAgent object, so you can reset that whenever you see fit.

Testing against a non-SSL URI with an additional $ua.history = [] in the loop, the memory usage stabilises and stays stable after the first couple of requests.

I am pretty certain that this isn't documented (I had actually forgotten that it did this until you raised it,) and I will fix that this afternoon. I also think that there probably should be some switch on the UserAgent to control the behaviour as it would all become a bit of a faff in long runing software that uses the same UA for a large number of requests.

AlexDaniel · 2017-04-01T15:42:53Z

@jonathanstowe ok, that makes sense!

However, I'm not sure how redirect loops are related to this “feature”? If I'm doing a new request, why would it need something from the history?

jonathanstowe · 2017-04-01T16:07:06Z

I'm not entirely convinced either but the code does :

when $.max-redirects < +@.history && all(@.history.reverse[0..$.max-redirects]>>.code)

Which basically checks whether the last max-redirects responses were re-directs. It's actually a bit tricky. I guess the best that you could do (and preserve this check,) is keep a count of the redirects and reset it when you get a non-redirect. Of course better suggestions would be appreciated (and being Saturday my brane is somewhat entertainment impaired :)

moritz · 2017-04-01T16:11:40Z

Or for a start, only store the headers, not the bodies...

jonathanstowe · 2017-04-01T16:16:46Z

there's a long standing comment in there that say's just that :)

AlexDaniel · 2017-04-01T17:14:19Z

I'm still not sure why it has to store anything after .get(…) is finished.

moritz · 2017-04-01T17:22:05Z

pull request forthcoming, as a basis for discussion.

See #169. We used to store the full history of the response, just to count redirects. This used up lots of RAM for very little gain. This commit replaces it by a simple integer that stores the count of redirects in a row. Since @.history was a public attribute, this changes the public API, and as such is subject to discussion.

jonathanstowe · 2017-04-01T17:45:35Z

It seems to pass the tests and I quite like it, but I was going to go for something slightly more conservative and use an :save-history on the request, which is only set by default on on the redirect (so that part can remain unchanged) and if it isn't set rest history at the end of request. That way the same strange behaviour can be retained but not by default.

moritz · 2017-04-01T19:02:19Z

@jonathanstowe there are two things here I don't understand. First, who benefits from retaining the strange behavior, particularly if it's not always enabled? At least inside the public ecosystem, there's no usage:

moritz@hack:~/p6/perl6-all-modules$ git ls-files -z|xargs -0  grep -lF UserAgent|xargs grep -l --word history 2>/dev/null
grep: Tux/CSV/csv-rust-csvrdr: No such file or directory
grep: Tux/CSV/csv-rust-libcsv: No such file or directory
grep: Tux/CSV/csv-rust-qckrdr: No such file or directory
avuserow/perl6-webservice-lastfm/README.md
jonathanstowe/META6/t/data/projects.json
sergot/http-useragent/lib/HTTP/UserAgent.pm6
zoffixznet/perl6-RT-REST-Client/lib/RT/REST/Client.pm6

(The mention of history in zoffixznet/perl6-RT-REST-Client/lib/RT/REST/Client.pm6 is part of an URL, and unrelated to the .history method).

And second, how does that work? You don't know whether to save history until after the request, and IMHO that's too late to lead to predictable behavior.

ugexe · 2017-04-01T19:46:01Z

I think the original intention was eventually keep just the headers and allow some way for the Request to transparently store the content history to and from disk. But there have been lots of improvements to the architecture of HTTP::UserAgent since the redirect code was written, so module space could probably provide the content storage portion easily now (although I'm all for it being implemented directly if someone else is doing it).

As for why to keep the headers instead of just a redirect counter - more enhanced redirect logic (especially involving caching) requires more information than just "how many times have I been redirected?".

jonathanstowe · 2017-04-02T08:21:35Z

As I noted on the #170 I'd be inclined to move the redirect chain into the response object for those cases where something more sophisticated is required ( this could be as simple as adding a previous-response attribute to request and populating this in the next-request, however right now as we are simply doing a count on the previous redirects anyway @moritz PR seems to be a sound solution. the @.history is neither explicitly tested nor documented, so quite relaxed about dropping it myself.

jonathanstowe · 2017-04-03T18:26:38Z

I've merged the #170 I'm going to look at ssupplying the redirect history on a per-response basis under separate cover.

MasterDuke17 · 2017-04-08T21:00:02Z

sergot/openssl#38 was just merged.

jonathanstowe · 2017-08-12T08:16:35Z

Are we happy with this now so it can be closed?

AlexDaniel · 2017-08-12T12:38:25Z

No, I am not. It is still leaking memory. Just run the example and see yourself.
It may be due to RT #131879, I don't know.

MasterDuke17 · 2017-08-12T13:43:14Z

i'm only seeing a tiny leak. Memory usage goes up for a while, but then pretty much stabilizes.

jonathanstowe · 2017-08-12T16:51:22Z

Yeah I'm not seeing the memory use monotonously increasing, it goes up for a bit then stays flat as, I guess, the garbage collection starts to take back the unused objects.

AlexDaniel · 2017-08-12T18:03:54Z

Actually, yes! It turns out that I was using an old version. D'oh! Sorry!

IMO it's closable. I don't know if you want to have tests for this, so feel free to reopen.

jonathanstowe · 2017-08-12T18:36:57Z

Yeah I'd like to have a test but I can't see how to do it without shelling out to some highly platform dependent tool.

AlexDaniel · 2017-08-12T19:48:59Z

Check the OS and maybe use Linux::Proc::Statm on linux. It's much better to have a test that only runs on Linux than to have no test at all.

moritz mentioned this issue Apr 1, 2017

Save memory by not storing full request history #170

Merged

AlexDaniel closed this as completed Aug 12, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory leak #169

Memory leak #169

AlexDaniel commented Apr 1, 2017

MasterDuke17 commented Apr 1, 2017

AlexDaniel commented Apr 1, 2017

MasterDuke17 commented Apr 1, 2017

jonathanstowe commented Apr 1, 2017

AlexDaniel commented Apr 1, 2017

jonathanstowe commented Apr 1, 2017

moritz commented Apr 1, 2017

jonathanstowe commented Apr 1, 2017

AlexDaniel commented Apr 1, 2017

moritz commented Apr 1, 2017

jonathanstowe commented Apr 1, 2017

moritz commented Apr 1, 2017

ugexe commented Apr 1, 2017 •

edited

Loading

jonathanstowe commented Apr 2, 2017

jonathanstowe commented Apr 3, 2017

MasterDuke17 commented Apr 8, 2017

jonathanstowe commented Aug 12, 2017

AlexDaniel commented Aug 12, 2017

MasterDuke17 commented Aug 12, 2017

jonathanstowe commented Aug 12, 2017

AlexDaniel commented Aug 12, 2017

jonathanstowe commented Aug 12, 2017

AlexDaniel commented Aug 12, 2017 •

edited

Loading

Memory leak #169

Memory leak #169

Comments

AlexDaniel commented Apr 1, 2017

MasterDuke17 commented Apr 1, 2017

AlexDaniel commented Apr 1, 2017

MasterDuke17 commented Apr 1, 2017

jonathanstowe commented Apr 1, 2017

AlexDaniel commented Apr 1, 2017

jonathanstowe commented Apr 1, 2017

moritz commented Apr 1, 2017

jonathanstowe commented Apr 1, 2017

AlexDaniel commented Apr 1, 2017

moritz commented Apr 1, 2017

jonathanstowe commented Apr 1, 2017

moritz commented Apr 1, 2017

ugexe commented Apr 1, 2017 • edited Loading

jonathanstowe commented Apr 2, 2017

jonathanstowe commented Apr 3, 2017

MasterDuke17 commented Apr 8, 2017

jonathanstowe commented Aug 12, 2017

AlexDaniel commented Aug 12, 2017

MasterDuke17 commented Aug 12, 2017

jonathanstowe commented Aug 12, 2017

AlexDaniel commented Aug 12, 2017

jonathanstowe commented Aug 12, 2017

AlexDaniel commented Aug 12, 2017 • edited Loading

ugexe commented Apr 1, 2017 •

edited

Loading

AlexDaniel commented Aug 12, 2017 •

edited

Loading