Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Critical] E2 Guardian eating up RAM after latest 4.1 updates #266

Closed
forid786 opened this issue Jul 23, 2017 · 97 comments
Closed

[Critical] E2 Guardian eating up RAM after latest 4.1 updates #266

forid786 opened this issue Jul 23, 2017 · 97 comments

Comments

@forid786
Copy link
Contributor

@forid786 forid786 commented Jul 23, 2017

Hi,

E2 Guardian seems to eat up all the RAM and the process takes up over 5GB of Ram (and growing) and starts swapping. Before the latest few updates, E2 Guardian only used about 300mb RAM for me. This doesn't happen immediately, it seems to happen over two days after fresh install. Restarting the E2 Guardian process instantly frees the RAM however after a few hours it'll be back. I'm not sure if it's a memory leak or what is actually going on but this needs looking into.

According to @marcelloc this has happened on Squid too, in a similar fashion.

I am running this on pfSense (BSD x64). The package is created by @marcelloc from the 4.1 branch.

@fredbcode

This comment has been minimized.

Copy link
Contributor

@fredbcode fredbcode commented Jul 24, 2017

@fredbcode

This comment has been minimized.

Copy link
Contributor

@fredbcode fredbcode commented Jul 26, 2017

Do you know github commands ? If yes you can try something easily

https://github.com/e2guardian/e2guardian/commits/develop
Compare with latest point without any problem (eg 98cf9cf) and revert back to commit close one by one and of course compile after.

The package is created by @marcelloc from the 4.1 branch

Maybe there are just some little changes
It could really help if we can approximatively find the period

There is no message in syslog, no crash ?

@fredbcode

This comment has been minimized.

Copy link
Contributor

@fredbcode fredbcode commented Jul 26, 2017

Can you post the result of e2guardian -v and a screenshot of top when the issue appear

@fredbcode

This comment has been minimized.

Copy link
Contributor

@fredbcode fredbcode commented Jul 26, 2017

Doh ! I forgot. Not a fix at all but meanwhile it can help

If you rotate your logs every day put a restart in your logrotate configuration, only large download will be interrupted

Something like

/var/log/e2guardian/access.log {
    daily
    compress
    rotate 5
    prerotate
        /etc/init.d/e2guardian stop > /dev/null 2>&1 || true
    endscript
    postrotate
        /etc/init.d/e2guardian start > /dev/null 2>&1
    endscript
}

I'm doing this with squid, with some versions it help very much :)

@forid786

This comment has been minimized.

Copy link
Contributor Author

@forid786 forid786 commented Jul 26, 2017

Sorry about the late reps again. I've already got log rotation enabled, and here's the screenshot of top :http://i.imgur.com/SHrf02V.png

Funny thing is, I've reinstalled E2 Guardian twice and it's been going for a little while without the memory issue popping up again. Do you think I should wait a little longer and see if it's properly fixed?

[2.3.4-RELEASE][root@pfSense.kortex]/root: e2guardian -v
e2guardian 4.1.2

Built with: '--localstatedir=/var' '--with-logdir=/var/log' '--with-piddir=/var/run' '--with-zlib=/usr' '--enable-fancydm' '--disable-clamd' '--enable-commandline' '--enable-dnsauth' '--disable-email' '--disable-icap' '--disable-kavd' '--enable-ntlm' '--enable-sslmitm' '--enable-trickledm' '--prefix=/usr/local' '--mandir=/usr/local/man' '--disable-silent-rules' '--infodir=/usr/local/info/' '--build=amd64-portbld-freebsd10.3' 'build_alias=amd64-portbld-freebsd10.3' 'CXX=c++' 'CXXFLAGS=-O2 -pipe -I/usr/local/include -D__SSLMITM -D__SSLCERT -DLIBICONV_PLUG -fstack-protector -fno-strict-aliasing -DLIBICONV_PLUG -std=c++1y' 'LDFLAGS= -lssl -lcrypto -fstack-protector' 'LIBS=' 'CPPFLAGS=-I/usr/local/include -DLIBICONV_PLUG' 'CC=cc' 'CFLAGS=-O2 -pipe -I/usr/local/include -D__SSLMITM -D__SSLCERT -DLIBICONV_PLUG -fstack-protector -fno-strict-aliasing' 'CPP=cpp' 'PKG_CONFIG=pkgconf'
[2.3.4-RELEASE][root@pfSense.kortex]/root

@forid786

This comment has been minimized.

Copy link
Contributor Author

@forid786 forid786 commented Jul 27, 2017

I'm seeing that the swap is going up again. This is very strange, I didn't have the problem before. E2 Guardian with like 10 clients shouldn't be using 8/9gb of RAM.

@fredbcode

This comment has been minimized.

Copy link
Contributor

@fredbcode fredbcode commented Jul 27, 2017

@fredbcode

This comment has been minimized.

Copy link
Contributor

@fredbcode fredbcode commented Jul 27, 2017

Sorry about the late reps again. I've already got log rotation enabled,

I mean a logrotate every day with a restart for e2guardian

@forid786

This comment has been minimized.

Copy link
Contributor Author

@forid786 forid786 commented Jul 28, 2017

You can see the memory usage of E2 Guardian in my previous reply, it's in the screenshot. After nearly a week it's caused my box to completely crash.

In terms of the log rotate, I'll speak to Marcelloc about having the option in our pfSense gui. For now do you recommend using cron?

@fredbcode

This comment has been minimized.

Copy link
Contributor

@fredbcode fredbcode commented Jul 28, 2017

Yes for now a cron at midnight.
Please post the result of

ps -edfL | grep e2 | WC -l

There is no indication in syslog,message,kern.log, etc ?

@forid786

This comment has been minimized.

Copy link
Contributor Author

@forid786 forid786 commented Jul 28, 2017

There's nothing in the syslog or anything, however when we eventually get a kernel panic then E2 Guardian is shown. As loads of processes.

Output for: ps -edfL | grep e2 | WC -l

Is WC not found xD

@fredbcode

This comment has been minimized.

Copy link
Contributor

@fredbcode fredbcode commented Jul 29, 2017

@forid786

This comment has been minimized.

Copy link
Contributor Author

@forid786 forid786 commented Jul 29, 2017

Output is : 0 right now

@fredbcode

This comment has been minimized.

Copy link
Contributor

@fredbcode fredbcode commented Jul 29, 2017

@forid786

This comment has been minimized.

Copy link
Contributor Author

@forid786 forid786 commented Jul 29, 2017

Here's the output:

[2.3.4-RELEASE][root@pfSense.kortex]/root: ps -edfl
UID PID PPID CPU PRI NI VSZ RSS MWCHAN STAT TT TIME COMMAND
0 42715 1 0 52 0 43440 2588 wait Is v0 0:00.00 TERM=cons25 login [pam] (login)
0 42986 42715 0 52 0 17000 2468 wait I v0 0:00.00 - USER=root LOGNAME=root HOME=/root SHELL=/bi
0 43118 42986 0 52 0 17000 2356 ttyin I+ v0 0:00.00 -- LOGNAME=root FTP_PASSIVE_MODE=YES MAIL=/v 0 57982 1 0 52 20 17000 2332 wait SN v0- 0:10.33 PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/local 0 27198 25463 0 25 0 17000 2468 wait Ss 0 0:00.00 USER=root LOGNAME=root HOME=/root MAIL=/var/m 0 27530 27198 0 52 0 17000 2360 wait S 0 0:00.00 - SSH_CLIENT=172.16.1.8 50786 22 LOGNAME=root 0 43996 27530 0 52 0 17340 3220 pause S 0 0:00.00 -- SSH_CLIENT=172.16.1.8 50786 22 LOGNAME=ro
0 57129 43996 0 72 0 18676 2244 - R+ 0 0:00.00 `-- SSH_CLIENT=172.16.1.8 50786 22 LOGNAME=

@fredbcode

This comment has been minimized.

Copy link
Contributor

@fredbcode fredbcode commented Jul 29, 2017

How many httpworkers in e2guardian.conf ?

@forid786

This comment has been minimized.

Copy link
Contributor Author

@forid786 forid786 commented Jul 29, 2017

256 HTTP workers

@fredbcode

This comment has been minimized.

Copy link
Contributor

@fredbcode fredbcode commented Jul 29, 2017

Ok can you make try with 1500 please

@forid786

This comment has been minimized.

Copy link
Contributor Author

@forid786 forid786 commented Jul 29, 2017

Alright, but won't that cause more swap issues?

@fredbcode

This comment has been minimized.

Copy link
Contributor

@fredbcode fredbcode commented Jul 29, 2017

@forid786

This comment has been minimized.

Copy link
Contributor Author

@forid786 forid786 commented Jul 29, 2017

Alright. This machine isn't super powerful, it's used in a home setting. 2GB RAM and a Dual Core 3.1Ghz CPU.

Usually memory usage never goes past 30%.

@forid786

This comment has been minimized.

Copy link
Contributor Author

@forid786 forid786 commented Jul 29, 2017

I'm running 1500 now, it seems to work normally for now

@forid786

This comment has been minimized.

Copy link
Contributor Author

@forid786 forid786 commented Jul 29, 2017

It's slowly swapped already I think, 2% swap usage. It was 0% before.

@fredbcode

This comment has been minimized.

Copy link
Contributor

@fredbcode fredbcode commented Jul 29, 2017

Ok with 2g I guess there is no difference at start.
I don't see this issue with Debian here, if we found nothing I will try to reproduce your configuration (not before 2 weeks)

@fredbcode

This comment has been minimized.

Copy link
Contributor

@fredbcode fredbcode commented Jul 29, 2017

Do you have a big blacklist ? Something new ?

@forid786

This comment has been minimized.

Copy link
Contributor Author

@forid786 forid786 commented Jul 29, 2017

ShallaList, that's the only blacklist I use.

Thank you for checking it out :)

@fredbcode

This comment has been minimized.

Copy link
Contributor

@fredbcode fredbcode commented Jul 29, 2017

In your screenshot you have more than 1g free, and swap 2%
I guess nothing wrong here

@fredbcode

This comment has been minimized.

Copy link
Contributor

@fredbcode fredbcode commented Jul 29, 2017

No change in rules recently, sslmitm, regular expression, etc ?

@forid786

This comment has been minimized.

Copy link
Contributor Author

@forid786 forid786 commented Jul 29, 2017

Nope, nothing changed recently. The problem started occurring when I updated E2 Guardian.

@lupusrex

This comment has been minimized.

Copy link

@lupusrex lupusrex commented Jul 31, 2017

I do also see that e2guardian 4.1.1 and 4.1.2 eats up RAM under Debian Jessie and Debian Stretch.
On my systems the e2guardian process is killed automatically after a few hours and it needs to be restarted manually. I switched back to e2guardian 3.5.1 - it runs smoothly without any problems.

@philipianpearce

This comment has been minimized.

Copy link
Contributor

@philipianpearce philipianpearce commented Aug 7, 2017

@fredbcode

This comment has been minimized.

Copy link
Contributor

@fredbcode fredbcode commented Aug 7, 2017

@forid786

This comment has been minimized.

Copy link
Contributor Author

@forid786 forid786 commented Aug 8, 2017

@fredbcode After turning off MITM the problem is fully fixed. Overall RAM usage is only 20% right now which is great.

In terms of configuration, my configuration is quite simple. Pornography is blocked via content checking and url blacklist. For kids group, YouTube restricted is enforced. The only other real thing I have is, I have put exceptions for a lot of popular apps so they can still work with SSL pinning. Even when I have my MITM on.

@fredbcode

This comment has been minimized.

Copy link
Contributor

@fredbcode fredbcode commented Aug 16, 2017

Here a complete trace with SSLMITM http://numsys.eu/TMP/valgrind2.txt

@forid786 forid are you using BYPASS or antivirus ?

@fredbcode

This comment has been minimized.

Copy link
Contributor

@fredbcode fredbcode commented Aug 16, 2017

Or identification, ntlm ?

@fredbcode

This comment has been minimized.

Copy link
Contributor

@fredbcode fredbcode commented Aug 16, 2017

I tried on my test system:

With sslmitm = on and a loop of 10 different websites (5 http and 5 https) + clamav AV + basic identification

I pushed the loop too 250 simultaneous wget (mode recursive)

Result:

In syslog:
Warning system is full : max httpworkers: 500 Used: 500

In top
Memory used by e2guardian is 3.7 % on 4 Go, CPU usage is almost 100 %
21982 e2guard+ 20 0 5155492 151748 5364 S 97,7 3,7 1:28.78 e2guardian

After 15 minutes I killed wget processes the memory usage by e2 is more or least 3 %

Debian Jessie 64 bits, Squid 3.5.25 and e2guardian 4.1.3 (dev version)

So as far I can tell the guilty is not SSLMITM, perhaps SSLMITM in special case ?

@forid786

This comment has been minimized.

Copy link
Contributor Author

@forid786 forid786 commented Aug 19, 2017

@fredbcode Yes I use BYPASS very often, as the content filtering isn't perfect yet. There's a lot of false positive, hence why I made another issue asking for the phrase lists to be updated. I am not using the AV at this time.

@fredbcode

This comment has been minimized.

Copy link
Contributor

@fredbcode fredbcode commented Aug 23, 2017

@lupusrex Debian 64 bits ? not 32 ?

@lupusrex

This comment has been minimized.

Copy link

@lupusrex lupusrex commented Aug 23, 2017

@fredbcode

This comment has been minimized.

Copy link
Contributor

@fredbcode fredbcode commented Aug 23, 2017

@forid786

This comment has been minimized.

Copy link
Contributor Author

@forid786 forid786 commented Aug 23, 2017

I'm on pfSense 64bit, and am still seeing this issue.

@fredbcode

This comment has been minimized.

Copy link
Contributor

@fredbcode fredbcode commented Sep 22, 2017

Someone can make a try with the latest changes from develop ?

@marcelloc

This comment has been minimized.

Copy link
Contributor

@marcelloc marcelloc commented Sep 27, 2017

I will. Sorry for the delay on answering

@fredbcode

This comment has been minimized.

Copy link
Contributor

@fredbcode fredbcode commented Oct 9, 2017

No one ?

@Anthony-76

This comment has been minimized.

Copy link

@Anthony-76 Anthony-76 commented Oct 9, 2017

Hi,

Of my part, I never had this problem.

@fredbcode

This comment has been minimized.

Copy link
Contributor

@fredbcode fredbcode commented Oct 9, 2017

With sslmitm enabled ?

@Anthony-76

This comment has been minimized.

Copy link

@Anthony-76 Anthony-76 commented Oct 9, 2017

during my test, I make lot of tests with sslmitm enable and I don't see any problem

@forid786

This comment has been minimized.

Copy link
Contributor Author

@forid786 forid786 commented Oct 9, 2017

That's part of the issue @Anthony-76 it happens randomly. Sometimes it takes a couple of days.

I've updated with Marcelloc's latest patches. My system has been fine for the past 3 days without issues. I'll leave it for a while and see how it goes. E2 Guardian is a great piece of software and one of the main reasons I have a firewall at home, but it just makes Pfsense go crazy sometimes (kernel panics). This is actually leading me to think about running Pfsense virtually... But people have a lot of different opinions on that.

@fredbcode I'll try keep you updated as much as possible, currently I'm finding it hard to make time. If possible could you please speak to Philip about fixing the phrase lists? They're too outdated / buggy. Trying to create one yourself is quite a mammoth task, I think Philip may have some updated ones.

@forid786

This comment has been minimized.

Copy link
Contributor Author

@forid786 forid786 commented Oct 10, 2017

@marcelloc I'm still getting kernel panics, this time it took 5 day for it to happen.

@Anthony-76

This comment has been minimized.

Copy link

@Anthony-76 Anthony-76 commented Oct 10, 2017

Of my part, like I has said in m last post, I'm not impacted with this issue.
(Debian 8 with official Debian Kernel, Squid etc....)

But, Each night, I reload E2guardian configuration in order to load a new blacklist.

Maybe this reload avoid me to have this issue

@philipianpearce

This comment has been minimized.

Copy link
Contributor

@philipianpearce philipianpearce commented Oct 10, 2017

@forid786

This comment has been minimized.

Copy link
Contributor Author

@forid786 forid786 commented Oct 10, 2017

Hi, just as an FYI it doesn't always take a few days. Sometimes it can happen a few times in a couple of hours. And that's what really gets me sometimes.

Honestly I'm not sure if this is BSD related, but pfsense becomes really unstable with E2 Guardian. It gets kernel crashes and sometimes it crashes and doesn't recover. Then you gotta go and manually hard reset the box.

@forid786

This comment has been minimized.

Copy link
Contributor Author

@forid786 forid786 commented Oct 11, 2017

Hoping V5 fixes all these problems. 3.5.1 worked fine without crashes or these memory leak issues.

@philipianpearce What about updating the word lists? Have you got an updated list?

@philipianpearce

This comment has been minimized.

Copy link
Contributor

@philipianpearce philipianpearce commented Oct 11, 2017

@forid786

This comment has been minimized.

Copy link
Contributor Author

@forid786 forid786 commented Oct 15, 2017

Initially I always had the Ram spiking until the entire box would crash. To combat this, I enabled log rotation on both squid and on E2 Guardian. Now it seems to crash at weird intervals. I'm not sure why more people aren't getting this, maybe it's the context I have everything setup in. I've got this setup at home with pfsense, and most devices are mobile / tablets which means rather than normal browser traffic, it would be app traffic.
Today I upgraded to pfSense 2.4 which is now running on BSD11, I hoping something here fixes it. Otherwise I'm not sure if this can be fixed without proper debugging on BSD.

@forid786

This comment has been minimized.

Copy link
Contributor Author

@forid786 forid786 commented Oct 23, 2017

Currently on pfsense 2.4 (BSD 11) still getting crashes. It seems like sometimes the E2 Guardian threads go a bit crazy.
@marcelloc have you investigated any of this or encountered it in your testing?

I initially started by using 3.5.1 which was pretty much flawless except the sense that SAN broke MITM. After upgrading to 4/4.1 I really started to get full system crashing issues. And I'm pretty much stuck due to debugging knowledge :(

@ebin-dev

This comment has been minimized.

Copy link

@ebin-dev ebin-dev commented Jan 16, 2018

@forid786 @fredbcode @philipianpearce

I did some tests with two SBCs - both running debian stretch (RaspberryPi 3, Kernel 4.9.59 (32bit armv7 according to /proc/cpuinfo) and EspressoBin, Kernel 4.4.111 (64bit armv8), both SBCs have just 1GB of RAM). E2guardian was configured exactly the same way on both systems (e2guardian.conf and e2guardianf1.conf are attached: default configuration plus sslmitm).

Upon reboot memory consumption of e2guardian 4.1.4 is very different on both systems: 6.4% RAM on RaspPi 3 (and stable) and 25.8% on the EspressoBin (and rising on the EspressoBin until RAM is filled).

e2guardian 4.1.4 on the EspressoBin is also running out of threads after a short time even with just a single user. No issues with e2guardian 4.1.4 on RasPi3...

This was clearly not the case with e2guardian 3.5.1 - there were no issues on both SBCs.

I have the impression that the RAM issue of e2guardian 4.1.4 may be related to the interaction of e2guardian with 64bit linux kernel versions.

The following compilation options were used on both SBCs (sslmitm):

e2guardian -v
e2guardian 4.1.4

Built with: '--prefix=/usr' '--enable-clamd=yes' '--with-proxyuser=e2guardian' '--with-proxygroup=e2guardian' '--sysconfdir=/etc' '--localstatedir=/var' '--enable-icap=yes' '--enable-commandline=yes' '--enable-email=yes' '--enable-ntlm=yes' '--enable-trickledm=yes' '--mandir=/share/man' '--infodir=/share/info' '--enable-' 'CXXFLAGS=-g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security' 'LDFLAGS=-Wl,-z,relro' 'CPPFLAGS=-D_FORTIFY_SOURCE=2' 'CFLAGS=-g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security' '--enable-sslmitm=yes' '--enable-pcre=yes' '--enable-locallists=yes'

e2guardian.txt
e2guardianf1.txt

@forid786

This comment has been minimized.

Copy link
Contributor Author

@forid786 forid786 commented Jan 22, 2018

I've managed to somewhat mitigate problems with E2 Guardian by setting the pfSense settings to use "Aggressive" firewall optimizations. What this does is drop idle connections faster.

@marcelloc Hasn't updated the package for BSD in a while so it's become quite ancient.

@forid786

This comment has been minimized.

Copy link
Contributor Author

@forid786 forid786 commented Apr 15, 2018

Issue seemed to be solved in V5. I've had no such extravagant RAM usage or system crashes. Closing ticket - Great work on V5!

@forid786 forid786 closed this Apr 15, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants
You can’t perform that action at this time.