Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fail2ban status <jail> is slow #2819

Closed
3 tasks done
allanwind opened this issue Aug 27, 2020 · 8 comments
Closed
3 tasks done

fail2ban status <jail> is slow #2819

allanwind opened this issue Aug 27, 2020 · 8 comments

Comments

@allanwind
Copy link

allanwind commented Aug 27, 2020

Environment:

  • Fail2Ban version (including any possible distribution suffixes): 0.10.2-2.1
  • OS, including release name/version: Debian 10
  • Fail2Ban installed via OS/distribution mechanisms
  • You have not applied any additional foreign patches to the codebase
  • Some customizations were done to the configuration (provide details below is so)

The issue:

I use munin's fail2ban plugin to graph the number of IPs that has been blocked per jail. All the plugins run every 5 minutes via cron in the Debian default configuration. The fail2ban plugin is a simple script that runs status to get the list of jails, then status to get the count for each jail.

  1. When I switched from iptables to nftables the fail2ban plugin would timeout. The workaround is to increase timeout from the default 60 seconds to 300 seconds in /etc/munin/plugin-conf.d/fail2ban:
[fail2ban]
timeout 300
  1. fail2ban-client status is ~43 times slower than getting the data from nft directly:
\# time for jail in apache-noscript dovecot postfix sshd; do fail2ban-client status $jail | grep 'Currently banned'; done
   |- Currently banned: 34
   |- Currently banned: 189
   |- Currently banned: 1415
   |- Currently banned: 7378

real    0m2.161s
user    0m1.803s
sys     0m0.313s
\# time nft list table inet filter -j | jq -r ".nftables[] | select(has(\"set\")).set |  [.name, (.elem | length) ] | \"\(.[0]) \(.[1])\""
f2b-sshd 7378
f2b-apache-noscript 34
f2b-postfix 1415
f2b-dovecot 189

real    0m0.050s
user    0m0.051s
sys     0m0.012s

I shared claim 1 with you mainly for context, as a way to document the workaround, and in case it helps troubleshoot. What I am asking is to see if there is a way to speed up 2, say, from 43x to <10x slower. There is a lot value in fail2ban-client status abstracting away the backend (iptables vs nftables) so I do think it's worth looking into.

Steps to reproduce

Expected behavior

Observed behavior

Any additional information

Configuration, dump and another helpful excerpts

Any customizations done to /etc/fail2ban/ configuration

Relevant parts of /var/log/fail2ban.log file:

Relevant lines from monitored log files in question:

@sebres
Copy link
Contributor

sebres commented Sep 7, 2020

When I switched from iptables to nftables the fail2ban plugin would timeout.

Have no idea what can cause that. Anyway I guess it is rather a matter of implementation of munin's fail2ban plugin.

fail2ban-client status is ~43 times slower than getting the data from nft directly:

Well, fail2ban status $jail does not return banned IPs only (info about bans), but status of filter (info about failures).
Although in normal case the execution time is almost to neglect, for example the test below shows that it is very fast:

# create dummy jail:
from fail2ban.server.jail import Jail, Actions
jail = Jail('test')
acts = Actions(jail)
# simulate 65K bans:
for i in range(1,256):
  for j in range(1,256):
    acts.addBannedIP('192.0.%s.%s' % (i, j))

# banned count:
>>> len(acts.status()[-1][-1])
65025
# status output:
>>> timeit.timeit(acts.status, number=1)
0.0026350021362304688

and filter status is still faster, BUT...

Some information will be retrieved in lock, so by persistent flooding with failures in some jails, their filter may become busy for longer time, so it could take certain time to obtain a lock (jail.status becomes suddenly slow).

I'll take a what we can do here (don't need exact info there, so maybe the locking can be removed).

But anyway, I don't think that status (how it is implemented now) is a proper way to obtain count (only) of banned tickets from jail.

@sebres
Copy link
Contributor

sebres commented Sep 8, 2020

Hmm... yesterday I investigated here a bit deeper - the issue is more complex: for translation between server and client fail2ban uses pickle protocol, so almost half of time going to this conversion (pickle.dumps/pickle.loads). Although I was also able to find another bottleneck and solve it (ca. 40% - 50% of whole execution time)...
But I guess to make status $jail more suitable by long amount of banned IPs, we should restrict maximal count of IPs in output by flavor basic (to something like 100)... It'd make the output also more readable.
For full list of IPs in newer version one could use:

  • either fail2ban-client status $jail full (default basic)
  • or fail2ban-client get $jail banned

sebres added a commit to sebres/fail2ban that referenced this issue Sep 8, 2020
sebres added a commit to sebres/fail2ban that referenced this issue Sep 8, 2020
…100 in output by basic flavor in order to speedup it and to provide more clear output (fail2bangh-2819);

for full list of IPs in newer version one could use:
  - `fail2ban-client status $jail full`
  - `fail2ban-client get $jail banned` or `fail2ban-client banned`
@allanwind
Copy link
Author

allanwind commented Sep 8, 2020 via email

@sebres
Copy link
Contributor

sebres commented Sep 8, 2020

Did you by any chance compare nft vs iptables?

What should be a goal of such comparison?

Does pickle/unpickle of ~10k items really take a second?

In my case ( i7-4790) it is 65k IPs, what really takes 1 second (2 seconds whole execution status).

Maybe it would make sense to retain the existing behavior for status for backwards compatibility

Maybe, just status is pretty unusable for human read this way (as for default). I must think a bit about that.

I obviously don't look through a list of 10k IPs but I do routinely pipe it to less to see if a given IP is on the list.

In new version there are 2 other possibilities to read it by script or programmatically, see 54b2208 (or #2725 (comment)).

@allanwind
Copy link
Author

allanwind commented Sep 8, 2020 via email

sebres added a commit to sebres/fail2ban that referenced this issue Sep 9, 2020
…`: output total and current counts only, without banned IPs list in order to speedup it and to provide more clear output (fail2bangh-2819), flavor `basic` (still default) is unmodified for backwards compatibility;

it can be changed later to `short`, so for full list of IPs in newer version one should better use:
  - `fail2ban-client status $jail basic`
  - `fail2ban-client get $jail banned` or `fail2ban-client banned`
sebres added a commit to sebres/fail2ban that referenced this issue Sep 10, 2020
…`: output total and current counts only, without banned IPs list in order to speedup it and to provide more clear output (fail2bangh-2819), flavor `basic` (still default) is unmodified for backwards compatibility;

it can be changed later to `short`, so for full list of IPs in newer version one should better use:
  - `fail2ban-client status $jail basic`
  - `fail2ban-client get $jail banned` or `fail2ban-client banned`
@sebres
Copy link
Contributor

sebres commented Sep 10, 2020

I rebased it in f381b98 with changed behavior, so backwards compatible now, with new flavor short, which provides shorter output with counts only (without IP list):

fail2ban-client status $jail short

Probably we should switch default flavor (from basic to short) in some version, for aforementioned reasons, so if someone use it programmatically to obtain list of IPs, it should be better changed to this to be sure in the future:

-fail2ban-client status $jail
+fail2ban-client status $jail basic

fail2ban-client status is rather something for humans eyes.

@allanwind
Copy link
Author

allanwind commented Sep 10, 2020 via email

@sebres
Copy link
Contributor

sebres commented Sep 23, 2020

Merged, thus close.

@sebres sebres closed this as completed Sep 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants