Permalink
| .. __ _ _ ___ _ | |
| / _|__ _(_) |_ ) |__ __ _ _ _ | |
| | _/ _` | | |/ /| '_ \/ _` | ' \ | |
| |_| \__,_|_|_/___|_.__/\__,_|_||_| | |
| ================================================================================ | |
| Developing Filters | |
| ================================================================================ | |
| Filters are tricky. They need to: | |
| * work with a variety of the versions of the software that generates the logs; | |
| * work with the range of logging configuration options available in the | |
| software; | |
| * work with multiple operating systems; | |
| * not make assumptions about the log format in excess of the software | |
| (e.g. do not assume a username doesn't contain spaces and use \S+ unless | |
| you've checked the source code); | |
| * account for how future versions of the software will log messages | |
| (e.g. guess what would happen to the log message if different authentication | |
| types are added); | |
| * not be susceptible to DoS vulnerabilities (see Filter Security below); and | |
| * match intended log lines only. | |
| Please follow the steps from Filter Test Cases to Developing Filter Regular | |
| Expressions and submit a GitHub pull request (PR) afterwards. If you get stuck, | |
| you can push your unfinished changes and still submit a PR -- describe | |
| what you have done, what is the hurdle, and we'll attempt to help (PR | |
| will be automagically updated with future commits you would push to | |
| complete it). | |
| Filter Test Cases | |
| ================= | |
| Purpose | |
| ------- | |
| Start by finding the log messages that the application generates related to | |
| some form of authentication failure. If you are adding to an existing filter | |
| think about whether the log messages are of a similar importance and purpose | |
| to the existing filter. If you were a user of Fail2Ban, and did a package | |
| update of Fail2Ban that started matching new log messages, would anything | |
| unexpected happen? Would the bantime/findtime for the jail be appropriate for | |
| the new log messages? If it doesn't, perhaps it needs to be in a separate | |
| filter definition, for example like exim filter aims at authentication failures | |
| and exim-spam at log messages related to spam. | |
| Even if it is a new filter you may consider separating the log messages into | |
| different filters based on purpose. | |
| Cause | |
| ----- | |
| Are some of the log lines a result of the same action? For example, is a PAM | |
| failure log message, followed by an application specific failure message the | |
| result of the same user/script action? If you add regular expressions for | |
| both you would end up with two failures for a single action. | |
| Therefore, select the most appropriate log message and document the other log | |
| message) with a test case not to match it and a description as to why you chose | |
| one over another. | |
| With the selected log lines consider what action has caused those log | |
| messages and whether they could have been generated by accident? Could | |
| the log message be occurring due to the first step towards the application | |
| asking for authentication? Could the log messages occur often? If some of | |
| these are true make a note of this in the jail.conf example that you provide. | |
| Samples | |
| ------- | |
| It is important to include log file samples so any future change in the regular | |
| expression will still work with the log lines you have identified. | |
| The sample log messages are provided in a file under testcases/files/logs/ | |
| named identically as the corresponding filter (but without .conf extension). | |
| Each log line should be preceded by a line with failJSON metadata (so the logs | |
| lines are tested in the test suite) directly above the log line. If there is | |
| any specific information about the log message, such as version or an | |
| application configuration option that is needed for the message to occur, | |
| include this in a comment (line beginning with #) above the failJSON metadata. | |
| Log samples should include only one, definitely not more than 3, examples of | |
| log messages of the same form. If log messages are different in different | |
| versions of the application log messages that show this are encouraged. | |
| Also attempt to inject an IP into the application (e.g. by specifying | |
| it as a username) so that Fail2Ban possibly detects the IP | |
| from user input rather than the true origin. See the Filter Security section | |
| and the top example in testcases/files/logs/apache-auth as to how to do this. | |
| One you have discovered that this is possible, correct the regex so it doesn't | |
| match and provide this as a test case with "match": false (see failJSON below). | |
| If the mechanism to create the log message isn't obvious provide a | |
| configuration and/or sample scripts testcases/files/config/{filtername} and | |
| reference these in the comments above the log line. | |
| FailJSON metadata | |
| ----------------- | |
| A failJSON metadata is a comment immediately above the log message. It will | |
| look like:: | |
| # failJSON: { "time": "2013-06-10T10:10:59", "match": true , "host": "93.184.216.119" } | |
| Time should match the time of the log message. It is in a specific format of | |
| Year-Month-Day'T'Hour:minute:Second. If your log message does not include a | |
| year, like the example below, the year should be listed as 2005, if before Sun | |
| Aug 14 10am UTC, and 2004 if afterwards. Here is an example failJSON | |
| line preceding a sample log line:: | |
| # failJSON: { "time": "2005-03-24T15:25:51", "match": true , "host": "198.51.100.87" } | |
| Mar 24 15:25:51 buffalo1 dropbear[4092]: bad password attempt for 'root' from 198.51.100.87:5543 | |
| The "host" in failJSON should contain the IP or domain that should be blocked. | |
| For long lines that you do not want to be matched (e.g. from log injection | |
| attacks) and any log lines to be excluded (see "Cause" section above), set | |
| "match": false in the failJSON and describe the reason in the comment above. | |
| After developing regexes, the following command will test all failJSON metadata | |
| against the log lines in all sample log files:: | |
| ./fail2ban-testcases testSampleRegex | |
| Developing Filter Regular Expressions | |
| ===================================== | |
| Date/Time | |
| --------- | |
| At the moment, Fail2Ban depends on log lines to have time stamps. That is why | |
| before starting to develop failregex, check if your log line format known to | |
| Fail2Ban. Copy the time component from the log line and append an IP address to | |
| test with following command:: | |
| ./fail2ban-regex "2013-09-19 02:46:12 1.2.3.4" "<HOST>" | |
| Output of such command should contain something like:: | |
| Date template hits: | |
| |- [# of hits] date format | |
| | [1] Year-Month-Day Hour:Minute:Second | |
| Ensure that the template description matches time/date elements in your log line | |
| time stamp. If there is no matched format then date template needs to be added | |
| to server/datedetector.py. Ensure that a new template is added in the order | |
| that more specific matches occur first and that there is no confusion between a | |
| Day and a Month. | |
| Filter file | |
| ----------- | |
| The filter is specified in a config/filter.d/{filtername}.conf file. Filter file | |
| can have sections INCLUDES (optional) and Definition as follows:: | |
| [INCLUDES] | |
| before = common.conf | |
| after = filtername.local | |
| [Definition] | |
| failregex = .... | |
| ignoreregex = .... | |
| This is also documented in the man page jail.conf (section 5). Other definitions | |
| can be added to make failregex's more readable and maintainable to be used | |
| through string Interpolations (see http://docs.python.org/2.7/library/configparser.html) | |
| General rules | |
| ------------- | |
| Use "before" if you need to include a common set of rules, like syslog or if | |
| there is a common set of regexes for multiple filters. | |
| Use "after" if you wish to allow the user to overwrite a set of customisations | |
| of the current filter. This file doesn't need to exist. | |
| Try to avoid using ignoreregex mainly for performance reasons. The case when you | |
| would use it is if in trying to avoid using it, you end up with an unreadable | |
| failregex. | |
| Syslog | |
| ------ | |
| If your application logs to syslog you can take advantage of log line prefix | |
| definitions present in common.conf. So as a base use:: | |
| [INCLUDES] | |
| before = common.conf | |
| [Definition] | |
| _daemon = app | |
| failregex = ^%(__prefix_line)s | |
| In this example common.conf defines __prefix_line which also contains the | |
| _daemon name (in syslog terms the service) you have just specified. _daemon | |
| can also be a regex. | |
| For example, to capture following line _daemon should be set to "dovecot":: | |
| Dec 12 11:19:11 dunnart dovecot: pop3-login: Aborted login (tried to use disabled plaintext auth): rip=190.210.136.21, lip=113.212.99.193 | |
| and then ``^%(__prefix_line)s`` would match "Dec 12 11:19:11 dunnart dovecot: | |
| ". Note it matches the trailing space(s) as well. | |
| Substitutions (AKA string interpolations) | |
| ----------------------------------------- | |
| We have used string interpolations in above examples. They are useful for | |
| making the regexes more readable, reuse generic patterns in multiple failregex | |
| lines, and also to refer definition of regex parts to specific filters or even | |
| to the user. General principle is that value of a _name variable replaces | |
| occurrences of %(_name)s within the same section or anywhere in the config file | |
| if defined in [DEFAULT] section. | |
| Regular Expressions | |
| ------------------- | |
| Regular expressions (failregex, ignoreregex) assume that the date/time has been | |
| removed from the log line (this is just how fail2ban works internally ATM). | |
| If the format is like '<date...> error 1.2.3.4 is evil' then you need to match | |
| the < at the start so regex should be similar to '^<> <HOST> is evil$' using | |
| <HOST> where the IP/domain name appears in the log line. | |
| The following general rules apply to regular expressions: | |
| * ensure regexes start with a ^ and are as restrictive as possible. E.g. do not | |
| use .* if \d+ is sufficient; | |
| * use functionality of Python regexes defined in the standard Python re library | |
| http://docs.python.org/2/library/re.html; | |
| * make regular expressions readable (as much as possible). E.g. | |
| (?:...) represents a non-capturing regex but (...) is more readable, thus | |
| preferred. | |
| If you have only a basic knowledge of regular repressions we advise to read | |
| http://docs.python.org/2/library/re.html first. It doesn't take long and would | |
| remind you e.g. which characters you need to escape and which you don't. | |
| Developing/testing a regex | |
| -------------------------- | |
| You can develop a regex in a file or using command line depending on your | |
| preference. You can also use samples you have already created in the test cases | |
| or test them one at a time. | |
| The general tool for testing Fail2Ban regexes is fail2ban-regex. To see how to | |
| use it run:: | |
| ./fail2ban-regex --help | |
| Take note of -l heavydebug / -l debug and -v as they might be very useful. | |
| .. TIP:: | |
| Take a look at the source code of the application you are developing | |
| failregex for. You may see optional or extra log messages, or parts there | |
| of, that need to form part of your regex. It may also reveal how some | |
| parts are constrained and different formats depending on configuration or | |
| less common usages. | |
| .. TIP:: | |
| For looking through source code - http://sourcecodebrowser.com/ . It has | |
| call graphs and can browse different versions. | |
| .. TIP:: | |
| Some applications log spaces at the end. If you are not sure add \s*$ as | |
| the end part of the regex. | |
| If your regex is not matching, http://www.debuggex.com/?flavor=python can help | |
| to tune it. fail2ban-regex -D ... will present Debuggex URLs for the regexs | |
| and sample log files that you pass into it. | |
| In general use when using regex debuggers for generating fail2ban filters: | |
| * use regex from the ./fail2ban-regex output (to ensure all substitutions are | |
| done) | |
| * replace <HOST> with (?&.ipv4) | |
| * make sure that regex type set to Python | |
| * for the test data put your log output with the date/time removed | |
| When you have fixed the regex put it back into your filter file. | |
| Please spread the good word about Debuggex - Serge Toarca is kindly continuing | |
| its free availability to Open Source developers. | |
| Finishing up | |
| ------------ | |
| If you've added a new filter, add a new entry in config/jail.conf. The theory | |
| here is that a user will create a jail.local with [filtername]\nenable=true to | |
| enable your jail. | |
| So more specifically in the [filter] section in jail.conf: | |
| * ensure that you have "enabled = false" (users will enable as needed); | |
| * use "filter =" set to your filter name; | |
| * use a typical action to disable ports associated with the application; | |
| * set "logpath" to the usual location of application log file; | |
| * if the default findtime or bantime isn't appropriate to the filter, specify | |
| more appropriate choices (possibly with a brief comment line). | |
| Submit github pull request (See "Pull Requests" above) for | |
| github.com/fail2ban/fail2ban containing your great work. | |
| Filter Security | |
| =============== | |
| Poor filter regular expressions are susceptible to DoS attacks. | |
| When a remote user has the ability to introduce text that would match filter's | |
| failregex, while matching inserted text to the <HOST> part, they have the | |
| ability to deny any host they choose. | |
| So the <HOST> part must be anchored on text generated by the application, and | |
| not the user, to an extent sufficient to prevent user inserting the entire text | |
| matching this or any other failregex. | |
| Ideally filter regex should anchor at the beginning and at the end of log line. | |
| However as more applications log at the beginning than the end, anchoring the | |
| beginning is more important. If the log file used by the application is shared | |
| with other applications, like system logs, ensure the other application that use | |
| that log file do not log user generated text at the beginning of the line, or, | |
| if they do, ensure the regexes of the filter are sufficient to mitigate the risk | |
| of insertion. | |
| Examples of poor filters | |
| ------------------------ | |
| 1. Too restrictive | |
| We find a log message:: | |
| Apr-07-13 07:08:36 Invalid command fial2ban from 1.2.3.4 | |
| We make a failregex:: | |
| ^Invalid command \S+ from <HOST> | |
| Now think evil. The user does the command 'blah from 1.2.3.44' | |
| The program diligently logs:: | |
| Apr-07-13 07:08:36 Invalid command blah from 1.2.3.44 from 1.2.3.4 | |
| And fail2ban matches 1.2.3.44 as the IP that it ban. A DoS attack was successful. | |
| The fix here is that the command can be anything so .* is appropriate:: | |
| ^Invalid command .* from <HOST> | |
| Here the .* will match until the end of the string. Then realise it has more to | |
| match, i.e. "from <HOST>" and go back until it find this. Then it will ban | |
| 1.2.3.4 correctly. Since the <HOST> is always at the end, end the regex with a $:: | |
| ^Invalid command .* from <HOST>$ | |
| Note if we'd just had the expression:: | |
| ^Invalid command \S+ from <HOST>$ | |
| Then provided the user put a space in their command they would have never been | |
| banned. | |
| 2. Unanchored regex can match other user injected data | |
| From the Apache vulnerability CVE-2013-2178 | |
| ( original ref: https://vndh.net/note:fail2ban-089-denial-service ). | |
| An example bad regex for Apache:: | |
| failregex = [[]client <HOST>[]] user .* not found | |
| Since the user can do a get request on:: | |
| GET /[client%20192.168.0.1]%20user%20root%20not%20found HTTP/1.0 | |
| Host: remote.site | |
| Now the log line will be:: | |
| [Sat Jun 01 02:17:42 2013] [error] [client 192.168.33.1] File does not exist: /srv/http/site/[client 192.168.0.1] user root not found | |
| As this log line doesn't match other expressions hence it matches the above | |
| regex and blocks 192.168.33.1 as a denial of service from the HTTP requester. | |
| 3. Over greedy pattern matching | |
| From: https://github.com/fail2ban/fail2ban/pull/426 | |
| An example ssh log (simplified):: | |
| Sep 29 17:15:02 spaceman sshd[12946]: Failed password for user from 127.0.0.1 port 20000 ssh1: ruser remoteuser | |
| As we assume username can include anything including spaces its prudent to put | |
| .* here. The remote user can also exist as anything so lets not make assumptions again:: | |
| failregex = ^%(__prefix_line)sFailed \S+ for .* from <HOST>( port \d*)?( ssh\d+)?(: ruser .*)?$ | |
| So this works. The problem is if the .* after remote user is injected by the | |
| user to be 'from 1.2.3.4'. The resultant log line is:: | |
| Sep 29 17:15:02 spaceman sshd[12946]: Failed password for user from 127.0.0.1 port 20000 ssh1: ruser from 1.2.3.4 | |
| Testing with:: | |
| fail2ban-regex -v 'Sep 29 17:15:02 Failed password for user from 127.0.0.1 port 20000 ssh1: ruser from 1.2.3.4' '^ Failed \S+ for .* from <HOST>( port \d*)?( ssh\d+)?(: ruser .*)?$' | |
| .. TIP:: I've removed the bit that matches __prefix_line from the regex and log. | |
| Shows:: | |
| 1) [1] ^ Failed \S+ for .* from <HOST>( port \d*)?( ssh\d+)?(: ruser .*)?$ | |
| 1.2.3.4 Sun Sep 29 17:15:02 2013 | |
| It should of matched 127.0.0.1. So the first greedy part of the greedy regex | |
| matched until the end of the string. The was no "from <HOST>" so the regex | |
| engine worked backwards from the end of the string until this was matched. | |
| The result was that 1.2.3.4 was matched, injected by the user, and the wrong IP | |
| was banned. | |
| The solution here is to make the first .* non-greedy with .*?. Here it matches | |
| as little as required and the fail2ban-regex tool shows the output:: | |
| fail2ban-regex -v 'Sep 29 17:15:02 Failed password for user from 127.0.0.1 port 20000 ssh1: ruser from 1.2.3.4' '^ Failed \S+ for .*? from <HOST>( port \d*)?( ssh\d+)?(: ruser .*)?$' | |
| 1) [1] ^ Failed \S+ for .*? from <HOST>( port \d*)?( ssh\d+)?(: ruser .*)?$ | |
| 127.0.0.1 Sun Sep 29 17:15:02 2013 | |
| So the general case here is a log line that contains:: | |
| (fixed_data_1)<HOST>(fixed_data_2)(user_injectable_data) | |
| Where the regex that matches fixed_data_1 is gready and matches the entire | |
| string, before moving backwards and user_injectable_data can match the entire | |
| string. | |
| Another case | |
| ------------ | |
| ref: https://www.debuggex.com/r/CtAbeKMa2sDBEfA2/0 | |
| A webserver logs the following without URL escaping:: | |
| [error] 2865#0: *66647 user "xyz" was not found in "/file", client: 1.2.3.1, server: www.host.com, request: "GET ", client: 3.2.1.1, server: fake.com, request: "GET exploited HTTP/3.3", host: "injected.host", host: "www.myhost.com" | |
| regex:: | |
| failregex = ^ \[error\] \d+#\d+: \*\d+ user "\S+":? (?:password mismatch|was not found in ".*"), client: <HOST>, server: \S+, request: "\S+ .+ HTTP/\d+\.\d+", host: "\S+" | |
| The .* matches to the end of the string. Finds that it can't continue to match | |
| ", client ... so it moves from the back and find that the user injected web URL:: | |
| ", client: 3.2.1.1, server: fake.com, request: "GET exploited HTTP/3.3", host: "injected.host | |
| In this case there is a fixed host: "www.myhost.com" at the end so the solution | |
| is to anchor the regex at the end with a $. | |
| If this wasn't the case then first .* needed to be made so it didn't capture | |
| beyond <HOST>. | |
| 4. Application generates two identical log messages with different meanings | |
| If the application generates the following two messages under different | |
| circumstances:: | |
| client <IP>: authentication failed | |
| client <USER>: authentication failed | |
| Then it's obvious that a regex of ``^client <HOST>: authentication | |
| failed$`` will still cause problems if the user can trigger the second | |
| log message with a <USER> of 123.1.1.1. | |
| Here there's nothing to do except request/change the application so it logs | |
| messages differently. | |