-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue filtering with new GeoIP2 module #3109
Comments
Hello, I'm having the same issue. fluent bit v1.7 deployed as a daemonset on kubernetes. maxmind database is hosted on a persistent volume and mounted to the pods. The DB is managed and updated by Maxmind's geoipupdate tool, also deployed on the cluster (image maxmindinc/geoipupdate). Mounting the volume on a pod with a terminal I can confirm that the DB files are availables. Here is the configmap storing the fluent config.
The daemonset manifest:
Volume manifest:
Not shown on the screenshot but the field traefik.ClientHost used in the config filter-geoip.conf Lookup_key contains the client IP (X-Forwarded-For/X-Real-Ip) |
editing last remark - I made a stupid mistake in configuration where I didn't specify the lookup key in the Record field and now this is working as expected for me. @frenchviking I'm wondering if you still need the @proffalken are you seeing any errors in the log file? Potentially not being able to read the DB? Configuration:
JSON record {"date":1614192785.0,"host":"24.32.25.22","user":"-","method":"GET","path":"/apps/cart.jsp?appID=3790","code":"200","size":"4968","referer":"http://carey.info/","agent":"Mozilla/5.0 (Macintosh; U; PPC Mac OS X 10_5_0) AppleWebKit/5351 (KHTML, like Gecko) Chrome/15.0.814.0 Safari/5351","country":"United States","isocode":"US","city":"Truckee","latitude":39.3385,"longitude":-120.1729,"postal_code":"96161","region-code":"CA","region-name":null} |
@agup006 The only issue the logs is a
Sending the data to STDOUT suggests that this is due to the I'm wondering if it's to do with where I'm calling the parser - I notice that you're calling the "apache" parser in the input and then matching against that, I'm matching against the SYSLOG parser, and then filtering against the IP Tables parser I created, and then filtering again against the GeopIP2 parser. My assumption was that data would flow through the filters from top to bottom, but I'm now wondering if this assumption is correct? My config is as follows:
The output from the above is a blank |
@agup006 I tried the lookup_key with or without the "traefik" prefix with no success. Also I changed the processing order of my filters as follow:
If the filters are parsed and events processed in this order, all records should have the traefik prefix so now I'm using
Still empty fields and no warn/error output. Since its working for you I guess there is a misconfiguration on my side. EDIT: I must specify that I need the traefik prefix because my original record contains a "log" fields with the nested json containing the ClientHost field : |
Digging with log_level debug. Weird event while reading the log containing my traefik events : While I can see that my filters 1 & 2 are loaded I don't see any record regarding the third filter, the geoip one.
Since the records are added the filter is loaded. I'm in a dead end. For now. |
Another thought @frenchviking is that its a nested JSON and that the lookup key might need to reflect a nested structure |
Thanks for this @agup006 , always nice to know that your assumptions are valid! 🤣 The output without the geoip module is as follows:
I can see the |
So I have commented out all the kubernetes input and filter, and the elasticsearch output, replaced by a stdout config. This is what comes out (reformatted for reading):
Everything looking good isn't it ? Here is the version with the geoip filter :
|
@frenchviking, thanks for providing that info -the main reason I think this is failing is that I'm unsure if the GeoIP2 filter supports record accessor which would use the following for the lookup key: |
ok, shotgun-debugging has led me to the following: The "fault" is raised at https://github.com/fluent/fluent-bit/blob/master/plugins/filter_geoip2/geoip2.c#L259, however I think it's actually failing at https://github.com/fluent/fluent-bit/blob/master/plugins/filter_geoip2/geoip2.c#L233 because the My C is very rusty, but I'm wondering if there's a way to get more debug statements both in here and in the libmaxmind code to help us troubleshoot this? For example, if I add the following code at lines 220 and 260 accordingly, I get the output below:
output
Note that I have updated my config file to ship dummy data as follows (this now mirrors the example):
And the visualiser (amazing tool btw!) shows the following (link.calyptia.com/wbv): |
OK, I've got back to the beginning. FluentD 1.7.1 with the configuration from the docs works, I've no idea why it didn't:
Output:
As soon as I update it to the desired config with the IP Tables filter, it stops for everything apart from my own public IP Address:
** output ** (grepping for the external interface):
If I run the mmdblookup tool against the other source IP, it finds the data:
|
So I tried the following for my geoip2 filter. One with "log" value in case it was not replaced by the merge log key, and the one with it.
and
Still no luck. And I'm not able to have the Newt filter lift keys up with the following :
I tested it with only the tail INPUT and then the nest FILTER. Could it be because my original json has backslashes ? |
@proffalken The only difference for the 1.7.1 and previous config looks like the path between the DB. If we take one of the JSON outputs from the IP Tables parser and set that as the new |
@frenchviking potentially, we could use another filter to convert escaped JSON into JSON before sending off to the GeoIP filter. https://docs.fluentbit.io/manual/pipeline/parsers/decoders#getting-started |
@agup006 you may have stumbled on something here - I don't seem to get the output from the iptables parser as JSON, I get it as what looks like a hash?
This is with the following config:
I'm now wondering if I'm configuring the filter correctly given that the key/value pairs are separated by |
I finally got it working with a specific FILTER parser and specifying the "log" Key_Name.
Thank you for your help on this ! |
@agup006 after our chat last night I've tried adding in an extra filter, and now it's working - I'd love to know why!
This results in the following:
I'll run this as a test config for the next few days and see what happens - thanks again for all your help! |
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
This issue was closed because it has been stalled for 5 days with no activity. |
@agup006 could you please respond to the why question? The workaround works, but it's ugly to do for all fields |
Bug Report
Describe the bug
When the GeoIP2 filter is enabled, I cannot get fluent-bit to augment my logs in the way I would expect.
To Reproduce
Configure Fluent-bit as follows:
Expected behavior
All log data should be augmented with
fb_<field name>
with the values populated as appropriate.Screenshots
Note: Fields circled in red are from Fluent-bit, fields circled in green are from
fluentd
with the GeoIP Filter. Both solutions point to the sameGEOIP2city.mmdb
file, and use the filteredsource
field as the IP address to lookup against.Your Environment
Additional context
Trying to get the same functionality from Fluent-bit that I currently get from Fluentd to remove Fluentd from my logging stack!
The text was updated successfully, but these errors were encountered: