Normalization Effort: Part 1: Remove string ip fields **WIP** #734
Conversation
There are alerts and visualizations that use the string version especially for aggregations. I don't see a need to do this since it just reduces the flexibility and leaves us open to issues in ES where they misinterpret an IP (happened before, especially in aggregations). Lets discuss before further work. |
@jeffbryner FYI, there are no alerts that actually use this field, they all use sourceipaddress or destinationipaddress. As mentioned in my OP ES now supports both versions of IP addresses (link for reference: https://www.elastic.co/blog/indexing-ipv6-addresses-in-elasticsearch), thus the issue seen previously should no longer be a problem. We can handle what version an IP is within the code without having to explode our mapping. |
Might not be alerts, but there is definitely other code using these fields. Again, I don't see a need but I do see lots of regression testing that would be necessary. |
I completely agree on that point. I'll add further test cases to the OP. |
Hey, caught up with Tristan on this. My apologies for not realizing this is part of the normalization effort, I should have looked for context before commenting. https://github.com/mozilla/MozDef/search?q=ipaddress&unscoped_q=ipaddress may be a useful start at all the places that may be assuming string/ipv4. In particular we should be cautious about offering up an ipv6 to a routine that is not expecting it. |
I do understand the concern, and do plan on ensuring everything is fully tested and scoped before any of this is implemented. I was using vscode to search for occurrances of "ipv" in our repo. |
Going to close this PR for now, until Kibana adds a method for Aggregating on ipv6 IPs (they only aggregate on IPv4) we can't remove the text fields as it will cause some problems. |
Do not merge!
Purpose of this change:
With ES 5.x the ip field type now supports both ipv4 and ipv6. This reduces the need to add these additional fields as string fields. This change will reduce the 6 field types we currently use to track IP information to only 2.
Left to do:
Test:
Testing Completed:
Number of Alerts that depend on the removed fields: 0
Files with ipv4 or ipv6 strings:
It seems like we can still use netaddr to validate if an IP is ipv4 or ipv6 in these scripts, but they don't necessarily need to be fields within the mapping.
New Finding:
Seems like aggregation of ipv6 is supported in ES, but not in Kibana using the IP range bucket.
To aggregate the ip field you'd need to use "Terms", but "significant terms" will omit the IP field.
I created a post on their site to get more information around this.
https://discuss.elastic.co/t/no-ipv6-range-aggregation-bucket/144928