Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NMS-13373: add CortexIo #31

Merged
merged 1 commit into from
Sep 8, 2021
Merged

NMS-13373: add CortexIo #31

merged 1 commit into from
Sep 8, 2021

Conversation

swachter
Copy link
Contributor

@swachter swachter commented Jul 19, 2021

Issues:

This PR

  • adds a new submodule that provides a Cortex sink
  • allows to configure Nephron's pipeline to output to Cortex
  • allows to configure the pipeline to accumulate flow summaries in-depend of the use of Cortex. In that case other sinks also benefit from a reduced number of generated flow summaries
  • extends the benchmark application to run Nephron's pipeline over different combinations of argument settings
  • add an option (cortexConsideredHosts) that allow to specify if / what samples for aggregations for hosts are written.

@swachter swachter requested review from fooker and j-white July 19, 2021 14:43
@swachter swachter force-pushed the jira/NMS-13373 branch 2 times, most recently from be69d56 to 9bfef59 Compare August 3, 2021 07:30
@swachter swachter marked this pull request as ready for review August 3, 2021 16:33
@swachter
Copy link
Contributor Author

swachter commented Aug 3, 2021

Implemented storage scheme as supposed in NMS-13374.

@j-white
Copy link

j-white commented Aug 12, 2021

Can we include the storage schema described in https://issues.opennms.org/browse/NMS-13374 as part of the README, or documentation in this project?

@swachter
Copy link
Contributor Author

I added a separate persistence.md document. I did not (yet) document the ES and Kafka output, though. Let me know if I should add sections for these, too.

Comment on lines 65 to 63
@Description("Regular expression for selecting hosts that are included in aggregations for hosts and possibly conversations.")
@Default.String("")
String getCortexConsideredHosts();
void setCortexConsideredHosts(String value);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the semantics of this filter? "Only persist entries matching this filter out of the top-k over all hosts" vs "Perists the top-k of the hosts matching this filter"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The filter is applied after "topK" calculation. Results stored in ES are not affected.

(In the meantime I found that there is an IPAddressRange class in OpenNMS. Maybe that should be used to filter hosts instead of a RegEx.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The filter is applied after "topK" calculation. Results stored in ES are not affected.

Makes sense. Follow up question: is this filter matching IPs only or does it match hostnames, too?
Maybe this needs some rewording: including into aggregations sounds like filter before aggregation. Maybe a native speaker could help out?

(In the meantime I found that there is an IPAddressRange class in OpenNMS. Maybe that should be used to filter hosts instead of a RegEx.)

It would. 10.0.1.15 - 10.0.2.12 is super hard to express as an regex.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I switched to using the IPAddressRange machinery of OpenNMS. This now also supports CIDR notation.

The IPAddressRange machinery boils down to using InetAddress.getByName(dottedNotation). Therefore, I think that host names can also be used when specifying the filter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The IPAddressRange machinery boils down to using InetAddress.getByName(dottedNotation). Therefore, I think that host names can also be used when specifying the filter.

Not exactly hat I was asking: If I specify a hostname in this parameter, will this hostname be used to match the hostnames that nephron is transporting besides the IPs of a flow?

Hostname resolution is something we should avoid (or at least not talk about) as it is done on the wrong system.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The filter checks the IP addresses of flows. Host names in the cortexConsideredHosts argument are resolved on the machine where Nephron is started. You are right, that might not always be what is expected. We can prohibit the use of host names if you prefer.

BTW: The flow classification engine that assigns applications to flows also supports host names in its rules. In that case host names are resolved where the classification engine runs. This may also be surprising.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW: The flow classification engine that assigns applications to flows also supports host names in its rules. In that case host names are resolved where the classification engine runs. This may also be surprising.

Doubt. Maybe the constructor allows this. But it is prevented in the frontend.

We can prohibit the use of host names if you prefer.

This is not about what I prefer but what makes most sense. So, let's have a discussion by exchanging arguments. If there are no arguments and this is just an oversight - go ahead and change it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied the corresponding address validation from OpenNMS into Nephron. Host names are no more valid.

main/src/main/java/org/opennms/nephron/CortexOptions.java Outdated Show resolved Hide resolved
@swachter swachter requested a review from fooker August 17, 2021 09:22
cortex/pom.xml Show resolved Hide resolved
main/src/main/java/org/opennms/nephron/Pipeline.java Outdated Show resolved Hide resolved
main/src/main/java/org/opennms/nephron/CortexOptions.java Outdated Show resolved Hide resolved
@swachter swachter requested a review from fooker September 7, 2021 06:45
Copy link
Member

@fooker fooker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make sure that the generated code contains the license header, too.

NMS-13452: GC global state of CortexIo write transform
@swachter swachter merged commit bdd3714 into master Sep 8, 2021
@swachter swachter deleted the jira/NMS-13373 branch September 8, 2021 14:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants