Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: prepare data structure to fetch prefix len/net mask from BMP #676

Merged
merged 6 commits into from
May 27, 2023

Conversation

margau
Copy link
Contributor

@margau margau commented May 5, 2023

Currently, the NetMask for Source/Destination is not set from the BMP component, but instead directly when parsing the flow.
It might be wanted to set the NetMask as result of the BMP Lookup.

This PR does the following changes:

  • do not set the NetMask fields in protobuf when parsing the flows, but prepare the fields into a flow message
  • add config options where the NetMask should come from
  • overwrite the netmask in the flow during flow enrichment if configured
  • append the netmask to protobuf during final conversion of the FlowMessage struct to protobuf

@vincentbernat
Copy link
Member

It would need to be configurable for people getting this information from multiple sources (BMP and flow), similar to what is done for the AS number.

@margau margau force-pushed the feature/netmask-preparation branch from a9b9e24 to 6b31237 Compare May 5, 2023 14:59
@margau
Copy link
Contributor Author

margau commented May 5, 2023

I've implemented an first draft for that option.
No tests have been implemented yet, I'm also not sure about the 0 net mask as default, because that would indicate a default route being used, which is a valid choice/selection.

@vincentbernat
Copy link
Member

I think 0 is fine to ignore. If the default route matches, it's like we don't really have the prefix length.

@margau margau force-pushed the feature/netmask-preparation branch from 6b31237 to 1a49665 Compare May 8, 2023 06:00
inlet/bmp/lookup.go Outdated Show resolved Hide resolved
@margau margau force-pushed the feature/netmask-preparation branch from 1a49665 to d887f35 Compare May 8, 2023 06:12
@margau margau requested a review from vincentbernat May 8, 2023 08:12
@margau
Copy link
Contributor Author

margau commented May 8, 2023

Tests are there, should be ready from my side.

inlet/bmp/lookup.go Outdated Show resolved Hide resolved
@vincentbernat
Copy link
Member

You don't seem to use data from BMP yet? The current changes look good.

@margau
Copy link
Contributor Author

margau commented May 8, 2023

You don't seem to use data from BMP yet? The current changes look good.

We're currently experimenting with a local bmp setup, which also returns a bmp.LookupResult.

@margau margau force-pushed the feature/netmask-preparation branch from 155cf21 to 17209df Compare May 8, 2023 08:43
@margau margau marked this pull request as draft May 8, 2023 13:35
@margau
Copy link
Contributor Author

margau commented May 8, 2023

I think 0 is fine to ignore. If the default route matches, it's like we don't really have the prefix length.

We've discussed this again, and would prefer if there is a certain way to differ between "we have no route information for this flow" and "this flow was sent out the default route".
What do you think about setting the NetMask by default to a value >= 129, and filter on the query side if we have a prefix (0-128) or the prefix "None" (Mask >=129)?

@vincentbernat
Copy link
Member

If you have a default route, you'll never get the second case, as the BMP component will fallback to the default route. Testing if != 255 is an additional operation ClickHouse will have to do when working with network prefixes, making it slower.

@margau
Copy link
Contributor Author

margau commented May 8, 2023

The BMP component should either report "no prefix" (=255) in this case, or have an LPM result (once implemented). The LPM result could be an default route, e.g. for non-fulltable installations.
But on the other hand, if I have e.g. a sflow message without net mask, this would currently result in an implicit default route, which could be clearly wrong, if there is a more specific that was not transmitted with the router.
The query performance is another topic, but as the query already has an conditional branching, I'd say thats okay.

@vincentbernat
Copy link
Member

The BMP component will fallback to any matching prefix, whatever this is the right router or the right nexthop. As soon as you have a default route somewhere, it will be matched. I don't think this is worth adding a second conditional (a test that would run on all rows for everybody).

@tobikris
Copy link

tobikris commented May 8, 2023

I totally understand your point about the performance impact - this sounds like a problem we do not want to create. However, I do not understand the idea behind enriching the flow with any matching data instead of making sure that the "real"/original details are used. I currently do not have any specific implementation in mind, but it seems worthwhile to try and find one solving the correctness issue at hand while minimizing the performance impact.
Do you agree?

@vincentbernat
Copy link
Member

Is the correctness issue about unknown prefix length or about BMP falling back to any available route? The later is intentional as it seems better to have an info that is likely to be correct (on the paper, routers in the same AS should share the same routing table) than having the info missing. This could be made configurable.

@tobikris
Copy link

tobikris commented May 9, 2023

I think a config option for this behavior would be useful. Even better would be to store the information about falling back in some kind of "uncertainty flag", I think. This would allow to differentiate between strictly correct and hopefully correct data after the recording.

However, I have the impression that this is only related the current MR and does not really solve my concern here. I would rather handle the failing prefix lookup explicitly instead of relying on some other component/behavior to ensure useful data.

Do we have any option where we differentiate between lookup failures and falling back to the default values for the data type?

@margau margau marked this pull request as ready for review May 9, 2023 13:28
@margau
Copy link
Contributor Author

margau commented May 9, 2023

For this MR, I'd propose to continue as it is now, using 0 (implicit default route) as the default/fallback value.

@margau margau requested a review from vincentbernat May 9, 2023 13:29
@vincentbernat
Copy link
Member

No, there is no such mechanism. BGP being a distributed system, strictness seems a goal that wouldn't work anyway. A route may disappear a reappear a few seconds later and only the router would know the exact route used at the moment the flow was generated. The fallback makes it work for most people instead of having an option that makes the setup not work and me handling the support for this.

Copy link
Member

@vincentbernat vincentbernat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Little point in accepting this PR right now: it needs to implement the TODO part.

@margau
Copy link
Contributor Author

margau commented May 16, 2023

Little point in accepting this PR right now: it needs to implement the TODO part.

I have implemented the prefix length/netmask to the BMP component, however, due to the lack of a proper BMP testing setup, this is not fully verified yet.

@margau margau requested a review from vincentbernat May 22, 2023 12:28
@vincentbernat vincentbernat merged commit 694a40a into akvorado:main May 27, 2023
@vincentbernat
Copy link
Member

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants