As @reuteran discussed with you during the Hackathon, we want to extend BGPstream to support RPKI Prefix Origin Validation. The first step is validation of live data. Before we submit a PR, we want to clarify two questions.
To store the ROA payload for validated prefixes, we need a data structure that contains all origin ASNs and prefixes of the covering ROAs.
My current solution consists of two structs implementing two nested dynamic arrays that contain origin ASNs and the corresponding prefixes (https://github.com/swp16/bgpstream/blob/master/lib/bgpstream_elem.h#L115).
Since it does not use existing data structures there are some functions necessary (https://github.com/swp16/bgpstream/blob/master/lib/bgpstream_elem.c#L426).
My question is if you prefer this solution or if I should extend an existing data structure (which?)?
The current output of BGP Elem Format should be extended to show ROA data. My proposal
where ROA-payload separates multiple ROAs by ; and splits origin ASN and prefix(es) by ",", e.g.,
What do you think?
Ping @alistairking @chiaraorsini?
@waehlisch thanks for the reminder about this, and sorry for the delay in getting back to you.
@salsh thanks for working on this. We really appreciate your time and effort to make BGPStream better.
We've had a chat about how best to integrate this into BGPStream and I have a few comments.
Firstly, let me give you an idea of where we are going at an architectural level. We would like to (by the time this code is part of a BGPStream release) have the concept of an "annotation" for elems. These would be things (like your data) that are not extracted directly from BGP, but are instead computed/derived based on the BGP data (geolocation is another example). We would have a "plugin"-like API to make adding annotation providers (e.g. RTRlib) easy. The annotation API would approximately include functions to:
In this model, annotation providers would be optional both at compile time (and disabled by default if they depend on external libraries), and at run time (disabled unless explicitly enabled before the stream starts).
If anyone has any comments/suggestions about this model, we'd love to hear them.
We're definitely not asking/requiring you to implement this framework before you submit a pull request. Instead, we'll make a couple of suggestions as to how you could structure your code to make the transition to this framework easier:
As another comment, please carefully consider the names of your data structures. Currently you have bgpstream_elem_valid_asn_t and bgpstream_elem_asn_t. These are really misleading for users (they don't mention RPKI/RTRlib at all). I suggest something like bgpstream_rtrlib_result_t, or bgpstream_rpki_validation_result_, or some combination of the two (I'm not familiar with the RPKI/ROA field enough to know exactly what the result represents).
Lastly, you asked about the elem output format. I think for now your proposed format is fine. We may consider revising it depending on the exact details of how the annotation framework is implemented.
Actually, the annotation concept sounds good to me!
@alistairking thank you very much for your review and the effort. The annotation approach sounds pretty good. I'm working on your suggestions and will push it as soon as possible.
Live PRKI Origin Validation Annotation
- The BGPStream will be extended by Live PRKI Origin Validation
- All details concerning the provided functions, annotation elements
and output format are described in issue CAIDA/bgpstream#19