Live PRKI Origin Validation #19

Open
salsh opened this Issue Apr 19, 2016 · 4 comments

Projects

None yet

3 participants

@salsh
Contributor
salsh commented Apr 19, 2016

As @reuteran discussed with you during the Hackathon, we want to extend BGPstream to support RPKI Prefix Origin Validation. The first step is validation of live data. Before we submit a PR, we want to clarify two questions.

Data structure

To store the ROA payload for validated prefixes, we need a data structure that contains all origin ASNs and prefixes of the covering ROAs.

My current solution consists of two structs implementing two nested dynamic arrays that contain origin ASNs and the corresponding prefixes (https://github.com/swp16/bgpstream/blob/master/lib/bgpstream_elem.h#L115).
Since it does not use existing data structures there are some functions necessary (https://github.com/swp16/bgpstream/blob/master/lib/bgpstream_elem.c#L426).

My question is if you prefer this solution or if I should extend an existing data structure (which?)?

Output format

The current output of BGP Elem Format should be extended to show ROA data. My proposal

<dump-type>|<elem-type>|<record-ts>|<project>|<collector>|<peer-ASn>|<peer-IP>|<prefix>|<next-hop-IP>|<AS-path>|<origin-AS>|<communities>|<old-state>|<new-state>|<ROA-payload>

where ROA-payload separates multiple ROAs by ; and splits origin ASN and prefix(es) by ",", e.g.,

  • ASN1,PFX1.1 PFX1.2 PFX1.3;ASN2,PFX2.1 PFX2.2 PFX2.3

What do you think?

@alistairking
Member

@waehlisch thanks for the reminder about this, and sorry for the delay in getting back to you.
@salsh thanks for working on this. We really appreciate your time and effort to make BGPStream better.

We've had a chat about how best to integrate this into BGPStream and I have a few comments.

Background
Firstly, let me give you an idea of where we are going at an architectural level. We would like to (by the time this code is part of a BGPStream release) have the concept of an "annotation" for elems. These would be things (like your data) that are not extracted directly from BGP, but are instead computed/derived based on the BGP data (geolocation is another example). We would have a "plugin"-like API to make adding annotation providers (e.g. RTRlib) easy. The annotation API would approximately include functions to:

  • initialize an annotation provider (passing any needed config information). This function would be called once before starting the stream. It would be something like: bgpstream_enable_annotation_provider(BGPSTREAM_ANNOTATION_PROVIDER_RTRLIB, "<options>")
  • request the result of an annotation given an elem. This would be an accessor on the elem. Something like: bgpstream_elem_get_annotation(BGPSTREAM_ANNOTATION_PROVIDER_RTRLIB). Internally, the annotation framework would check a cache in the elem (encapsulated in an opaque structure) to see if the annotation has already been computed, if it has, it would be directly returned, if not, the annotation provider plugin would be asked to compute the annotation. The advantage of this design is that annotations are only computed when needed, but results are cached once they are computed.

In this model, annotation providers would be optional both at compile time (and disabled by default if they depend on external libraries), and at run time (disabled unless explicitly enabled before the stream starts).

If anyone has any comments/suggestions about this model, we'd love to hear them.

Reality
We're definitely not asking/requiring you to implement this framework before you submit a pull request. Instead, we'll make a couple of suggestions as to how you could structure your code to make the transition to this framework easier:

  1. Please create an additional bgpstream_elem_annotations_t structure in bgpstream_elem.h, and add an instance of it to the bgpstream_elem_t structure.
  2. Please move your "result" reference from the elem structure to the new annotations structure.
  3. Please move the functions that you created for manipulating your result objects into an appropriate class in utils (think of your result object as being similar to a bgpstream_as_path_t, or bgpstream_community_set_t).
  4. [Bonus points] Instead of populating the result value automatically for every elem, you may want to add an accessor function (e.g. bgpstream_elem_get_XXX) which does the validation unless it is already complete.

As another comment, please carefully consider the names of your data structures. Currently you have bgpstream_elem_valid_asn_t and bgpstream_elem_asn_t. These are really misleading for users (they don't mention RPKI/RTRlib at all). I suggest something like bgpstream_rtrlib_result_t, or bgpstream_rpki_validation_result_, or some combination of the two (I'm not familiar with the RPKI/ROA field enough to know exactly what the result represents).

Lastly, you asked about the elem output format. I think for now your proposed format is fine. We may consider revising it depending on the exact details of how the annotation framework is implemented.

@waehlisch

Actually, the annotation concept sounds good to me!

@salsh
Contributor
salsh commented May 4, 2016

@alistairking thank you very much for your review and the effort. The annotation approach sounds pretty good. I'm working on your suggestions and will push it as soon as possible.

@salsh salsh added a commit to swp16-final/bgpstream that referenced this issue Jul 11, 2016
@salsh salsh Live PRKI Origin Validation Annotation
- The BGPStream will be extended by Live PRKI Origin Validation
Annotation

- All details concerning the provided functions, annotation elements
and output format are described in issue CAIDA/bgpstream#19
d9ad3cc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment