Reducing Configuration Representation Boilerplate #346

XAMPPRocky · 2021-07-22T04:51:41Z

Currently most of the boilerplate in writing a filter, is writing a configuration that is statically and dynamically configurable. Right now, you essentially have to write twice, once in Rust and another in Protobuf. I'd like to propose that we try to generate one these representations from the other, so that we can much more easily iterate and add to a filter's configuration. This would come in the form of a tool that is either run separately, or as part of the build process.

Protobuf -> Rust
- The issue with this route is that the protobuf parsers in Rust, aren't at the same level quality as the Rust parser libraries. So we'd have to put more work in the parser than I think would be worth it.
Rust -> Protobuf
- I think this the most realistic both in terms of the implementation as well as flexibility, as we can do a lot more in Rust than we can in Protobuf.

One way we could do this is have a procedural macro for the configuration struct, that generates protobuf files from their type definitions, and then dump the generated definition into the folder, so you'd have to run cargo build after changing a configuration struct to generate the update protobuf, but that seems reasonable and in return we wouldn't have to manually edit protobuf definitions.

// src/lib.rs
quilkin::generate_config_protobufs!("proto/quilkin");

// src/filters/capture_bytes.rs

#[derive(quilkin::ConfigEnum)]
enum Strategy {
    #[quilkin(rename = "PREFIX")]
    /// Looks for the set of bytes at the beginning of the packet
    Prefix = 0,
    #[quilkin(rename = "SUFFIX")]
    /// Look for the set of bytes at the end of the packet
    Suffix = 1,
}

#[derive(quilkin::Config)]
struct Config {
    #[quilkin(default)]
    strategy: Strategy,
    /// the number of bytes to capture
    #[quilkin(rename = "size")]
    size: usize,
    /// the key to use when storing the captured bytes in the filter context
    #[quilkin(rename = "metadataKey")]
    #[quilkin(default = "default_metadata_key")]
    metadata_key: String,
    /// whether or not to remove the set of the bytes from the packet once captured
    #[quilkin(default = "default_remove")]
    remove: bool,
}

markmandel · 2021-07-26T21:03:47Z

I love this idea, because the double definition of models for each config type annoys me no end. 🥳

I'm also on the side of Rust -> Protobuf - the other thing we could do is have some opinionated defaults on how the protobuf gets generated (or maybe even start with not even letting people have the option).

For example, above we have #[quilkin(rename = "metadataKey")] - but if we have strong opinions on how the protobuf should be generated, and we would be doing code reviews to ensure that people align with that standard - we might not want to give them the option to do anything other than convert metadata_key ➡️ metadataKey ... or maybe a better idea is that we could build in linting tools into the macro that would let you write snake_case -- or something like that 😄 .

🤔 I might lean towards this being part of our build.rs than a tool that is manually run, since it will get rebuilt automatically on each compile, and there is less for new developers to do -- but we could also build it into our CI pipeline to run the tool and see if there is any difference in the .proto files (which helps us, but not others that use this as a library 🤔 ).

What are your thoughts on build.rs vs manual tool?

XAMPPRocky · 2021-07-27T07:38:58Z

Thinking on it more, I think it makes sense to run on every build. Also to clarify, I don't think this would run in build.rs, we should run this as a proc macro, as it makes it easier to catch all of the configuration in the crate. Like the above syntax should work as-is with this macro IMO.

the other thing we could do is have some opinionated defaults on how the protobuf gets generated (or maybe even start with not even letting people have the option).

I don't have any opinions on Protobuf style (yet 😄) as this is my first production project using it. But I do think we should try to use or have some convention that we follow and make that the easy path. I don't see a problem with having a rename attribute as an escape hatch if you need it but it should be the uncommon case.

XAMPPRocky · 2021-08-11T09:34:16Z

Working #358 I've realised that this problem is actually a bit bigger than the filter configuration, as there are types we want to use in configurations as well as use in filters such as the Endpoint type. With the addition of #360 now requiring a Go representation of the configuration, as well as #13 requiring a representation that can safely used across a VM/language barrier.

This brings us up to five potential different representations of a type (Rust, Go, Protobuf, YAML/JSON, FFI). Implementing all of those for each type would be a lot of work, and is not really sustainable to maintain. What I'd like to propose is instead that we create a Quilkin data model layer between the types and the formats (similar to how serde works), that essentially represents the same types as JSON/YAML, but we can use this layer to have all types that implement Quilkin's data model have a single implementation and generate the five different representations from this data model. This would also make it easier to replace, add, or remove formats in the future as new formats only need to implement translating from Quilkin's data model to their format.

markmandel · 2021-08-12T19:20:50Z

What I'd like to propose is instead that we create a Quilkin data model layer between the types and the formats (similar to how serde works), that essentially represents the same types as JSON/YAML, but we can use this layer to have all types that implement Quilkin's data model have a single implementation and generate the five different representations from this data model.

To confirm - I'm assuming this would still be a series of macros (much like serde/what is above) that would be applied to our config data models as well as our Filter config, and users that write their own custom models as well.

This makes a lot of sense to me.

Small thing:

This brings us up to five potential different representations of a type (Rust, Go, Protobuf, YAML/JSON, FFI)

I think the Go representation would come from the generated proto, so I don't think we need to concern ourselves with that layer in Quilkin.

@iffyio Would this affect the xDS layer at all? Make that more complicated / easier / the same? I can't think of any concerns, but figured I would ask.

iffyio · 2021-08-13T07:26:19Z

@XAMPPRocky could you give some pseudocode+config examples of what this would look like? I'm not too sure I understand what's being proposed beyond the rust->protobuf side of things unfortunately. My current assumption is that we still specify a config as a struct then the macro generates pretty much all formats or maybe only some formats (if e.g the go would still come from protobuf)? Then where does the quilkin data model layer and translation between layers fit in (is it the macro also generating code that translate between any two layers)? Or is this something else entirely? I don't think I have a picture of it yet unfortunately.

XAMPPRocky · 2021-08-13T11:16:28Z

I think the Go representation would come from the generated proto, so I don't think we need to concern ourselves with that layer in Quilkin.

Looking at #360 there's types here that aren't part of the protobuf that we need to generate.

quilkin/examples/xds-management-server/pkg/resources/resources.go

Lines 23 to 49 in 2787e96

    
           // Endpoint represents an xds endpoint 
        
           type Endpoint struct { 
        
           	IP       string 
        
           	Port     int 
        
           	Metadata map[string]interface{} 
        
           } 
        
           // Cluster represents an xds cluster 
        
           type Cluster struct { 
        
           	Endpoints []Endpoint 
        
           } 
        
           // FilterConfig represents a filter's config 
        
           type FilterConfig struct { 
        
           	Name string 
        
           	Config interface{} 
        
           } 
        
           // Resources represents an xds resource. 
        
           type Resources struct { 
        
           	Clusters map[string]Cluster 
        
           	FilterChain []FilterConfig 
        
           } 
        
           type debugFilterConfig struct { 
        
           	Id string 
        
           }

could you give some pseudocode+config examples of what this would look like?

Sure, the configuration as it stands today should be entirely the same, the only things we're changing is what code is generated, and how we encode/decode to a format like YAML or Protobuf. At compile-time, the macro generates three representations of the config. Protobuf schema, Go, and Rust. The Rust generated code uses a layer between the type and the runtime format, so that we can decouple the types from the format.

I've written a basic example of what this could look like for boolean types, in reality we'd have probably at least the same kind of types as YAML. While this pattern has a bit of boilerplate it's incredibly powerful, and hopefully it shows enough to demonstrate how we can take advantage of this kind of layout going forward. We only implement formats in terms of Quilkin's layer, which is then transformed into the desired format. This design means that now whenever we have any bool like type, we just call serialise_bool on it, and through the traits we now have a single implementation of that type that works with JSON, YAML, and Protobuf. This also means we can add and remove formats as we want without also having to refactor the types in the codebase, you just add or remove the Serializer/Deserializer implementations and they support all types that implement the traits.

// Abstract traits that represent encoding and decoding a type into Quilkin's "data layer".
pub trait Serializer {
    fn serialize_bool(&mut self, value: bool) -> Result<()>;
}

pub trait Deserializer {
    fn deserialize_bool(&mut self) -> Result<bool>;
}

pub trait Serialize {
    fn encode<S>(&self, encoder: &mut S) -> Result<()> where S: Serializer;
}

pub trait Deserialize {
    fn decode<D>(decoder: &mut D) -> Result<Self> where D: Deserializer;
}

// Implementations for types

impl Deserialize for bool {
    fn decode<D>(decoder: &mut D) -> Result<Self> where D: Deserializer {
        decoder.decode_bool()
    }
}

impl Serialize for bool {
    fn encode<S>(&self, encoder: &mut S) -> Result<()> where S: Serializer {
        encoder.encode_bool(self)
    }
}

// Serializer formats

struct JsonEncoder {
    ser: serde_json::Serializer,
}

struct YamlEncoder {
    ser: serde_yaml::Serializer,
}

struct ProtobufEncoder {
    buf: Vec<u8>
}

impl Serializer for Json {
    fn serialize_bool(&mut self, value: bool) -> Result<()> {
         self.ser.encode_bool(value)
    }
}

impl Serializer for Yaml {
    fn serialize_bool(&mut self, value: bool) -> Result<()> {
         self.ser.encode_bool(value)
    }
}

impl Serializer for Protobuf {
    fn serialize_bool(&mut self, value: bool) -> Result<()> {
         prost::Message::encode(value, &mut self.buf)
    }
}

// Deserializer formats

struct JsonDecoder {
    de: serde_json::Deserializer,
}

struct YamlDecoder {
    de: serde_yaml:: Deserializer,
}

struct ProtobufDecoder<'input> {
    buf: &'input mut [u8]
}

impl Deserializer for JsonDecoder {
    fn deserialize_bool(&mut self) -> Result<bool> {
         self.de.decode_bool()
    }
}

impl Deserializer for YamlDecoder {
    fn deserialize_bool(&mut self) -> Result<bool> {
         self.de.decode_bool()
    }
}

impl Deserializer for ProtobufDecoder<'_> {
    fn serialize_bool(&mut self, value: bool) -> Result<()> {
         prost::Message::decode_length_delimited::<bool>(value, self.buf)
    }
}

markmandel · 2021-08-13T19:36:27Z

This definitely makes a lot of sense to me.

On the point of Go types, I'm going to go back to #360 - because either (a) those specific types (like Endpoint), don't changes, they are set to the specifications of xDS or (b) we can generate the Filter specific configs from protobuf - but I think that's a minor point to the overall strategy, which makes perfect sense to me 👍🏻

XAMPPRocky added kind/design Proposal discussing new features / fixes and how they should be implemented area/filters Related to Quilkin's filter API. priority/low Issues that don't need to be addressed in the near term. labels Jul 22, 2021

XAMPPRocky added priority/medium Issues that we want to resolve, but don't require immediate resolution. and removed priority/low Issues that don't need to be addressed in the near term. labels Aug 11, 2021

XAMPPRocky changed the title ~~Reducing Filter Configuration Boilerplate~~ Reducing Configuration Representation Boilerplate Aug 12, 2021

XAMPPRocky added the area/user-experience Pertaining to developers trying to use Quilkin, e.g. cli interface, configuration, etc label Aug 12, 2021

markmandel mentioned this issue Aug 12, 2021

XDS Management Server #360

Merged

3 tasks

XAMPPRocky mentioned this issue Aug 17, 2021

Add wrapper function filter factories #301

Closed

markmandel mentioned this issue Aug 17, 2021

Filter idea: Scriptable filter #13

Open

XAMPPRocky mentioned this issue Nov 8, 2021

Changing dynamic metadata from Any. #433

Closed

XAMPPRocky added this to To do in Kanban Board via automation Apr 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reducing Configuration Representation Boilerplate #346

Reducing Configuration Representation Boilerplate #346

XAMPPRocky commented Jul 22, 2021

markmandel commented Jul 26, 2021 •

edited

XAMPPRocky commented Jul 27, 2021

XAMPPRocky commented Aug 11, 2021 •

edited

markmandel commented Aug 12, 2021

iffyio commented Aug 13, 2021

XAMPPRocky commented Aug 13, 2021 •

edited

markmandel commented Aug 13, 2021

Reducing Configuration Representation Boilerplate #346

Reducing Configuration Representation Boilerplate #346

Comments

XAMPPRocky commented Jul 22, 2021

markmandel commented Jul 26, 2021 • edited

XAMPPRocky commented Jul 27, 2021

XAMPPRocky commented Aug 11, 2021 • edited

markmandel commented Aug 12, 2021

iffyio commented Aug 13, 2021

XAMPPRocky commented Aug 13, 2021 • edited

markmandel commented Aug 13, 2021

markmandel commented Jul 26, 2021 •

edited

XAMPPRocky commented Aug 11, 2021 •

edited

XAMPPRocky commented Aug 13, 2021 •

edited