Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

declarative openapi validation rules #129

Closed
clux opened this issue Feb 13, 2020 · 26 comments · Fixed by #647
Closed

declarative openapi validation rules #129

clux opened this issue Feb 13, 2020 · 26 comments · Fixed by #647
Labels
derive kube-derive proc_macro related

Comments

@clux
Copy link
Member

clux commented Feb 13, 2020

It would be really cool if we had a way to create declarative openapi validation rules ala:

e.g. something like:

#[derive(OpenapiValidation, Deserialize, Serialize, Clone)]
pub struct FooSpec {
    #[openapi(MinLength=5)
    name: String,
    info: String,
    #[openapi(Minimum=1, Maximum=10)]
    replicas: i32,
}

The thing this most reminds me of is serde attributes, but this feels a whole lot dumber than that. We just need to generate: { type: object, properties: ... } from a struct, so it's easy to attach onto the the crd json!

The problem with this is that it would necessarily conflict with serde field attributes, which are also frequently used. A dumb proc macro that wasn't aware of serde wouldn't be hard to do, but I imagine it also would cause a lot of duplication between between serde attrs like skip_serializing_if, default with validation rules.

@clux clux added help wanted Not immediately prioritised, please help! question Direction unclear; possibly a bug, possibly could be improved. labels Feb 13, 2020
@clux clux added the api Api abstraction related label Feb 26, 2020
@clux
Copy link
Member Author

clux commented Feb 27, 2020

@clux
Copy link
Member Author

clux commented Mar 1, 2020

This could be linked to #159

@clux
Copy link
Member Author

clux commented Mar 5, 2020

Definitely something to do in a proc_macro.

https://book.kubebuilder.io/reference/markers/crd-validation.html has one api set. Though we may be able to re-use a crate for it.

@clux clux added derive kube-derive proc_macro related and removed api Api abstraction related labels Mar 5, 2020
@clux
Copy link
Member Author

clux commented Mar 22, 2020

We definitely need something like this to make server side apply useful. The apiserver will default all optionals as null when missing from a yaml patch passed to server side apply otherwise.

@clux clux added api Api abstraction related apply patch or server-side apply labels Mar 22, 2020
@praveenperera
Copy link
Contributor

@clux do you think this could be useful: https://docs.rs/rweb/0.5.3/rweb/openapi/index.html

@clux
Copy link
Member Author

clux commented Jul 14, 2020

That looks promising. Seems the related proc-macro module is here: https://github.com/kdy1/rweb/tree/master/macros/src/openapi . From a quick glance it also seems tied to a web framework for customizability, and can't tell if it's openapi v2 or v3. But that said, #[derive(Schema)] would be very nice.

@praveenperera
Copy link
Contributor

praveenperera commented Jul 14, 2020

@clux cool i’m going to look at that more if I get a chance Tomorrow.

I’m relatively new to rust, and haven’t worked with proc macros before. Would it make sense copy those files over and make it work for our needs? Wonder how hard that would be.

@clux
Copy link
Member Author

clux commented Jul 15, 2020

If you're willing to try it out and see if the schema it produces is usable and can be put onto the newly required schemas property, then that certainly would be valuable. Experimentation on that front would be very appreciated.

That said, I know that proc-macros are definitely not an easy entry-point into rust. It's meta code that generates real code (frequently inside other macros like quote!) so it can be a bit confusing to read if you're not already familiar with rust's syntax. But if you feel that you can read and understand it, go for it!

The proc-macro crate we have in this repo is kube-derive which you can try it out with cargo run --examples crd_derive and with cargo-expand you can see what code it generates. Might be a decent starting point to see how it works.

@praveenperera
Copy link
Contributor

praveenperera commented Jul 16, 2020

I've done some experimenting and I've learned it is the OpenAPI v3 spec.

Here is what I got:

Hand written

description: "Issuer"
required:
  - domainName
  - dnsProvider
type: object
properties:
  domainName:
    type: string
  dnsProvider:
    type: object
    properties:
      provider:
        type: string
        enum:
          - digtalocean
          - cloudflare
      key:
        type: string
      secretKey:
        type: string
  secretName:
    type: string
    nullable: true
  namespaces:
    type: array
    items:
      type: string

Generated

    let schema = <CertIssuerSpec as Entity>::describe();
    println!("SCHEMA: \n{}\n", serde_yaml::to_string(&schema)?);
type: object
properties:
  domainName:
    type: string
  dnsProvider:
    type: object
    properties:
      provider:
        enum:
          - DigitalOcean
          - Cloudflare
      key:
        type: string
      secretKey:
        type: string
  secretName:
    type: string
    nullable: true
  namespaces:
    type: array
    items:
      type: string

So its already pretty close

Problems:

  1. #[serde(rename_all = "lowercase")], does not work on Enums, you have to call #[serde(rename = "")] on each field
  2. On enums type: string is not generated, which is required
  3. required field is not generated
  4. Not sure how to add description field

I'm also trying to figure out how to add the required fields in automatically.

Updates

  1. This works for renaming even though rename_all lowercase doesn't
enum DnsProvider {
    #[serde(rename = "digtalocean")]
    DigitalOcean,
    #[serde(rename = "cloudflare")]
    Cloudflare,
}

@clux
Copy link
Member Author

clux commented Jul 16, 2020

Hey this is really promising! Sounds like this is potentially possible to port into kube-derive in a modified form (or have another proc-macro crate/entrypoint defining #[openapi] attrs and invoking that from kube-derive). Thanks a lot for looking into this.

Some quick comments on the problems for now:

  1. it is possible to act an struct-level serde attrs so we can parse #[serde(rename_all = ...)] (i think)
  2. enums might be a pain point we have to caveat anyway given that we can do fancy genrealized enums, and go cannot. But not sure what good defaults are atm (did not find any enum docs on the crd page)
  3. This is probably something we can also generate from the struct by doing a loop over all the fields and looking for nullable equivalent things like Option (or things that have serde(default))
  4. description might need a special attr if we even care for it (looks like it can apply to both fields and the main spec struct?) - probably fine to ignore in the first PoC

@praveenperera
Copy link
Contributor

proc-macro crate defining #[openapi] attrs and invoking that from kube-derive

I think this would be a great approach

  1. This is probably something we can also generate from the struct by doing a loop over all the fields and looking for nullable equivalent things like Option (or things that have serde(default))

I he is doing something like this for the ResponseEntity but I guess he just didn't do it for schema?
https://github.com/kdy1/rweb/blob/master/src/openapi/mod.rs#L287

description might need a special attr if we even care for it (looks like it can apply to both fields and the main spec struct?) - probably fine to ignore in the first PoC

Agreed! Definitely not needed for POC

@clux clux removed api Api abstraction related apply patch or server-side apply question Direction unclear; possibly a bug, possibly could be improved. labels Jul 16, 2020
@clux
Copy link
Member Author

clux commented Jul 16, 2020

Yeah, it looks like there's tons of things the proc macro could fill in by looking at the root Schema definition. (At least that's the crate rweb uses, seems to be a soft fork of https://github.com/softprops/openapi - probably because of softprops/openapi#22)

@praveenperera
Copy link
Contributor

praveenperera commented Jul 31, 2020

Hey a heads up, Kubernetes has some extra rules which means some valid OpenAPIV3 schemas are actually invalid in K8s world. It has to do with structural schemas: https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#specifying-a-structural-schema

Example:

Rust:

#[derive(Serialize, Deserialize, Clone, Debug, Schema)]
#[serde(tag = "provider")]
#[serde(rename_all = "lowercase")]
pub enum DnsProvider {
    DigitalOcean(BasicAuth),
    Cloudflare(BasicAuth),
    Route53(Route53),
}

#[derive(Serialize, Deserialize, Clone, Debug, Schema)]
#[serde(rename_all = "camelCase")]
pub struct BasicAuth {
    key: String,
    secret_key: String,
}

#[derive(Serialize, Deserialize, Clone, Debug, Schema)]
#[serde(rename_all = "camelCase")]
pub struct Route53 {
    access_key: String,
    secret_access_key: String,
    region: String,
    profile: Option<String>,
    hosted_zone_id: Option<String>,
}

rweb produces this, but it is invalid as a structural schema:

dnsProvider:
  oneOf:
    - type: object
      properties:
        key:
          type: string
        secretKey:
          type: string
    - type: object
      properties:
        key:
          type: string
        secretKey:
          type: string
    - type: object
      properties:
        accessKey:
          type: string
        secretAccessKey:
          type: string
        region:
          type: string
        profile:
          type: string
          nullable: true
        hostedZoneId:
          type: string
          nullable: true

Instead it would have to be written something like this:

dnsProvider: 
  type: object
  required: ["provider"]
  properties:
    provider:  { "type": "string" }
    key: { "type": "string" }
    secretKey: { "type": "string" }
    accessKey: { "type": "string" }
    secretAccessKey: { "type": "string" }
    region: { "type": "string" }
    hostedZoneId: { "type": "string" }
  oneOf:
    - properties:
        provider:
          enum: ["digitalocean"]
      required: ["provider", "key", "secretKey"]
    - properties:
        provider:
          enum: ["cloudflare"]
      required: ["provider", "key", "secretKey"]
    - properties:
        provider:
          enum: ["route53"]
      required: ["accessKey", "secretAccessKey", "region"]

@clux
Copy link
Member Author

clux commented Aug 3, 2020

Hey, thanks for figuring this out. That is annoying that it doesn't map straight onto what we need from existing crate, but it sounds like this is possible to override for enums (which look like the offending type?).

I'm wondering how much is required for this work. Technically, we don't even need to even parse anything into openapi schemas we just need to generate json from struct types that matches what kubernetes expects. But it would be good to use valid openapiv3 schemas from the definitional crate (if what kubernetes accepts is within spec, sounds like it still is?) so we can add the tons of #[openapi(...)] field level attrs.

How salvageable do you feel the rweb part is? If you have anything lying around I would love to see a PR!

@kazk
Copy link
Member

kazk commented Dec 22, 2020

Some validations equivalent to kubebuilder's are supported by #348.

For the remaining, I'm hoping we can work with schemars to add support:

One way of specifying validations that #[derive(JsonSchema)] could hook into would be using something like the validator crate's validate attributes.

GREsau/schemars#12 (comment)

Alternatively, I could just add more options to #[schemars(...)] attributes to allow setting more properties on the generated schema

GREsau/schemars#12 (comment)


Supported

  • +kubebuilder:default sets the default value for this field
    • Use #[serde(default)]
  • +kubebuilder:validation:Required specifies that this field is required
    • Non-nullable (not Option) fields without default is required
  • +kubebuilder:validation:Optional specifies that this field is optional
    • Nullable field (Option) or non-nullable with default is optional
  • +kubebuilder:validation:Format specifies additional "complex" formatting for this field.
    • schemars primitives
      • ipv4 (Ipv4Addr)
      • ipv6 (Ipv6Addr)
      • ip (IpAddr)
      • all the numeric types (integers, floats)
    • schemars chrono feature
      • date-time (DateTime<Utc>)
      • date (NaiveDate)
      • partial-date-time (NaiveDateTime, NaiveTime)
    • schemars uuid feature
      • uuid
    • Can implement JsonSchema for types as needed. See how chrono is supported.
  • +kubebuilder:validation:Enum specifies that this (scalar) field is restricted to the exact values specified
    • Use enum with scalar variants. enum Abc { A, B, C }

TODO

Strings

  • +kubebuilder:validation:Pattern specifies that this string must match the given regular expression.
  • +kubebuilder:validation:MinLength specifies the minimum length for this string.
  • +kubebuilder:validation:MaxLength specifies the maximum length for this string.

Numeric

  • +kubebuilder:validation:Minimum specifies the minimum numeric value that this field can have.
  • +kubebuilder:validation:Maximum specifies the maximum numeric value that this field can have.
  • +kubebuilder:validation:ExclusiveMinimum indicates that the minimum is "up to" but not including that value.
  • +kubebuilder:validation:ExclusiveMaximum indicates that the maximum is "up to" but not including that value.
  • +kubebuilder:validation:MultipleOf specifies that this field must have a numeric value that's a multiple of this one.

List

  • +kubebuilder:validation:MinItems specifies the minimun length for this list
  • +kubebuilder:validation:MaxItems specifies the maximum length for this list

Others

  • +kubebuilder:validation:EmbeddedResource marks a fields as an embedded resource with apiVersion, kind and metadata fields.

I just realized we can probably implement JsonSchema for newtype and use that to get any specific set of validations meaningful for that type, but haven't confirmed (confirmed). For example:

  • struct Replicas(i32); and get type: integer, format: int32, minimum: 1, maximum: 10
  • struct Hex6(String); and get type: string, pattern: "^#[0-9a-f]{6}$" etc.

@clux
Copy link
Member Author

clux commented Dec 22, 2020

ooh, manual newtype impls like impl JsonSchema for Replicas would allow users to get the extra validations. That sounds very nice, given that we just passed on JsonSchema as a thing for users to derive anyway. Is it particularly verbose? An example for this would be great!

@kazk

This comment has been minimized.

@kazk
Copy link
Member

kazk commented Dec 24, 2020

We should add some examples showing possible validations (#355).

The above can be simplified with json!:

#[derive(Serialize, Deserialize, Debug, Default, PartialEq, Eq, Clone, Copy)]
struct Replicas(u32);
impl JsonSchema for Replicas {
    // ...
    fn json_schema(_: &mut SchemaGenerator) -> Schema {
        serde_json::from_value(serde_json::json!({
            "type": "integer",
            "minimum": 0,
            "maximum": 10,
        }))
        .unwrap()
    }
}

By the way, k8s.io/kube-openapi/pkg/generators (used by Operator SDK?) supports something very similar:

import openapi "k8s.io/kube-openapi/pkg/common"
type Time struct {
    time.Time
}

func (_ Time) OpenAPIDefinition() openapi.OpenAPIDefinition {
    return openapi.OpenAPIDefinition{
        Schema: spec.Schema{
            SchemaProps: spec.SchemaProps{
                Type:   []string{"string"},
                Format: "date-time",
            },
        },
    }
}

@clux
Copy link
Member Author

clux commented Dec 25, 2020

That looks great. A little verbose, but definitely usable.

If the pattern you've found generalises to most cases, it probably wouldn't be impossible for schemars to add a small helper macro for things like this to condense this:

struct Replicas(u32);
implJsonSchema!(Replicas, ({"type": "integer", "maximum": 10, "minimum": 0})

clux added a commit that referenced this issue Jan 25, 2021
an addon, and illustration of the more manual parts of #129
@clux
Copy link
Member Author

clux commented Jan 25, 2021

A small extra note; if anyone needs to specify the kubernetes specific openapi extensions for deciding merge strategy of structs/maps/lists, this can be done with the same manual impl JsonSchema for your types. See #390 for an example.

@thedodd
Copy link

thedodd commented Feb 4, 2021

I wanted to mention a difficulty I recently ran into while generating a CRD which used an enum similar to the following:

struct MySpec {
    values: Vec<VariantType>,
}

enum VariantType {
    VaraintA(VariantAModel),
    VariantB(VariantBModel),
}
// ... other code elided ...

For brevity I'm eliding the derives and such. There were two issues I ran into with this. The list type generated for the values field was missing some needed type info. And the schemas for the inner variant types were non-structural.

I'm wondering if perhaps we need to use the OpenAPI discriminators for each of the individual schemas and such. Doing so would probably require that we have users follow the #[serde(tag = "...")] pattern on their enum (which we can verify in our derive macros, I've done it before :)), and then we generate the corresponding discriminator yaml block which matches the serde tag being used to describe the discriminant.

Thoughts?

@clux
Copy link
Member Author

clux commented Feb 7, 2021

Thanks for the heads up @thedodd . Have confirmed that the default JsonSchemas for enum variants are not well received by the kubernetes api (see #398). Tried to use a tagged enum as you suggested but did not get it to work. If you have suggestions in the PR it would be very much appreciated!

@kazk
Copy link
Member

kazk commented Feb 12, 2021

Update: In addition to newtype, you can also use #[schemars(schema_with = "func")] to get custom schema.

https://github.com/clux/kube-rs/blob/170a11b6504e174a87ff93410614edfa0ba27fa3/examples/crd_derive_schema.rs#L83-L99

@clux
Copy link
Member Author

clux commented Sep 19, 2021

As per schemars@0.8.4 you can use the validator crate to get schemars to handle #[validate(...)] attributes, so that also closes another big chunk of missing features.

I believe the only couple of things left are structural enums, and possibly a better way to declare x-kubernetes properties (both of which have some workarounds, but the enum workaround leaves the most to be desired). Maybe it's worth closing this issue and make a new issue for missing behaviour, since this issue is really old and filled with a year of history.

@kazk
Copy link
Member

kazk commented Sep 20, 2021

Yeah, we should close this and open separate issues (and add an example using #[validate(...)]).

For structural schema, I'd look into https://github.com/kubernetes/apiextensions-apiserver/tree/release-1.22/pkg/apiserver/schema . Maybe it's possible to transform the schema to structural (https://github.com/kubernetes/apiextensions-apiserver/blob/release-1.22/pkg/apiserver/schema/convert.go), and validate (https://github.com/kubernetes/apiextensions-apiserver/blob/release-1.22/pkg/apiserver/schema/validation.go).

@clux clux removed the help wanted Not immediately prioritised, please help! label Oct 9, 2021
clux added a commit that referenced this issue Oct 9, 2021
Signed-off-by: clux <sszynrae@gmail.com>
@clux clux linked a pull request Oct 9, 2021 that will close this issue
@clux clux closed this as completed in #647 Oct 10, 2021
clux added a commit that referenced this issue Oct 10, 2021
* add example using validator - closes #129

Signed-off-by: clux <sszynrae@gmail.com>

* fix doc link, fmt, and ignore aggressive clippy warns

Signed-off-by: clux <sszynrae@gmail.com>

* account for choices and add warnings

Signed-off-by: clux <sszynrae@gmail.com>

* also validate locally and add comments here

Signed-off-by: clux <sszynrae@gmail.com>
@clux
Copy link
Member Author

clux commented Oct 10, 2021

This Issue is closed, the core support is here now that #129 shows an example of using #[validate]. The last remaining issue (structural enums) is tracked in #648 to avoid the noise. If there are other issues with the kube-derive setup please raise other issues :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
derive kube-derive proc_macro related
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants