Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replaces invalid enum identifiers with generated ones #103

Merged
merged 1 commit into from
Nov 12, 2022

Conversation

Dav1dde
Copy link
Member

@Dav1dde Dav1dde commented Nov 11, 2022

Related: #92 and #93

Initial draft that simply replaces invalid enum variants with auto generated names.

Things to discuss:

  • What to do with with empty "" variants, we could call them Empty (maybe a separate issue)
  • How do we name generated Variants?
  • Should there be a duplicate check for auto generated variants (what if Variant{i} is a valid variant defined by the CRD (unlikely))

I also wasn't sure where to do the renaming, some of the renaming is done in Container::rename and some is done in main and there is also a little bit of logic in analyze_enum_properties.

This currently also changes integer enum generation:

#[derive(Serialize, Deserialize, Clone, Debug, JsonSchema)]
pub enum HTTPRouteRulesBackendRefsFiltersRequestRedirectStatusCode {
    #[serde(rename = "301")]
    r#_301,
    #[serde(rename = "302")]
    r#_302,
}

But I think int enums need to be generated differently anyways using serde_repr e.g.:

#[derive(Serialize_repr, Deserialize_repr, Clone, Debug, JsonSchema)]
#[repr(u64)]
pub enum HTTPRouteRulesBackendRefsFiltersRequestRedirectStatusCode {
    r#_301 = 301,
    r#_302 = 302,
}

src/output.rs Outdated
Comment on lines 85 to 86
} else if syn::parse_str::<syn::Ident>(&m.name).is_err() {
// name is not a valid rust identifier -> generate one
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this catch keywords? we currently have a hardcoded keyword list in main 🙃

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it does, we need to move the keyword check in here first and handle that separately.

I think before we move on to a full Variant rename, we should see if underscore prefixing in a rawstring is a valid identifier (because that's going to be the majority case). If that's not the case, then we should do Variants (and we should probably do Variants for ALL members at that point because otherwise we'd have to check variant names against regexes and it feels brittle).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this catch keywords? we currently have a hardcoded keyword list in main 🙃

Good point, I'll have to check. Is the plan to move all the renaming then from main.rs into the Container::rename? I think it would be best to have everything in one place.

I think before we move on to a full Variant rename, we should see if underscore prefixing in a rawstring is a valid identifier (because that's going to be the majority case).

Makes sense.

If that's not the case, then we should do Variants (and we should probably do Variants for ALL members at that point because otherwise we'd have to check variant names against regexes and it feels brittle)

I don't understand what you mean here, if one Variant can't be prefixed with r#_ use the generic name for all (even the ones that can be escaped with a raw string)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keywords refactor and Container::rename

Yeah, agreed. if we are starting to cover keyword related things inside output.rs then we should factor all of that out of main if possible.

if one Variant can't be prefixed with r#_ use the generic name for all (even the ones that can be escaped with a raw string)?

Yeah. Because otherwise you could end up with enums like:

enum Blah {
  SomeSmartName,
  SomeOtherName,
  Variant1
  SomeFourthName
}

and devs would likely goto upstream docs only to find that Variant1 does not exist there and get confused. If we renamed ALL of them to Variant{i} then it's a little more clear that these types do not match directly to upstream and are more likely to read the #serde(rename)] attrs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does make the loop a little more convoluted though... Maybe if we just used less generic names (like VariantAlias{i} or KopiumAlias{i} this wouldn't be a problem 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just go with Kopium* and assume there wont be any collisions, this require less logic in the generating code and should be safe enough to never have collisions (until we have a Kopium CRD).

@clux
Copy link
Member

clux commented Nov 12, 2022

Needs the DCO applied, but change is otherwise great. Thank you!
The check for enums is good. I will make a proper change to use discriminants as this might serialize the integers as "301"? Anyway, nothing blocking the pr, it was broken after my pr. Will fix before a release.

How do we name generated Variants?

I think this convention here is fine. We can't really guess. If users do not like it, they can use the new --elide flag to let them override the struct.

What to do with with empty "" variants, we could call them Empty (maybe a separate issue)

I think we should try to call it Empty, but we can check against the collected member names (before the loop) to see if any members contain the literal string empty (case-insensitively), and if that's found we could either bail! or choose an ugly name like KopiumEmpty to let users know it's probably time to override.

kind of leaning towards bailing if users have both "" and "empty" as a enum values tbh...

Should there be a duplicate check for auto generated variants (what if Variant{i} is a valid variant defined by the CRD (unlikely))

Similar to the "" case, yeah, we could do some basic checking, but I think we should have an either all or nothing approach to variant renames (either all get Variant{d} renames when any member name is invalid - even after rawstring with underscore prefix, OR none of them get variant renames - only some rawstring prefixing) because otherwise it's very confusing to users to have them intermingle.

@Dav1dde
Copy link
Member Author

Dav1dde commented Nov 12, 2022

I think we should try to call it Empty, but we can check against the collected member names (before the loop) to see if any members contain the literal string empty (case-insensitively), and if that's found we could either bail! or choose an ugly name like KopiumEmpty to let users know it's probably time to override.

We also can just do nothing and generate an enum with 2 Variants as the same name. This will fail to compile and be pretty obvious to fix. Bailing is a bit annoying because then you have to edit the CRD first in order to generate something at all.

@clux
Copy link
Member

clux commented Nov 12, 2022

compile fail on two equal names

Yeah, it's true. We could try to rely on compile failures on the user end, but that means the output of kopium is not super reliable (even when it succeeds) and people will accidentally publish broken specs.

(Maybe down the line I think doing some basic syntax check on the file would be a nice optional extra.)

need to edit the crd if we bail

Yeah, this is a good point. We should probably always generate something, but I think what we generate should be valid. If we can't do that, we should bail.

@Dav1dde
Copy link
Member Author

Dav1dde commented Nov 12, 2022

I pushed a new version with the renaming completely moved from main.rs to output.rs, it also handles keywords via syn::parse_str (it fails on keywords).

It also does not need the special casing via parse::<u64>() anymore, because syn::parse_str also fails on that. The rename order is now {name} -> r#{name} -> r#_{name}, which should always succeed for fields and may fail for enums in which case there is the last fallback to KopiumVariant{i}.

I decided to go with the KopiumVariant namespace to prevent collisions, I think it is valid to assume that there won't be any enum value named KopiumVariant{i} and it is clear that this variant was named/generated by Kopium. The same applies to KopiumEmpty.

src/output.rs Outdated Show resolved Hide resolved
src/output.rs Show resolved Hide resolved
src/output.rs Show resolved Hide resolved
@clux
Copy link
Member

clux commented Nov 12, 2022

I decided to go with the KopiumVariant namespace to prevent collisions, I think it is valid to assume that there won't be any enum value named KopiumVariant{i} and it is clear that this variant was named/generated by Kopium. The same applies to KopiumEmpty.

Yeah, that is a very sensible starting point 👍

Signed-off-by: David Herberth <github@dav1d.de>
@Dav1dde
Copy link
Member Author

Dav1dde commented Nov 12, 2022

Fixed the panic message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants