Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pascal case operation names #1827

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

david-perez
Copy link
Contributor

And hence pascal case operation error struct names too, like we're
currently doing with operation input and output struct names.

Fixes #1826.

Checklist

  • I have updated CHANGELOG.next.toml if I made changes to the smithy-rs codegen or runtime crates
  • I have updated CHANGELOG.next.toml if I made changes to the AWS SDK, generated SDK code, or SDK runtime crates

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

And hence pascal case operation error struct names too, like we're
currently doing with operation input and output struct names.

Fixes #1826.
@david-perez david-perez requested review from a team as code owners October 7, 2022 13:36
@david-perez
Copy link
Contributor Author

Let's see what the "damage" to the SDK diff is.

Changelog entries missing.

@github-actions
Copy link

github-actions bot commented Oct 7, 2022

A new generated diff is ready to view.

A new doc preview is ready to view.

@david-perez
Copy link
Contributor Author

It looks like the following 34 operations in the AWS Rust SDK are affected:

AddClientIDToOpenIDConnectProvider
CreateOpenIDConnectProvider
CreateSAMLProvider
CreateVirtualMFADevice
DeactivateMFADevice
DeleteOpenIDConnectProvider
DeleteSAMLProvider
DeleteSSHPublicKey
DeleteVirtualMFADevice
EnableMFADevice
EnableVolumeIO
GetOpenIDConnectProvider
GetSAMLProvider
GetSSHPublicKey
ListMFADeviceTags
ListMFADevices
ListOpenIDConnectProviderTags
ListOpenIDConnectProviders
ListSAMLProviderTags
ListSAMLProviders
ListSSHPublicKeys
ListVirtualMFADevices
RemoveClientIDFromOpenIDConnectProvider
ResyncMFADevice
TagMFADevice
TagOpenIDConnectProvider
TagSAMLProvider
UntagMFADevice
UntagOpenIDConnectProvider
UntagSAMLProvider
UpdateOpenIDConnectProviderThumbprint
UpdateSAMLProvider
UpdateSSHPublicKey
UploadSSHPublicKey

Even if the breaking change is large, we have to do something. We're being inconsistent right now. For example, we're currently generating:

in aws-sdk-iam-0.19.0.

There's an argument to be made that we should leave operation names for input/output/error structs as they are in the input model, because it's likely that we would deviate from the official API docs (e.g. the official API documentation for this particular operation is GetSAMLProvider: https://docs.aws.amazon.com/IAM/latest/APIReference/API_GetSAMLProvider.html)

@awslabs/rust-sdk-owners thoughts?

@jdisanti
Copy link
Collaborator

Definitely agree we need consistency here. However, after examining the other SDKs and the SDK documentation, I think we should go in the opposite direction: preserve the operation name rather than convert it to Pascal case. The reason being is the overwhelming majority of the other SDKs take this approach, and maintaining the case from the model will make it consistent with the API reference docs.

What we could do is change the input/output shape creation to use operationName + suffix instead of calling toPascalCase.

Does this work for the server side?

@jdisanti
Copy link
Collaborator

The breaking change will also need to get called out in the SDK changelog.

@david-perez
Copy link
Contributor Author

Agree, we should preserve the case.

@david-perez
Copy link
Contributor Author

@jdisanti

I'm in the middle of implementing this. What is it exactly that what we want? Personally I'd just keep every name as modeled in Smithy, but it's going to be a huge change if so.

  • Enum names (includes union shapes, string shapes with @enum, IDL v2 enum shapes, and combined operation error names e.g. GetSAMLProviderError):
    • Currently: toPascalCase().
    • Want: ?
  • Enum variant names (includes union shapes, string shapes with @enum, IDL v2 enum shapes):
    • Currently: toPascalCase().
    • Want: ?
  • Event stream marshaller/unmarshaller:
    • Currently: toPascalCase().
    • Want: ? (doesn't really matter, since the event_stream_serde module is not part of the public API).
  • Service name (only applies to servers, clients don't render the service name anywhere except in Rustdocs):
    • Currently: toPascalCase().
    • Want: as modeled.
  • Operation struct names:
    • Currently: as modeled.
    • Want: as modeled.
  • Structure names (includes synthetic operation input/output structure shapes and @error structure shapes)
    • Currently: toPascalCase().
    • Want: as modeled.
  • Paginator names:
    • Currently: as modeled.
    • Want: as modeled.

We're lacking tests in many places where we're retaining names as modeled. If you have shape names containing underscores, we trigger non_camel_case_types warnings. I suspect this is the reason why we initially started using toPascalCase() everywhere, since it gets rid of the underscores, inadvertently changing things like GetSAMLProvider to GetSamlProvider, which we now don't want. So, in all cases where we want to keep names as modeled, we should #[allow(non_camel_case_types)].

@jdisanti
Copy link
Collaborator

I think we should use the modeled name for every name where both Smithy models and Rust convention call for PascalCase/PascalACRONYMCase. So given that, the only one we would need to call toPascalCase() on would be the enum variants, since those are all modeled as UPPER_SNAKE_CASE.

We may also want to add a sanity check that we're outputting names in the correct Rust convention. For example, if someone calls their operation foo_bar in the Smithy model, we should correct that to FooBar in codegen. But if they provide the name FOOBar, then we should just keep that as is.

@david-perez
Copy link
Contributor Author

The Smithy spec's ABNF says that shape names are Identifiers:

Identifier =
    IdentifierStart *IdentifierChars

IdentifierStart =
    *"_" ALPHA

IdentifierChars =
    ALPHA / DIGIT / "_"

@enum names in IDL v1 follow the same rules, i.e. the ^[a-zA-Z_]+[a-zA-Z_0-9]*$ regular expression.

Rust identifiers' grammar is a strict superset.

However, Rust API guidelines say types and enum variants should be UpperCamelCase (aka PascalCase).

To convert shape names to type and enum variant names, we can't simply:

  1. uppercase the first character,
  2. uppercase every character after an underscore; and
  3. remove underscores,

as you suggest, as that could introduce conflicts (e.g. distinct __my_shape and _my__shape shapes would both convert to MyShape). Note that we currently have this bug with toPascalCase() too. We need either a smarter renaming strategy, or retain model names and #[allow(non_camel_case_types)].

@jdisanti
Copy link
Collaborator

I think Smithy is opinionated by default on naming conventions, right? You would need to suppress warnings to encounter these edge cases if I recall correctly.

@rcoh
Copy link
Collaborator

rcoh commented Oct 31, 2022

the original failure to rename operations was an oversight (not an intentional choice).

Personally I don't think we should necessarily preserve what's modeled—IIRC it's not totally consistent across all the models and can lead to weird results.

My preference would be:

  1. Move operations to PascalCase. This has the nice property of adhering to Rust's naming guidelines:

In UpperCamelCase, acronyms and contractions of compound words count as one word: use Uuid rather than UUID, Usize rather than USize or Stdin rather than StdIn. In snake_case, acronyms and contractions are lower-cased: is_xid_start.

  1. Add a detector for the _ conflict case. We can either fix it with the reserved words or hard bail.

Personally, I'm not bothered at all by casing differences with the official docs but happy to be convinced

@david-perez
Copy link
Contributor Author

If we're going to rename, the devil is in the details: what renaming strategy do we choose? toPascalCase gives rise to additional conflicts, e.g. two distinct Smithy operations GetSAMLProvider and GetSamlPROVider would both be renamed as GetSamlProvider.

@david-perez
Copy link
Contributor Author

We now lean towards using Smithy identifiers as-is in generated Rust code. However, this is blocked by #2340.

@rcoh rcoh requested review from a team as code owners November 14, 2023 02:21
@jmklix
Copy link
Collaborator

jmklix commented Apr 12, 2024

we've decided to begin closing old PRs that we don't plan to work on in the near future. Please let us know if you still think this PR is needed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Operation input and output struct names are pascal cased, but operation error struct names are not
6 participants