New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix filename clashes between resources and functions. #12453
Conversation
Changelog[uncommitted] (2023-03-29)Bug Fixes
|
c3c4e24
to
c3119f5
Compare
We should be careful here as it looks like these changes would result in API docs pages living at different URLs, which would result in a whole lot of 404s without a solid redirect strategy. |
c3119f5
to
0768ffb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't match what we discussed in Slack, but precedence-based conflict resolution is also reasonable as long as the conflicts can only be across those boundaries.
However, unless I'm misreading this, we could also see conflicts if two functions both have similar names with different casing: GetName and getName. This will use "fn-getname" for one and "getname" for the other, and I don't know if we have a guarantee that which function gets the prefix is always the same.
pkg/codegen/docs/gen.go
Outdated
var namePrefixed string | ||
for _, prefix := range prefixes { | ||
namePrefixed = prefix + name | ||
if _, exists := seen[namePrefixed]; exists { | ||
// Unset namePrefixed because it isn't valid. | ||
namePrefixed = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style: Set the variable and then unset it like that is a bit off in terms of flow.
How about:
var found bool
for _, prefix := range prefixes {
candidate := prefix + name
if _, exists := seen[candidate]; exists {
continue
}
name = candidate
found = true
seen[name] = struct{}
break
}
if !found {
return
}
The target is set only if the candidate is valid.
(My example doesn't have a namePrefixed because we don't use the original "name" variable again after this point, and the prefixed version effectively replaces it.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similar names with different casing: GetName and getName.
I feel like this pattern of naming doesn't exist in practice (and shouldn't exist in practice) because it's not idiomatic in any supported language. I would expect if someone created a provider with these names in the schema, we should encourage them to get a little more creative with their names. I would expect we could issue guidance around naming conventions in the schema to mitigate this concern. Offering guidance feels like something we should do to maintain the ecosystem's health and ensure quality in our providers.
Just my 2c, feel free to ignore me :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this pattern of naming doesn't exist in practice (and shouldn't exist in practice) because it's not idiomatic in any supported language
You'd think so! It's definitely not idiomatic, but it's not unsupported—in that it's not impossible to do. There's nothing stopping one from doing this.
Story time: A service schema system I maintained in the past made the same assumption. It normalized getName
to GetName
because it was generating Go code. Some users started reporting broken code specifically because they had two fields with different casing for the same name. The reasons of why they did that was not important, but a workaround was necessary. (We had the ability for fields to be annotated. We basically added the ability for you to set an annotation on conflicting fields to specify an alternative Go-level name.)
In general, if we're taking user input from one set of possibilities and normalizing it into a smaller set of possibilities, we have to account for conflicts.
pkg/codegen/docs/gen.go
Outdated
if name != "" && namePrefixed == "" { | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will cause a silent failure if by some chance or bug, we cannot get a unique name for something.
We should return an error here, and have the addFile
callers fail if an error is returned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a hard failure on this could change the behavior for downstream users of this such as the registry. I'm not sure if we should modify the behavior in this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we can't fail here, at least a warning is merited.
Silently skipping files will just make these conflicts harder to debug for users.
A bug report that includes the warning in the log output will be easier for us too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it's in codegen, I think it's okay if we fail in the registry. If that happens, the Docs team will open a fix or a P1, and we can resolve that before they upgrade.
A warning might also suffice here; does anyone review the logs during registrygen? I think the answer is no. If it were a manual task, then we'd guarantee eyes on the warning.
Again, just my 2c. Not my hill to die on. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, a failure would do too. A warning is slightly extra steps, but it's still more than silence.
Silently ignoring the failure is the only case I'd push back against.
From my understanding, this indeterminate behavior already existed, but this PR disambiguates only across module, resource, and function docs. I think that the number of potential URLs being changed in the registry by this PR could be high and I'm not sure if we want to stack that change in behavior on top of it for the SEO reasons that @cnunciato mentions. |
I'm not suggesting we rename all entities. I'm suggesting we increase the scope of this change from the conflict from the original bug report to conflicts in these file names in general. seen := make(map[string]struct{})
nameFor := func(name, kind string) (result string) {
// Always put the final result into the map.
defer func() { seen[result] = struct{}{} }()
if _, ok := seen[ok]; !ok {
return name // no conflict
}
name := kind + "-" + name // pkg-foo
if _, ok := seen[ok]; !ok {
return name
}
// Still conflicts.
// Might have "FooBar" and "Foobar" in the same package.
i := 2
for {
candidate := name + strconv.Itoa(i)
if _, ok := seen[candidate]; !ok {
return candidate
}
i++
}
} This will guarantee a unique lower case name for all cases—no possibility of error (which also makes the other error discussion moot). The last piece for this conflict resolution is this: deterministic ordering. resources := mod.resources
sort.Slice(resources, func(i, j int) bool {
return resourceName(resources[i]) < resourceName(resources[j])
})
for _, r := range resources {
// ... This will guarantee that we always generate files for resources in a fixed order, |
I like this change, but it suffers from the potential for a provider upgrade inserting a new conflict and changing the ordering.
I think it's worth adding the log message and postponing this until we're aware of a user running into this issue for same-kind conflicts because this does make decisions around API and registry docs URL naming |
I'm happy with postponing that part of the change. By rights schema's shouldn't even allow functions or types to differ just on case (and mine and Daniels idea for new naming system would enforce that). |
0768ffb
to
68e3774
Compare
@@ -0,0 +1,43 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should add a conflicting module to this as well to check that also works as expected
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
look good?
pkg/codegen/testing/test/testdata/docs-collision/docs/_index.md
Outdated
Show resolved
Hide resolved
That sounds like a plan.
👍 |
pkg/codegen/testing/test/testdata/docs-collision/docs/_index.md
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
naming is hard so I'd appreciate name suggestions.
I think the approach is good and I just need to:
- correctness/writing tests
- tidying up the names
This is a 2000+ line file with globals and a lot of stateful stuff.
752c40d
to
4d85a15
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The underlying logic is a bit messy, but this change looks fine.
Ok. I've added a comment to better label what |
f58ec54
to
dcec674
Compare
Before we merge, would y'all object to running a diff of the filesystem before and after a docs build with these changes? Just want to be sure we err on the side of caution here; I’d like it to be clear what actually changes as a result of this change, and ideally get redirects in place if it seems warranted. I'm happy to take a swing at that now if you think this is close to being done. |
6d1d5c3
to
f82aa62
Compare
If it helps, here's a draft PR (feel free to push to it if you like) that runs a full build based a commit hash from this repo: pulumi/docs#8787 |
f82aa62
to
e8d68a7
Compare
name. Docs URLs are case-insensitive so modules, functions and resources may have name collisions. If a filename is taken by another class of document, it will be prefixed with "mod", "res", "fn" and docs will point to this new unique link. The priority is as follows: 1. module 2. resource 3. function
I've talked with @cnunciato about potential docs changes and we don't see anything different. I don't see any significant outstanding work on this and I have an approval so I'm going to merge this PR. Feel free to |
bors merge |
Build succeeded: |
Description
Docs paths are lowercased so functions and resources may have name clashes. This PR adds behavior to docs generation to prefix docs with
mod
,res
,fn
if there is a name conflict between documentation types in that order.Fixes #10495
Checklist
make changelog
and committed thechangelog/pending/<file>
documenting my change