New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Continues searching if a non-symbol matches a symbol link #901
Continues searching if a non-symbol matches a symbol link #901
Conversation
rdar://126580675
@swift-ci please test |
@swift-ci please test |
if node.identifier == nil { | ||
throw Error.unfindableMatch(node) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this check needed if the identifier
property inside theNode
class is an implicitly unwrapped optional?
fileprivate(set) var identifier: ResolvedIdentifier!
If there's a possibility for this value to be nil, then shouldn't it be an optional value?
All the tests pass without this check, so I'm not clear on whether it's needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code didn't change, I just moved it.
If there's a possibility for this value to be nil, then shouldn't it be an optional value?
Yes, it can be nil
. If you look at this code (PathHierarchy.swift L302-L305) you can see that for the nodes initialized from the symbol graph file, a node is only assigned an identifier if it has a symbol.
The reason why this exists is that the convert service build documentation with a "sparse" symbol graph file that only contains a single symbol. However, this symbol may have path components that represent symbols that aren't in that symbol graph. For example, if the SGF only contains someMethod()
but that symbol is a member of SomeClass
then that path hierarchy will create an "unfindable" node for "SomeClass" and use it to build up a documentation hierarchy. Since there isn't a symbol for that node, the convert service provides information about how that path component should be disambiguated via the knownDisambiguatedPathComponents
argument. Having a node for the "SomeClass" path components enabled the content to write links like SomeClass/someMethod()
or SomeClass-class/someMethod()
despite the path hierarchy not knowing that there is a "SomeClass" symbol.
The reason for Error.unfindableMatch(node)
is that in a convert service build with only one symbol, it's allowed to traverse past "SomeClass" but since SomeClass
doesn't exist in the symbol graph, it's not allowed to resolve a link to only SomeClass
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a comment elsewhere in the path hierarchy implementation that talk about this:
Placeholder values, also called "unfindable elements" or "sparse nodes", are created when constructing a path hierarchy for a "partial" symbol graph file.
When the
ConvertService
builds documentation for a single symbol with multiple path components, the path hierarchy fills in placeholder nodes for the other path components. This ensures that the nodes in the hierarchy are connected and that there's the same number of nodes—with the same names—between the module node and the non-placeholder node as there would be in the full symbol graph.The placeholder nodes can be traversed up and down while resolving a link—to reach a non-placeholder node—but the link will be considered "not found" if it ends at a placeholder node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, but I agree with @sofiaromorales this is confusing or even dangerous. We need to be 100% sure that the identifier will never be nil before trying to access it... Maybe that's the entire purpose of this function? To insure that no other code that uses PathHierarchy::Node
's will ever see a nil identifier? Are nodes ever returned in any other code paths?
@@ -18,14 +18,7 @@ extension PathHierarchy { | |||
/// - Returns: Returns the unique identifier for the found match or raises an error if no match can be found. | |||
/// - Throws: Raises a ``PathHierarchy/Error`` if no match can be found. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before, it was very easy to state that this method returned a PathHierarchy/Error
. However, now that this method calls another very extensive method with its own inner method calls, it's more difficult to assure that this - Throws: Raises a PathHierarchy/Error if no match can be found.
is always true.
I checked all the other throw statements inside findNode
, and it seems like they all throw a PathHierarchy/Error
. However, since there are multiple nested calls, I find it very easy to mistakenly throw another type of error without noticing this comment.
Any thoughts about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment still holds true. All errors raised from PathHierarchy/find(path:parent:onlyFindSymbols:)
are PathHierarchy/Error
. In the future when we eventually update to Swift 6 we can adopt typed throws here to have the compiler enforce this.
@@ -305,7 +310,8 @@ extension PathHierarchy { | |||
} | |||
|
|||
// Couldn't resolve the collision by look ahead. | |||
return try handleCollision(node: node, remaining: remaining, collisions: collisions, onlyFindSymbols: onlyFindSymbols, rawPathForError: rawPathForError) | |||
// return try handleCollision(node: node, remaining: remaining, collisions: collisions, onlyFindSymbols: onlyFindSymbols, rawPathForError: rawPathForError) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// return try handleCollision(node: node, remaining: remaining, collisions: collisions, onlyFindSymbols: onlyFindSymbols, rawPathForError: rawPathForError) |
@swift-ci please test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good and I was able to verify the fix works using your instructions. However, reviewing this was a bit tricky since it's not obvious how moving the calls to validateReturnValue
actually fixes the problem. I wasn't able to understand the impact of this change until I discovered in the debugger that we catch the exceptions in various places and continue on with the algorithm.
I wonder if this code could be more easily understood if we changed the return value of searchForNode
to be a new structure that contains the node and maybe some error info, that would only be returned by validateReturnValue
. This would make it less likely that some future developer would accidentally return a node from some new code path they added in searchForNode
. Right now it seems error prone to assume that all possible code paths (and there are many!) from searchForNode
call validateReturnValue
. Adding a new struct could allow the Swift compiler to help us remember to call validateReturnValue
in the future.
What do you think?
I don't quite know what I think... I got curious so I locally rewrote Also, since the current implementation already calls and catches some of the internal errors, the + let children: DisambiguationContainer
+ let pathComponent: PathComponent
+ do {
- let (children, pathComponent) = try findChildContainer(node: &node, remaining: remaining, rawPathForError: rawPathForError)
+ (children, pathComponent) = try findChildContainer(node: &node, remaining: remaining, rawPathForError: rawPathForError)
+ } catch let error as PathHierarchy.Error {
+ return .failure(error)
+ } catch {
+ fatalError("PathHierarchy only throws PathHierarchy.Error")
+ } Or the implementation needs to nest multiple do statements to have certain parts that catch some internal errors do {
// Do work where we want to return if it raises an error
do {
// Do work where we want to handle certain errors specifically
} catch DisambiguationContainer.Error.lookupCollision(let collisions) {
do {
// Do more work where we want to return if it raises an error
}
}
} catch let error as PathHierarchy.Error {
return .failure(error)
} catch {
fatalError("PathHierarchy only throws PathHierarchy.Error")
} The latter can be written like above, where the inner
but that makes the implementation longer and full of One way to solve all this extra internal error handling syntax and make the implementation read better again is to change the the other internal methods to return |
I think if we want to ensure that private func searchForNode(descendingFrom startingPoint: Node, pathComponents: ArraySlice<PathComponent>, onlyFindSymbols: Bool, rawPathForError: String) throws -> Node {
let node = try _searchForNode(descendingFrom: startingPoint, pathComponents: pathComponents, onlyFindSymbols: onlyFindSymbols, rawPathForError: rawPathForError)
/// Ensure that the return value is valid for this search.
if node.identifier == nil {
throw Error.unfindableMatch(node)
}
if onlyFindSymbols, node.symbol == nil {
throw Error.nonSymbolMatchForSymbolLink(path: rawPathForError)
}
return node
}
private func _searchForNode(descendingFrom startingPoint: Node, pathComponents: ArraySlice<PathComponent>, onlyFindSymbols: Bool, rawPathForError: String) throws -> Node {
// original implementation without validateReturnValue goes here.
} If we want to be really, really, sure that no-one accidentally calls private func searchForNode(descendingFrom startingPoint: Node, pathComponents: ArraySlice<PathComponent>, onlyFindSymbols: Bool, rawPathForError: String) throws -> Node {
func _innerImplementation(descendingFrom startingPoint: Node, pathComponents: ArraySlice<PathComponent>, onlyFindSymbols: Bool, rawPathForError: String) throws -> Node {
// original implementation without validateReturnValue goes here.
}
let node = try _innerImplementation(descendingFrom: startingPoint, pathComponents: pathComponents, onlyFindSymbols: onlyFindSymbols, rawPathForError: rawPathForError)
/// Ensure that the return value is valid for this search.
if node.identifier == nil {
throw Error.unfindableMatch(node)
}
if onlyFindSymbols, node.symbol == nil {
throw Error.nonSymbolMatchForSymbolLink(path: rawPathForError)
}
return node
} That gives the most guarantees from the compiler but it may be a new pattern to the reader. However, once the pattern is understood, the inner implementation can be read independently and the reader can know that the extra return value validation will always run. I'll change to code to follow this pattern. |
@swift-ci please test |
1 similar comment
@swift-ci please test |
"a new structure that contains the node and maybe some error info" ha ha I was describing the |
Like I said above, I did rewrite two different versions of the search algorithm that used the |
Yes I saw that you tried rewriting this in a couple of different ways. Thank for looking into this idea! I suppose what I mean by "larger rewrite" would be to look for an algorithm that doesn't use the error handling approach at all, or at least not with exceptions. Maybe what you meant with:
I think what you ended up with here this week is great. |
Bug/issue #, if applicable: rdar://126580675
Summary
This fixes a bug where DocC would stop searching as soon as it found any match while resolving a link, even if that match wasn't valid for the search (for example a non-symbol match for a symbol link). This resulted in "Symbol links can only resolve symbols" errors for some symbol links if there was a closer non-symbol match to the page that contained the link.
Now, DocC will check that the match is valid for the search, and if it's not; DocC will remember the error message from the first match but continue searching to see if there's another valid match.
Dependencies
None
Testing
In a project with some top-level symbol,
Add an article with a heading (or topic section) with the same name as the symbol and also link to that symbol with a symbol link. For example:
Build documentation for the project.
Change the link to a general documentation link (
<doc:SomeClass>
) and build documentation again.Add another heading to the article and link to it using a symbol link
Build documentation for the project again
Checklist
Make sure you check off the following items. If they cannot be completed, provide a reason.
./bin/test
script and it succeededUpdated documentation if necessaryNot applicable