Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Continues searching if a non-symbol matches a symbol link #901

Merged
merged 4 commits into from
May 3, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ extension PathHierarchy {
///
/// Includes information about:
/// - The path to the non-symbol match.
case nonSymbolMatchForSymbolLink(path: Substring)
case nonSymbolMatchForSymbolLink(path: String)

/// Encountered an unknown disambiguation for a found node.
///
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,14 +18,7 @@ extension PathHierarchy {
/// - Returns: Returns the unique identifier for the found match or raises an error if no match can be found.
/// - Throws: Raises a ``PathHierarchy/Error`` if no match can be found.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before, it was very easy to state that this method returned a PathHierarchy/Error. However, now that this method calls another very extensive method with its own inner method calls, it's more difficult to assure that this - Throws: Raises a PathHierarchy/Error if no match can be found. is always true.

I checked all the other throw statements inside findNode, and it seems like they all throw a PathHierarchy/Error. However, since there are multiple nested calls, I find it very easy to mistakenly throw another type of error without noticing this comment.

Any thoughts about this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment still holds true. All errors raised from PathHierarchy/find(path:parent:onlyFindSymbols:) are PathHierarchy/Error. In the future when we eventually update to Swift 6 we can adopt typed throws here to have the compiler enforce this.

func find(path rawPath: String, parent: ResolvedIdentifier? = nil, onlyFindSymbols: Bool) throws -> ResolvedIdentifier {
let node = try findNode(path: rawPath, parentID: parent, onlyFindSymbols: onlyFindSymbols)
if node.identifier == nil {
throw Error.unfindableMatch(node)
}
if onlyFindSymbols, node.symbol == nil {
throw Error.nonSymbolMatchForSymbolLink(path: rawPath[...])
}
return node.identifier
return try findNode(path: rawPath, parentID: parent, onlyFindSymbols: onlyFindSymbols).identifier
}

private func findNode(path rawPath: String, parentID: ResolvedIdentifier?, onlyFindSymbols: Bool) throws -> Node {
Expand Down Expand Up @@ -219,10 +212,21 @@ extension PathHierarchy {
var node = startingPoint
var remaining = pathComponents[...]

/// Ensure that the return value is valid for this search.
func validateReturnValue(_ node: Node) throws -> Node {
if node.identifier == nil {
throw Error.unfindableMatch(node)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this check needed if the identifier property inside theNode class is an implicitly unwrapped optional?

fileprivate(set) var identifier: ResolvedIdentifier!

If there's a possibility for this value to be nil, then shouldn't it be an optional value?

All the tests pass without this check, so I'm not clear on whether it's needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code didn't change, I just moved it.

If there's a possibility for this value to be nil, then shouldn't it be an optional value?

Yes, it can be nil. If you look at this code (PathHierarchy.swift L302-L305) you can see that for the nodes initialized from the symbol graph file, a node is only assigned an identifier if it has a symbol.

The reason why this exists is that the convert service build documentation with a "sparse" symbol graph file that only contains a single symbol. However, this symbol may have path components that represent symbols that aren't in that symbol graph. For example, if the SGF only contains someMethod() but that symbol is a member of SomeClass then that path hierarchy will create an "unfindable" node for "SomeClass" and use it to build up a documentation hierarchy. Since there isn't a symbol for that node, the convert service provides information about how that path component should be disambiguated via the knownDisambiguatedPathComponents argument. Having a node for the "SomeClass" path components enabled the content to write links like SomeClass/someMethod() or SomeClass-class/someMethod() despite the path hierarchy not knowing that there is a "SomeClass" symbol.

The reason for Error.unfindableMatch(node) is that in a convert service build with only one symbol, it's allowed to traverse past "SomeClass" but since SomeClass doesn't exist in the symbol graph, it's not allowed to resolve a link to only SomeClass.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a comment elsewhere in the path hierarchy implementation that talk about this:

Placeholder values, also called "unfindable elements" or "sparse nodes", are created when constructing a path hierarchy for a "partial" symbol graph file.

When the ConvertService builds documentation for a single symbol with multiple path components, the path hierarchy fills in placeholder nodes for the other path components. This ensures that the nodes in the hierarchy are connected and that there's the same number of nodes—with the same names—between the module node and the non-placeholder node as there would be in the full symbol graph.

The placeholder nodes can be traversed up and down while resolving a link—to reach a non-placeholder node—but the link will be considered "not found" if it ends at a placeholder node.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, but I agree with @sofiaromorales this is confusing or even dangerous. We need to be 100% sure that the identifier will never be nil before trying to access it... Maybe that's the entire purpose of this function? To insure that no other code that uses PathHierarchy::Node's will ever see a nil identifier? Are nodes ever returned in any other code paths?

if onlyFindSymbols, node.symbol == nil {
throw Error.nonSymbolMatchForSymbolLink(path: rawPathForError)
}
return node
}

// Third, search for the match relative to the start node.
if remaining.isEmpty {
// If all path components were consumed, then the start of the search is the match.
return node
return try validateReturnValue(node)
}

// Search for the remaining components from the node
Expand All @@ -238,11 +242,12 @@ extension PathHierarchy {
remaining = remaining.dropFirst()
if remaining.isEmpty {
// If all path components are consumed, then the match is found.
return child
return try validateReturnValue(child)
}
} catch DisambiguationContainer.Error.lookupCollision(let collisions) {
func handleWrappedCollision() throws -> Node {
try handleCollision(node: node, remaining: remaining, collisions: collisions, onlyFindSymbols: onlyFindSymbols, rawPathForError: rawPathForError)
let match = try handleCollision(node: node, remaining: remaining, collisions: collisions, onlyFindSymbols: onlyFindSymbols, rawPathForError: rawPathForError)
return try validateReturnValue(match)
}

// When there's a collision, use the remaining path components to try and narrow down the possible collisions.
Expand Down Expand Up @@ -270,13 +275,13 @@ extension PathHierarchy {
}
}
// A wrapped error would have been raised while iterating over the collection.
return uniqueCollisions.first!.value
return try validateReturnValue(uniqueCollisions.first!.value)
}

// Look ahead one path component to narrow down the list of collisions.
// For each collision where the next path component can be found unambiguously, return that matching node one level down.
let possibleMatchesOneLevelDown = collisions.compactMap {
return try? $0.node.children[String(nextPathComponent.name)]?.find(nextPathComponent.disambiguation)
try? $0.node.children[String(nextPathComponent.name)]?.find(nextPathComponent.disambiguation)
}
let onlyPossibleMatch: Node?

Expand All @@ -296,7 +301,7 @@ extension PathHierarchy {
remaining = remaining.dropFirst(2)
if remaining.isEmpty {
// If that was the end of the path we can simply return the result.
return onlyPossibleMatch
return try validateReturnValue(onlyPossibleMatch)
} else {
// Otherwise we continue looping over the remaining path components.
node = onlyPossibleMatch
Expand All @@ -305,7 +310,7 @@ extension PathHierarchy {
}

// Couldn't resolve the collision by look ahead.
return try handleCollision(node: node, remaining: remaining, collisions: collisions, onlyFindSymbols: onlyFindSymbols, rawPathForError: rawPathForError)
return try handleWrappedCollision()
}
}
}
Expand Down
43 changes: 43 additions & 0 deletions Tests/SwiftDocCTests/Infrastructure/PathHierarchyTests.swift
Original file line number Diff line number Diff line change
Expand Up @@ -1454,6 +1454,49 @@ class PathHierarchyTests: XCTestCase {
try assertFindsPath("Inner/InnerClass/something()", in: tree, asSymbolID: "s:5Inner0A5ClassC5OuterE9somethingyyF")
}

func testContinuesSearchingIfNonSymbolMatchesSymbolLink() throws {
let exampleDocumentation = Folder(name: "CatalogName.docc", content: [
JSONFile(name: "ModuleName.symbols.json", content: makeSymbolGraph(moduleName: "ModuleName", symbols: [
("some-class-id", .swift, ["SomeClass"], .class),
])),

TextFile(name: "Some-Article.md", utf8Content: """
# Some article

An article with a heading with the same name as a symbol and another heading.

### SomeClass

- ``SomeClass``

### OtherHeading
"""),
])
let catalogURL = try exampleDocumentation.write(inside: createTemporaryDirectory())
let (_, _, context) = try loadBundle(from: catalogURL)
let tree = context.linkResolver.localResolver.pathHierarchy

XCTAssert(context.problems.isEmpty, "Unexpected problems \(context.problems.map(\.diagnostic.summary))")

let articleID = try tree.find(path: "/CatalogName/Some-Article", onlyFindSymbols: false)

XCTAssertEqual(try tree.findNode(path: "SomeClass", onlyFindSymbols: false, parent: articleID).symbol?.identifier.precise, nil,
"A general documentation link will find the heading because its closer than the symbol.")
XCTAssertEqual(try tree.findNode(path: "SomeClass", onlyFindSymbols: true, parent: articleID).symbol?.identifier.precise, "some-class-id",
"A symbol link will skip the heading and continue searching until it finds the symbol.")

XCTAssertEqual(try tree.findNode(path: "OtherHeading", onlyFindSymbols: false, parent: articleID).symbol?.identifier.precise, nil,
"A general documentation link will find the other heading.")
XCTAssertThrowsError(
try tree.findNode(path: "OtherHeading", onlyFindSymbols: true, parent: articleID),
"A symbol link that find the header but doesn't find a symbol will raise the error about the heading not being a symbol"
) { untypedError in
let error = untypedError as! PathHierarchy.Error
let referenceError = error.makeTopicReferenceResolutionErrorInfo() { context.linkResolver.localResolver.fullName(of: $0, in: context) }
XCTAssertEqual(referenceError.message, "Symbol links can only resolve symbols")
}
}

func testSnippets() throws {
let (_, context) = try testBundleAndContext(named: "Snippets")
let tree = context.linkResolver.localResolver.pathHierarchy
Expand Down