Skip to content

Conversation

@d-ronnqvist
Copy link
Contributor

Bug/issue #, if applicable: rdar://163326857

Summary

This is the first broken down slice of #1366

Follow up PRs will incrementally add other changes that are already present in #1366


This adds a new internal library target for rendering Markdown content into static HTML.

Unlike a generic Markdown to HTML renderer that might be a good fit for Swift Markdown itself, this renderer has a LinkProvider that it can query for information about linked pages (such as their title and type of page) which it uses to render links and symbol links.

The rendered output can either focus on "richness" or on "conciseness". Most of the code is the same for both, but a few code paths diverge:

  • the "rich" HTML output adds <wbr/> elements into symbol names into semantically meaningful places to have a nicer word wrapping on the rendered page.
  • the "rich" HTML output makes heading elements wrap their content in an anchor that references the heading itself so that individual anchors can be clicked on and linked to.

The core of the HTML renderer is based on Foundation's XMLNode. The main reason for this is that it doesn't add any new dependencies which makes it significantly easier to get it into the toolchain. This doesn't add any public API, so we are free to completely rewrite the core rendered to use either an external dependency or a new library that we write ourselves without risking any breaking changes.

Dependencies

None.

Testing

Nothing in particular for this PR. It only adds an internal library target that's not used yet. See #1366 for how it does eventually get used.

Checklist

Make sure you check off the following items. If they cannot be completed, provide a reason.

  • Added tests
  • Ran the ./bin/test script and it succeeded
  • Updated documentation if necessary

@d-ronnqvist
Copy link
Contributor Author

@swift-ci please test

@d-ronnqvist
Copy link
Contributor Author

@swift-ci please test

Copy link
Contributor

@patshaughnessy patshaughnessy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing work!

I left some coding nitpicks but this looks great. Maybe add some comments with examples of the HTML that each visit function generates. Even though you have that in the tests, it would be nice to see a simple example just in front of each function I think.

.product(name: "Markdown", package: "swift-markdown"),
.target(name: "SwiftDocCTestUtilities"),
],
swiftSettings: [.swiftLanguageMode(.v6)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need language mode v6 to support Swift Testing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we just need to specify swift-tools-version:6.0 in the package manifest which we already do. Because this code doesn't load any symbols or documentation catalogs, it can start using Swift Testing right away. Most other tests in DocC probably need the updated test helpers in #1362.

My reason for using the Swift 6 language mode here is that it enabled a number of features by default and is strict about concurrency warnings. That way, any new code has to follow the stricter concurrency checks from the start, rather than adapting it afterwards.

func pathForSymbolID(_ usr: String) -> URL?

/// Provide information about an asset, or `nil` if the asset can't be found.
func assetNamed(_ assetName: String) -> LinkedAsset?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are assets? Images? Videos? Both? Other objects? Maybe a bit of explanation here or where you define LinkedAsset below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I briefly documented this in d89d3e5

}
}

package struct LinkedAsset {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the previous comment. Should this be named LinkedImages ? Or, if we expect there to be videos or other types of media is having a single images attribute incorrect?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Videos and I think downloadable files would also be considered assets. I'll update this comment and add a TODO for a future PR to verify that videos and downloads work, but that probably requires support for the @Video directive and @CallToAction directive first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I briefly documented this in d89d3e5

self.linkProvider = linkProvider
}

func visit(_ paragraph: Paragraph) -> XMLNode {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized you have extensive unit tests with detailed examples, but it would be very helpful to paste an example here in a comment above each visit function to indicate what it produces. This one is trivial but the following functions are quite complex and hard to understand here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added examples to the various visit(_:) functions and reformatted the tests to match in e0becf0

return .element(named: "h\(level)", children: content)

case .richness:
let id = urlReadableFragment(plainTextTitle().lowercased())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the call to lowercased() happen inside of urlReadableFragment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the urlReadableFragment comes from elsewhere in DocC I'd rather have it be exactly the same.

I wonder if we even need to lowercase the fragment/anchor here. I was trying to match what DocC Render does, but it only lowercases anchors to known sections like "parameters" or "topics" whereas authored headings don't get a lowercased anchor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I stopped lowercasing the anchors in 922e279

attributes: ["href": destination.absoluteString]
)
} else {
// If this is an unresolved documentation link, try to display only the name of the linked symbol; without the rest of its path and without its disambiguation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

YES THANK YOU for not rendering "doc:"

return children
}

var elements = Array(container)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe extract the following code into a separate function. It's quite complex and performing a specific operation related to inline HTML. Extracting a function would also give you a chance to name this functionality, and a place to add some comments explaining what this is doing and why.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now I added a number of code comments and a FIXME in 519274b

continue
}

// Gradually increase the content to try and parse
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you extract the following inner loop into a separate function also? The outer/inner labels are quite confusing. Is there a simpler way to express this algorithm?

.union(CharacterSet(charactersIn: "`")) // Also consider back-ticks as punctuation. They are used as quotes around symbols or other code.
.subtracting(CharacterSet(charactersIn: "-")) // Don't remove hyphens. They are used as a whitespace replacement.
static let whitespaceAndDashes = CharacterSet.whitespaces
.union(CharacterSet(charactersIn: "-–—")) // hyphen, en dash, em dash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these different characters here in the Swift source file? I think I see two but not sure. Is there another way to write this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is taken from elsewhere in DocC. In the future we might remove the duplication by either moving this functionality down into the Common target or adding it to the LinkProvider protocol so that the renderer can ask the calling code to create the anchor.

Regarding alternative ways to write this, we could either do 3 separate unions:

.union(CharacterSet(charactersIn: "-")) // hyphen
.union(CharacterSet(charactersIn: "")) // en dash
.union(CharacterSet(charactersIn: "")) // em dash

or we could use unicode literals for each character so that it's clear that there are 3:

 .union(CharacterSet(charactersIn: "\u{2010}\u{2013}\u{2014}")) // hyphen, en dash, em dash

return result
}

/// Returns the language specific symbol names sorted by the language.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this not a method of LinkedElement.Names?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because later PRs call this for many non-name values, for example [SourceLanguage: [ParameterInfo]], [SourceLanguage: [any Markup]], and [SourceLanguage: [DeclarationFragment]]

@d-ronnqvist
Copy link
Contributor Author

@swift-ci please test

@d-ronnqvist
Copy link
Contributor Author

@swift-ci please test

@d-ronnqvist d-ronnqvist merged commit 4876d1c into swiftlang:main Dec 4, 2025
2 checks passed
@d-ronnqvist d-ronnqvist deleted the output-html-1 branch December 4, 2025 17:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants