Skip to content

Add name_source_location() and value_source_location() to Attribute#313

Merged
kornelski merged 2 commits into
cloudflare:mainfrom
gmalette:gm/attribute-source-locations
Apr 28, 2026
Merged

Add name_source_location() and value_source_location() to Attribute#313
kornelski merged 2 commits into
cloudflare:mainfrom
gmalette:gm/attribute-source-locations

Conversation

@gmalette
Copy link
Copy Markdown
Contributor

Exposes document-absolute byte ranges for attribute names and values via the existing SourceLocation type.

We need this to perform byte-level splice rewrites on attribute values without re-parsing the start tag... I know we're holding it wrong but this is the fastest and most correct way to achieve our goals :)

@gmalette gmalette requested review from a team, Noah-Kennedy, jasnell and orium as code owners April 23, 2026 12:23
@kornelski kornelski force-pushed the gm/attribute-source-locations branch from eff9dd6 to e605251 Compare April 27, 2026 19:56
@kornelski
Copy link
Copy Markdown
Contributor

Thanks for the PR.

The change isn't too bad, but it makes me wonder why aren't you using the rewriter to do rewriting?

gmalette and others added 2 commits April 28, 2026 12:00
Exposes document-absolute byte ranges for attribute names and values
via the existing SourceLocation type. Returns None for attributes
added programmatically via set_attribute.

This enables downstream consumers to perform byte-level splice
rewrites on attribute values without re-parsing the start tag.
@kornelski kornelski force-pushed the gm/attribute-source-locations branch from e605251 to b33681f Compare April 28, 2026 11:11
@kornelski kornelski enabled auto-merge (rebase) April 28, 2026 11:12
@kornelski kornelski merged commit bc0667b into cloudflare:main Apr 28, 2026
3 checks passed
@gmalette
Copy link
Copy Markdown
Contributor Author

but it makes me wonder why aren't you using the rewriter to do rewriting?

Yes this is a very good question and I should've led with that.

The reason is: we rewrite html bodies not written by us. We used to do this with regexps and it was very imprecise and so often we targeted nodes that were unrelated to the reason for the rewrite. For example, if your body contained type="importmap", we injected the polyfill script before the very first <script>, even if it was unrelated.

Changing the payload we output is like turning a cargo ship around, and while we want to move to full rewriter, right now we use it also as a scanner for those cases. We register node positions and perform the string manipulation when it's all computed.

It's not great and I know it's not how you want us to use it, but it's also 2-7x faster and approximately infinity times more precise.

@gmalette
Copy link
Copy Markdown
Contributor Author

gmalette commented May 5, 2026

@kornelski would you cut a new release please? 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants