Skip to content

Conversation

nikneym
Copy link
Contributor

@nikneym nikneym commented Oct 7, 2025

This PR does a URL overhaul by introducing ada-url to the codebase.

  • Removes all std.Uri usage.
  • Href strings are normalized as WHATWG requires.
  • Refactors HTMLAnchorElement by introducing object_data in Page.

The CDP changes require extra attention; this is the first time I'm making a change in that part of the code. 👀

@nikneym nikneym force-pushed the nikneym/ada-url-experiment branch from 8906c63 to 8cde0f0 Compare October 9, 2025 07:12
@nikneym nikneym force-pushed the nikneym/ada-url-experiment branch from 7dd5ee0 to c371538 Compare October 14, 2025 13:43
karlseguin added a commit that referenced this pull request Oct 16, 2025
Using the `call_arena` here is unsafe in the case of a failure. It's possible
for the call_arena to be reset during module processing, making the log crash.

The issue is that the lifetime of a URL is often conditional. If the stitched
URL has already been seen (i.e. is in the module_cache), then it can be short-
lived. EXCEPT, URL.stitch might require an allocation..and then you start to
think, well, if URL.stitch is going to allocate anyways...If we stitch with
the `page.arena`, and end up not needing a long lifetime, we've wasted memory.
If we stitch with `page.call_arena` and end up needing a long lifetime, we need
to dupe.

It's a bit messy, and I'd like to take a stab at improving it after:
#1127.

I'm thinking that we need a URL intern pool. HashMap with a composite key of
base + path -> resolved. Then all URLs are resolved using the page.arena, but
we don't have any duplicates, so it isn't wasteful.
not fully sure how we should implement those; I believe we should move forward with nullable functions and put null-check logic outside of the wrappers
yet another thing we should figure out; IMO cookie can have ownership to its url, would make it a lot simpler to use & deinitialize
this must be done in runtime now sadly, good thing is it doesn't add much and `getHref` can be spread everywhere without pointer life concerns
Now that we allocate for URLs, we know that lifetime of `href` is same as URL itself; so we don't need to keep a separate `raw` string.

Only difference is `href` is normalized whereas `raw` is not. Most things `raw` being used for require normalized URLs though, so such a change is fine.
Prefer new URL implementation with separate store for object data.
@nikneym nikneym marked this pull request as ready for review October 17, 2025 16:39
This was causing an issue on ld linker but not on MachO.
Link the ada library to ada module rather than building alongside main module.
This was a regression while testing things.
Changes are made regarding to `host`, `port` and `hostname`. Definitions are provided by MDN.
complete,
};

const ObjectDataMap = std.HashMapUnmanaged(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this being replaced with the state, e.g. getOrCreateNodeState ?

Copy link
Contributor Author

@nikneym nikneym Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, my idea was to keep URL needed for HTMLAnchorElement only. I think it should live inside State.

Edit: State seem to be holding many things... I'm not sure which part should be taken.

};
pub fn getHostname(self: URL) []const u8 {
const hostname = ada.getHostnameNullable(self.internal);
return hostname.data[0..hostname.length];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

presumably this can be null?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants