Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enable declarationMap in tsconfig.json and include src in the published package #525

Closed
DetachHead opened this issue Aug 22, 2022 · 7 comments

Comments

@DetachHead
Copy link

i'm trying to locate elements in a browser dom (for my playwright selector engine) and think i've identified an issue that's specific to firefox.

however debugging is difficult even when i enable source maps in my project because this package doesn't have the source code included in the npm package, so i can only step into minified js.

i was going to try and make a PR to do this, but then i realised there's a custom build script so i'm not really sure how to go about it

@DrRataplan
Copy link
Collaborator

Hey @DetachHead,

When we (speaking of myself and my colleagues working on FontoXPath) run into any issue that might be related to the engine itself, we usually do not attempt to debug the binaries themselves. They are just too complex to debug: the google closure compiler rewrites code significantly, sourcemaps have not been as useful as we want.

Looking at your selector engine, I think I see a part of the issue: https://github.com/DetachHead/playwright-xpath/blob/master/src/browser/main.ts#L5. Namespaces are notorious in XPath, especially when working with HTML (which is not XML). Your 'trick' to translate any prefix (including the empty prefix) to the xhtml namespace works and is also what we do when we use FontoXPath for (x)html. You can choose to make it a bit more correct by returning the xhtml uri only for the empty prefix, returning undefined for others. This will make it better to work with svg elements for example, where the URI should be http://www.w3.org/2000/svg.

Does this help you on your way?

Kind regards,

Martin Middel

@DetachHead
Copy link
Author

DetachHead commented Sep 3, 2022

@DrRataplan thanks for the reply

When we (speaking of myself and my colleagues working on FontoXPath) run into any issue that might be related to the engine itself, we usually do not attempt to debug the binaries themselves. They are just too complex to debug: the google closure compiler rewrites code significantly, sourcemaps have not been as useful as we want.

i ended up compiling it myself with esbuild and the source maps it created helped me figure out what was going on.

Looking at your selector engine, I think I see a part of the issue: https://github.com/DetachHead/playwright-xpath/blob/master/src/browser/main.ts#L5. Namespaces are notorious in XPath, especially when working with HTML (which is not XML). Your 'trick' to translate any prefix (including the empty prefix) to the xhtml namespace works and is also what we do when we use FontoXPath for (x)html.

so the issue i found was that document.lookupNamespaceURI(null) returns null in firefox, and "http://www.w3.org/1999/xhtml" in chrome. i added that line to workaround it which seemed to solve the problem, though admittedly i have no idea how it works or why that fixed it lol.

You can choose to make it a bit more correct by returning the xhtml uri only for the empty prefix, returning undefined for others. This will make it better to work with svg elements for example, where the URI should be http://www.w3.org/2000/svg.

i gave this a go but it doesn't seem to work on svg's. any idea what i'm doing wrong? DetachHead/playwright-xpath#11

@DrRataplan
Copy link
Collaborator

Oh that, yuck. Did not know that... Thanks for pointing this one out, I'll document it as a caveat. This is why I recommend explicitly passing a namespace resolver. Namespaces are not always implemented 100% the same. Because nametests in XPath will simply not match, they are terrible to debug.

A short explanation on namespaces that I like is here: https://developer.mozilla.org/en-US/docs/Web/SVG/Namespaces_Crash_Course.

Because elements are identified with a localname plus a namespace uri, XPath needs to be told how the prefixes (including the empty prefix) should be translated to their corresponding URI. For example, a query like //div/svg will not match SVGs in div elements. It'll try to find svg elements in the html namespace instead. The best way to match all svg elements in div elements it to write a query like //div/svg:svg with the namespaceResolver returning the html uri for no prefix, and the svg uri for svg elements. You might need to process xlink mathml(?) in the same way.

@DetachHead
Copy link
Author

@DrRataplan what about if namespaceResolver took a second argument for the node name? that way, i could make it return a different namespace when the node is an svg.

i'd be happy to make a PR if you're ok with this idea

@DrRataplan
Copy link
Collaborator

That was also an earlier idea of mine, but then child::svg:* would be broken. Same with XPaths that (attempt to) refer to the different namespaces using the same prefix. I foresee madness.

Namespaces are something that's forced unto you when using XML technology. In some cases they actually make sense and solve problems, in most cases they are just a hurdle to jump over. The built-in XPath 1.0 engine in the browser also runs into this: https://developer.mozilla.org/en-US/docs/Web/XPath/Introduction_to_using_XPath_in_JavaScript#implementing_a_default_namespace_resolver.

My approach would be the following:

  • Document that the svg prefix should be used when addressing SVG elements (like descendant::svg:stroke)
  • Document that for addressing html element, no prefix is needed (like descendant::p/span)
  • Document that you can combine the two: descendant::section[@class="figure"]//svg:path

Another approach I see in use in Selenium for example is one I do not like as much: they instruct to use name() and local-name(): //*[name() = "svg"]. This is too verbose in my opinion.

So in the end your namespaceResolver could look like

(prefix) => {
  switch (prefix) {
    case 'svg':
      return 'http://www.w3.org/2000/svg';
    case '':
      return 'http://www.w3.org/1999/xhtml'
    default:
      console.warn(`Unused prefix used in XPath: ${prefix}. Only the 'svg' and the default namespace are registered.`)
      return undefined;
    }
  }
}

Does this help you on your way?

@DetachHead
Copy link
Author

it does. i've released 1.2.0 with the changes you suggested.

thanks for all your help :)

@DrRataplan
Copy link
Collaborator

Awesome, I;ll go ahead and close this issue! please do reach out if I can be of service!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants