Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Positional info #24

Closed
wooorm opened this issue Jan 4, 2023 · 6 comments · Fixed by #25
Closed

Positional info #24

wooorm opened this issue Jan 4, 2023 · 6 comments · Fixed by #25
Assignees

Comments

@wooorm
Copy link

wooorm commented Jan 4, 2023

Hi Ryan! 👋

I’m looking for a well-tested, mostly-spec-compliant, XML parser, with importantly positional info (specifically starting and ending line/columns/offset into the whole string), for xast.

I see that this project has positional info on errors, but I don’t see it on nodes. Would it be of interest to add that data on nodes in this project?

@rgrove
Copy link
Owner

rgrove commented Jan 4, 2023

Hey Titus!

Adding whole-string offset info to nodes would probably be pretty easy since the offset is already available during parsing, but line and column positions are currently only calculated when an error occurs (and not very efficiently).

There are currently several places where the parser consumes newlines, so keeping track of the line and column efficiently would probably take some refactoring. I'm open to this in theory, but I'm concerned about the potential performance impact given that many users aren't likely to need this functionality. I haven't looked into it deeply though, so I'm mostly working from memory.

@wooorm
Copy link
Author

wooorm commented Jan 4, 2023

https://github.com/vfile/vfile-location can do the transform! Should be more efficient that the current algorithm for errors, as it does most of the work once upfront, but it would likely still be faster to track everything directly when parsing

@wooorm
Copy link
Author

wooorm commented Jan 4, 2023

If you’re worried about performance (which makes sense), this could be put behind an option.
When that option is turned on, vfile-location is used at the start to generate the index, and then later for each start/end of a node you can do if (options.position) location.toPosition(byteOffset) or so?
I think that for me the perf “bottle neck” is fine

@rgrove
Copy link
Owner

rgrove commented Jan 4, 2023

True. Unfortunately, integrating vfile-location into parse-xml would mean adding several runtime dependencies. Having zero runtime dependencies is one of parse-xml's differentiating features, and I'd prefer to preserve that.

One alternative might be to add events or callback hooks that would give vfile-location (or anything else) an opportunity to annotate nodes during parsing. Then this optional behavior could be part of a higher level package without requiring parse-xml itself to add any dependencies.

Or, if parse-xml were to annotate each node with its string offset, maybe another option would be for a higher level package to traverse the node tree after parsing and use vfile-location at that point?

@wooorm
Copy link
Author

wooorm commented Jan 5, 2023

Byte offsets into the source document is good enough for me!

@rgrove
Copy link
Owner

rgrove commented Jan 17, 2023

@wooorm How does this look? #25

@rgrove rgrove closed this as completed in 657cc11 Feb 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants