Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convienience methods suggestions for DOM parser #12

Closed
helikopterodaktyl opened this issue Jul 5, 2018 · 4 comments
Closed

Convienience methods suggestions for DOM parser #12

helikopterodaktyl opened this issue Jul 5, 2018 · 4 comments
Labels
enhancement New feature or request

Comments

@helikopterodaktyl
Copy link

helikopterodaktyl commented Jul 5, 2018

DOM parser could use some methods to simplify usage of the API:

  1. Methods such as getElementsByTagName, getFirstChild etc.

  2. Extraction of attributes could be made more user friendly, by adding the option to query attributes by name, something like:

DOMEntity!string entity = parseXML(`);

writeln(entity.attrs["a"]);

@jmdavis jmdavis added the enhancement New feature or request label Aug 4, 2018
@jmdavis
Copy link
Owner

jmdavis commented Aug 4, 2018

What would you expect a function like getElementsByTagName to do? Are you looking for something that does the equivalent of

auto result = entity.children.filter!(a => (a.type == EntityType.elementStart ||
                                            a.type == EntityType.elementEmpty) &&
                                           a.name == tagName)();

Or are you looking for something else?

For getFirstChild, how would it be different from just calling entity.children[0]?

As for entity.attributes["a"], the attributes property of a DomEntity returns a dynamic array of attributes. So, you can get basically the same thing as simply doing something like

auto attr = entity.attributes.find!(a => a.name == "a")();

So, I'm not sure that adding a function just to get a specific attribute really makes much sense. Also, using the subscript syntax poses two problems. First, it implies an O(1) operation, which would only be possible if DomEntity provided something like an associative array for the attributes. Second, it implies that it would be a RangeError if the attribute weren't there, which is likely to be very problematic, particularly since the application generally has no control over what attributes are in the XML, and it can't assume that any particular attribute is present. Having an AA for the attributes might make sense (and it seems like that's essentially what you're looking for), but it would be inefficient from a memory standpoint, and the dynamic array would almost always be short enough that it wouldn't cost much to just linearly search the array for a particular element anyway. I'm inclined to think that getAttrs (which will be in dxml 0.4) will present a better solution

5a77fd3

over doing anything with AAs, and if a linear search is good enough, then find already does the job. An example of getAttrs would be

auto xml = `<root a="foo" b="19" c="true" d="rocks"/>`;
auto range = parseXML(xml);
assert(range.front.type == EntityType.elementEmpty);

string a;
int b;
bool c;

getAttrs(range.front.attributes, "a", &a, "b", &b, "c", &c);
assert(a == "foo");
assert(b == 19);
assert(c == true);

It will work with attributes from both an EntityRange and a DOMEntity. Getting a single attribute would then be something like

string str;
getAttrs(entity.attributes, "a", &str);

which is a bit more verbose than entity.attributes["a"], but it's more flexible and is more in line with the fact that DomEntity stores the attributes as a dynamic array rather than an associative array.

@JesseKPhillips
Copy link

I'm against adding helper functions as requested. As noted the algorithm functions already assist. I see these as just mimicing other DOM libraries and the range interface is already way better.

@helikopterodaktyl
Copy link
Author

"mimicking other DOM libraries", it's a bit more than that. DOM APIs are basically standarized on these names because that's what browser APIs use.

Good point about the std algorithms. Perhaps it should be mentioned in a documentation :) frankly I was quite stumped after doing parseDOM(), I didn't know how to exactly extract some meaningful data out of it.

@JesseKPhillips
Copy link

@helikopterodaktyl, yeah the way I phrased that was harsh. I just hate the "standard" and much prefer the D standards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants