-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add XML::Node.normalized_text #4020
Conversation
Why would you call this normalized? That's definitely not self-explanatory. In the context of XML nodes, normalization usually refers to merging adjacent text nodes in a tree and removing empty ones. Besides naming, I am not sure why this would be necessary, anyway. No XML parser API I could think of provides such a method. What's your use case? I can't think of a common application where you would need only text content from direct descendants. Seems to me like this would probably mean a flawed XML schema... When it's needed, you can just easily write |
@straight-shoota: Also it is not true that no XML parser does this. The ruby XML parser does this by default for the |
@Rinkana What's this Ruby library that has this |
By the way:
I also checked Nokogiri's source code, the |
@Rinkana Do you mean |
@asterite i've checked. Its REXML that does this: require "rexml/document"
include REXML
string = <<EOF
<foo>text<bar>another</bar></foo>
EOF
doc = Document.new string
REXML::XPath.first(doc, "/foo").text # text |
According to the REXML API
That's something different than what you are proposing with |
I can give you an example in what usecase it can be useful. I'm using it to parse the OpenGL specs file (gl.xml). In this file XML types are defined this way: <types>
<type>typedef unsigned int <name>GLenum</name>;</type>
<type>typedef double <name>GLdouble</name>;</type>
</types> In this usecase i just need the typedefs but without the nested This is also how i came across it. But this is just a proposal, if you fell that this function is better suited with another name i'd gladly change it. |
Well in that case you could just go with |
Well yeah, you are correct about this case. However that's not the point that i want to address. Crystal's |
Yeah they do. But I don't see why anyone would it to return what rexml does. That doesn't make any sense and can be accomplished in a straightforward way. The |
Have to agree with @straight-shoota here. The Still, thank you for the contribution @Rinkana, and please feel free to comment if you think of another use case or argument of why this particular implementation should be included. |
XML::Node has three text functions that all return the same thing:
#content
,#text
,#inner_text
.However they all return the current node's text including the text from its children.
This function only returns the text that the current node has
Example: