Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guidance Needed: Removing XML Declaration in etree #135

Closed
call-stack opened this issue May 4, 2024 · 4 comments
Closed

Guidance Needed: Removing XML Declaration in etree #135

call-stack opened this issue May 4, 2024 · 4 comments

Comments

@call-stack
Copy link

call-stack commented May 4, 2024

Hello,

I am currently trying to utilize this library for XML processing in Go and I am encountering a challenge with removing the XML declaration from the output string. Despite several attempts, I haven't found a straightforward method to omit the XML declaration when using the WriteToString function.

Could someone please guide me on how to achieve this? Are there specific settings or procedures within this lib that I should use to ensure the XML declaration is consistently omitted from the output?

Thank you in advance for your assistance!

@beevik
Copy link
Owner

beevik commented May 4, 2024

It is admittedly somewhat difficult to manipulate non-element tokens from an etree document. But it is possible. For example, the following code searches through the root level's child tokens for an "xml" processing instruction (i.e. the XML declaration) and removes it:

for _, t := range doc.Child {
	if p, ok := t.(*etree.ProcInst); ok && p.Target == "xml" {
		doc.RemoveChild(p)
		break
	}
}

@call-stack
Copy link
Author

Thank you for the detailed explanation and the code snippet. It definitely clarifies how to manipulate non-element tokens using etree.
Considering this approach, would it be a good idea to encapsulate this functionality into a method that can be part of this lib itself? This could simplify the process for users who need to manipulate processing instructions or other non-element tokens in similar ways. What do you think about exposing such a method in the library itself?

@call-stack
Copy link
Author

The code provided successfully removes the XML declaration. However, when invoking doc.WriteToString() afterward, it results in an empty line along with document content.

@beevik
Copy link
Owner

beevik commented May 5, 2024

Yes, each element's Child tokens includes white-space text-only tokens. You can either continue to scan for CharData tokens after finding the ProcInst and remove them as well, or you can just re-indent the document using doc.Indent() after removing the line.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants