Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XElement formating issue #29260

Closed
VBAndCs opened this issue Apr 15, 2019 · 14 comments
Closed

XElement formating issue #29260

VBAndCs opened this issue Apr 15, 2019 · 14 comments
Assignees
Milestone

Comments

@VBAndCs
Copy link

VBAndCs commented Apr 15, 2019

XElement behaves bad with xml nodes that contain literal strings beside xml elements. In this case, XElement deletes all white spaces, including line separators, regardless of the reader and writer settings, and the method used to read the XElement content(using a reader, or ToString)
For example:

Dim cshtml = 
<cshtml>
   <p>Test</p>
   @foreach (var i in items)
   {
     <h>item: </h>
     <div>
       <p>i</p>
     </div>
   }
   <p>End Test</p>
</cshtml>

Dim S = cshtml.ToString()

The resulting string in s will be:

<cshtml>
  <p>Test</p>
                             @foreach (var i in items)
                             {
                             <h>item: </h><div><p>i</p></div>
                             
                             }
                                 <p>End Test</p></cshtml>

The format is damaged because the c# string!. If you omitted it, every thing will work fine! Why?

@krwq
Copy link
Member

krwq commented Apr 15, 2019

@VBAndCs, XML spec allows to normalize whitespace around the text (I believe it should not touch whitespaces within text but not 100% sure without looking at the spec).

The situation should improve if you put xml:space="preserve" on the element containing such or use CDATA.

XML parser is still allowed to normalize line endings regardless of that setting.

Your best bet is to use XmlTextReader/XmlTextWriter with WhitespaceHandling option - it should work best when whitespaces are important (you shouldn't need to use xml:space="preserve" with that).

Please let me know if this fixes your issue.

@krwq krwq self-assigned this Apr 15, 2019
@VBAndCs
Copy link
Author

VBAndCs commented Apr 15, 2019

@krwq
Thanks for your help. This is the result using xml:space="preserve".. It doesn't preserv line breaks in my case:

<cshtml xml:space="preserve">
  <p>Test</p>
   @foreach (var i in items)
   {
     <h>item: </h><div xml:space="preserve"><p>i</p></div>
   }
   <p>End Test</p></cshtml>

I tried XmlTextReader/XmlTextWriter before and got the same results. The only thing worked for me is this workaround:

<cshtml>
  <p>Test</p>
   <zmiItem0/>
   <zmlitem1/>
     <h>item: </h>
     <div>
          <p>i</p>
     </div>
   <zmlitem2/>
   <p>End Test</p>
</cshtml>

Then I replace <zmlitemN/> in the output string.

@krwq
Copy link
Member

krwq commented Apr 15, 2019

@VBAndCs could you paste the code you are using for reading and writing XML with XmlTextReader/Writer? I wanted to double check if there is anything else we can do to improve this.

One other thing you might want to check: writer might be writing new lines but \n instead of \r\n - some editors might not display it correctly.

@VBAndCs
Copy link
Author

VBAndCs commented Apr 15, 2019

@krwq
I deleted the Writer code since it didn't solve the issue. I tried different ways, untill I realized that if I erased the text parts, I get a correct xml format.
I checked \n and tried to manually replace it, or use the replace option. Nothing changed as long as the XML contains that c# literal code!
I am satisfied with my workaround now, but I am reporting this so you can check what is happening and fix it for all developers.

@krwq
Copy link
Member

krwq commented Apr 15, 2019

@VBAndCs we will need a complete code sample which is not working correctly for this to be actionable.. Note XML is one of the largest libraries in corefx and there are multiple different Readers/Writers.

@VBAndCs
Copy link
Author

VBAndCs commented Apr 15, 2019

I already gave the sample using VB.NET xml literals. You can try it as is:

Dim cshtml = 
<cshtml>
   <p>Test</p>
   @foreach (var i in items)
   {
     <h>item: </h>
     <div>
       <p>i</p>
     </div>
   }
   <p>End Test</p>
</cshtml>

Dim S = cshtml.ToString()
Console.WriteLine(S)

You can trace the XElemnt.ToString() and see why it produces that code.
If you want C# sample:

var cshtml =
@"<cshtml>
   <p>Test</p>
   @foreach (var i in items)
   {
     <h>item: </h>
     <div>
       <p>i</p>
     </div>
   }
   <p>End Test</p>
</cshtml>";

var x = System.Xml.Linq.XElement.Parse(cshtml);
var S = cshtml.ToString();

You can trace Parse and ToString.
Thanks.

@krwq
Copy link
Member

krwq commented Apr 16, 2019

@VBAndCs running C# above here is the output I get + Console.WriteLine:

<cshtml>
   <p>Test</p>
   @foreach (var i in items)
   {
     <h>item: </h>
     <div>
       <p>i</p>
     </div>
   }
   <p>End Test</p>
</cshtml>

this looks the same as an input to me...

@VBAndCs
Copy link
Author

VBAndCs commented Apr 16, 2019

Sorry, it exactly prints the input itself :D
Change var S = cshtml.ToString(); to var S = x.ToString(); to print the xml string.

@VBAndCs
Copy link
Author

VBAndCs commented Apr 16, 2019

var cshtml =
@"<cshtml>
   <p>Test</p>
   @foreach (var i in items)
   {
     <h>item: </h>
     <div>
       <p>i</p>
     </div>
   }
   <p>End Test</p>
</cshtml>";

var xml = System.Xml.Linq.XElement.Parse(cshtml);
var s = xml.ToString();
Console.WriteLine(s);

@krwq
Copy link
Member

krwq commented Apr 16, 2019

@VBAndCs didn't notice that, just quickly copy pasted... 🤦‍♂️

@krwq
Copy link
Member

krwq commented Apr 16, 2019

@VBAndCs try this out: XElement.Parse(cshtml, LoadOptions.PreserveWhitespace);

<cshtml>
   <p>Test</p>
   @foreach (var i in items)
   {
     <h>item: </h>
     <div>
       <p>i</p>
     </div>
   }
   <p>End Test</p>
</cshtml>

@VBAndCs
Copy link
Author

VBAndCs commented Apr 16, 2019

@krwq
It works. Thanks alot. Of course it will not work for xml literals becuase VB.NET does the parsing internally! Using string literal workarround is not always possible (it can contain embeded vb expressions). I will ask about this in VB.NET repo.
Thanks again.

@VBAndCs VBAndCs closed this as completed Apr 16, 2019
@krwq
Copy link
Member

krwq commented Apr 16, 2019

Glad it works for you 😄 (still facepalming on my first response though)

@VBAndCs
Copy link
Author

VBAndCs commented Apr 16, 2019

It was my typo mistake in first place :) .. Sleap time errors :D .

@msftgits msftgits transferred this issue from dotnet/corefx Feb 1, 2020
@msftgits msftgits added this to the 3.0 milestone Feb 1, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 13, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants