Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

<source> with only spaces are emptied on deserialization #11

Closed
ysavourel opened this issue Feb 14, 2016 · 6 comments
Closed

<source> with only spaces are emptied on deserialization #11

ysavourel opened this issue Feb 14, 2016 · 6 comments

Comments

@ysavourel
Copy link

It looks like the content of <source> is removed on reading when it is made of spaces.
For example if we have this:

String data = "<xliff srcLang='en' version='2.0' xmlns='urn:oasis:names:tc:xliff:document:2.0'>"
    + "<file id='f1'><unit id='u1'>"
    + "<segment><source>Sentence 1.</source></segment>"
    + "<ignorable><source> </source></ignorable>"
    + "<segment><source>Sentence 2.</source></segment>"
    + "</unit></file></xliff>";
using (IO.MemoryStream ms = new IO.MemoryStream(Encoding.UTF8.GetBytes(data)) )
{
    XliffReader reader = new XliffReader();
    XliffDocument doc = reader.Deserialize(ms);
    foreach (XliffElement e in doc.CollapseChildren<XliffElement>() )
    {
        Console.WriteLine("Type: " + e.GetType().ToString());
        if ( e is PlainText )
        {
            PlainText pt = (PlainText)e;
            Console.WriteLine("Content: '" + pt.Text + "'");
        }
    }
}

We get this output (no content for <ignorable>):

Type: Localization.Xliff.OM.Core.File
Type: Localization.Xliff.OM.Core.Unit
Type: Localization.Xliff.OM.Core.Segment
Type: Localization.Xliff.OM.Core.Source
Type: Localization.Xliff.OM.Core.PlainText
Content: 'Sentence 1.'
Type: Localization.Xliff.OM.Core.Ignorable
Type: Localization.Xliff.OM.Core.Source
Type: Localization.Xliff.OM.Core.Segment
Type: Localization.Xliff.OM.Core.Source
Type: Localization.Xliff.OM.Core.PlainText
Content: 'Sentence 2.'

While I would expect this output:

Type: Localization.Xliff.OM.Core.File
Type: Localization.Xliff.OM.Core.Unit
Type: Localization.Xliff.OM.Core.Segment
Type: Localization.Xliff.OM.Core.Source
Type: Localization.Xliff.OM.Core.PlainText
Content: 'Sentence 1.'
Type: Localization.Xliff.OM.Core.Ignorable
Type: Localization.Xliff.OM.Core.Source
Type: Localization.Xliff.OM.Core.PlainText
Content: ' '
Type: Localization.Xliff.OM.Core.Segment
Type: Localization.Xliff.OM.Core.Source
Type: Localization.Xliff.OM.Core.PlainText
Content: 'Sentence 2.'

It happens also for <segment> elements.

@RyanKing77
Copy link
Member

Thanks for reporting this issue. If you would like to contribute a fix, please do so via a pull request. Otherwise, we will evaluate and prioritize the fix as appropriate.

@RyanKing77
Copy link
Member

Upon further examination of the issue, this is by design. If you declare xml:space="preserve" then you will get the expected output. Perhaps you are expecting the default "processing mode" of the OM to be to preserve whitespace? In which case, this is not a bug but a design change.

@RyanKing77
Copy link
Member

To clarify a bit more: this is the default behavior of the OM because it is based on .Net XmlReader/Writer which does not preserve by default.

@ysavourel
Copy link
Author

Looking more closely at the XML specification, I have to agree: default means the reader does whatever it wants. I had read it as: "The reader can normalize or preserve". But it seems that complete deletion is valid too.

So it's not a bug.

But I would change this as a request for a change in behavior. While remove spaces on outer content is fine, it seems that completely removing spaces in elements that are content like <source> and <target> is probably unwise. I would expect to either normalize or preserve whitespace there.

I'll also post a not to answer your email in the XLIFF list.

@RyanKing77
Copy link
Member

Thanks for reporting this issue. If you would like to contribute a fix, please do so via a pull request. Otherwise, we will evaluate and prioritize the design change request appropriately.

@RyanKing77
Copy link
Member

RyanKing77 commented Jan 4, 2017

Fixed with the following commits
b0dadfe
08178a6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants