Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reading _Xceed_ results in 컭 #1154

Closed
VDBBjorn opened this Issue Mar 6, 2019 · 2 comments

Comments

Projects
None yet
2 participants
@VDBBjorn
Copy link

VDBBjorn commented Mar 6, 2019

Do you want to request a feature or report a bug?

  • Bug
  • Feature

If you are logging a possible bug or feature request, please test with the latest development build first.

Version of ClosedXML

0.94.2

What is the current behavior?

When having for example "_Xceed_Something" filled out in cell A1 of my excel, the resulting string read from this cell will be "컭Something".

What is the expected behavior or new feature?

The resulting string should be "_Xceed_Something", as this is just a string value.

Did this work in previous versions of our tool? Which versions?

I don't think so, but untested

Reproducibility

When running the code below, the result will be "컭Something" instead of "_Xceed_Something".

Code to reproduce problem:

public void Main()
{
   var stream = new FileStream(@"path\to\your\XceedTest.xlsx", FileMode.Open);
   var workbook = new XLWorkbook(stream);
   var sheet = workbook.Worksheets.First();

    var cell = sheet.RowsUsed().First().Cell("A");
    Console.WriteLine(cell.GetString());
}
@igitur

This comment has been minimized.

Copy link
Member

igitur commented Mar 6, 2019

Interesting.

When saving string values, unicode characters that are not supported by XML are encoded with a _x0000 format, where the last for characters represent the unicode hexidecimal code. ClosedXML uses the System.Xml.XmlConvert class' DecodeName and EncodeName for this decoding/encoding.

See https://github.com/dotnet/corefx/blob/master/src/System.Private.Xml/tests/XmlConvert/EncodeDecodeTests.cs for examples.

Usually the encoding is done with a lowercase x, but according to https://github.com/dotnet/corefx/blob/master/src/System.Private.Xml/src/System/Xml/XmlConvert.cs#L96 the uppercase X is also valid.

That implies that _Xceed is a valid escaped unicode character (the ), with CEED being the hexidecimal code for that character.

I'll prepare a fix to explicitly ignore upper case _X.

@igitur

This comment has been minimized.

Copy link
Member

igitur commented Mar 6, 2019

Obviously, the intermediate workaround for you is to avoid _X in your cell values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.