New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unicode characters not written to excel #743
Unicode characters not written to excel #743
Conversation
Pull latest abap2xlsx changes
Remove doma ZEXCEL_BOOLE01 (abap2xlsx#738)
AUnit warning zcl excel reader huge file "# au > "#au abap2xlsx#739 (abap2xlsx#742)
Fix abap2xlsx#672 ("vietnamese") and abap2xlsx#688 ("emoji"). It works for a recent system but unfortunately I can't test it on a old system to make sure that the new method `is_support_non_xml_characters` returns false.
Can you extend or provide a test report to replicate this behaviour? |
I propose to extend the Hello world program ZDEMO_EXCEL1 to write "Hello world" in 10 most-spoken languages in the world + emoji 👋🌎, 👋🌍, 👋🌏 (waving hand + 3 parts of the world). |
Good suggestion. Please add it to this PR. |
Done. |
@sandraros how old should the system be? |
@AndreaBorgia-Abo I tested on a 7.52 system so 7.40 is fine. |
@AndreaBorgia-Abo I guess Excel doesn't say the file is corrupted. Could you force r_result = abap_true to see what happens? |
@AndreaBorgia-Abo So, that was the perfect test with SAP kernel having the bug, and correctly detected by abap2xlsx and handled the best way it can. Thanks! |
Any particular notes to check or kernel version? |
@AndreaBorgia-Abo SKIP_NON_XML_CHARACTERS was made available in note 1750204 and Kernel must be as per note 2220720. |
1750204: SAP_BASIS is SAPKB74003, higher than required |
@AndreaBorgia-Abo Your system should have valid support for "non-XML characters" according to the SAP notes, but the method says your SAP system doesn't support them, and the test you did by forcing true via debug confirms it. What value do you get in |
3C3F786D6C2076657273696F6E3D22312E30223F3E3C524F4F543EEDA080EDB0803C2F524F4F543E |
@AndreaBorgia-Abo Well, EDA080 and EDB080 correspond to invalid UTF-8 conversion of the two (invalid) code points U+D800 and U+DC00, but if the SAP kernel supports it correctly, it should return F0908080 (as on my 7.52 system), which corresponds to the UTF-8 code value of character U+10000. I don't know what to think. Maybe UCCP doesn't work well for surrogate code points U+D800 and U+DC00 in old systems. I have |
@AndreaBorgia-Abo The current code does what I expect it to do, but I didn't expect that it wouldn't work in your system configuration. I think I have just found out why, maybe that doesn't work in any version before 7.52, that's explained in the note 2922674 - Support for Unicode Characters U+10000 to U+10FFFF in the iXML kernel library / ABAP package SIXML. Hence, maybe this completely new code might be more universal. Could you test it please? |
Thanks a lot. |
@gregorwolf my comment refers to an out-of-PR version that @sandraros wanted me to test, I believe, before including it in the PR. As the master branch stands now, those changes are not yet present. |
Sorry. Do you have already a PR for the improvements? |
Let's wait for @sandraros , it's her code and her PR. |
@AndreaBorgia-Abo Sorry for the delay. I just created the bug #756 that corresponds to the symptom you have seen in 7.40, and the PR #757, that does the same solution as you have tested but the code is a little bit different so that --I hope-- there's less risk of error in non-Unicode systems (+ comments). Could you test this new PR please? |
Fix #672 ("vietnamese") and #688 ("emoji").
It works for a recent system but unfortunately I can't test it on a old system to make sure that the new method
is_support_non_xml_characters
returns false.