Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trailing whitespace #425

Closed
rgruebel opened this issue Aug 6, 2019 · 3 comments
Closed

Trailing whitespace #425

rgruebel opened this issue Aug 6, 2019 · 3 comments
Labels
Milestone

Comments

@rgruebel
Copy link

rgruebel commented Aug 6, 2019

When I read the value of a cell, the result string is filled with trailing whitespaces until it has the same length as the maximum of the whole column.
Is this the expected behavior?

@andersnm
Copy link
Collaborator

andersnm commented Aug 6, 2019

Hi,

This is not the expected behavior, unless there are actually spaces in the cells. Any chance to attach a file for repro? (assuming its an xls, please zip it first, otherwise github wont accept the upload)

If you cannot share the file, if would be somewhat helpful with a hex dump of the first 8 bytes of the file, and the bytes before and around a string in the file which is being returned with trailing whitespace

@rgruebel
Copy link
Author

rgruebel commented Aug 6, 2019

Thank you for the quick response. When I open the file with Excel there are no spaces. As soon as I make a change to the original file, the spaces also disappear during import. It's an xlsx file, so I unpacked it and the spaces are already contained in the XML. So I think it's not a bug of ExcelDataReader. I just wonder why the spaces are not present when I open the file with Excel.

<x:c s="0" t="str">
    <x:v>SHR-ZYL-DIN84-MS-M3X16                  </x:v>
</x:c>

@andersnm
Copy link
Collaborator

andersnm commented Aug 6, 2019

Curious. If I create an XLSX with spaces, its saved to the shared string table like this, and the spaces are preserved upon save and reload:

<sst xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" count="1" uniqueCount="1">
  <si>
    <t xml:space="preserve">                 asdad a                      </t>
  </si>
</sst>

Sounds like ExcelDataReader could implement support for the special "xml:space" attribute to handle this perfectly, and trim by default if its omitted. Backlog material :-)

@appel1 appel1 added the bug label Dec 3, 2022
@appel1 appel1 added this to the 3.7 milestone Jun 18, 2023
@appel1 appel1 closed this as completed in f6ae4cc Jun 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants