Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reading fixed length string dataset ignores charset #113

Closed
JCzogalla opened this issue Oct 16, 2019 · 2 comments
Closed

Reading fixed length string dataset ignores charset #113

JCzogalla opened this issue Oct 16, 2019 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@JCzogalla
Copy link
Contributor

Describe the bug/missing feature
When reading a dataset of type fixed length string(i.e. StringData), the charset is ignored and the bytes are always decoded as US_ASCII (DatasetReader l. 336). It seems that HDFView has the same problem.

To Reproduce
Use the attached file: utf8-fixed-length.zip
HDFView and jhdf will both show broken characters instead of umlauts.

Expected behaviour
DatasetReader should take the string type's charset into account.

Please complete the following information:

  • jhdf version: 0.4.8
  • Java version: 1.8
  • Stack trace/problem site: DatasetReader l. 336

Additional context
StringData knows its charset, and the call to the private method fillFixedLengthStringData (DatasetReader l. 134) could use it as a parameter.

@jamesmudd jamesmudd added the bug Something isn't working label Oct 16, 2019
@jamesmudd jamesmudd self-assigned this Oct 16, 2019
@jamesmudd
Copy link
Owner

Reproduced the issue. Should be an easy fix....

@jamesmudd
Copy link
Owner

Think I have fixed this one. If you get chance give it a try and let me know.

JCzogalla pushed a commit to rapidminer/jhdf that referenced this issue Jan 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants