Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Windows-1252 fallback logic for Encoding.Default #10190

Merged
merged 30 commits into from
Jun 10, 2024

Conversation

radeusgd
Copy link
Member

@radeusgd radeusgd commented Jun 5, 2024

Pull Request Description

Important Notes

Checklist

Please ensure that the following checklist has been satisfied before submitting the PR:

  • The documentation has been updated, if necessary.
  • Screenshots/screencasts have been attached, if there are any visual changes. For interactive or animated visual changes, a screencast is preferred.
  • All code follows the
    Scala,
    Java,
    TypeScript,
    and
    Rust
    style guides. In case you are using a language not listed above, follow the Rust style guide.
  • Unit tests have been written where possible.

@radeusgd radeusgd self-assigned this Jun 5, 2024
@radeusgd radeusgd force-pushed the wip/radeusgd/10148-win-1252-fallback branch from fcabb8e to 6ef7aae Compare June 5, 2024 17:43
@radeusgd radeusgd force-pushed the wip/radeusgd/10148-win-1252-fallback branch from 6ef7aae to a059d88 Compare June 7, 2024 16:21
@radeusgd radeusgd marked this pull request as ready for review June 7, 2024 16:25
@radeusgd
Copy link
Member Author

radeusgd commented Jun 8, 2024

Tested the use-case loading the dataset https://www.kaggle.com/datasets/vivek468/superstore-dataset-final

image

We can see that the default one works, while setting explicitly to UTF-8 has failing characters. This shows that the Default encoding correctly falls back to Windows-1252.

test/Table_Tests/src/IO/Delimited_Write_Spec.enso Outdated Show resolved Hide resolved
test/Table_Tests/src/IO/Delimited_Read_Spec.enso Outdated Show resolved Hide resolved
@radeusgd radeusgd added the CI: Ready to merge This PR is eligible for automatic merge label Jun 10, 2024
@mergify mergify bot merged commit 41d02e9 into develop Jun 10, 2024
35 checks passed
@mergify mergify bot deleted the wip/radeusgd/10148-win-1252-fallback branch June 10, 2024 10:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI: Ready to merge This PR is eligible for automatic merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fallback to Windows-1252 encoding with Encoding.Default if invalid UTF-8 characters are encountered
3 participants