Support non-UTF-8 encodings and fail gracefully with unexpected encoding #7

chrispomeroyhale · 2023-12-02T22:21:23Z

A customer was trying to import a Windows-1252 encoded document and the parser gets stuck in a loop when finding a bad encoding. For privacy reasons I will not post the particular document.

Two parts:

At the very least, we should gracefully handle documents with unexpected encodings so that client apps don't hang and can display an error.
Ideally support multiple encodings. Although we may expect modern CSV to be UTF-8 encoded, the whole idea of the CSV Dialect is interoperability with existing documents in their various flavors.

chrispomeroyhale added bug Something isn't working enhancement New feature or request labels Dec 2, 2023

chrispomeroyhale pushed a commit that referenced this issue Dec 4, 2023

GH-7: Fix hang when encountering an unexpected encoding.

4e04316

chrispomeroyhale pushed a commit that referenced this issue Dec 4, 2023

GH-7: Optionally specify import data encoding.

89c6c11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support non-UTF-8 encodings and fail gracefully with unexpected encoding #7

Support non-UTF-8 encodings and fail gracefully with unexpected encoding #7

chrispomeroyhale commented Dec 2, 2023

Support non-UTF-8 encodings and fail gracefully with unexpected encoding #7

Support non-UTF-8 encodings and fail gracefully with unexpected encoding #7

Comments

chrispomeroyhale commented Dec 2, 2023