Skip to content

Consider making obj_maybe_translate_encoding() always translate to UTF-8 #1246

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
DavisVaughan opened this issue Sep 8, 2020 · 0 comments · Fixed by #1249
Closed

Consider making obj_maybe_translate_encoding() always translate to UTF-8 #1246

DavisVaughan opened this issue Sep 8, 2020 · 0 comments · Fixed by #1249
Assignees

Comments

@DavisVaughan
Copy link
Member

DavisVaughan commented Sep 8, 2020

Would likely make #1245 more straightforward

We currently let a vector that is consistently in a non-UTF-8 encoding (like an all Latin1 vector) through obj_maybe_translate_encoding() without translation. We do this:

a) For performance reasons with those vectors
b) This is what base R does

However, the new vec_order() will always convert to UTF-8, and it would certainly simplify things if we always converted to UTF-8 everywhere.

I don't think it would change the actual behavior of any functions that use this translation helper, since you'd still end up with a consistent encoding that you can compare in.

I think this would remove the need for obj_maybe_translate_encoding2(), because you could always expect that the result of obj_maybe_translate_encoding() would be in UTF-8, making it comparable to any other UTF-8 vector.

It would also greatly simplify the translation code, especially for lists, which already have weird translation rules (it does a first pass through the list to check if any element needs translation, then does a second pass to translate all elements if any elements required translation).


Also consider renaming it to proxy_translate_utf8()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant