-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API change: Support different file encodings #376
Comments
I'd rather ask users to convert their files/packages to UTF-8. |
I agree that one should use UTF-8 in principal, but I am not sure whether styler is the right place to tell people what encoding they should use 😊 |
See checks in r-pkgs, second bullet point ;-) This is about |
Ok, thanks. |
As a result of this poll, which is probably not entirely free of bias, only 60% of R users use utf8 for sure. Around 35% use some default. |
If the default of windows users is the native encoding (and hence most likely latin1 in most cases), I think we can't ignore this. |
Yeah, or change that default ;-) Maybe we could supply another package that reencodes as UTF-8 first, and have styler detect non-UTF-8 encoding and direct users there? Could be a part of usethis, too -- can you please check if there is something related to this and perhaps open an issue if not? |
You mean changing the RStudio default encoding for new R Scripts? I mean for packages, I think it might be utf8 already, but we also have users that are not package developers. Is utf8 encoding a standard or planned to become one in the tidyverse? |
We should teach users/package authors/... to use the One True Character Encoding. I forgot about user scripts, but these too should be UTF-8 in my opinion. That standard has been around long enough, it doesn't seem anything else is going to replace it anytime soon. Windows users still won't be able to use characters from non-native locales (e.g., can't use a Chinese letter on a US-English Windows) due to limitations of |
That sounds like we should consider doing a PR to tidyverse/style, adding a rule to use utf8 throughout. |
Sounds good. (And maybe help them convert their existing code.) |
I filed an issue in tidyverse/style. If we get approval there, I think I can open one in usethis for conversion. |
I think in the light of tidyverse/style#71 (comment), we should re-consider offering a |
I'm not convinced. Why? |
Because I don't see how we can end the encoding mess of computer science with styler -.- As far as R goes, the issue will most likely be there for a very long time in the future unless some key players in the R world change their mind and enforce ASCII or UTF8 or some standard. Then, I think it would be reasonable to only support this encoding. And since RStudios default encoding is not UTF8 on windows if you are not inside a project that is a package, we have a large group of R users that are excluded from using styler if we insist on UTF8. |
ASCII is already somewhat enforced by RStudio's default file encoding is UTF-8, also on Windows. Just checked. |
If we decide to stick to UTF-8, we should at least update help files accrodingly and maybe even give a warning if we detect (how?) a file is not utf8 encoded. |
Anyways, let's add a note in the help file that we currently only support UTF-8. |
Final decision. UTF8 only as per https://yihui.name/en/2018/11/biggest-regret-knitr/. |
Migrated from ##374 (comment).
@krlmlr Do you think we should support more file encodings? Is the package enc limited to utf8 and latin1? If not, we could add an argument file_encoding to the exported style_*() functions. As mentioned in #374, formatR seems to handle the problems outlined in #374 well.
The text was updated successfully, but these errors were encountered: