-
Notifications
You must be signed in to change notification settings - Fork 10.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Files and Resources do not handle UTF-8 files with BOM #345
Comments
Original comment posted by kevinb@google.com on 2010-04-09 at 07:05 PM Good to know. Based on this, it will probably make sense for us to check for a byte- |
Original comment posted by fry@google.com on 2011-01-28 at 04:03 PM (No comment entered for this change.) Status: |
Original comment posted by mail4danny on 2011-02-04 at 04:18 PM No, the JDK just quietly ignores this ;-) |
Original comment posted by finnw1 on 2011-02-04 at 09:01 PM The JDK does detect (and strip) the BOM for some encodings, e.g. |
Original comment posted by kevinb@google.com on 2011-07-13 at 06:18 PM (No comment entered for this change.) Status: |
Original comment posted by fry@google.com on 2011-12-10 at 03:45 PM (No comment entered for this change.) Labels: |
Original comment posted by fry@google.com on 2012-02-16 at 07:17 PM (No comment entered for this change.) Status: |
Original comment posted by kevinb@google.com on 2012-06-22 at 06:16 PM (No comment entered for this change.) Status: |
Original comment posted by j...@durchholz.org on 2012-12-27 at 10:01 AM
|
Original comment posted by NikolayMetchev on 2014-01-07 at 11:57 AM This was filed as a bug in the JDK. The decided not to fix it there for backward compatibility reasons: |
Any progress on this? Won't Guava help us read a BOM? |
have their input stream remove any char with the value of 65279 at index 0. It's not pointless notepad uses it to easily determine what utf type the file is in. To be honest I think this is what file headers are made for why not just have a file header with the string utf-x in front of it only takes a couple bytes but, I didn't make utf protocal |
Original issue created by kai@google.com on 2010-04-08 at 07:59 PM
By the UTF-8 definition, UTF-8 files are allowed to have an optional leading
BOM. This BOM is stupid and pointless, but many Windows apps seem to
generate UTF-8 files with the BOM. Guava's classes Files and Resources do
not handle UTF-8 files with a BOM. I'm not sure where this fix belongs, or
whether it should even be fixed at all (since Windows is being stupid, and
people are rightly sick and tired of working around Windows issues). BTW, I
don't personally use Windows. I'm reporting this issue only because I
maintain a library that uses Guava, and there are some Windows users of my
library that are running into this issue.
The text was updated successfully, but these errors were encountered: