A helper class to skip Unicode BOMs at the beginning of input streams.
I initially released this class as a Stack Overflow answer and it apparently got copy-pasted into several Java projects already. However, code put as answers on Stack Overflow is licensed under CC-BY-SA 3.0 which may not suit everybody.
Many years have passed since I wrote this class and today Java still doesn't properly deal with UTF-8 Unicode Byte Order Marks (BOMs) at the beginning of data. In 2001, someone opened bug JDK-4508058 with the sound expectation Java should detect and skip UTF-8 BOMs at the beginning of UTF-8 streams, the same way it does for e.g. UTF-16.
Bug JDK-4508058 remained open for a while, then got fixed and ultimately reverted because some other great programmers relied on that exact same bug:
the Java EE 5 RI and SJSAS 9.0 has been relying on detecting a BOM, setting the appropriate encoding, and discarding the BOM bytes before reading the input
See, they're complaining because shipped code breaks if/when JDK behavior changes. And instead of fixing JDK-4508058 and accept this would be an annoyance only for Java EE 5 RI and SJSAS 9.0 users, people in charge at Sun decided we're all living in a better world if JDK-4508058 gets closed as "won't fix". Because fuck you, just skip the BOM yourself.
UnicodeBOMInputStream and use
skipBOM() methods. See
If you find this library useful and decide to use it in your own projects please drop me a line @gpakosz.
If you use it in a commercial project, consider using Gittip.