Skip to content


Repository files navigation


Maven Central Maven Central (snapshot) Codecov Java Version


JVM Platform Status
OpenJDK (Temurin) Current Linux Build (OpenJDK (Temurin) Current, Linux)
OpenJDK (Temurin) LTS Linux Build (OpenJDK (Temurin) LTS, Linux)
OpenJDK (Temurin) Current Windows Build (OpenJDK (Temurin) Current, Windows)
OpenJDK (Temurin) LTS Windows Build (OpenJDK (Temurin) LTS, Windows)


The jeucreader package provides an interface for reading Unicode codepoints one at a time.


  • Unicode codepoint reader interface.
  • High coverage test suite.
  • Written in pure Java 17 with no dependencies.
  • OSGi-ready
  • JPMS-ready
  • ISC license.


For some reason, Java does not expose any interface to read individual Unicode codepoints from any kind of I/O stream. It does provide methods to, for example, read text into a String and then iterate over the codepoints of the String.

The jeucreader package attempts to provide this missing functionality.


Given a r, instantiate a UnicodeCharacterReaderType and use it to read individual codepoints:

Reader r;

try (var u = UnicodeCharacterReader.newReader(r)) {
  int c0 = u.readCodePoint();
  int c1 = u.readCodePoint();
  int c2 = u.readCodePoint();

On consuming malformed text, the reader may raise subtypes of IOException such as InvalidSurrogatePair, MissingLowSurrogate, OrphanLowSurrogate, and etc.