Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
A UTF-8 Encoding Validator.
Ruby

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
examples
lib
test
.gitignore
LICENSE.txt
README.rdoc
Rakefile
VERSION
utf8_validator.gemspec

README.rdoc

A UTF-8 Validator State Machine

Provides an implementation of a state machine for validating UTF-8 encoded strings. Clients may request that encoding errors be reported in several ways:

  • simple true / false indicator

  • a raised exception

What This gem does Not Provide

  • UTF-8 Encoding

  • UTF-8 Decoding

That functionality is left as an exercise for the reader.

Thanks To

The Unicode Consortium

At unicode.org/ for all the information published there.

Frank Yung-Fong Tang

For the state machine algorithm. See: unicode.org/mail-arch/unicode-ml/y2003-m02/att-0467/01-The_Algorithm_to_Valide_an_UTF-8_String

Markus Kuhn

For invalid test data. www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt

Useful Information

Other interesting and/or useful information can be found:

A Word On Ruby Versions

It is expected that this validator will be used in Ruby environments prior to 1.9.x. However, nothing prohibits use with Ruby 1.9 or 2.0. Tests recognize these environments and adjust behavior accordingly.

Testing for 1.8.x done using:

  • 1.8.5_231

  • 1.8.6_383

  • 1.8.7_299

Testing for 1.9.x done using a variety of versions, including:

  • 1.9.1p378

  • 1.9.1p431

  • 1.9.2p188

  • 1.9.3p358

and others.

Testing for 2.0.x done using:

  • 2.0.0p0

Testing for 2.1.x done using:

  • 2.1.1p76

Reporting Issues

Please report issues on the tracker at github:

Web Based Documentation

Human readable documentation can be found at:

Contributing to the utf8_validator gem

  • Check out the latest master to make sure the feature hasn't been implemented or the bug hasn't been fixed yet.

  • Check out the issue tracker to make sure someone already hasn't requested it and/or contributed it.

  • Fork the project.

  • Start a feature/bugfix branch.

  • Commit and push until you are happy with your contribution.

  • Make sure to add tests for it. This is important so it does not break in in a future version unintentionally.

  • Please try not to modify the Rakefile or VERSION file. If you require your own version please isolate the version update to its own commit so cherry-pick or rebase can be used to skip it.

  • Request a pull.

Copyright

Copyright © 2011-2014 Guy Allard. See LICENSE.txt for further details.

Something went wrong with that request. Please try again.