Syntax is a Ruby library for simple syntax highlighting.
Switch branches/tags
Pull request Compare This branch is 25 commits ahead, 2 commits behind distler:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
lib Preparing for release: 1.2.2 Jul 3, 2017
test Fix: run on Ruby 2.x, path resolution has evolved. Dec 10, 2013
.gitignore Add Gemfile for Bundler. Jun 28, 2017
Gemfile Ensure test-unit is available for recent Rubies. Jun 28, 2017
Rakefile If rubygems/builder isn’t available, no worries. Jun 28, 2017



A syntax highlighting a library for Ruby.


This fork is maintained and version 1.1.0 has been published from it. However, there's currently none or not much new development going on here and the original author, @jamis, recommends using CodeRay, over this library.


This is a simple syntax highlighting library for Ruby. It is a naive syntax analysis tool, meaning that it does not “understand” the syntaxes of the languages it processes, but merely does some semi-intelligent pattern matching.


There are primarily two uses for the Syntax library:

  • Convert text from a supported syntax to a supported highlight format (like HTML).

  • Tokenize text in a supported syntax and process the tokens directly.

Highlighting a supported syntax

require 'syntax/convertors/html'

convertor = Syntax::Convertors::HTML.for_syntax "ruby"
puts convertor.convert( "file.rb" ) )

The above snippet will emit HTML, using spans and CSS to indicate the different highlight “groups”. (Sample CSS files are included in the “data” directory.)

Tokenize text

require 'syntax'

tokenizer = Syntax.load "ruby"
tokenizer.tokenize( "file.rb" ) ) do |token|
  puts "group(#{}, #{token.instruction}) lexeme(#{token})"

Tokenizing is straightforward process. Each time a new token is discovered by the tokenizer, it is yielded to the given block.

  • is the lexical group to which the token belongs. Each supported syntax may have it's own set of lexical groups.

  • token.instruction is an instruction used to determine how this token should be treated. It will be :none for normal tokens, :region_open if the token starts a nested region, and :region_close if it closes the last opened region.

  • token is itself a subclass of String, so you can use it just as you would a string. It represents the lexeme that was actually parsed.