Skip to content


Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Syntax is a Ruby library for simple syntax highlighting.
Ruby CSS
branch: master

This branch is 6 commits ahead, 2 commits behind distler:master

Explicitly require 'set'.

latest commit 22430510b9
Jan Berkel authored committed
Failed to load latest commit information.
doc/manual Removed generated documentation.
lib Explicitly require 'set'.
test Fix: run on Ruby 2.x, path resolution has evolved.
.gitignore Added generated documentation to gitignore.
RELEASING.rdoc Split releasing into its own document.
Rakefile The Jamisbuck SVN server is RIP.



A syntax highlighting a library for Ruby.

This fork is maintained and version 1.1.0 has been published from it. However, there's currently none or not much new development going on here and the original author, @jamis, recommends using CodeRay, over this library.


This is a simple syntax highlighting library for Ruby. It is a naive syntax analysis tool, meaning that it does not “understand” the syntaxes of the languages it processes, but merely does some semi-intelligent pattern matching.


There are primarily two uses for the Syntax library:

  • Convert text from a supported syntax to a supported highlight format (like HTML).

  • Tokenize text in a supported syntax and process the tokens directly.

Highlighting a supported syntax

require 'syntax/convertors/html'

convertor = Syntax::Convertors::HTML.for_syntax "ruby"
puts convertor.convert( "file.rb" ) )

The above snippet will emit HTML, using spans and CSS to indicate the different highlight “groups”. (Sample CSS files are included in the “data” directory.)

Tokenize text

require 'syntax'

tokenizer = Syntax.load "ruby"
tokenizer.tokenize( "file.rb" ) ) do |token|
  puts "group(#{}, #{token.instruction}) lexeme(#{token})"

Tokenizing is straightforward process. Each time a new token is discovered by the tokenizer, it is yielded to the given block.

  • is the lexical group to which the token belongs. Each supported syntax may have it's own set of lexical groups.

  • token.instruction is an instruction used to determine how this token should be treated. It will be :none for normal tokens, :region_open if the token starts a nested region, and :region_close if it closes the last opened region.

  • token is itself a subclass of String, so you can use it just as you would a string. It represents the lexeme that was actually parsed.

Something went wrong with that request. Please try again.