An independant OCaml lexer, extracted from OCaml/Camlp4
OCaml Shell
Latest commit 5222a06 Jun 27, 2012 @np Bump to version 1.1.1
Permalink
Failed to load latest commit information.
local annex migrate (size in key) on ausone Jul 12, 2011
tests Global rename: pplex -> Pplex, substloc -> Substloc Jun 21, 2012
.gitattributes annex upgrade v2 to v3 on ausone Jul 12, 2011
.gitignore Small script to build the tarball. Based on a similar script from Woj… Jun 27, 2012
Camllexer.mli Filenames match module names & files have .mli Jan 22, 2012
Camllexer.mll Filenames match module names & files have .mli Jan 22, 2012
Camltoken.ml Filenames match module names & files have .mli Jan 22, 2012
Camltoken.mli Filenames match module names & files have .mli Jan 22, 2012
LICENSE Add LICENSE Apr 27, 2010
Located.ml Filenames match module names & files have .mli Jan 22, 2012
Located.mli Filenames match module names & files have .mli Jan 22, 2012
Makefile Makefile: tell more about oasis as a default target Jun 27, 2012
Pplex.ml Filenames match module names & files have .mli Jan 22, 2012
Pplex.mli Filenames match module names & files have .mli Jan 22, 2012
README.md Remove the unused KEYWORD token Dec 21, 2010
Substloc.ml Filenames match module names & files have .mli Jan 22, 2012
Substloc.mli Filenames match module names & files have .mli Jan 22, 2012
Substloc_lex.mli Filenames match module names & files have .mli Jan 22, 2012
Substloc_lex.mll Filenames match module names & files have .mli Jan 22, 2012
_oasis Bump to version 1.1.1 Jun 27, 2012
_tags Get rid of streams parsers. Dec 10, 2010
dist.sh Small script to build the tarball. Based on a similar script from Woj… Jun 27, 2012
example.ml Remove the unused KEYWORD token Dec 21, 2010

README.md

Camllexer is an enhanced lexer for Caml dialects.

The lexer has been extracted from the Camlp4 (> 3.10) lexer, which in turns was reimplemented as a derivative of the lexer from the OCaml compiler.

This lexer has the following particularities:

  • Correct and complete: as far as testing gone (~800_000 distinct lines over ~3_000_000 lines of Caml like files).
  • Supports most Caml dialects:
    • By re-using the lexer of Camlp4 this lexer works on any extension of the OCaml language made with Camlp4. In particular it has a support for quotations and anti-quotations.
    • Works fine on lexers and parsers (ocamllex, ocamlyacc), except when using the C style of comments.
  • Lossless: every single bit of the input file is kept. Blanks, comments, newlines, lexical conventions for writing literals, all of it is kept in the returned token stream. Undesired information can easily be thrown out of the stream.
  • Keyword independent: there is no token for keywords. This is up to you to cast some LIDENTs and some SYMBOLs into proper keyword tokens of your own token type.
  • Fault tolerant: errors take part of the token stream, allowing to write fault tolerant translations.
  • Flexible warnings: the lexer warn about some corner case of the lexical conventions that the user might want to avoid, again warnings take part of the token stream such that you easily control everything.
  • A simple lexer program is provided, it enables quick debugging, and simple stream editions using Unix pipelines!