Newlisp parser

cormullion edited this page Sep 18, 2011 · 1 revision
Clone this wiki locally

The newLISP parser reads newLISP source code and converts it to a nested list structure, preserving everything it can about the original file. Since newLISP uses lists for code and data, this is quite easy.

To run the parser, load the parser, then do this:

(Nlex:parse-newlisp source-text)

The result is a list of pairs, organised into a tree-like structure. For example:

> (Nlex:parse-newlisp  "(+ 2 2)")
 (((Nlex:LeftParen "(") 
   (Nlex:Symbol "+") 
   (Nlex:WhiteSpace "IA==") 
   (Nlex:Integer 2)
   (Nlex:WhiteSpace "IA==")
   (Nlex:Integer 2) 
   (Nlex:RightParen ")")))

As you can see, the resulting list is very verbose, and it even preserves formatting (base64-encoding the white-space of the original), but this is because the code can be used for displaying source code as well as analysing and modifying its structure.

Once you've generated this tree of code, you can convert it back to its original form using:

> (Nlex:nlx-to-plaintext l)

where l is the tree.

With the source in a tree, you can use newLISP's list indexing functions, along with the powerful match and unify functions to extract or manipulate data. For example:

(ref-all '(Nlex:Comment *) tree  match true)

produces a list of every comment:

  (Nlex:Comment "; more faithful to original format than newLISP's float?") 
  (Nlex:Comment "; try hex first") 
  (Nlex:Comment "; scientific notation if there's an e") 
  (Nlex:Comment "; float?") 
  (Nlex:Comment "; newLISP's float function isn't quite what we want here     ") 
  (Nlex:Comment "; octal, not hex or float? 017 is OK, 019 is read as 10")

You can use set-ref-all to change or delete categories. For example, to replace each run of white space with a single space:

(set-ref-all '(Nlex:WhiteSpace *) tree (list 'Nlex:WhiteSpace (base64-enc " ")) match)

The resulting code, translated back to plain text, runs as valid newLISP, but is compressed. Or, to remove all comments:

(set-ref-all '(Nlex:Comment *) tree '(Nlex:Comment "") match)

Here, we're replacing every comment pair with a blank. Alternatively, you could replace all comments with nil for later filtering.