/
overview.html
113 lines (112 loc) · 6.07 KB
/
overview.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
<!DOCTYPE html>
<html>
<head>
<meta content="text/html;charset=UTF-8" http-equiv="Content-type" />
<title>parslet - Overview</title>
<meta content="Kaspar Schiess (http://absurd.li)" name="author" />
<link href="images/favicon3.ico" rel="shortcut icon" />
<link href="/parslet/stylesheets/site.css" rel="stylesheet" /><link href="/parslet/stylesheets/sh_whitengrey.css" rel="stylesheet" /><script src="http://code.jquery.com/jquery-2.1.4.min.js"></script><script src="/parslet/javascripts/toc.js"></script><script src="/parslet/javascripts/sh_main.min.js"></script><script src="/parslet/javascripts/sh_ruby.min.js"></script>
</head>
<body class="code" onload="sh_highlightDocument(); $('#toc').toc({selectors: 'h2'});">
<div id="everything">
<div class="main_menu">
<img src="/parslet/images/parsley_logo.png" alt="Parslet Logo" />
<ul>
<li>
<a href="/parslet/">about</a>
</li>
<li>
<a href="/parslet/get-started.html">get started</a>
</li>
<li>
<a href="/parslet/install.html">install</a>
</li>
<li>
<a href="/parslet/documentation.html">docs</a>
</li>
<li>
<a href="/parslet/contribute.html">contribute</a>
</li>
<li>
<a href="/parslet/projects.html">projects</a>
</li>
</ul>
</div>
<div class="content">
<h1>
Overview
</h1>
<p>Parslet is a library with a clear philosophy: It makes parser writing easy and
testable. On top of that, it provides understandable error messages (“General
Protection Fault”) to you, the language writer. In extension, you will
hopefully manage to provide good error messages to your users. Together, we
can create a better world!</p>
<p>Traditional texts on the subject will have you write a compiler or interpreter
for a language in several stages:</p>
<ul>
<li>Parsing or Lexing/Parsing</li>
<li>Abstract Syntax Tree construction</li>
<li>Optimization and checking of the tree</li>
<li>Generation of code / Execution</li>
</ul>
<p>This library will be good for the first two stages only. After that, you’ll
be on your own.</p>
<p>The parsing step has literally been implemented by hundreds (thousands) of
clever people; No lack of
<a href="http://jeffreykegler.github.io/Ocean-of-Awareness-blog/individual/2016/08/timeline2.html">alternatives</a>.
Even in Ruby, you’ll have the choice among a handful of libraries. There’s a
distinction to make on this level as well:</p>
<p><strong>LRk parsers and related fields</strong> Parsers in this class use a lexical analyzer
(a <em>lexer</em> ) to transform the input text into tokens (<em>tokenizing</em> ). They
then check if your stream of tokens has a corresponding tree that conforms to
the grammar. Most of these parsers allow grammars that are ambiguous and will
provide a mechanism for resolving the arising ambiguities. These are the
earliest parser generators and the most widely used (<em>yacc</em>, <em>bison</em>, …) —
chances are, you’ll have more than one of these libraries installed on your
system.</p>
<p><strong><span class="caps">PEG</span> or packrat parsers</strong> Parsers in this class are based on a slightly more
modern algorithm, which translates what one does when writing a parser by hand
in top-down fashion. This is what we programmers do all over, methods calling
methods – and its also (grossly) what these parsers do to recognize input.
Left recursion is impossible to express in these grammars. No lexers are
required – lexical tokenizing and parsing are one single step. Ruby has
several implementations of this algorithm in library form:
<a href="http://github.com/nathansobo/treetop">Treetop</a>,
<a href="http://github.com/mjijackson/citrus">Citrus</a>,
<a href="http://wiki.github.com/luikore/rsec/">rsec</a> and of course
<a href="http://kschiess.github.com/parslet">Parslet</a>.</p>
<p>All of these generators are different in small ways; yet most implement common
patterns and provide almost identical APIs. We believe this is an error:
choice is good, but there should be visible attributes distinguishing the
choices.</p>
<p>Parslet is not like the others, in fact it is radically different on some
key elements.</p>
<h2>Writing a language</h2>
<p>Whether you write a language for a configuration file or a new computer
language (the holy grail), the steps are always the same:</p>
<ol>
<li>Create a grammar: <em>What should be legal syntax?</em></li>
<li>Annotate the grammar: <em>What is important data?</em></li>
<li>Create a transformation: <em>How do I want to work with that data?</em></li>
</ol>
<p>The creation of grammars and the various concepts that are associated are treated
in <a href="parser.html">Parslet::Parser</a>.</p>
<p>Transformation of the resulting intermediary tree is treated in
<a href="transform.html">Parslet::Transform</a></p>
</div>
<div class="copyright">
<p><span class="caps">MIT</span> License, 2010-2018, © <a href="http://absurd.li">Kaspar Schiess</a><br/></p>
</div>
<script>
var _gaq = _gaq || [];
_gaq.push(['_setAccount', 'UA-16365074-2']);
_gaq.push(['_trackPageview']);
(function() {
var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
})();
</script>
</div>
</body>
</html>