11This directory contains data needed by Bison.
22
3- * Skeletons
4- Bison skeletons: the general shapes of the different parser kinds,
5- that are specialized for specific grammars by the bison program.
3+ # Directory content
4+ ## Skeletons
5+ Bison skeletons: the general shapes of the different parser kinds, that are
6+ specialized for specific grammars by the bison program.
67
78Currently, the supported skeletons are:
89
@@ -22,19 +23,18 @@ Currently, the supported skeletons are:
2223- glr.cc
2324 A Generalized LR C++ parser. Actually a C++ wrapper around glr.c.
2425
25- These skeletons are the only ones supported by the Bison team.
26- Because the interface between skeletons and the bison program is not
27- finished, *we are not bound to it*. In particular, Bison is not
28- mature enough for us to consider that "foreign skeletons" are
29- supported.
26+ These skeletons are the only ones supported by the Bison team. Because the
27+ interface between skeletons and the bison program is not finished, *we are
28+ not bound to it*. In particular, Bison is not mature enough for us to
29+ consider that "foreign skeletons" are supported.
3030
31- * m4sugar
32- This directory contains M4sugar, sort of an extended library for M4,
33- which is used by Bison to instantiate the skeletons.
31+ ## m4sugar
32+ This directory contains M4sugar, sort of an extended library for M4, which
33+ is used by Bison to instantiate the skeletons.
3434
35- * xslt
36- This directory contains XSLT programs that transform Bison's XML output
37- into various formats.
35+ ## xslt
36+ This directory contains XSLT programs that transform Bison's XML output into
37+ various formats.
3838
3939- bison.xsl
4040 A library of routines used by the other XSLT programs.
@@ -48,13 +48,132 @@ into various formats.
4848- xml2xhtml.xsl
4949 Conversion into XHTML.
5050
51+ # Implementation note about the skeletons
52+
53+ "Skeleton" in Bison parlance means "backend": a skeleton is fed by the bison
54+ executable with LR tables, facts about the symbols, etc. and they generate
55+ the output (say parser.cc, parser.hh, location.hh, etc.). They are only in
56+ charge of generating the parser and its auxiliary files, they do not
57+ generate the XML output, the parser.output reports, nor the graphical
58+ rendering.
59+
60+ The bits of information passing from bison to the backend is named
61+ "muscles". Muscles are passed to M4 via its standard input: it's a set of
62+ m4 definitions. To see them, use `--trace=muscles`.
63+
64+ Except for muscles, whose names are generated by bison, the skeletons have
65+ no constraint at all on the macro names: there is no technical/theoretical
66+ limitation, as long as you generate the output, you can do what you want.
67+ However, of course, that would be a bad idea if, say, the C and C++
68+ skeletons used different approaches and had completely different
69+ implementations. That would be a maintenance nightmare.
70+
71+ Below, we document some of the macros that we use in several of the
72+ skeletons. If you are to write a new skeleton, please, implement them for
73+ your language. Overall, be sure to follow the same patterns as the existing
74+ skeletons.
75+
76+ ## Symbols
77+
78+ ### `b4_symbol(NUM, FIELD)`
79+ In order to unify the handling of the various aspects of symbols (tag, type
80+ name, whether terminal, etc.), bison.exe defines one macro per (token,
81+ field), where field can `has_id`, `id`, etc.: see
82+ `prepare_symbols_definitions()` in `src/output.c`.
83+
84+ The macro `b4_symbol(NUM, FIELD)` gives access to the following FIELDS:
85+
86+ - `has_id`: 0 or 1.
87+
88+ Whether the symbol has an id.
89+
90+ - `id`: string
91+ If has_id, the id (prefixed by api.token.prefix if defined), otherwise
92+ defined as empty. Guaranteed to be usable as a C identifier.
93+
94+ - `tag`: string.
95+ A representation of the symbol. Can be 'foo', 'foo.id', '"foo"' etc.
96+
97+ - `user_number`: integer
98+ The external number as used by yylex. Can be ASCII code when a character,
99+ some number chosen by bison, or some user number in the case of
100+ %token FOO <NUM>. Corresponds to yychar in yacc.c.
101+
102+ - `is_token`: 0 or 1
103+ Whether this is a terminal symbol.
104+
105+ - `number`: integer
106+ The internal number (computed from the external number by yytranslate).
107+ Corresponds to yytoken in yacc.c. This is the same number that serves as
108+ key in b4_symbol(NUM, FIELD).
109+
110+ In bison, symbols are first assigned increasing numbers in order of
111+ appearance (but tokens first, then nterms). After grammar reduction,
112+ unused nterms are then renumbered to appear last (i.e., first tokens, then
113+ used nterms and finally unused nterms). This final number NUM is the one
114+ contained in this field, and it is the one used as key in `b4_symbol(NUM,
115+ FIELD)`.
116+
117+ The code of the rule actions, however, is emitted before we know what
118+ symbols are unused, so they use the original numbers. To avoid confusion,
119+ they actually use "orig NUM" instead of just "NUM". bison also emits
120+ definitions for `b4_symbol(orig NUM, number)` that map from original
121+ numbers to the new ones. `b4_symbol` actually resolves `orig NUM` in the
122+ other case, i.e., `b4_symbol(orig 42, tag)` would return the tag of the
123+ symbols whose original number was 42.
124+
125+ - `has_type`: 0, 1
126+ Whether has a semantic value.
127+
128+ - `type_tag`: string
129+ When api.value.type=union, the generated name for the union member.
130+ yytype_INT etc. for symbols that has_id, otherwise yytype_1 etc.
131+
132+ - `type`
133+ If it has a semantic value, its type tag, or, if variant are used,
134+ its type.
135+ In the case of api.value.type=union, type is the real type (e.g. int).
136+
137+ - `has_printer`: 0, 1
138+ - `printer`: string
139+ - `printer_file`: string
140+ - `printer_line`: integer
141+ If the symbol has a printer, everything about it.
142+
143+ - `has_destructor`, `destructor`, `destructor_file`, `destructor_line`
144+ Likewise.
145+
146+ ### `b4_symbol_value(VAL, [SYMBOL-NUM], [TYPE-TAG])`
147+ Expansion of $$, $1, $<TYPE-TAG>3, etc.
148+
149+ The semantic value from a given VAL.
150+ - `VAL`: some semantic value storage (typically a union). e.g., `yylval`
151+ - `SYMBOL-NUM`: the symbol number from which we extract the type tag.
152+ - `TYPE-TAG`, the user forced the `<TYPE-TAG>`.
153+
154+ The result can be used safely, it is put in parens to avoid nasty precedence
155+ issues.
156+
157+ ### `b4_lhs_value(SYMBOL-NUM, [TYPE])`
158+ Expansion of `$$` or `$<TYPE>$`, for symbol `SYMBOL-NUM`.
159+
160+ ### `b4_rhs_data(RULE-LENGTH, POS)`
161+ The data corresponding to the symbol `#POS`, where the current rule has
162+ `RULE-LENGTH` symbols on RHS.
163+
164+ ### `b4_rhs_value(RULE-LENGTH, POS, SYMBOL-NUM, [TYPE])`
165+ Expansion of `$<TYPE>POS`, where the current rule has `RULE-LENGTH` symbols
166+ on RHS.
167+
51168-----
52169
53170Local Variables:
54- mode: outline
171+ mode: markdown
172+ fill-column: 76
173+ ispell-dictionary: "american"
55174End:
56175
57- Copyright (C) 2002, 2008-2015, 2018 Free Software Foundation, Inc.
176+ Copyright (C) 2002, 2008-2015, 2018-2019 Free Software Foundation, Inc.
58177
59178This file is part of GNU Bison.
60179
0 commit comments