forked from westes/flex
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Start a HOWTO on writing target-language back ends.
#8 in the retargeting patch series
- Loading branch information
1 parent
673f2ca
commit e58cdd1
Showing
2 changed files
with
77 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,75 @@ | ||
= How to add a support for a new language to flex | ||
|
||
= Theory | ||
|
||
The flex code was historically written to generate parsers in C, but | ||
it has factored to isolate knowledge of the specifics of each target | ||
languageas from the logic for byukilding the lexer state tables much | ||
as possible. | ||
|
||
The only assumption that is absolutely baked into all of flex is that | ||
the bodies of initializers for arrays of integers consist of decimal | ||
numeric kiterals sepaerated by commas (and optional whitespace). | ||
|
||
Otherwise, knowledge of each target langage's syntax lives in two | ||
places: (1) a table of langyuge-specific syntax-generator methods, | ||
and (2) A language-specific skeleton file. | ||
|
||
For example: The methods for the C and C++ back end live in a source | ||
file named cpp_backend.c (so named because both languages use the C | ||
preprocessor), and in a skeleton file names cpp-flex.skl. | ||
|
||
Syntactically C-like languages such as Go, Rust, and Java should be easy | ||
target. Alnost anything generally descended from Algol shouldn't be | ||
much more difficult; this certainly includes the whole | ||
Pascal/Modula/Oberon family. | ||
|
||
= Writing a new backend | ||
|
||
All the code that accesses language-specific code generators goes | ||
through a global pointer named "backend" to a method table. The | ||
results of these generators are used to fill in some parts of the | ||
language-specifoc skeleton file amd conditionalize other. | ||
|
||
Read the definition of struct backend_t in src/flexdefs.h, and | ||
attached comments, to get a feel for the methods. Don't worry | ||
about understandng table generator names at first. | ||
|
||
To write support for a langusge, you'll want to do the following | ||
steps: | ||
|
||
1. Clone one of the existing back-end/skeleton pairs. If the language | ||
you are supporting is names "foo", you should create files named | ||
foo_backend.c and foo-flex.skl. | ||
|
||
2. Add foo_backend.c to COMMON_SOURCES in src.Makefile.am. Add the | ||
name of your skeleton file to EXTRA_DIST. | ||
|
||
3. Add a production to src/Makefile.am parallel to the one that | ||
priduces cpp-skel.h. Your objecting is to make s string list | ||
initializer from your skeleton file that can be linked with flex | ||
and is opointed at by the skel nember of your language back end. | ||
|
||
4. Add some logic to main.c that enables the new back end with a | ||
new command-line option. Following this step you should be | ||
able to run flex on a specification and fet code out in the | ||
language of whatever back end you cloned. | ||
|
||
5. The interesting part: mutate your new back end and skeleton so they | ||
produce code in your desired target langage. | ||
|
||
6. Write a test suite for your back end. You should be able to clone | ||
one of the existing sets of test loads to get good coverage. Note | ||
that is highly unliely your back end will be accepted into the | ||
flex distribution without a test suite. | ||
|
||
A hint about step 5: | ||
|
||
* Don't bother supporting non-reentrant parser generation. | ||
The interface of original lex with all those globals hanging out | ||
needs to be supported in C for backwards compatibility, but | ||
there | ||
|
||
|
||
|
||
|