Permalink
Switch branches/tags
Nothing to show
Find file Copy path
637bc9f Feb 25, 2018
1 contributor

Users who have contributed to this file

392 lines (288 sloc) 15.5 KB
If you just want to build the variant of Racket that runs on Chez
Scheme, then you probably meant to read "./c/README.txt" instead of
this file.
If you're working on the implementation of Racket-on-Chez, then it's
more convenient to work in this directory, so keep reading here.
Requirements
------------
* Chez Scheme --- for now, use a fork at
https://github.com/mflatt/ChezScheme
but we will eventually return to the current development version
from
https://github.com/cisco/ChezScheme
If this build of Chez Scheme is not installed so that plain
`scheme` on the command line runs your installation, you can use
`make SCHEME=...` to set the command for `scheme`.
* Racket --- a recent version
By default, `make` will use the enclosing Racket build. Go back to
the root of this repository/distribution and build so that at least
the "compiler-lib" package is installed, either with just `make`
(for a full build) or with
make PKGS="compiler-lib"
Note that if you build as described in "./c/README.txt", then you
don't need the "compiler-lib" package.
If you'd like to use an existing installation of Racket, instead,
you can use `make RACKET=...` to set the command for `racket`.
Building
--------
Running `make` will build the Racket-on-Chez implementation, although
not in stand-alone form. Use `make expander-demo` to run a demo that
loads `racket/base` from source.
Use `make setup` (or `make setup-v` for a verbose version) to build
".zo" files for collection-based libraries.
If you want to control the `raco setup` that `make setup` runs, supply
an `ARGS` variable to make, such as
make setup ARGS="-l typed/racket" # only sets up TR
make setup ARGS="--clean -Dd" # clears ".zo" files
make setup ARGS="--fail-fast" # stop at the first error
Machine Code versus JIT
-----------------------
Racket on Chez Scheme currently supports two modes:
* Machine-code mode --- The compiled form of a module is machine code
generated by compiling either whole linklets (for small enough
linklets) or functions within linklets (with a "bytecode"
interpreter around the compiled parts). Compiled ".zo" files in
this format are written to a subdirectory of "compiled" using the
Chez Scheme platform name (e.g., "a6osx").
Select this mode by seting the `PLT_CS_MACH` environment variable,
but it's currently the default.
* JIT mode --- The compiled form of a module is an S-expression where
individual `lambda`s are compiled on demand. Compiled ".zo" files
in this format are written to a "cs" subdirectory of "compiled".
Select this mode by seting the `PLT_CS_JIT` environment variable.
Set the `PLT_ZO_PATH` environment variable to override the path used
for ".zo" files. For example, you may want to preserve a normal build
while also building in machine-code mode with `PLT_CS_DEBUG` set, in
which case setting `PLT_ZO_PATH` to something like "a6osx-debug" could
be a good idea.
In machine-code code, set `PLT_CS_COMPILE_LIMIT` to set the maximum
size of forms to compile. The default is 10000.
Running
-------
Use `make run ARGS="..."` to run Racket on Chez Scheme analogous to
running plain `racket`, where command-line arguments are supplied in
`ARGS`.
Structure
---------
The reimplementation on Chez Scheme is meant to export the same
interface as the traditional Racket virtual machine in "../racket":
the macro expander and primitive modules such as `#%kernel` and
`#%network`.
The implementation is in layers. The immediate layer over Chez Scheme
is called "Rumble", and it implements delimited continuations,
structures, chaperones and imperaontors, engines (for threads), and
similar base functionality. The Rumble layer is implemeneted in Chez
Scheme.
The rest of the layers are implemented in Racket:
thread
io
regexp
expander
schemify
Each of those layers is implemented in a sibling directory of this
one. Each layer is expanded (using "expander", of course) and then
compiled to Chez Scheme (using "schemify") to implement Racket.
The fully expanded form of each layer must not refer to any
functionality of previous layers. For example, the expander "thread"
must not refer to functionality implemented by "io", "regexp", or
"expander", while the expanded form of "io" must not refer to "regexp"
or "expander" functionality. Each layer can use `racket/base`
functionality, but beware that code from `racket/base` will be
duplicated in each layer.
The "io" layer relies on a shared library, rktio, to provide a uniform
interface to OS resources. The rktio source is in a "rktio" sibling
directory to this one.
Files in this directory:
*.sls - Chez Scheme libraries that provide implementations of Racket
primitives, building up to the Racket expander. The
"rumble.sls" library is implemented directly in Chez Scheme.
For most other cases, a corresponding "compiled/*.scm" file
contains the implementation extracted from from expanded and
flattened Racket code. Each "*.sls" file is built to "*.so".
rumble/*.ss - Parts of "rumble.sls" (via `include`) to implement data
structures, immutable hash tables, structs, delimited
continuations, engines, impersonators, etc.
compiled/*.rktl (generated) - A Racket library (e.g., to implement
regexps) that has been fully macro expanded and flattened
into a linklet from its source in "../*". A linklet's only
free variables are primitives that will be implemented by
various ".sls" libraries in lower layers.
For example, "../thread" contains the implementation (in
Racket) of the thread and event subsystem.
compiled/*.scm (generated) - A conversion from a ".rktl" file to be
`included`d into an ".sls" library.
../build/so-rktio/rktio.rktl (generated) and
../../lib/librktio.{so,dylib,dll} (generated) - Created when building
the "io" layer, the "rktio.rktl" file contains FFI descriptions
that are `included` by "io.sls" and "librktio.{so,dylib,dll}"
is the shared library that implements rktio.
CAUTION: The makefile here doesn't track dependencies for
rktio, so use `make rktio` if you change its implementation.
primitive/*.ss - for "expander.sls", tables of bindings for
primitive linklet instances; see "From primitives to modules"
below for more information.
convert.rkt - A "schemify"-based linklet-to-library-body compiler,
which is used to convert a ".rktl" file to a ".scm" file to
inclusion in an ".sls" library.
demo/*.ss - Chez Scheme scripts to check that a library basically
works. For example "demo/regexp.ss" runs the regexp matcher
on a few examples. To run "demo/*.ss", use `make *-demo`.
other *.rkt - Racket scripts like "convert.rkt" or comparisions like
"demo/regexp.rkt". For example, you can run "demo/regexp.rkt"
and compare the reported timing to "demo/regexp.ss".
From Primitives to Modules
--------------------------
The "expander" layer, as turned into a Chez Scheme library by
"expander.sls", synthesizes primitive Racket modules such as
`'#%kernel` and `'#%network`. The content of those primitive _modules_
at the expander layer is based on primitve _instances_ (which are just
hash tables) as populated by tables in the "primitive" directory. For
example, "primitive/network.scm" defines the content of the
`'#network` primitive instance, which is turned into the primitive
`'#%network` module by the expander layer, which is reexported by the
`racket/network` module that is implemented as plain Racket code. The
Racket implementation in "../racket" provides those same primitive
instances to the macro expander.
Running "demo/expander.ss"
--------------------------
A `make expander-demo` builds and tries the expander on simple
examples, including loading `racket/base` from source.
Dumping Linklets and Schemified Linklets
----------------------------------------
Set the `PLT_LINKLET_SHOW` environment variable to pretty print each
linklet generated by the expander and its schemified form that is
passed on to Chez Scheme.
By default, `PLT_LINKLET_SHOW` does not distinguish gensyms that have
the same base name, so the schemified form is not really accurate. Set
`PLT_LINKLET_SHOW_GENSYM` instead (or in addition) to get more
accurate output.
In JIT mode, the schemified form is shown after a conversion to
support JIT mode. Set `PLT_LINKLET_SHOW_PRE_JIT` to see the
pre-conversion form. Set `PLT_LINKLET_SHOW_JIT_DEMAND` to see forms as
they are compiled on demand.
In machine-code mode, set `PLT_LINKLET_SHOW_LAMBDA` to see individual
compiled terms when a linklet is not compliled whole; set
`PLT_LINKLET_SHOW_POST_LAMBDA` to see the linlet reorganized around
those compiled parts; and/or set `PLT_LINKLET_SHOW_POST_INTERP` to see
the "bytecode" form.
Development Mode
----------------
If you make changes to files in "rumble", you should turn off
`[RUMBLE_]UNSAFE_COMP` in the makefile.
You may want to turn on `DEBUG_COMP` in the makefile, so that
backtraces provide expression-specific source locations instead of
just procedure-specific source locations. Enabling `DEBUG_COMP` makes
the Racket-on-Chez implementation take up twice as much memory and
take twice as long to load.
Turning on `DEBUG_COMP` affects only the Racket-on-Chez
implementation. To preserve per-expression locations on compiled
Racket code, set `PLT_CS_DEBUG`. See also "JIT versus Machine Code"
for a suggestion on setting `PLT_ZO_PATH`.
When you change "rumble" or other layers, you can continue to use
Racket modules that were previously compiled to ".zo" form... usually,
but inlining optimizations and similar compiler choices can break
compatibility. Set `compile-as-independent?` to #t in "expander.sls"
to make compiled Racket modules reliably compatible with changes to
the layers here (at the expense of some performance).
FFI Differences
---------------
* The `make-sized-byte-string` function always raises an exception,
because a foreign address cannot be turned into a byte string whose
content is stored in the foreign address. The options are to copy
the foreign content to/from a byte string or use `ptr-ref` and
`ptr-set!` to read and write at the address.
* When `_bytes` is used as an argument type, beware that a byte
string is not implicitly terminated with a NUL byte. When `_bytes`
is used as a result type, the C result is copied into a fresh byte
string.
* The 'atomic-interior allocation mode returns memory that is allowed
to move after the cpointer returned by allocation becomes
unreachable.
* A `_gcpointer` can only refer to the start of an allocated object,
and never the interior of an 'atomic-interior allocation. Like
traditional Racket, `_gcpointer` is equivalent to `_pointer` for
sending values to a foreign procedure, return values from a
callback that is called from foreign code, or for `ptr-set!`. For
the other direction (receiving a foreign result, `ptr-ref`, and
receiving values in a callback), the received pointer must
correspond to the content of a byte string or vector.
* Calling a foreign function implicitly uses atomic mode and also
disables GC. If the foreign function calls back to Racket, the
callback runs in atomic mode with the GC still disabled.
* An immobile cell must be modified only through its original pointer
or a reconstructed `_gcpointer`. If it is cast or reconstructed as
a `_pointer`, setting the cell will not cooperate correctly with
the garbage collector.
* Memory allocated with 'nonatomic works only in limited ways. It
cannot be usefully passed to foreign functions, since the layout is
not actually an array of pointers.
Status and Thoughts on Various Racket Subsystems
------------------------------------------------
* Applicable structs work by adding an indirection to each function
call when the target is not obviously a plain procedure; with the
analysis in "../schemify/schemify.rkt", the indirection is not
needed often in a typical program, and the overhead appears to be
light when it is needed.
* Racket's delimited continuations, continuation marks, threads, and
events are mostly in place (see "rumble/control.ss",
"rumble/engine.ss", and the source for "thread.rktl").
* The "rktio" library fills the gap between Racket and Chez Scheme's
native I/O. The "rktio" library provides a minimal, non-blocking,
non-GCed interface to OS-specific functionality. Its' compiled to a
shared library and loadied into Chez Scheme, and then Racket's I/O
API is implemented in Racket by calling rktio as a kind of foreign
library.
* The Racket and Chez Scheme numeric systems likely differ in some
ways, and I don't know how much work that will be.
* For futures, Chez Scheme exposes OS-level threads with limited
safety guarantees. An implementation of futures can probably take
advantage of threads with thread-unsafe primitives wrapped to
divert to a barrier when called in a future.
* GC-based memory accounting similarly seems to require new support,
but that can wait a while.
* Extflonums will probably exist only on the Racket VM for a long
while.
* For now, `make setup` builds platform-specific ".zo" files in a
subdirectory of "compiled" named by the Chez Scheme platform name
(e.g., "a6osx"). Longer term, although bytecode as it currently
exists goes away, platform-independent ".zo" files might contain
fully expanded source (possibly also run through Chez Scheme's
source-to-source optimizer) with `raco setup` gaining a new step in
creating platform-specific compiled code.
Performance Notes
-----------------
The best-case scenario for performance is current the default
configuration:
* `UNSAFE_COMP` is enabled in "Makefile" --- currently on by default.
Effectiveness: Matters the most for "rumble.so", which has its own
setting, but otherwise by itself affects a from-source
`racket/base` expansion by about 5%. See also the interaction with
`compile-as-independent?`.
* `RUMBLE_UNSAFE_COMP` is enabled in "Makefile" --- applies to
"rumble.so" even if `UNSAFE_COMP` is disabled.
Effectiveness: Can mean a 10-20% improvement in loading
`racket/base` from source. Since the Rumble implementation is in
pretty good shape, `RUMBLE_UNSAFE_COMP` is enabled by default.
* `compile-as-independent?` is #f in "expander.sls" --- currently set
to #f by default. See "Development Mode" above for more
information.
Effectiveness: Without also enabling `UNSAFE_COMP`, setting
`compile-as-independent?` to #f slows down tasks like loading
`racket/base` from source, but substantially improves programs
where the Chez Scheme optimizer needs to recognize uses of
primitives (e.g., microbenchmarks). Combining with `UNSAFE_COMP`
speeds up loading `racket/base` from source, too.
The combination of `UNSAFE_COMP` and `compile-as-independent?`
enables inlining of unsafe function bodies. For example,
`variable-ref/no-check` inlines as lots of code in safe mode and
little code in unsafe mode; lots of code doesn't run more slowly,
but it compiles more slowly.
* `DEBUG_COMP` not enabled --- or, if you enable it, run `make
strip`.
Effectivess: Avoids increasing the load time for the Rumble and
other layers by 30-50%.
* `PLT_CS_DEBUG` not set --- an environment variable similar to
`DEBUG_COMP`, but applies to code compiled by Racket-on-Chez.
Effectivess: Avoids improvement to stack traces, but also avoids
increases load time and memory use of Racket programs by as much as
50%.