Permalink
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
307 lines (252 sloc) 11.9 KB
Mercury Grades
==============
Paul Bone <paul@plasmalang.org>
v0.1, March 2018: Draft.
Copyright (C) 2018 Plasma Team
License: CC BY-SA 4.0
link:https://github.com/PlasmaLang/plasma/tree/master/docs/grades.txt[Contribute
to this page]
Plasma is written in Mercury
(at least until we get to a https://plasmalang.org/roadmap.html[self
hosting] stage)
which means if you want to compile Plasma (to contribute to it) you'll need
to build Mercury, and while there are a couple of short-cuts the long way
means navigating the Mercury grade system.
Mercury supports many different "grades", each one is a collection of
settings for how to build and link a Mercury program or library.
Each grade is made out of many grade components separated by +.+
Shortcuts:
* If you just want to run Plasma, without compiling it, then try this
https://plasmalang.org/plasma-static.tgz[static build].
(TODO: https://github.com/PlasmaLang/plasma/issues/9[better static builds]).
* If you want to build Plasma on x86 or x86_64 on a .deb based Linux
system;
then use the https://dl.mercurylang.org/deb/[Debian packages], and edit
Makefile to uncomment optional settings such as debugging then type
"make".
* If you want to build Plasma on a non-.deb system on x86 or x86_64 then
you'll have to build Mercury. I suggest installing the +asm_fast.gc+ and
+asm_fast.gc.decldebug.stseg+ grades. Remember to tell +./configure+ which
grades you need otherwise it'll
http://yfl.bahmanm.com/Members/ttmrichter/yfl-blog/mercury-time-to-hello-world[try to build all of them and could take a long time]
(TODO: https://github.com/PlasmaLang/plasma/issues/8[provide detailed
instructions]).
* If you have some other type of system, or are building something other than
Plasma but found this document, then read on.
The Mercury project documents its grade components
https://www.mercurylang.org/information/doc-latest/mercury_user_guide/Grades-and-grade-components.html#Grades-and-grade-components[here (retrieved 2018-03-04)],
and I will be clarifying some points made there.
This manual, when I retrieved it, mentioned a few grade components not worth
attempting to use, these are:
+hl+::
The +hl+ grade component is just like the +hlc+ grade but uses a different
format data on the heap. It doesn't provide a significant advantage over
+hlc+ so isn't considered useful.
+il+:: A deleted .net backend.
+agc+:: A bit-rotten garbage collector.
+threadscope+:: A bit-rotted profiling system, the viewer component's latest
version can no-longer open profiles generated by Mercury.
+mm+ and probably others:: alternative evaluation strategies for logic
programming, you probably don't need this and if you do, someone else will
tell you.
+rbmm+ region based memory management.
An advanced optimisation for memory allocation. AIUI it only works for
single module programs and it's practically useful.
There are many other `secret' grade components not covered here or in the
User's guide.
They are mostly experimental and include grades like +rbmm+.
If you think they should be documented here then please
https://www.plasmalang.org/contact.html[let us know].
== Base grade
Everything starts with a base grade.
The base grade selects which compilation backend you wish to use.
Some backend have more than one base grade, and there are two C backends.
Exactly one base grade must be part of every valid grade string.
Low-level C:: +none+, +reg+, +jump+, +asm_jump+, +fast+ or +asm_fast+
High-level C:: +hlc+
C#:: +csharp+
Java:: +java+
Erlang:: +erlang+
If you need to call C#, Java or Erlang foreign code then the choice is
fairly obvious.
If you need to work with C foreign code, as the Plasma compiler does,
then things are more complicated.
For a long time the Low-level C backend generated faster code than the
High-level one, at least when comparing the +asm_fast+ and +hlc+ grades.
These days, due to changes in the C compilers, it depends on the program
being run.
=== Choosing a low-level C grade
Assuming you might use the low-level C grade, read this section.
The low-level C grade uses a combination of three optimisations ('hacks')
provided by GCC.
With all three disabled, the base grade is +none+, with all three enabled
it's +asm_fast+.
.Low-level C Optimisations
|========================
| Grade | GCC global registers | GCC Non-local GOTOs | ASM Labels | Useful
| +none+ | N | N | N | Y
| +reg+ | Y | N | N | Y
| +jump+ | N | Y | N | N
| +fast+ | Y | Y | N | N
| +asm_jump+ | N | Y | Y | N
| +asm_fast+ | Y | Y | Y | Y
|==========================================================================
Of course you want as much optimisation as possible, so choose +asm_fast+
but not all compilers (including GCC) fully support these GCC extensions so
these grades may not work.
Note that ASM labels cannot be used without GCC Non-local gotos, so there's
no grades combining those.
Note also that I've included a "Useful" column, these are the ones worth
testing, the others are only of interest to researchers, since if they work,
it's almost a certainty that +asm_fast+ works.
So choose in order of preference: +asm_fast+, +reg+ then +none+. On x86 and
x86_64 on Linux with GCC or Clang, +asm_fast+ works (but a future version of
GCC or Clang could break this).
On OS X I think only +none+ works, but I don't remember.
== High level C
As mentioned above, +hlc+ and +asm_fast+ are (IIRC) comparable
performance-wise.
Which one you choose will depend on whether your C compiler can handle
+asm_fast+ and what other features you may need (see below).
For example, if you want to use the declarative debugger, then you must use
a low-level C grade, if that low-level C grade happens to be +none+, then
that's the best you can do.
== More grade components
The complete grade is built by adding grade components to select different
features, separated by periods.
Garbage collection::
--
+gc+ or absent.
+gc+ is Boehm GC, the only supported GC.
Not including +gc+ means that a GC will not be built, but note that Java,
C# and Erlang backends provide a GC anyway, and for them +gc+ does not
make sense.
(+agc+ bitrotted long ago, and +hgc+ was an experiment never completed.)
You should always include +gc+ when using a C backend.
Not including this is intended only for testing.
--
Thread safety::
--
+par+ or absent
Like the +gc+ option, this only makes sense on C grades.
Grades that include +par+ are thread safe and support the functions in the
thread module of the standard library.
The Java, C# and Erlang grades support this anyway.
Low level C::
The threading model is N:M with IO that can block a whole "engine" of
workers.
The parallel conjunction operator and the 'very' experimental
https://paul.bone.id.au/pub/pbone-2012-thesis/[automatic parallelism]
work are supported.
This is the only combination of base grade and +par+ that support these
features.
High level C::
This uses the OS's native threads and IO works properly.
Plasma doesn't use thread-safety in any of its Mercury programs.
--
Stack segmentation::
--
+stseg+ or absent
Meaningful only on low-level C grades where Mercury manages its own stack.
Use a segmented stack so that
* The program is more tolerant of deep recursion s where TCO/LCO were not
used/available.
* The memory cost of a thread in +par+ grades is much cheaper.
This is recommended when +par+ is used and can also help with debugging and
deep profiling.
Other than the similar name to "trseg" (a segmented trail) and some basic
low-level concept, this is not functionally related to trailing.
--
Single precision float::
--
+spf+ or absent
Use +float+ for floating point numbers rather than +double+.
Much faster on 32bit platforms where floats normally require boxing, but
your program may have different results
Only meaningful in C grades (I think).
--
Debugging::
--
+debug+, +decldebug+, +ssdebug+ or absent
Which type of debugging to support if any.
Note that +decldebug+ is a superset of +debug+, you might as well use it
instead of just +debug+.
+ssdebug+ is a totally separate debugger suitable in the "MLDS" backends
(high level C, C#, Java and Erlang).
--
Profiling::
--
+prof+, +memprof+, +profdeep+ or absent
What type of profiling to support if any.
+prof+ and +memprof+ have a smiliar workflow.
+profdeep+ is a
https://mercurylang.org/documentation/papers.html#mu_01_24[very advanced profiler]
and worth considering.
These only make sense with low-level C grades.
We're not concerned about Plasma's compiler's performance until well after
bootstrapping, so you won't need this for Plasma.
--
Trailing::
--
+tr+, +trseg+ or absent.
Enable trailing support.
Trailing is a technique for undoing destructive update on backtracking.
If you don't know what it is then you probably don't need it. need this
+tr+ is generally discouraged in favour of +trseg+.
I believe this option is supported with all the C backends.
--
== Grade compatibility
.Grade component compatibility matrix
|====================================
| | asm_fast1 | hlc | java | csharp | erlang | gc | par | stseg | tr/trseg | debug/decldebug | ssdebug | prof/memprof | profdeep
| asm_fast | - | N | N | N | N | R | Y2 | Y |
Y | Y | y | Y | Y
| hlc | N | - | N | N | N | R | Y3 | N |
Y | N | Y | y | N
| java | N | N | - | N | N | n | n | N |
?D | N | Y | N | N
| csharp | N | N | N | - | N | n | n | N |
?D | N | Y | N | N
| erlang | N | N | N | N | - | n | n | N |
?D | N | Y | N | N
| gc | Y | Y | n | n | n | - | Y | Y |
Y | Y | y | Y | Y
| par | Y2 | Y3 | n | n | n | Y | - | R |
? | ?D | ? | ?D | N
| stseg | Y | N | N | N | N | Y | Y | - |
Y | Y | y | y | Y
| tr/trseg | Y | Y | ?D | ?D | ?D | Y | ? | Y |
- | Y | ? | y | ?
| debug/decldebug | Y | N | N | N | N | Y | ?D | R |
Y | - | D | ?D | ?D
| ssdebug | y | Y | Y | Y | y | y | y | y |
y | D | - | ? | ?
| prof/memprof | Y | ? | N | N | N | Y | ?D | y |
y | ?D | ? | - | N
| profdeep | Y | N | N | N | N | Y | N | R |
?D | ?D | ? | N | -
|=====
Y:: Compatible
y:: Probably compatible
N:: Not compatible
n:: Not compatible, but implied support by the base grade
?:: Don't know.
?D:: Don't know, but I doubt it
R:: Recommended to add the column grade component if you're using the row
grade component
1:: asm_fast could mean any of the LLDS base grades, see table 1.
2:: asm_fast.par supports parallel conjunction and the experimental
auto-parallelism. It uses green threads however IO will block an entire
worker thread, you may be able to avoid that with spawn.native.
3:: hlc.par does not support parallel conjunction or auto-parallelism. It
uses pthreads so works correctly with IO.
== My favorite grades
I use Linux on x86_64.
Default::
+asm_fast.gc+, or maybe +hlc.gc+
Thread safety::
+asm_fast.par.gc.stseg+
Debugginu::
+asm_fast.gc.decldebug.stseg+
Profiling::
+asm_fast.gc.profdeep.stseg+