Permalink
Switch branches/tags
Nothing to show
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
1614 lines (1165 sloc) 56.9 KB
When editing this file, tw=80, possibly set wrap
goal_standalone.py
runtime/goal_standalone.py is a file instructing rpython how to compile Lever.
Lever builds upon a runtime written in RPython. It functions and behaves like
python code, but has been constrained such that it can be translated into native
executable.
RPython originates from the PyPy -project. There it is used to translate python
written python interpreter into native machine code. It is capable of generating
a JIT compiler for an interpreter while it translates. Various annotations in
RPython code controls how the JIT compiler operates.
The author of RPython code does not need to worry about maintaining rules for
garbage collection. Translator takes care of inserting the necessary guards and
safety handles for the collectors.
When it is translated, the lever runtime is first imported by python
interpreter. The interpreter turns it into function objects and the entry
function is passed down to RPython. RPython starts annotating and specializing
the code according to the rules described on their documentation.
base.py
base is a module in Lever. Many files in the runtime participate to building
it. It forms a default environment for modules.
The base environment contains lot of functionality.
The following is being inserted into base in base.py:
dict
module
object
list
multimethod(arity:int)
float
int
bool
str
null
true
false
path
property
Uint8Array
set
load(program:dict, path=null)(module)
class(methods:object, parent:interface, name:str)
interface(object)
iter(object)
hash(object)
repr(object)
reversed(object)
getitem(object, index, value)
setitem(object, index)
listattr(object)
getattr(obj, name)
setattr(obj, name, value)
ord(string)
chr(value)
isinstance(obj, interface/interfaces)
# isinstance is the only name at the runtime
where you got "is" without underline. This is treated as a sort of
abbreviation.
print(values...)
and(a, b)
or(a, b)
len(obj) # small func for .length, slightly pointless.
not(a)
encode_utf8(a:str)
decode_utf8(a:Uint8Array)
time()
getcwd()
chdir(path)
exit(status:int=0)
range()
start, stop, step
start, stop
stop,
input(prompt)
print_traceback(exception)
parse_int(string, base=10)
parse_float(string)
Additionally base.py adds errors, operators and vector arithmetic objects
into base.
Base module is described in builtin modules, and therefore can be imported.
bon.py
stdlib/binon.py
Originally called bon, binon is a binary object notation used by Lever. It is
simple to decode and can be extended with custom encodings specific to Lever.
Lever parsing tends to be heavy and take its time, so it is important that the
compiled bytecode can be stored as files and reused between programs. Lever is
using binon to cache bytecode.
It is desired that binon isn't Lever specific format. It's meant to evolve along
Lever to interesting usecases, but there's no desire to lock it down to Lever
after it matures.
Glancing through the runtime, there's also json reader present, which is similar
to binon in many aspects. So binon is sort of an oddball. Also the use of binon
is restricted to files that aren't likely used by other software for now.
Binon uses network byte endian (big endian).
Binon doesn't define an identifying header, but it is being defined in units:
T -- 8-bit unsigned integer tag
8 -- 8-bit unsigned integer
32 -- 32-bit unsigned integer
long -- [ T] =0
Variable-length quantity integer.
0x40 bit in the first byte denotes sign of the integer.
If 0x80 bit is set, there's additional digit.
If 0x80 bit is unset, it is the last digit.
the value is multiplied with the sign to make encoding easy.
double -- [ T] =1
run-off-the-mill IEEE floating point number.
8 bytes wide
string -- [ T] =2
[32] length of the string in bytes
[..] the string represented in utf-8 encoding.
list -- [ T] =3
[32] how many items in the list
items of the list encoded with tagged units.
dict -- [ T] =4
[32] how many pairs in the dict
key,value pairs encoded with tagged units.
bytes -- [ T] =5
[32] how many bytes in the array.
encodes to Uint8Array
Currently only reason why used instead of json.
boolean -- [ T] =6
[ 8] If 0x00, false
If 0x01, true
Other values are interpreted as true.
null -- [ T] =7
Binon object may consist of any tagged unit.
The format is exposed to whole runtime as a standard library. You can import it
and get access to following functions:
read_file(path:Object)
write_file(path:Object, data:Object)
Writer not protected or designed to handle cyclic data structures.
continuations.py
Mechanism to implement continuations are provided by RPython. Last time I
checked the STM-version of pypy had them disabled, so continuations prevent
testing of STM with lever.
RPython stacklets are the easiest way to crash or break things, so there's the
continuations.py wrapper that ensures the interface is used correct.
This abstraction also makes it simpler to implement greenlets, those are the
actual representation of continuations in Lever.
A continuation object is bit like an evil mirror. Execution flow that creates
one gets trapped inside the continuation and a new execution takes its place.
Execution flow may switch places between an another execution trapped into the
continuation. If execution ends, and it is not the original, it must return a
continuation. The execution in that continuation takes its place and the
continuation becomes empty.
This sinister sounding control flow construct allows ordinary control flow to
wait for a value.
The advantage is that we don't need to break otherwise synchronous control flow
into callbacks, and there's no need to implement duplicate forms of control
flow with promises, or need to abuse iterators and generators for what
they weren't intended for.
Continuations add some challenges to working with resources such as file
descriptors. This kind of feature dissonance seem to be a recurring theme
across whole Lever.
evaluator/loader.py
Loader is one of the things in runtime where there's lot of happening. The name
is a bit of misnomer because the loader code also holds the interpreter for
Lever.
This file defines how closures behave and how programs are loaded.
This file also contains some code the translator requires to produce JIT
compiler.
Instruction set for the interpreter is compact because operators are implemented
as functions and methods. Only "or", "not" and "and" are implemented via
instructions.
This is not the place to mention what each instruction does, but it's worthwhile
to mention how the interpreter is constructed:
- Code is loaded as compilation units that usually span a single source file.
- The compilation unit consists of list of functions. These functions can
create a closure from any other function they share an unit with.
- When the unit is loaded, The whole thing is wrapped into Program, that when
called creates a Frame and runs the very first function in it.
- Closure has access to the Frame of the function where it's been created.
It can access variables in that frame. When closure is called it creates a
new frame that parents the frame it had.
- The execution context holds a virtual register array, and a bundle of locals.
This separation is intended to avoid lugging the registers that might happen
when you pass child closures around.
- The exceptions are handled by exceptions -list. When interpreter
returns into a frame with an exception it goes through the exceptions -list
and jumps into the first entry that is active on the location of the current
program counter. The exception is stored in the register pointed out by the
exception structure.
Since the loader defines how the bytecode object looks like, lets describe the
format here:
{
version = 0
sources = [ str ] # list of source files this unit was compiled from.
constants = [ object ] # List of constants for this unit.
functions = [
{
flags = int # These aren't actually described anywhere, but..
# 0x1 -bit means that the remaining arguments are
put into local[topc]
argc = int # minimum number of arguments
topc = int # maximum number of recognized arguments
localc = int # how many local variables there are in the frame.
regc = int # how many virtual registers in this frame.
code = Uint8Array # network-byte-endian encoded array of u16
sourcemap = Uint8Array # described at evaluator/sourcemaps.py
exceptions = [[start, stop, label, vreg]]
}
]
}
When the object is loaded, it also gets a path -object that is given to
TraceEntry -records produced by the compilation unit. The point of this path is
to locate the source file when used in combination with 'sources' and
'sourcemap'.
Related:
exec(unit, module, path) loads an object and calls it using this system.
To help implementing REPL, the last value in
the compilation unit is returned by this function.
evaluator/optable.py
stdlib/optable.py
Some people like to type stupid magic numbers everywhere and move them across
files. I prefer that stuff works even if nobody is in supporting it with a stick.
The evaluator/optable.py contains name, opcode and format for every instruction
in the bytecode that Lever uses.
enc/dec tables are filled from the tabular specification. Both are dictionaries
that come in following format:
enc[opname] = [opcode, has_result, [pattern], variadic]
dec[opcode] = [opname, has_result, [pattern], variadic]
Examples:
enc["call"] = [0x40, True, ['vreg', 'vreg'], 'vreg']
enc["jump"] = [0x60, False, ['block'], null]
Every compiler and evaluators refers to these tables directly.
Changing optable isn't an offense as long as you have a good reason to do so.
For example, that the opcodes are all weirdly spread is not a good reason. A
good reason would be that you need a new instruction to support some usecase.
If you introduce a new instruction, try not to choose a number that was used
by earlier instruction to avoid issues. The old instructions are commented out
and it should be written in which version they were discarded to help you avoid
choosing a recently used number. The rule can be broken when the instructions
are so old that they have been nearly forgotten.
Please keep the opcode numbers in order and try to group the instructions by
their context, prefer alphabetic order otherwise.
The intent here is to not cause much issues with old bytecode. If you change
behavior of an instruction, please change the opcode as well.
evaluator/sourcemaps.py
I've been following how other languages made error mapping to source files
and learnt along the way.
It saddened me that python couldn't point out the exact position where error
occured, just the line. Also I felt that if this system wass well-documented,
and forwards-compatible it would make it easier to create code that generates
bytecode.
The evaluator/sourcemaps.py describes the sourcemaps of Lever. The sourcemapping
seen here is copied from the javascript world.
Every function in compilation unit has a sourcemap -object.
The purpose of this map is to translate program counter [pc] values into a path
and a position inside a file. Sourcemaps are stored in Uint8Arrays to make their
encoding and decoding into files very fast.
Every sourcemap consists of variable length quantity -encoded values of unsigned
kind. There are 6 VLQ encoded values in every record.
The record contains following values:
count -- How many PC values are spanned by this record
file_id -- Index to the sources -list.
col0, lno0, col1, lno1 -- Range inside the file.
Compilation unit has been annotated with 'sources' -list, which points out the
paths to the source code files. The paths are relative to the file location of
the compilation unit. To retrieve the source file, you would catenate 'source'
to the compilation unit.
Current lever compiler only compiles one source file into one compilation unit.
I foresee there could be a situation where single compilation unit will contain
several different files.
A sourcemap can hold a record with count=0 in the beginning. This should point
to the declaration of the function.
The zero-count record is meant to make CallErrors little more readable and
can be later used to implement "go to the source" -function.
main.py
Main is the heart of Lever. Main is the entry point of lever. This file also
implements greenlets.
Lever process has a global state that holds execution context. The execution
context hold up some values the context needs to function.
lever_path -- The directory where lever is operating in.
If LEVER_PATH variable is not set, the program assumes the
execution takes place in current directory, or in the
prefix path given during translation.
Note that LEVER_PATH variable can cause problems if it's inappropriately set.
When Lever starts up, it comes up with an event loop. The event loop isn't much
yet. For now you can wait with sleep -function. Also nothing prevents you from
suspending execution yourself and scheduling it in. :)
Behavior during startup is bit chaotic. If the program gives arguments, the
first argument is treated as a script to run. Otherwise it will run a script
"app/main.lc" inside lever_path.
However it goes, the script is brought up with a new module scope.
The greenlets are wound up with the event loop to make both much more useful, so
they are in the same file. These are built along the earlier continuation
structures, but each greenlet associates with an execution, making them easier
to reason about.
Greenlets mesh neatly with the remaining system. You can either schedule them or
jump into them directly. There are many neat usecases where this is useful.
Particularly it makes event streams easy to wait on.
When greenlet ends, it returns to their parent. The parent is one that created
the greenlet, but it could be switched in theory.
It should be also possible to throw an error via the greenlet, either in sync or
async, to cause it alter in behavior, but this useful feature hasn't been
introduced yet. The likely functions will be:
greenlet.throw(exception) # sync
schedule(greenlet.throw, exception) # async
There are three dissonance situations with greenlets you may face.
One dissonance happens when you have a continuation that holds a resource that
must be explicitly freed. A memory buffer or file handle. If greenlet is
suspended and lost before it finishes execution, it also leaks these memory
handles even if they were seemingly properly allocated & freed.
So if you have something that can accept a greenlet, it is most likely good idea
to send an exception that tells it that a resource it waited on was discarded.
Second dissonance happens when you have an action that should not pass control,
as in it has atomicity requirements. This kind of function might accidentally
pass control by calling something that reads or otherwise may context switch.
This would violate the operation of the program and cause it to crash or corrupt
its state.
Third dissonance is a dual situation to the second. It is when you assume that
certain state is fixed during reading, and then you switch changinging the state
you were reading, and another execution change the state you were in middle of
reading.
It may be necessary to introduce a simple event loop lock mechanism to prevent that
kind of problems. Also you can control access to objects by means of visibility.
Execution that does not have access to something can also neither write to it.
Third you can introduce lot of stability into your programs by determining that
certain objects are immutable after they've been created.
Functions provided into base -module in main.py:
schedule(function/greenlet, args...)
- schedules a function or greenlet to start/resume up inside the event
loop. Execution continues without interruption.
sleep(...)
d:float - suspends the current greenlet for d seconds.
d:float, func/greenlet - schedules a function or resumes a greenlet
after given time.
getcurrent() - gets the current greenlet.
greenlet(args...) - creates a new greenlet which runs call with the given
args when waked up.
greenlet.swich(args...) - switches to the given greenlet.
- if the greenlet is just starting up, the arguments
are catenated with the initials
- otherwise zero arguments give null
one argument is passed as it
many arguments are turned into a list to return
from .switch() of the resuming greenlet.
module_resolution.py
I remember Python module system maintainer saying that Python modules aren't as
nice as they could be because there's a notion of mutable global state in them,
and that people are occassionally doing nasty things to this state to reach
their means. It turned up when I was researching on module systems.
Module system in dynamically typed language faces few challenges. Problems are
related to the fact that modules should be cached such that they can be accessed
my multiple modules that share dependencies.
Simplest and often most common way to implement coding of programs while they
are live is to write a script that is reloaded when it is updated. For module
systems this means you should be able to reload modules, or otherwise be able to
control which modules are reloaded and when.
The concepts related to module resolution aren't entirely honed up yet in Lever,
but there's very promising development going related to the problems described
above.
Lever comes with a concept I call scoped modules. I mean modules practically
have scopes, exactly like how functions have scopes. You can freely choose that
scope when you create a module or inherit from an another scope!
Any lever module scope can cache any path or resource and they are cached by
their absolute paths to help live reloaders to do their job. Very few
programs prepare for the situation that their path may change while they're
running, am I right?
Every module scope comes with a default search directory where to look modules
from. Every module scope also has a parent scope which is searched if the
current scope doesn't resolve the resource.
Every scope may also contain a handle to the function to compile stale modules
in the scope. The compile_file is retrieved for lib/ -scope before the main
scope is created.
Every module cache entry contains the mtime of the file that was loaded, and a
path to the file. So live reloaders know which modules are stale and should be
reloaded too.
When lever boots up, it already has a builtin:/ scope. It derives the lib/ scope
from this builtin scope. Then the main scope is created that points to the
directory of the first script.
Lever supports directory hierarchies in modules. If you have a directory and
that directory contains "init.lc" or "init.lc.cb", then it's a module you can
import.
When a module is created in this system, it receives some attributes:
dir -- directory of the module, makes it easy to relative-load things.
name -- name of this module.
import -- a callable object this module uses for importing code.
Import holds copy to the directory of the module, so that it can first search
relative to the location of the current module.
Then it searches from the scopes in order until it finds the module or fails to
import it. The imported module is returned by the call.
%"import".scope and you can access the current module scope as an object. This
scope object has one useful function such as: reimport It should supposedly
reimport your module but it looks like it might be half broken.
You can iterate through the scope, as well as getitem from it.
Note the "import" function is conveniently accessible. So you can always import
from the scope of an another module. Same applies to the "dir".
This is not implemented yet, but "foo.bar" -convention in lever should trigger
import of foo/init.lc, then getattr(init, "import")("bar"). Otherwise if you
import it directly, it should skip the foo/init.lc entirely.
Introduces following names into 'base' -module:
ModuleScope(local, parent=null)
Import(local, scope)
pathobj.py
It is always utter pain in the ass to work with Windows paths once you've been
on the other systems that do it right. Therefore every path is posix-formatted
within Lever.
To the OS lever still presents whichever wrecked path convention the OS is
following. But user of the language isn't forced to handle \\cupcake\boilingeggs
on one system and something else on another. The user consistently sees
//cupcake/boilingeggs. This file pathobj.py takes care of the translation.
Lever comes with a path() -object. This path can take posix and URL prefixes. In
a way I consider URLs to be posix formatted paths to a limit.
Path objects and prefixes are generally immutable. They can be compared and
hashed, though the comparison is strict and not canonicalizing the paths.
In a proxy path (not implemented), the user object used as prefix is allowed to
be mutable.
When you wrap a string into a path object, it is converted into prefix and
path sequence.
this file provides following entries:
PosixPrefix(label="", is_absolute=false)
PosixPrefix.is_absolute -- whether the path starts with "/" or not.
eg. is absolute path and not relative.
PosixPrefix.label -- the thing before colon, eg. "c:"
URLPrefix(domain="", protocol="")
URLPrefix.protocol -- eg. "http"
URLPrefix.domain -- eg. "example.org"
path.prefix -- either PosixPrefix or URLPrefix now.
path.basename -- basename of the path.. eg. thing after last "/"
path.dirname -- dirname of the path.. eg. thing before last "/"
path.relpath(base=getcwd()) -- turn the path into relative path in respect
base. It may still stay absolute if base has
completely different prefix.
path.drop(count) -- drop slash 'count' times and return the resulting path.
path.get_os_path(path) -- transmoglify the path into OS conventions.
path.to_string() - return posix formatted path.
hash(path)
str ++ path
path ++ str
path ++ path
path == path
path != path
path <= path
path >= path
path < path
path > path
Also implements getcwd, chdir, to_path, so on..
space/__init__.py
Python and Lever are similar to the extent that everything is a value. Therefore
we need several objects that just "are there". I misinterpreted what space means
in PyPy and called the directory of these objects a space.
The name is appropriate here anyway. The objects here pretty much form a common
space where everything happens.
Every user-accessible value has an interface. The interface is used to
determine what is called when the user operates on the value.
Interface is retrieved directly from the class definition, unless you have a
custom object. Instantiations from user defined classes are such custom
objects.
space/builtin.py
Builtin functions represent entries in the runtime. Some day the builtins
will have documentation associated to them.
From now on the preferred way to create a builtin function when extending lever
runtime is to use the space.signature() -decorator.
There's been considerations to mark the builtin functions in the exception
flow, so that you know whether the control flow goes through a builtin. This is
rare enough occassion so it hasn't happened yet.
space/customobject.py
Custom objects are objects that come with custom interface. It's a nice concept.
You can determine how the object should behave in respect to some builtin
functions:
+init
+call
+getattr(name)
+setattr(name, item)
+getitem(index)
+setitem(index, item)
+iter()
+contains(object)
+repr()
+hash()
In this file, properties are introduced. Although they are so useful concept
that they should be perhaps introduced earlier, in the space/interface.py
Property objects have .get and .set you can choose. This way you can provide
custom attributes that call functions, without hacking the getattr/setattr.
Please don't use property objects to produce immutability. If you do, despite
this written notice, don't mind about the batons flying like arrows around you.
It's just a side effect of doing one of the dumbest things I can imagine.
space/dictionary.py
Dictionaries. They shouldn't about have anything special in respect to python
dictionaries, but for reference their behavior is described here:
dict(arg=null) -- if you pass an iterable here that returns pairs, you'll
get a dictionary.
+contains(key) -- works pretty much like it should.
dict.length -- how many entries in this dict.
+getitem(key)
+setitem(key, value)
+iter() -- you get bunch of keys.
dict.get(key, default=null) -- getitem, except that default value is
returned if it fails.
dict.keys() -- iterator for keys
dict.items() -- iterator for item pairs
dict.values() -- iterator for values
dict.pop(key)
space/errors.py
Errors are inevitable, and exception system has been a nice, feasible way to
handle them where the handling matters, so far. Therefore lever implements an
exception system too.
Since lever exceptions are themselves just objects, we can't treat them as
system exceptions. Therefore the exception objects are plugged into an unwinder
when activated. The unwinder should be hidden from the userspace.
When lever throws, it attempts to retrieve .traceback from the exception object.
If the exception doesn't have a .traceback it creates a list for that purpose..
This way you may catch multiple exceptions on one level and you still retain
tracebacks for every exception you catched! They are cut from the point where
they are handled.
It's bit dissonant too. If you abuse this thing you are able to hide stack trace
entries! So don't abuse it without a reason.
There's an inheritance hierarchy on lever exceptions, though so far it may still
change. For now the hierarchy is flat, every exception is just inherited from an
Exception.
Users can introduce custom objects as exceptions. Only requirement is that they
hold .traceback -value that is initially null or a list. Be aware remaining
system can't catch an exception if it's not extended from Exception, though.
In runtime you see OldErrors spread around. It's a legacy. Replace those when
you update some part of code, but don't commit just to remove this function.
I want it to drop down from there on its own pace.
So far there are following exceptions:
Exception
Error extends Exception
AssertionError extends Exception
SystemExit extends Exception
UncatchedStopIteration extends Exception
AttributeError extends Exception
KeyError every subsequent extends Exception
ValueError
TypeError
FrozenError
CallError
InstructionError
More can be defined when appropriate.
space/exnihilo.py
It's extremely common to define small objects that have attributes. It's
extremely good practice when the object has no other purpose than to hold some
values.
Therefore you got exnihilo(). For now you can't fill it from an iterable. But
it's some object you can fill with values of your choice. It's very good when a
custom class and additional new type isn't required.
The exnihilo is replaced by 'object'. It is essentially the same thing but
pretends to be of type 'object'.
space/interface.py
Defines how the objects in lever look like. There's very little bit of legacy
here, that should be likely removed proactively some day, unlike the harmless
OldError.
Interfaces are themselves objects, so this file contains some recursive
definitions.
Interestingly it was quite easy to choose these details eventually. Even if they
are hard and complex subjects.
Interfaces and internal representations of objects aren't related. Interface
means for an object that is used to determine how certain object behaves under
commands.
The "interface" was chosen as a name to avoid misguided associations with
Plato and biologist ideas. Object system in lever doesn't form a taxonomy, it's
not magical or mystical and definitely physical objects "instances" aren't some
shadows of their ideal forms "classes".
It removes quite lot of confusions when the language author isn't lying through
his teeth about the implementation!
Every object has an interface, so interfaces have interface too. Interface of an
interface should be an interface.
Null is bit of a special object in that it's an interface too, because null is
an interface of itself. null is also a parent of null.
When used in isinstance() or in coercion operations, it's useful to have a
hierarchy so that you can group objects with similar meaning together.
Every interface can have a parent interface. "null" is generally accepted as a
parent, but the distinction between "null" and "object" as a parent is probably
pointless.
Every interface are considered to have methods. When method is retrieved via
getattr it is bound. This is default behavior that can be vastly mangled by the
runtime.
Every interface defined by runtime can get a 'cast' -function which
describes the implicit casting rules into those values follow. Use
'space.cast_for(ClassName)' -decorator to tell out the function.
It'd be likely beneficial to have a Property -system for builtins. I yet have to
consider options for how to do this.
space/listobject.py
Old good workhorse. The analogy is bit off but you can't really understate how
nice thing lists can be in a language runtime.
They aren't very exciting but that's very good property in them:
+hash
+getattr
+contains
+getitem
+setitem
+iter
+repr
.append(item)
.extend(items)
.insert(index:int, item)
.remove(item)
.pop(index=.length-1)
.index(item)
.count(item)
.sort(lt=%"<")
.reverse()
space/module.py
Module presents a global context for a script. Basically when you run a script,
it is associated with a module.
Module has been structured so that it works efficiently as global scope when
used in combination with JIT. Again if you get caught about lying, it's not
good for you. So the principle here is to not lie in the first place about the
implementation.
Modules cannot be cleared during reimport. You may null every field, but that's
it. You can getattr/setattr anything they contain. And you can use a module as a
base when you instantiate new module.
All builtin modules should be frozen, unless there's opposite reason.
space/multimethod.py
Multimethods consists of table of functions, each associated with interfaces the
function should be used with.
For efficient implementation there's a dissonance in multimethods that lever
presents, with extended interfaces. The multimethods completely ignore
inheritance rules. It means that Lever multimethods can be perceived to violate
Liskov Substitution Principle.
Multimethods have fixed arity. This means that they select the function by
fixed number of arguments they receive. But otherwise they let all arguments
pass through.
Basically the action done by lever multimethod is:
- Select function in the table by the interfaces associated with n objects.
- if that function exists, call it.
- if the function isn't existing and the call isn't suppressed, call
the default method associated with multimethod.
Multimethods can handle:
multimethod(int)
setitem
.default set and get
call
.call_suppressed(args...)
This behavior makes it really easy to reason about multimethods! This is
invaluable if lever is ever translated, and it is invaluable for documentation.
Your initial impression of this kind of multimethods would be that they cannot
handle arithmetic without being really crowded. But Lever methods aren't really
crowded. That's because the lever author is clever.
Lets consider the usual situation that arithmetic isn't in pair. You call 1.4 +
2, what should happen is that you get 2.4. On float + int you will expect float
aka. some inaccurate answer.
You would think "there must be a float + int", but lever doesn't have one.
Instead when float + int is called, it goes to +.default which calls coerce on
the values and attempts to call the multimethod again with the returned values,
suppressed this time so that there doesn't come an infinite loop.
There's very small potential surprise that the person didn't notice the values
were coerced to try again. But otherwise this way is much better than trying to
be stupid and pretend that float inherits from a integer inherits from a boolean.
space/numbers.py
Basic structures for floats, integers and booleans.
Strings cannot be directly coerced into integers or floats btw. You need to
convert them by parsing.
File provides int.to_string(base=10) and float.to_string()
space/operators.py
Operators define some multimethods, and some non-multimethods!
Multimethods:
clamp(low, high, value)
coerce(a,b)
a ++ b
a != b
a == b
a < b
a > b
a <= b
a >= b
-expr # yes, you actually do %"-expr" to get this.
+expr
Every arithmetic method does the coerce as default, like described in
multimethods. Otherwise they are implemented for int and float pairs in this
file.
There is whole full set of them: + - * | % & ^ >> << min max /
Divide is bit of special in this set because it returns a float for integers as
well. If there is ever division to return integer, it'll be // and defaults to
floor(a/b).
Coercion rules on coerce(a, b):
bool bool -> int int
int bool -> int int
bool int -> int int
int float -> float float
float int -> float float
Comparison methods are completely defined on integers, floats and strings pairs.
Comparison is not using coercion table on default. It may be better idea to just
implement comparison between integers and floats explicitly. That is very
special case when you need to compare something else together.
The proposition to comparison is that when you define one, it should make sense
to define all of them.
Concat works on strings and lists.
Sets implement <= >= < > | & - ^ -methods.
space/slices.py
Slices provide iteration and indexing help. You can use them to get
substrings/sublists or iterate through ranges.
Semantics of slice(start, stop, step=1) is to provide integer intervals that can
be 'unbounded' to negative or positive infinity by giving null on either start
or stop.
Slices have +iter implemented on them. It allows them to be used in place of
range(). If stop=null in slice that is iterated, it becomes a step -iterator
that doesn't end.
clamp(slice, low, high) method is implemented for slices in
space/operators.py, it can be used to normalize a slice into a desired range.
Note that this function takes the 'step' into the account and binds the
unbounded sides.
Strings and Lists can getitem with slices.
Semantics of list.setitem should be thought out for slices. It is likely that we
want to support at least list.setitem slices with step 1 and -1.
step=0 for strings and lists will hang. :) TODO: Worth fixing later.
I plan to deprecate the range() to encourage use of slices instead.
There is ".:" and ":." -syntax for slices, this causes bit of a precedence
corner-case with floating point numbers.
space/setobject.py
Sets follow the good convention of "lets take it from python" we have already
used to.
Sets implement:
+contains
+getattr
+iter
.copy()
.clear()
.update(args...)
.intersection_update(args...)
.differenc_update(args...)
.symmetric_differenc_update(other)
.discard(value)
.remove(value)
.pop()
.is_disjoint(other)
.is_subset(other)
.is_superset(other)
.union(others...)
.intersection(others...)
.difference(others...)
.symmetric_difference(others...)
set(iterable?)
space/string.py
Strings are immutable just like in python. Turns out it's very sane choice.
Supported:
+repr
+hash
+getitem
+iter
.length
.join(strings)
.is_alpha()
.is_digit(base=10)
.is_space()
.startswith(str)
.endswith(str)
space/uint8array.py
Uint8Arrays aren't immutable because that's not very sane.
These are blobs of uint8 that are manage to remove themselves when they get
forgotten.
If I remember right, you can pass these into ffi as memory buffers. If I don't,
well..
Supported:
+repr
+getitem
+setitem
.length
Uint8Arrays(...)
size
iterable containing integers
stdlib/api.py
'api' -module comes with utilities to load .json -formatted headers and convert
them into structures that let you annotate C FFI in Lever.
The common problem with every new programming language is the lack of libraries.
When you're writing programs it's common to face recurring needs with lot of
work involved to satisfy them. Access to C libraries solves many of these needs
very well.
Nearly every dynamic language worth using has a C FFI you can use to load shared
libraries and run the C code inside them. But these shared libraries do not come
with "headers" - details about how to call the code.
With Python there's been a tradition to write wrapper modules that load the
libraries, then export and annotate every symbol the author wants to expose.
For large libraries such modules are thousands of lines long and sometimes
automatically generated by a script.
There's lot of tedious but trivial work involved in maintaining such wrappers.
To make it even more tedious you can attempt to "pythonify" the wrapper and
change the interface you expose to the user. That way you have to translate the
original C documentation for your pythonified library. And we know how eager
programmers are at writing documentation for their work especially when it
involves lot of redundant effort.
This is a tragedy in progress, and there is one precise cause for all the pain:
The fucking shared libraries do not come with all the means to call them!
C libraries come with C headers. When a C programmer uses such library he will
retrieve the headers as well.
C programmers do not write annotations to libraries they call. Instead they
refer to the headers they need. Those headers are themselves written in C.
There's a complication in fetching the headers that you practically need to
preprocess everything properly and then parse it all. The amount of
preprocessing required before you obtain your headers is intense.
I did that, then I exported the important pieces to .json -files that are
ambiguous, and really quick to read and process. Having macro-preprocessed it
means they are platform-specific iff the library author was harebrained.
The library for generating headers: https://github.com/cheery/cffi-gen
I may return to the subject some day, because the above generator is not built
in the standard it should be.
One simple use-case for this module would be use of libSDL2 to write multimedia
programs. Lever holds the headers/libSDL2.json so you won't need to supply those
headers along your program. You would access the full access to the library by:
import api, ffi, platform
sdl2_api = api.open_nobind("libSDL2", {})
if platform.name == "win32"
sdl = ffi.library("SDL2.dll", sdl2_api)
else
sdl = ffi.library("libSDL2.so", sdl2_api)
assert sdl.Init(sdl.INIT_VIDEO) == 0
"SDL Init: " ++ sdl.GetError()
window = sdl.CreateWindow("Hello", 100, 100,
640, 480, sdl.WINDOW_SHOWN)
assert window, "CreateWindow" ++ sdl.GetError()
When preprocessing headers, it is acceptable for the author to remove the
library prefix and adjust the formatting of names.
Additionally Lever has helpers for constructing bitmasks that are optional to
use. They were originally designed to handle Vulkan. They should be only used if
it makes easier to use the library.
Provided objects:
so_ext
funclibrary(api, func)
library(name, func=null, dependencies={}, decorator=null)
- this is a shorthand for:
ffi.library(name, api.read_file(name, dependencies, decorator))
api.funclibrary(api.read_file(name, dependencies, decorator), func)
read_file(path, dependencies={}, decorator=null)
read_object(obj, dependencies={}, decorator=null)
api.lookup_type(name)
- looks up a type.
api.build_type(name, decl)
- builds a type from declaration, assuming it had the name given here.
Was introduced along the decorator argument.
Meaning of different arguments:
path: If absolute, will be used as direct path.
If relative, will reference from headers/
If the path doesn't end with ".json", the string will
be catenated to the end of the path.
The 'api' should treat paths as if they are relative to headers/
-directory.
To be forwards compatible, the ".json" file extension will be
supplied to you, and you shouldn't add it yourself.
func: Meant for times when you have something like glGetProcAddr and you
would need to convert it into a library.
You get a funclibrary that way.
dependencies: If a library depends on something, you can pass here which API
it should use to satisfy the dependency. The object given here
must have getitem() -method.
decorator(api, name, decl):
Can be used to redefine how the library builds type objects. Also a
concept that was originally introduced to handle Vulkan.
Format and a function of a header json file:
{
"constants": { name: value }
"types": { name: decl }
"variables": {
name: {
name: cname
type: decl
}
}
"depends": [ name ] # optional, for now ignored and used as extra
# documentation.
}
The resolution order of getitem(api, name):
- cache
- constants dict
- variables dict
- types dict
decl can be one of several formats:
"void" - treated as null
"library.type" - fetched from the dependencies dict by using getitem()
twice.
"typename*" - Treated as pointer, type is resolved for the prefix.
"typename" and can be found in types dict -
The type is resolved for the name and the type is cached.
"typename" and not found in types dict -
Resolved from the platform-dependent type table: ffi.systemv.types
TODO: the table not visible to runtime
If decorator is used, it may be used to introduce additional types not
present below. Or to extend the behavior of these existing declarators.
{
"type": "cfunc"
"restype": decl
"argtypes": [decl]
}
{
"type": "union"
"fields": [[name, decl]]
}
{
"type": "struct"
"fields": [[name, decl]]
}
{
"type": "opaque"
}
{
"type": "array"
"length": null or int
"ctype": decl
}
{
"type": "pointer"
"to": decl
}
{
"type": "enum"
"ctype": decl
"constants": {name: int}
}
{
"type": "bitmask"
"ctype": decl
"constants": {name: int}
}
stdlib/ffi/__init__.py
FFI on lever was designed with a goal to make it so nice to work with
automatically generated foreign function bindings that there would be little or
no need to write wrappers for them.
This means that given good enough C header parser/spec and tool to generate
json-formatted headers, you can use it in a manner that isn't much worse off
than using the library directly in C.
To keep it simple, in terms of C-visible storage, the FFI only has a concept of
memory buffer.
Supported objects:
array(ctype, count=null)
bitmask
cfunc(restype, [argtype])
handle - A concept of a library handle.
It may make sense to make this object more like mem in future.
library(name, api) -
Loads a C library with given name. The api -object can be used to supply
headers to make the library immediately usable. api must implement
getitem()
TODO: put to support path objects.
throws ffi.LoadError if fails.
mem - A memory object, can be obtained with malloc & automem
pointer(ctype)
signed(size)
struct([[name, ctype]])
union([[name, ctype]])
unsigned(size)
voidp - a void pointer.
wrap - an object that can be returned to a library object.
- Contains a reference to cname and a ctype. The intent is
- that the library exports when it sees such object.
pool(autoclear=true)
pool.alloc(ctype, count=1, clear=pool.autoclear)
- allocate from the pool.
pool.mark(obj) - keep 'obj' alive while pool is alive.
pool.free() - clean every entry in the pool.
cast(obj, ctype)
sizeof(ctype, count=1)
malloc(ctype, count=1, clear=false)
automem(ctype, count=1, clear=false)
free(mem)
memset(dst, byte, count)
memcpy(dst, src, count)
ref(membuffer)
callback(cfunctype, callback)
The result of callback -function must be stored as long as the C pointer to it
remains in use.
stdlib/ffi/simple.py
ffi/simple.py simple, platform independent concepts of the FFI belong here.
This file establishes some concepts in the lever FFI, so I should point them
out here.
sizeof() in lever accepts both a ctype and a count. This makes sure the
parametric C types can be treated correctly.
size=0 and align=0 in a type means it's an opaque type and cannot be allocated.
It can be worth mentioning that any lever object can be treated as a C type if
there are the methods .load(value) and .store(value), and an attribute .shadow
If the above happens, the type is wrapped into a "shadow" -type which internally
takes care that such user-defined-ctypes behave like other type objects.
C-types are like interfaces in the sense that they provide an interface. But this
interface is provided to a memory object laying in C heap.
Types also describe getattr/setattr behavior of the C objects.
stdlib/ffi/bitmask.py
Idea of bitmasks, including enums and flags. This is not used if it doesn't make
the C FFI cleaner.
This became useful when implementing early vulkan support to Lever. Vulkan had
so many object-orientedly named bitmasks that it became much cleaner when you
have something to tackle them.
Internally bitmask is represented as a value wrapped with an annotation.
There are several things you can do for bitmasks. First of all you can convert
them to integers. Second you can getattr them to query whether the flag or enum
is set. Example:
apiType = lib.buffType
if apiType.FLOAT
float things...
elif apiType.INT
int things...
else
some error
When you have a field that accepts a bitmask, you may supply it a string,
another bitmask, integer or a list (only if it's not an enum).
For example:
lib.setTypes("FLOAT")
lib.setTypes(["INT", "FLOAT"])
lib.setTypes(2)
TODO: Implicit conversions present across the lever ffi...
I've been considering whether they should be reversible for some purpose.
For example. these bitmasks.
Should it be possible to implicitly convert them into best representation
you can infer?
This might allow lever to have it easier to store C objects in .json
stdlib/ffi/systemv.py
ffi/systemv.py This code used to be partially or completely SystemV specific.
The file has received non-systemv specific things as well, so it might be
revised if lever is ever ported to different platforms.
Arrays are pretty standard part in Lever. They may either form a parametric
field, or then their size is specified and they don't.
If the size of array ctype is 1, the array memory object has a .str -attribute
which gives you the array contents as a string. Lever assumes utf-8 encoding in
this case.
Arrays can be implicitly filled from lists when there is memory backing.
Unions and structures can be implicitly filled from a dictionary when there is
memory backing.
Pools were introduced when considering Vulkan bindings. The intent of this
construct is to make it easy to fill large C structures.
When you allocate an object from a pool, the pool is used for implicit
allocation and for keeping references alive.
In presence of pool you can setattr lists and dictionaries into memory object
and see them stored into implicitly allocated C structures that you can pass to
C API.
If you pass a mem -object to pool-allocated record, the pool marks the object
and keeps it alive as long as the pool itself is alive.
Calls to C functions create themselves a pool to construct a function call. You
should be able to call a function with a list as a parameter and expect it to be
converted into memory object of appropriate size.
stdlib/fs.py
Every module that wants to read or write something into file is using this
module. This is quite rudimentary module without beauties.
I would hope to combine the filesystem module with eventloop at some point, and
have the methods in this library work asynchronously, returning control to the
internal event loop when process would have to sleep otherwise.
'fs' exposes following objects:
exists(path)
stat(path)
getatime(path)
getmtime(path)
getctime(path)
read_file(path, mode="")
open(path, mode="")
file
file.read(count=null)
file.write(data:Uint8Array)
file.close()
file.seek(pos:int)
file.tell()
file.truncate(pos:int)
file.fileno()
file.isatty()
file.flush()
stdlib/gc.py
I would hope to have some control over GC. But RPython methods that seemed
relevant didn't turn out to control anything.
All this 'gc' module exposes is:
collect()
I don't know if it triggers a collector cycle.
stdlib/__init__.py
When I have a directory containing names, it would feel off that have to write
a list of modules present in Lever. A little script in the stdlib/__init__.py
did really brightened up my mood.
stdlib/json.py
This JSON parser has been derived from the cheery/json-algorithm at github
Currently the json library is used for reading and writing generated C header
files and for configs. They have lot of common uses.
Loading and writing even large json files is fast, so there is no much need
to diverge from using json on the C headers.
Exposed function:
write_file(path, obj, options=null)
write_string(obj, options=null)
read_file(path)
read_string(string)
If options dictionary is not provided then the writing functions use quick json
encoder.
If options dictionary is provided then the pretty printer is used for encoding
json. The pretty printing takes following options by default:
{indent=2, sort_keys=false}
How the pretty printing is activated may change in future. In any case, giving
'indent' or 'sort_keys' options should always trigger the function.
This library will likely receive improvements on the API. But the internal
algorithm seem very good and solid.
stdlib/platform.py
Sometimes it's really necessary to know what your platform looks like. Every
interface isn't portable, and everyone can make mistakes.
Exposed values:
name # Describes the name of the platform. win32 or linux.
This is derived from the sys.platform directly.
stdlib/process.py
Consider that you want to communicate with other programs. Chances are you want
to spawn processes. This library was written to get blender export stale models
automatically.
I'm honestly not sure how the spawnv should work. The first argument variable
seem to be often unimportant. But I know how it would suck if some app depended
on specific argv[0] and you couldn't trigger it with this code.
The module 'process' contains following objects:
spawnv(path, args:list)
waitpid(pid:int)
which(program:path)
is_exe(file:path)
util.py
Lot of things seem to fall into place in Lever, eventually. When they don't,
they end up to the util.py At 1.0.0 this file might be empty or not be there.
vectormath.py
Interactive graphics, music.. VR... It seems inevitable that the runtime needs
to deal with vector arithmetic. It would also eventually have to make it very
fast.
Therefore we have some vector arithmetic, and it's defined and pushed into the
base module! The functionality here isn't complete, but it is already
sufficient. There are some OpenGL 4 demos in the samples directory.
Main reason why this code is in base module is that I were not sure how
"from x import ..." should function exactly. It, again, presents dissonance
case for reloadable modules. This is one method to leave stale marks from
earlier modules into the runtime.
Vectormath also holds some randomizing functions and trigonometry functions.
It's steering on the limit of containing gunk. Possibly some of these features
have to be separated into modules later.
vec3, quat, mat4 are iterable and all have .length
The vectormath.py introduce the following values into base module:
vec3(x=0, y=0, z=0)
quat(x=0, y=0, z=0, w=1)
quat.to_mat4(position:vec3=vec3())
quat.invert()
axisangle(axis:vec3, angle:float)
mat4(values...)
mat4.transpose()
mat4.invert()
mat4.adjoint()
mat4.determinant()
mat4.rotate_vec3(vec3)
mat4.translate(vec3)
mat4.scale(vec3)
random()
random_circle()
random_sphere()
sin(float)
cos(float)
tan(float)
asin(float)
acos(float)
atan(float)
atan2(y, x)
sqrt(float)
projection_matrix(fovy_radians, aspect, znear, zfar)
left
right
up
down
forward
backward
length(value)
dot(a, b)
cross(a, b)
normalize(a)
pi
tau
Vectormath introduce the following functions as multimethods:
vec3 + vec3
vec3 - vec3
vec3 * float
float * vec3
vec3 / float
float / vec3
length(vec3)
dot(vec3, vec3)
cross(vec3, vec3)
normalize(vec3)
-vec3
-quat
+vec3
+quat
quat * quat
quat * vec3
mat4 * vec3
mat4 * mat4
clamp(float, float, float)
clamp(int, int, int)