Permalink
Find file
Fetching contributors…
Cannot retrieve contributors at this time
447 lines (329 sloc) 19.3 KB

The Sparkling C API

The Sparkling API is bidirectional: the native host program and a Sparkling script can control each other.

Loading, compiling and running Sparkling scripts

Sparkling is a compiled language: the interpreter performs the usual transformations on the code before running it. Namely, source code is broken into tokens by the lexer, the token stream is analyzed by the parser to form an abstract syntax tree (AST), then the AST is walked by the compiler, a step where actual executable code (bytecode) is generated.

Then, the bytecode is fed into the virtual machine (VM) which executes the instructions and responds with the return value of the program.

The Sparkling API provides data types and functions for all these tasks.

typedef struct SpnParser SpnParser;

SpnParser represents the parser (no surprise here). Its API consists of memory management (creation, destruction) functions, an actual parser function and error handling.

typedef struct SpnAST SpnAST;

An abstract syntax tree object. AST nodes have at most two children, and they are connected in a righ-leaning manner (the first child is called the left child, the second child is the right child).

void spn_parser_init(SpnParser *);

Initializes a parser object.

void spn_parser_free(SpnParser *);

Deallocates resources that the parser object owns.

SpnAST *spn_parser_parse(SpnParser *, const char *);

This function takes Sparkling source text and parses it into an AST. On error, sets the error message and returns NULL. The returned AST shall be freed using spn_ast_free() after use.

typedef struct SpnCompiler SpnCompiler;

A compiler takes the syntax tree of a program and outputs a bytecode image. This bytecode can be run on the Sparkling virtual machine.

SpnCompiler *spn_compiler_new(void);
void spn_compiler_free(SpnCompiler *);

Memory management functions.

SpnFunction *spn_compiler_compile(SpnCompiler *, SpnAST *, int debug);

Compiles an AST into bytecode. Returns a callable function object on success, a NULL pointer on error. The returned function object holds an owning pointer to the generated bytecode, so unless you are using the convenience context API, you must spn_value_release() it after use.) If debug is nonzero, the returned function will contain debugging information.

The Sparkling bytecode format is platform-dependent, i. e. bytecode generated on some platform will always run on the same platform, it may be serialized to file safely and it can be retrieved and run in another host program/process, but this is not portable across different platforms. (Specifically, the size of the long data type, the representation of floating-point numbers, the representation of negative signed integers and endianness may vary.)

const char *spn_compiler_errmsg(SpnCompiler *);

Returns the last error message. Check this if the compilation of an AST failed.

typedef struct SpnVMachine SpnVMachine;

A virtual machine is an object that manages the execution of bytecode images.

SpnVMachine *spn_vm_new(void);
void spn_vm_free(SpnVMachine *);

Memory management functions.

void spn_vm_addlib_cfuncs(SpnVMachine *vm, const char *libname,
    const SpnExtFunc fns[], size_t n);

Registers n native (C language) extension functions to be made visible by all scripts running on the specified virtual machine instance. if libname is NULL, the functions will be available at global namespace. Else a new global array will be created with the name libname, and the functions will be members of this global array. This is how you can create "modules" or "namespaces".

void spn_vm_addlib_values(SpnVMachine *vm, const char *libname,
    SpnExtValue fns[], size_t n);

This function is similar to spn_vm_addlib_cfuncs(), but it accepts any valid SpnValue, not just C extension functions.

SpnHashMap *spn_vm_getglobals(SpnVMachine *vm);

This functions gives access to the global symbols currently registered with the virtual machine. This comes handy when one wants to call a function with a specific name: given a name, it is possible to get an SpnValue out of the hashmap that corresponds to the function with the specified name. It is also possible to set (add or replace) a global constant using the C api. That should be done very carefully, though.

void *spn_vm_getcontext(SpnVMachine *);

Returns the context info of the virtual machine. This is set and retrieved by the user exclusively. It's important mainly when one want to communicate with extension functions.

void spn_vm_setcontext(SpnVMachine *vm, void *ctx);

Sets the context info of vm to the user-supplied ctx.

const char *spn_vm_geterrmsg(SpnVMachine *);

If a runtime error occurred, returns the last error message.

void spn_vm_seterrmsg(SpnVMachhine *, const char *, const void *[]);

This function should only be called from within a native extension function, in case of erroneous termination (before returning a non-zero status code). It sets a custom - formatted - error message that will be returned by spn_vm_geterrmsg().

SpnStackFrame *spn_vm_stacktrace(SpnVMachine *vm, size_t *n);

Returns an array of stack frame descriptor structures (SpnStackFrame). The return value must be free()d after use. On return, *n contains the number of stack frames.

The SpnStackFrame structure has the following fields:

SpnFunction *function;
ptrdiff_t exc_address;
ptrdiff_t return_address;
void *sp;

The function member is an SpnFunction pointer that points to the function object corresponding to the stack frame. You can use its name member to generate a symbolicated stack trace.

The exc_address member contains an approximation of the address where the actual runtime error has occurred. For the 0th frame, this is the same as the return value of spn_vm_exception_addr(); for frames #1 and above, this is computed from the return address of the caller. (the frame of the caller of the function in the ith frame is the frame at index i - 1.)

The return_address member contains the offset into the bytecode of the caller of function where control flow would have been transferred if the function had returned normally. If return_address is negative, then the caller of the function is a native C function, and not a Sparkling script function.

sp is an opaque pointer. It holds information about the registers in the given stack frame. It is used by spn_vm_get_register().

SpnValue spn_vm_get_register(SpnStackFrame *frame, size_t index);

If frame corresponds to a Sparkling function (top-level program, closure or non-closure script function), then returns the value of the indexth register in the stack frame. Along with the debug information, this can be useful for e. g. inspecting the value of certain variables or other expressions.

If frame corresponds to a native (C) extension function, or index is too high (>= the number of registers in the stack frame), then the behavior is undefined.

The returned value handle is non-owning, i. e. it is invalidated by a subsequent function call initiated on the corresponding virtual machine object or by the decallocation of the virtual machine object.

ptrdiff_t spn_vm_exception_addr(SpnVMachine *vm);

If the last runtime error has been thrown by Sparkling code (a script function or top-level program), then returns the offset into the bytecode of the offending VM instruction. Otherwise (when the last error has been caused by a native extension function, or when there hasn't been any run-time exceptions so far), this function returns a negative value.

int spn_vm_callfunc(
    SpnVMachine *vm,
    SpnFunction *fn,
    SpnValue *retval,
    int argc,
    SpnValue *argv
);

This function makes it possible to call any Sparkling (or native) function from C code (eg. from native extension functions or the host environment).

On return, the value pointed by `retval' contains the return value of the called function. This is an owning structure: you must spn_value_release() it when you don't need it anymore. The returned value may refer to the program and/or bytecode, so you must not use the value after the main program itself has been deallocated. Usually this is not a problem, since normally, when using the context API, the lifetime of the programs is tied to the lifetime of the corresponding context object, and one should only free the context object when one doesn't use the Sparling engine anymore.

Throws a runtime error if:

  1. its fn argument does not contain a value of function type, or
  2. if fn is a native function and it returns a non-zero status code.

Script functions are tied to their top-level program, so if fn is not a native function, then it should be implemented in a translation unit that has already been executed at least once.

Returns the status code of fn.

This is the function that you can use to run compiled Sparkling programs. It is used by the context API too, for the same purpose.

Important: if spn_vm_callfunc() returns nonzero (indicating that the called function failed in some way or another), then your native extension function (the caller) is obliged to return an error code as well; there's no possibility for graceful error recovery. This is required because of an implementation detail: each native function has its own "pseudo-frame" on the call stack in order it to be shown in the stack trace if it returns an error status code. If it terminates normally, then this pseudo-frame is, of course, popped off of the call stack. If, however, the function reports an error, the dummy frame isn't removed (so that the stack trace is complete).

Now, if a non-native caller function tries to access its frame, this will lead to an off-by-one (or two, or however many functions do not follow this convention) error, and the caller will operate on the frame of another function, corrupting the stack.

For the same reason, you must not attempt to call another function using spn_vm_callfunc() after a runtime error has occurred.

Usually, this constraint should not be a problem, though -- only fatal errors should lead a function to return an error status code anyway. Non-critical errors (from which recovery is considered possible) should be reported by the means of some other mechanism (e. g. by returning `nil' or a Boolean status flag to the caller).

Using the convenience context API

The Sparkling API also provides an even easier interface, called the context API. This API is the preferred way of accessing the Sparkling engine. A context object encapsulates a parser, a compiler, a virtual machine, and each bytecode file that has been loaded into that context.

void spn_ctx_init(SpnContext *ctx);
void spn_ctx_free(SpnContext *ctx);

These functions initialize and destroy a context object. spn_ctx_init() sets the context object itself as the context info of the contained virtual machine. If you need user info (e. g. from within an extension function), use the spn_ctx_getuserinfo() and spn_user_setuserinfo() functions.

enum spn_error_type spn_ctx_geterrtype(SpnContext *ctx);
const char *spn_ctx_geterrmsg(SpnContext *ctx);
SpnSourceLocation spn_ctx_geterrloc(SpnContext *ctx);

These functions return the type, description and location in the source code of the last error that occurred in the context. If no error occurred, the error type is SPN_ERROR_OK, and in this case, the description is a NULL pointer, and the location is { line: 0, column: 0 }.

SpnFunction *spn_ctx_compile_string(SpnContext *ctx, const char *str, int debug);
SpnFunction *spn_ctx_compile_srcfile(SpnContext *ctx, const char *fname, int debug);
SpnFunction *spn_ctx_compile_expr(SpnContext *ctx, const char *expr, int debug);

These functions attempt to parse and compile a source string. If an error is encountered during either phase, they set an error message and they return NULL. Else they return a function pointing to the compiled bytecode. This value is non-owning -- it should not be released explicitly, because it is deallocated when you free the context object.

The first two functions expect the source string or the contents of the source file to be a full top-level program (i. e. zero or more statements), while spn_ctx_compile_expr() waits for a single expression.

The bytecode objects are accumulated inside the context object, in the form of an SpnArray. This array can be accessed through spn_ctx_getprograms(). (See the warning about the returned array being read-only at the documentation of the spn_ctx_getprograms() function!)

SpnFunction *spn_ctx_loadobjfile(SpnContext *ctx, const char *fname);

Reads a compiled object file and returns a non-owning function representing its contents. Adds the function to the beginning program list, as described above. On error, it returns a null pointer.

int spn_ctx_execstring(SpnContext *ctx, const char *str, SpnValue *ret);
int spn_ctx_execsrcfile(SpnContext *ctx, const char *fname, SpnValue *ret);
int spn_ctx_execobjfile(SpnContext *ctx, const char *fname, SpnValue *ret);

These wrapper functions call the corresponding spn_ctx_load* function, but they also attempt to execute the resulting compiled bytecode, and they copy the result of the successfully executed program to ret and return zero. On error, they set an appropriate error type and error message.

int spn_ctx_callfunc(
    SpnContext *ctx,
    SpnFunction *func,
    SpnValue *ret,
    int argc,
    SpnValue argv[]
);

void spn_ctx_runtime_error(SpnContext *ctx, const char *fmt, const void *args[]);

SpnStackFrame *spn_ctx_stacktrace(SpnContext *ctx, size_t *size);

ptrdiff_t spn_ctx_exception_addr(SpnContext *ctx);

void spn_ctx_addlib_cfuncs(SpnContext *ctx, const char *libname,
    const SpnExtFunc fns[], size_t n);

void spn_ctx_addlib_values(SpnContext *ctx, const char *libname,
    SpnExtValue vals[], size_t n);

SpnHashMap *spn_ctx_getglobals(SpnContext *ctx);

These are equivalent with calling spn_vm_callfunc(), spn_vm_seterrmsg(), spn_vm_stacktrace(), spn_vm_exception_addr(), spn_vm_addlib_cfuncs(), spn_vm_addlib_values() and spn_vm_getglobals(), respectively, on ctx->vm.

SpnArray *spn_ctx_getprograms(SpnContext *ctx);

Returns an array of SpnValue objects, which are functions representing the top-level programs that have been added to the context. You must not modify the returned array in any way.

Writing native extension functions

Native extension functions must have the following signature:

int my_extfunc(SpnValue *ret, int argc, SpnValue *argv, void *ctx);

Whenever a registered native function is called from within a Sparkling script, the corresponding C function receives a pointer where the return value must be written (this is initialized to nil by default, so if a function doesn't set this argument, it will return nil into Sparkling-land), the number of arguments it was called with in argc, and a pointer of the first element of the argument vector, which is an array of argc value structures.

The user-controlled context info field of the virtual machine is also passed to extension functions in the ctx parameter.

Native extension functions must return 0 on success or nonzero on error (returning a non-zero value will generate a runtime error in the VM). The argv array and its members are never to be modified; the only exception to this rule is that the user can retain the members (along with copying them) to store them for later use. In this case, they are to be released as well. Memory management functions spn_value_retain() and spn_value_release() are provided for this purpose.

One word about the return value. Values are represented using the SpnValue struct, which is essentially a tagged union (with some additional flags). Values can be of type nil, Boolean, number (integer or floating-point), function, user info, string, array and hashmap. Strings, arrays, hashmaps, functions and user info values marked as such are object types (i. e. they have their SPN_FLAG_OBJECT flag set in the structure). It means that they are reference counted. As an implementation detail, it is required that if a native extension function returns a value of such an object type, then it shall own a reference to it (because internally, it will be released by the virtual machine when it's not needed anymore). So, if, for example, one of the arguments is returned from a function (not impossible), then it should be retained.

In other words: the following is wrong:

int memory_corruption(SpnValue *ret, int argc, SpnValue *argv, void *ctx)
{
    if (argc > 0)
        *ret = argv[0];

    return 0;
}

This is how it should have been done:

int well_done(SpnValue *ret, int argc, SpnValue *argv, void *ctx)
{
    if (argc > 0) {
        spn_value_retain(&argv[0]);
        *ret = argv[0];
    }

    return 0;
}

Use the convenience value constructor functions in api.h, str.h, array.h, hashmap.h and func.h in order to create value structs of any type.

Sparkling API functions typically copy and retain input values, and return non-owning pointers when giving output to the caller. Thus, if you want to use a value longer than an immediate operation, you typically retain and copy the structure too. In the other direction, when you supply a value to a function, you can pass the address of an automatic local (block-scope) variable -- it will be safely copied, and the value inside will also be retained if it's an object.

Accessing debug information

A Sparkling compiler instance can be asked to generate debugging information. This information is then available in the debug_info member of the resulting function (the one which contains the bytecode for the top-level program).

The debug info contains implementation-defined data which maps low-level bytecode addresses, register numbers, etc. to source lines, columns and variable names, etc.

void spn_dbg_set_filename(SpnHashMap *debug_info, const char *fname);

By default, the compiler doesn't ask for a filename, even if it is generating debug information. (The rationale behind this design decision is that not all Sparkling programs may have been created from a file, and I didn't want to make the compiler api a complete pain to use.)

So, if you know that a particular program has been created from a file, you can use this function to register the file name in the debug info.

const char *spn_dbg_get_filename(SpnHashMap *debug_info);

Returns the filename set using spn_dbg_get_filename() or "???" if no file name was set and/or debug_info was NULL.

SpnSourceLocation spn_dbg_get_frame_source_location(SpnStackFrame frame);

Returns a source location object corresponding to the error location in frame. frame must have been obtained via a call to spn_vm_stacktrace() or spn_ctx_stacktrace(). If no corresponding location info is found, returns (0, 0).

SpnSourceLocation spn_dbg_get_raw_source_location(
    SpnHashMap *debug_info,
    ptrdiff_t address
);

Returns the source location for an arbitrary address pointing inside a bytecode array described by debug_info.

If debug_info is NULL or the location couldn't be determined, returns (0, 0).