Skip to content

[Proposal] Utilities for argument validation in external functions #1716

@martijnthe

Description

@martijnthe

When implementing external functions, you might have written something like this:

// Native implementation of Xyz.prototype.doSomething(a, b)
// First arg is expected to be a string.
// Second (optional) arg is expected to be a boolean.
// `this` is expected to be bound to an instance of Xyz.
static jerry_value_t native_method_impl(const jerry_value_t func_val __attribute__((unused)),
                                        const jerry_value_t this_val __attribute__((unused)),
                                        const jerry_value_t *args_p __attribute__((unused)),
                                        const jerry_length_t args_cnt __attribute__((unused))) {
    if (args_cnt < 2) {
        return jerry_create_error(...);
    }

    // Validate and "transform" (copy to C string) argument 0:
    if (!jerry_value_is_string(args_p[0])) {
        return jerry_create_error(...);
    }
    char arg0[32];
    if (0 == jerry_string_to_utf8_char_buffer(arg[0], (jerry_char_t *)arg0, sizeof(arg0))) {
        // arg0 buffer too small
        return jerry_create_error(...);
    }

    // Validate and "transform" (copy to C bool) argument 1:
    bool arg1 = true; /* default value is true */
    if (args_cnt >= 1) {
        // Optional 2nd arg of type boolean
        if (!jerry_value_is_boolean(args_p[1])) {
            return jerry_create_error(...);
        }
        arg1 = jerry_get_boolean_value(args_p[1]);
    }

    // Validate `this`:
    struct native_obj_t *this_native_obj;
    if (!jerry_get_object_native_handle(this_val, (uintptr_t *)&this_native_obj)) {
        // Whoops, no native handle at all! Caller probably re-bound the method.
        return jerry_create_error(...);
    }
    if (!check_is_native_obj(native_obj)) {
        // Whoops, some other kind of struct! Caller probably re-bound the method.
        return jerry_create_error(...);
    }

    // yay, arguments validated (finally!)
    // Now, let's use this_native_obj, arg0 and arg1 to do something interesting!
    // ... method implementation here ...

    return jerry_create_undefined();
}

void xyz_init(void)
{
    // ...
    jerry_value_t do_something = jerry_create_external_function(native_method_impl);
    jerry_value_t rv = jerry_set_property (xyz_prototype, do_something_name, do_something);
    // ...
}

In short, when implementing a native external function, one may have to:

  1. Check that mandatory arguments...
  • are supplied at all.
  • are of the expected type.
  • are within the expected range of values.
  1. Check optional arguments in a similar manner as mandatory ones. Assume default values when omitted.
  2. Check this in a similar manner as mandatory ones.
  3. If a check does not pass, return appropriate JS Error (Error, TypeError, ...).
  4. "Transform" the JS value to C values (i.e. copy to C string, assign C bool, convert string to C enum ...)

All projects I've looked at either use macros or helper functions to condense the code a bit, but effectively still
involves quite a bit of manually written validation code.
Writing correct validation code is quite involved, tedious and prone to errors.
Therefore, I'd like to propose adding a utility to JerryScript to help with this task.

Proposal

Design Goals for the initial version of this utility:

  • Simple validation of arguments (incl. this) and "transformation" to C types.
  • The solution should target "most common cases", including:
  • Checks & transforms for values of primitive/common types (string => char *, boolean => C bool, etc.).
  • Optional arguments.
  • Extensibility: it should be possible to add additional check/transformer functions without having to change JerryScript source code files.
  • Smaller binary code size compared to manual validation code.

Non-goals:

  • "Function overloading" / allowing an argument to be of various types and changing the behavior of the function based on the argument type. In my opinion this is not a "common case". Addressing this case by manually writing validation code is probably fine.
  • "Rest/... parameters". I think it's OK to check/transform rest parameters manually manually for now.
  • "options" object arguments (passing an object containing additional/optional arguments). Support for this can be added later.
  • string "symbols" to C enums: can be added later. However, it should be trivial to add such a check+transform yourself.

Me and @jiangzidong have come up with this solution direction.
Here is how the utility would be used in the example use case:

static const jerry_native_handle_info_t native_obj_info = ...; // depends on PR #1711

// Native implementation of Xyz.prototype.doSomething(a, b)
// First arg is expected to be a string.
// Second (optional) arg is expected to be a boolean.
// `this` is expected to be bound to an instance of Xyz.
static jerry_value_t native_method_impl(const jerry_value_t func_val __attribute__((unused)),
                                        const jerry_value_t this_val __attribute__((unused)),
                                        const jerry_value_t *args_p __attribute__((unused)),
                                        const jerry_length_t args_cnt __attribute__((unused))) {
    char arg0[32];
    bool arg1 = true; /* default value is true */
    struct native_obj_t *this_native_obj;

    const jerry_arg_t mapping[] = {
         // First element in the array is the mapping for `this`.
         // If no checking is needed for `this`, use JERRY_ARG_IGNORE.
        {JERRY_ARG_TYPED_NATIVE_HANDLE, &this_native_obj, native_obj_info}, // depends on PR #1711

        // Further elements map to args_p[0], args_p[1], etc.
        {JERRY_ARG_UTF8_STRING, arg0, sizeof(arg0)},
        {JERRY_ARG_OPTIONAL | JERRY_ARG_BOOL, &arg1},
    };

    // This function is the "work horse" which does the validation and transformations:
    jerry_value_t err;
    if (!jerry_validate_and_assign_args(&err, this_val, args_p, args_cnt, mapping, ARRAY_LENGTH(mapping))) {
        return err;
    }

    // yay, arguments validated (that was easy!)
    // Now, let's use this_native_obj, arg0 and arg1 to do something interesting!
    // ... method implementation here ...

    return jerry_create_undefined();
}

The basic idea is to create an array that describes what types of arguments are expected in the args_p array, how to transform them and to where to assign the results to.
A glimpse behind the scenes:

typedef enum
{
    // Ignore the argument (useful to skip validating `this` or for args that need to be checked manually)
    JERRY_ARG_IGNORE = 0,

    // Checks for boolean and assigns to C bool
    JERRY_ARG_BOOL,

    // Checks for number within uint8_t range and assigns to C uint8_t
    JERRY_ARG_UINT8,

    // ... repeats for each [u]intXX_t

    // Checks for string, copies C string to existing buffer.
    JERRY_ARG_UTF8_STRING,

    // Checks for string, creates buffer using jerry_port_malloc() and copies C string to it.
    JERRY_ARG_UTF8_STRING_MALLOC,

    // Checks that native handle matches supplied jerry_native_handle_info_t * // depends on PR #1711
    JERRY_ARG_TYPED_NATIVE_HANDLE,

    // Checks for function and assigns to jerry_value_t
    JERRY_ARG_FUNCTION,

    ...

    JERRY_ARG__COUNT,

    // Flag to indicate the argument is optional (`undefined` or of the expected type & value)
    JERRY_ARG_OPTIONAL = 0x80,
} jerry_arg_type_t;

_Static_assert(JERRY_ARG__COUNT <= JERRY_ARG_OPTIONAL, "Too many JERRY_ARG_... enums!");

typedef struct
{
    // Type of check + transform:
    jerry_arg_type_t arg_type;

    // "Destination" pointer: (this is where the transformed value need to be assigned to)
    union {
        void *ptr;
        char *ptr_char;
        uint8_t *ptr_uint8;
        // ... etc ...
    };
} jerry_arg_t;

Using anonymous unions, defining the array can be quite clean looking.
I imagine adding (inline) C functions to create each of the jerry_arg_t elements in the array and get better type safety etc.

jerry_validate_and_assign_args() would take the array of jerry_arg_t elements and iterate over them, checking for each element whether the corresponding argument from arg_p matches. If so, it would transform the value and assign it to the "destination".

Open Questions

  • Extensibility: our current thinking is to add a JERRY_ARG_CUSTOM enum value and to provide a validation+transformation callback as part of the jerry_arg_t element. Note we would need to add another field for the callback to jerry_arg_t (thus increasing all members in the array...) Alternatively, we could have a "registration" API to add custom validations/transforms at runtime. Any other ideas are welcome.
  • Errors: would it be OK if the utility defined the Error types and messages? Or do users of this utility need control over this (i.e. through a port function)?
  • Smaller binary code size: any thoughts on the current design and its implications for code size? We tried to limit the number of function calls to one (although one with 6 arguments). In our experience, function calls take up quite a lot of code space.
  • Alternative solution direction: using a "format string" to specify the transformations in a single C string (kind of like Python's struct.unpack). I think this would be harder to use (you have to learn the "format string" format), but it may be better from a code size and extensibility perspective compared to the proposal.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions