Skip to content

Generic functions

David Tarditi edited this page Dec 13, 2018 · 5 revisions

Overview

Motivation

C has functions that operate over any pointer type, such as memcpy. These functions have parameters with the void pointer type. Static type checking is lost when the void pointer type is used because any pointer type can be converted implicitly to or from the void pointer type. A conversion from the void pointer type could be incorrect, which could lead to memory corruption or an invalid memory read.

For example, a pointer to an integer on the stack could be converted to a void pointer, which could then be converted to a pointer to a struct with 4 pointer members. A subsequent use of the struct pointer could incorrectly write or read memory. Functions that take or return void pointers can have the same problems, with incorrect conversions at calls to them or in their inner workings.

Type variables

Generic functions can also operate over any pointer type. However, instead of using the void pointer type, they use type variables to represent unknown types.

A type variable can be used anywhere a type may be used when building types or in declarations. Type variables are incomplete types, like the void pointer type. They cannot be used as the type of a member or a variable. However, they can be used in pointer types.

Type variables can be used to express constraints: if two arguments must point to values of the same type, the same type variable can be used.

Parameterizing functions by type variables

Generic functions have type variables as additional parameters. When a generic function is called, types must be passed as additional arguments to the function. The type arguments determine the type at which the function is being used. Type parameters and arguments are purely compile-time constructs. They are not represented at run time.

Syntax and examples

We start with some simple examples of generic functions. We might want a function that allocates an array of data. In C, this function would usually have a void * type:

void *alloc(unsigned int s);

In Checked C, we can declare a generic function instead, using the keyword _For_any to introduce one or more type variables:

_For_any(T) _Array_ptr<T> alloc(unsigned int s) : byte_count(s);

In this example, T is a type variable. For_any(T) means that alloc takes an additional type argument. The scope of _For_any extends to the end of the function declaration or definition. The : byte_count(s) declares the bounds for the return value of alloc. In this case alloc returns a pointer to s bytes of storage, or NULL.

To use alloc, we need to pass an additional type argument:

_Array_ptr<int> arr : count(5) = alloc<int>(sizeof(int) * 5);

The type arguments are passed before regular arguments, by following the generic function with a list of one or more types enclosed in < and >. This is a syntactic extension to existing C: the < operator cannot be followed currently by a type.

Generic bounds-safe interfaces

C already has standard library functions for allocating and copying data. Generic bounds-safe interfaces allow existing functions to be redeclared so that are optionally treated as being generic functions when used with checked pointer types. This is done by using _Itype_for_any in place of _For_any and specifying bounds-safe interfaces for parameters or the return value.

For example, malloc is given the generic bounds-safe interface:

_Itype_for_any<T> void * _Array_ptr<T>(size_t s) : itype(_Array_ptr<T>) byte_count(s);

The itype(_Array_ptr<T>) provides an alternate return type to use in place of the return void *. (Note that this example is slightly simplified: we have omitted the restrict keyword that appears in the official C11 declaration of malloc.)

In a checked scope, _Itype_for_any turns into _For_any and the bounds-safe interface types for parameters and return types are used in place of the original types:

_For_any<T> _Array_ptr<T> _Array_ptr<T>(size_t s) : byte_count(s);

In a checked scope, programmers must supply type arguments for calls to malloc:

_Array_ptr<int> arr : count(5) = malloc<int>(sizeof(int) * 5);

In an unchecked scope, programmers can omit type arguments and use the function as before:

int *arr = malloc(sizeof(int) * 5);

Programmers can also supply type arguments when using checked pointers.

_Array_ptr<int> arr : count(5) = malloc<int>(sizeof(int) * 5);

Future work

We plan to add the following over time:

  • Constraints on existing variable that declare they hold the size of a type at runtime.
  • Generic structure types, which would allow data structures such as generic lists and stacks to be created.
  • where clauses, which allow pre-conditions and post-conditions about functions and values of variables. This will let us express constraints that size of the data being copied must be a multiple of a size of a type, so that entire values of the type are copied.