-
Notifications
You must be signed in to change notification settings - Fork 186
Bounds safe interfaces
The Checked C extension is designed so that programs can be changed to use checked pointers and arrays in an incremental fashion. A programmer can change just a few lines at a time and still have a working program. This matches the way that software is developed and maintained. However, incremental conversion is problematic for libraries. Changing argument types and return types to be checked types could break existing unconverted code that uses a library. In addition, some libraries may be external and impossible to modify.
We solve this problem by introducing bounds-safe interfaces. A bounds-safe interface describes the expected behavior and requirements with respect to checked types and bounds of existing code. A programmer provides an alternate view of existing functions, members, and global variables that gives checked types and bounds to use in place of existing unchecked types. Code with a bounds-safe interface is simultaneously checked and unchecked.
When an entity with a bounds-safe interface is used in a checked scope, the type and bounds given by the bounds-safe interface are used. In an unchecked scope, its type and bounds depend on what is expected in the context where it is used. Informally, if an unchecked type is expected, the type is the original (unchecked) type. If a checked type is expected, the type in the bounds-safe interface is used. We describe the precise rules for what type in this section.
To add a bounds-safe interface to an existing entity, a programmer
declares (or redeclares) the entity with additional information. The most
common annotation is the itype
annotation, which gives an alternate
checked type for a declaration. For example, strcmp
is redeclared as:
int strcmp(const char *src1 : itype(_Nt_array_ptr<const char>),
const char *src2 : itype(_Nt_array_ptr<const char>));
This allows strcmp
to be called with checked pointers to null-terminated
arrays of characters. However, strcmp
cannot be called with checked pointers
to arrays of characters. These might not be null-terminated and passing such
an argument could cause a buffer overrrun:
void f(_Nt_array_ptr<const char> arg1, _Nt_array_ptr<const char> arg2) {
if (strcmp(arg1, arg2)) // OK,
...
}
void g(_Array_ptr<const char> arg1, _Array_ptr<const char> arg2) {
if (strcmp(arg1, arg2)) // Error.
...
}
Here are examples of other standard C library functions
annotated with itype
declarations:
size_t strlen(const char *s : itype(_Nt_array_ptr<const char>));
int atoi(const char *s : itype(_Nt_array_ptr<const char>));
double modf(double value, double *iptr : itype(_Ptr<double>));
int fclose(FILE *stream : itype(_Ptr<FILE>));
FILE *tmpfile(void) : itype(_Ptr<FILE>);
The itype annotation can also be used with structure members. Suppose a structure has a buffer of integers and pointer to an array of characters:
struct S {
int buf[50];
char *name;
};
It can be modified to have the following bounds-safe interface:
struct S {
int buf[50]: itype(_Checked int[50]);
char *name : itype(_Nt_array_ptr<char>);
};
On Linux, stdin
can be given this bounds-safe interface:
FILE *stdin : itype(_Ptr<FILE>);
If an entity with an unchecked pointer type is actually a pointer to an array,
it can be given a bounds declaration. Consider strncpy
, for example. Its original
type is:
char *strncpy(char * restrict dest, const char * restrict src, size_t n);
It can be given the bounds-safe interface:
char *strncpy(char * restrict dest : count(n),
const char * restrict src : count(n),
size_t n) : bounds(dest, (_Array_ptr<char>)dest + n);
When strncpy
is called with checked pointers, the source and destination pointers must have at least n
characters
available.
The bounds declaration bounds(dest, (_Array_ptr<char>)dest + n)
declares the return bounds for strncpy
.
The return bounds follows the parameter list so that it can refer to parameters. strncpy
returns the
destination pointer, so the bounds for its return value are the bounds for the destination pointer.
For brevity, bounds declarations by themselves imply interface types. If the original type was an unchecked
pointer to T
, the interface type is _Array_ptr<T>
. If it is an unchecked array T[len]
, the interface
type is T checked[len]
. The implied interface type for dest
in strncpy
is _Array_ptr<char>
.
A bounds declaration can be combined with an interface type in the case where the implied type is not the right
checked type. Combined declarations are needed for nested pointers: int **
might need to have the interface type _Array_ptr<_Ptr<int>>
. They are also needed for_Nt_array_ptr
interface types that have a bounds other
than count(0)
.
The function fread
can be given the bounds-safe interface:
size_t fread(void * restrict p : byte_count(size * nmemb),
size_t size, size_t nmemb,
FILE * restrict stream : itype(restrict _Ptr<FILE>));
The function memcpy
can be given the bounds-safe interface:
void *memcpy(void * restrict dest : byte_count(n),
const void * restrict src : byte_count(n),
size_t n) : bounds(dest, (_Array_ptr<char>) dest + n);
Note that with this bounds-safe interface, dest
and src
both have
the interface type _Array_ptr<void>
. This means that calls to
memcpy
may not preserve type safety. We can provide a bounds-safe
interface that does better than this, which we describe later.
A programmer can also provide bounds-safe interfaces for function types.
This is done the same way as for function declarations, by providing
bounds-safe interfaces for parameters and the return value. For
example, qsort
takes a comparison function and uses it to sort an
array of values:
void qsort(void *base, size_t nmemb, size_t size,
int ((*compar)(const void *, const void *)));
compar
is a pointer to a function that takes two void *
pointers
and returns an integer. The pointers are pointers to elements
of the array. The function type should be read from inside (closest
to the identifer) outward.: (*compar)
means that compar
is a pointer to ...
, where
...
is the function type int (const void *, const void *)
.
qsort
can be given the following bounds-safe interface:
void qsort(void *base : byte_count(nmemb * size),
size_t nmemb, size_t size,
int ((*compar)(const void *, const void *)) :
itype(_Ptr<int (_Ptr<const void>, _Ptr<const void>)>));
In this case, compar
has a bounds-safe interface type that is a
_Ptr
to a function type that takes two void pointer arguments
and returns an integer.
Note that we can do even better and replace the _Ptr<void>
types
with type-safe constructs.
In the Checked C extension, implicit conversions between different kinds of pointer types are allowed at assignments, function calls, and return statements.
When an expression with unchecked pointer type is converted implicitly to an checked pointer type, the expression must meet any target bounds requirements for the checked pointer. Bounds-safe interfaces are used during inference of bounds for the expression. If a variable with unchecked pointer type occurs in the expression and the variable has a bounds-safe interface that declares bounds, those bounds are trusted and assumed to be true.
Ths code is allowed:
void f(char *buf : count(n)) {
_Array_ptr<char> c : count(n) = buf;
}
At the assignment to c
, buf
is cast implicitly to an _Array_ptr
.
It has the bounds count(n)
, which is the same as the bounds declared forc
.
This code is not allowed:
void g(char *buf) {
_Array_ptr<char> d : count(n) = buf; // error.
}
At the assignment to d
, buf
is cast implicitly to an _Array_ptr
.
However, it has no bounds declared, so it fails to meet the bounds requirement
for d
of count(n)
. The code is rejected by the compiler.
A checked pointer can be converted implicitly to an unchecked ponter only when a bounds-safe interface is present. Here are the rules:
- If the left-hand side of an assignment is a variable with a bounds-safe interface or a member reference with a bounds-safe interface, and the right-hand side expression has a checked pointer type, the right-hand side expression is converted implicitly to the unchecked pointer type.
- Similarly, at calls, the function type of the function being called is determined. If a parameter has a bounds-safe interface, and the corresponding argument has a checked type, the argument is converted implicitly to the unchecked type of the parameter.
- At return statements, if the return value for the enclosing function has a bounds-safe interface, and the expression in the return statement has a checked type, the expression is converted to an unchecked return type.
In all these cases, if bounds are declared by the bounds-safe interface, the converted expression must meet them. This enforces that checked pointers meet the bounds requirements of functions, variables, or members.
For example, this prevents memcpy
from being called with _Array_ptr
values that are do not point to enough data. The following code is correct:
int a _Checked[3] = {0, 1, 2}
int b _Checked[3] = {3, 4, 5};
memcpy(a, b, sizeof(int) * 3); // correct.
while the following code is rejected:
int c _Checked[2] = { 6, 7 };
memcpy(a, c, sizeof(int*) * 3); // error.
Checked C Wiki