-
Notifications
You must be signed in to change notification settings - Fork 33
Avoiding generic programming pitfalls
Excessive generic function specialization can lead to large executables, and added instruction cache pressure can eliminate any performance benefit. By default, a Clay function with no type information will specialize for every set of input types it's given:
muladd(a, b, c) = a * b + c; // A separate muladd function will be compiled for every set of input types
For small functions that will likely be inlined, the cost will be minimal, but for a large function, this is problematic.
An obvious way to curtail the number of instances is to explicitly limit the set of allowed types:
// Only allow int inputs
muladd(a: Int, b: Int, c: Int) = a * b + c;
// Only allow Int or Float inputs
[A, B, C | allValues?(T => inValues?(T, Int, Float), A, B, C)]
muladd(a: A, b: B, c: C) = a * b + c;
Limiting to a small set of types may be undesirable, especially when dealing with various different-sized integer or floating-point types. If a function operates on a related group of types, it can convert those types to a canonical "funnel" type:
// Generic wrapper accepts any set of Integer? types and funnels them down to Int64
[A, B, C | allValues?(Integer?, A, B, C)]
muladd(a: A, b: B, c: C) = muladd(Int64(a), Int64(b), Int64(c));
// Principal overload implements the actual operation
overload muladd(a: Int64, b: Int64, c: Int64) = a * b + c;
The size of the generic stub will then be minimal—only the size of the conversions and the call to the principal overload—while only one instance of the actual body of the function needs to be generated.
Often, a function contains a large body of generic code with only a small amount of behavior that needs to be specialized on the type of its arguments. For example, the following function doesn't need to care about the types of its lastname, firstname, and address functions beyond whether bindStatement accepts them as arguments:
saveClient(db: DB, lastname, firstname, address) {
var stmt = DBStatement(db, "insert into clients (lastname, firstname, address) values (?, ?, ?)");
bindStatement(stmt, 0, lastname);
bindStatement(stmt, 1, firstname);
bindStatement(stmt, 2, address);
execStatement(stmt);
}
Nonetheless, saveClient will be instantiated for every set of input types. However, we can funnel the desired input types into a variant type, and use the * dynamic dispatch operator to dispatch to specialized bindStatement calls:
variant ClientField = String | StringConstant | UTF8String;
[L, F, A | allValues?(T => VariantMember?(ClientField, T), L, F, A)]
saveClient(db: DB, lastname: L, firstname: F, address: A) {
saveClient(ClientField(lastname), ClientField(firstname), ClientField(address));
}
overload saveClient(db: DB, lastname: ClientField, firstname: ClientField, address: ClientField) {
var stmt = DBStatement(db, "insert into clients (lastname, firstname, address) values (?, ?, ?)");
bindStatement(stmt, 0, *lastname);
bindStatement(stmt, 1, *firstname);
bindStatement(stmt, 2, *address);
execStatement(stmt);
}
With this code, only the conversion stub and bindStatement need to be specialized; the main body of saveClient requires only one instance for all of the member types of ClientField.