Skip to content

Bit Vector Cookbook

David Cok edited this page Jul 3, 2022 · 5 revisions

Dafny supports both integers and bit vectors. While it can successfully prove many complex properties in either domain in isolation, properties that involve the interaction between integers and bit vectors are typically more difficult to prove. In the following, I'll use the term "integer type" to refer to int, nat, and any subtype or newtype based on either. And I'll use the term "bit vector type" to refer to bv<n> for any natural number <n>.

To be concrete, conversion between integers and bit vectors shows up in Dafny in the form x as bv<n>, when x is of integer type, or in the form x as T where x is of bit vector type and T is of integer type. It also occasionally shows up implicitly, such as with shifting operations: v << i, where v is a bit vector and i is an integer, implictly casts i to the type of v.

This page describes a set of techniques to help make proofs involving both bit vectors an integers more likely to be successful.

General principle: clear separation between domains

The key criteria to keep in mind when working with both integers and bit vectors is that the mathematical representation of conversion between the two data types is computationally expensive to reason about. So, to avoid performance issues (or, effectively, failure to complete at all), it can be helpful to reduce the number of instances of conversion as much as possible, and to reduce the sizes of bit vectors involved as much as possible.

The following sections detail specific approaches to help achieve this goal.

Using the smallest bit vector sizes possible

The efficiency with which Dafny can verify bit vector formulas is related (though in a complex way) to the number of possible values each bit vector may have, which is exponential in the number of bits in the vector. Therefore, using the smallest bit vector size that can represent all of the values a variable may take on can make verification more efficient. For example, Dafny is currently unable to automatically verify the following lemma in a reasonable amount of time:

lemma ShiftVsCast(val: bv16, idx: nat)
  requires 0 <= idx <= 8
  requires 0 <= val < (1 << 8)
  ensures val as bv32 << idx == (val << idx) as bv32
{}

However, Dafny can quickly verify the following lemma the uses smaller bit vectors:

lemma ShiftVsCast(val: bv8, idx: nat)
  requires 0 <= idx <= 8
  requires 0 <= val < (1 << 8)
  ensures val as bv16 << idx == (val << idx) as bv16
{}

This technique is not always feasible, but can be very effective when it is. As a last resort, if you need to turn a lemma about bit vectors into an axiom, it may be possible to prove a similar lemma using smaller sizes that, while perhaps not directly usable in the larger proof, can help improve confidence in the correctness of the axiom that you do directly use.

Casting shift amounts to small bit vectors

This technique is related to the previous one. When shifting bit vectors, as in the expression v << i, it is typically convenient to use type nat for the shift amount (i in this case), and Dafny allows this. If v has type bv64, Dafny's treatment of the nat value i is to cast it to the same bv64 type. However, because shift amounts greater than 64 aren't very useful for a 64-bit value, it would be possible to replace v << i with v << (i as bv7), an expression Dafny will have an easier time reasoning about.

Putting this together into a complete example, Dafny has trouble verifying the following lemma:

const Half: bv128 := 0xffff_ffff_ffff_ffff

lemma MaskedCommLShiftUpCastNat(v: bv64, idx: nat)
  requires idx <= 64
  ensures (v as bv128 << idx) & Half == (v << idx) as bv128
{}

But Dafny can easily verify the version where the shift amount is a cast to a small bit vector:

lemma MaskedCommLShiftUpCastNatToBV(v: bv64, idx: nat)
  requires idx <= 64
  ensures (v as bv128 << (idx as bv7)) & Half == (v << (idx as bv7)) as bv128
{}

Similarly, it would be easy if the idx parameter began as a small bit vector:

lemma MaskedCommLShiftUpCastBV(v: bv64, idx: bv7)
  requires idx <= 64
  ensures (v as bv128 << idx) & Half == (v << idx) as bv128
{}

Grouping operations judiciously

When casting integers to small bit vectors as described in the previous section, it can sometimes be possible to group operations in various different ways, by putting parentheses in different places. In some cases, proof difficulty may depend strongly on what grouping you choose.

Consider the following example, which assumes one axiom (because the bit width of the argument is too large for Dafny to prove it) and then uses that axiom to prove a more specific instance.

lemma {:axiom} LeftShiftBV64(v: bv64, idx: nat)
  requires idx <= 64
  ensures v << idx == v << idx as bv8

lemma LeftShiftMinusBv64Hard(v: bv64, i: nat)
  requires i <= 64
  ensures v << 64 - i == v << 64 - i as bv8
{
  LeftShiftBV64(v, 64 - i);
}

Dafny can't prove that LeftShiftMinusBv64Hard is correct because its ensures clause is interpreted as if it were written v << (64 - i) == v << (64 - (i as bv8)). This is equivalent to the following more complex expression:

v << ((64 - i) as bv64) == v << ((64 - i as bv8) as bv64)

This, in turn, amounts to proving that the right hand sides of the shifts are equivalent:

(64 - i) as bv64 == (64 - i as bv8) as bv64

This involves casting i to bv8 and to bv64 and reasoning about the relationship between them. Due to the computationally intensive representation of the integer value of a bit vector, this is a difficult problem. If bv64 were, instead, bv16, then it would be tractable. But that restriction is not always feasible.

Fortunately, such a restriction is not necessary in this case. If, instead, we change the grouping as shown below, Dafny can easily prove it.

lemma LeftShiftMinusBv64Easy(v: bv64, i: nat)
  requires i <= 64
  ensures v << 64 - i == v << (64 - i) as bv8
{
  LeftShiftBV64(v, 64 - i);
}

In this case, the parentheses group the expression so that it exactly matches the syntax of the ensures clause of LeftShiftBV64, so the result follows immediately.

Interacting with external code

When describing the interface between Dafny code and external code, machine words in the target language can be represented in two ways: as integers or as bit vectors. For example, consider the following C# method:

static int Inc(int x) {
  return x + 1;
}

This can be declared as an external function (or method) in Dafny that works over integers.

newtype {:nativeType "int"} int32 = x | -0x8000_0000 <= x < 0x8000_0000
function {:extern "Inc"} IncInt(x: int32): int32

Or it can be declared to work over bit vectors.

function {:extern "Inc"} IncBV(x: bv32): bv32

Since a C# int is really a 32-bit machine word, either declaration is semantically reasonable. Which to choose depends on how convenient it will be to work with the arguments and return values. If the surrounding code works primarily with integers, the former will probably be preferable. If it works primarily with bit vectors, the latter will be preferable. Both declarations can even coincide in the same Dafny program if the Dafny names are different (as they are in this case), in which case a single caller could call both IncInt with an integer argument and IncBV with a bit vector argument.

One factor to keep in mind when doing this is that some operations on bit vectors, such as comparison, always treat their arguments as unsigned. For a C# int this is not usually the desired behavior, so it may be valuable to implement a signed comparison function over bit vectors and use that instead of the built-in < operator, for instance.