# Integers

The integer types in C are similar to the integer types in Java but there are some important differences:

1. the C standard does not specify how integer values are represented (although this is set to change in C23)
    * a consequence of this is that integer overflow may behave differently in C compared to Java
2. the C standard does not specify precisely how many bits are used for each type
3. there are more integer types in C
4. C is more permissive than Java when converting between different types


## The integer types

The following table lists the 

| Recommended name | Width in bits |
| :--- | :--- |
| `bool` | at least 8 |
| `signed char` | at least 8 |
| `unsigned char` | at least 8 |
| `char` | at least 8 |
| `short int` | at least 16 |
| `unsigned short int` | at least 16 |
| `int` | at least 16 |
| `unsigned int` | at least 16 |
| `long int` | at least 32 |
| `unsigned long int` | at least 32 |
| `long long int` | at least 64 |
| `unsigned long long int` | at least 64 |

As in Java, the integer types occupy a finite and fixed amount of memory which implies that there is 
a minimum and maximum value for each type. The size, minimum, and maximum value for each type for
your compiler and target architecture are defined in the header `<limits.h>`.

The unsigned types cannot represent negative values and the minimum value for each unsigned type is zero.

The standard mandates an ordering on the sizes of the types:

`1 == sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long) <= sizeof(long long)`

Furthermore, the standard mandates that `1 == sizeof(char) == sizeof(signed char) == sizeof(unsigned char)`.

Although the standard does not mandate the exact size of each type, there are four widely used *data models*
that specify the widths of the various types (see https://en.cppreference.com/w/c/language/arithmetic_types#Data_models).

If you need types of specific sizes, you may use the types defined in the header `<stdint.h>` (since C99).

#### `bool`

C99 introduced the Boolean type `_Bool` that stores only the value `0` or `1` (false and true).
Assigning any non-zero value to a `_Bool` causes the value to become `1`. 
If you include the header `<stdbool.h>` then you can use the type name `bool` and the values `false` and `true`. 

In [3]:
#include <stdbool.h>
#include <stdio.h>

int main(void) {
    bool flag = true;          // or any non-zero value
    if (flag) {
        puts("true");
    }
    else {
        puts("false");
    }
    return 0;
}

true


### `char`

The types `char`, `unsigned char`, and `signed char` are three distinct types in the C standard. The standard
calls these *Character types* but they do in fact represent integer values.

The standard says that `char` is equivalent to one of `unsigned char` or `signed char`; it is up to the
compiler implemented to decided with type `char` is equivalent to.

## Unsigned integers



The unsigned integer types have ranges that start at `0` and go up to some maximum positive value. They
have the simplest binary representation: Each bit of the binary number is simply multiplied by a corresponding
power of $2$. For example, consider the 8-bit unsigned integer `01101011`:

$$
\begin{array}{cccccccc|l}
2^7 & 2^6 & 2^5 & 2^4 & 2^3 & 2^2 & 2^1 & 2^0 & \text{multiplier} \\
0 & 1 & 1 & 0 & 1 & 0 & 1 & 1 & \text{bit} \\
0 \times 2^7 & 
1 \times 2^6 & 
1 \times 2^5 & 
0 \times 2^4 & 
1 \times 2^3 & 
0 \times 2^2 & 
1 \times 2^1 & 
1 \times 2^0 & \text{bit}\ \times\text{multiplier} \\
0 &
64 &
32 &
0 &
8 &
0 & 
2 & 
1 & \text{sum} = 107
\end{array}
$$

The bit having the multiplier with the highest exponent is called the *most significant* bit because
it contributes the greatest amount to the overall value when its value is $1$.

The bit having the multiplier exponent equal to $0$ is called the *least significant* bit because
it contributes the least amount to the overall value.

Binary numbers are commonly written from left to right starting with the most significant bit.

The minimum value of an 8-bit unsigned integer is `00000000` which is equal to $0$.

The maximum value of an 8-bit unsigned integer is `11111111` which is equal to $255$.

You can prove via induction that the maximum value of an $n$ bit unsigned binary number is equal to $2^n - 1$.

## Signed integers and two's complement

Signed integers have ranges that have a most negative value and a most positive value.
If zero is considered to be unsigned then the number of negative values is not equal to the number of positive values (can you see why?).

The C standard does not specify how such numbers should be represented, however the C23 standard is
expected to specify that only *two's-complement representation* will be supported.
All significant current computer architectures use two's-complement representation of signed integers.

In two's complement representation, the most significant bit has a multiplier equal to $-(2^{n-1})$ for
an $n$-bit binary number. 
If the most significant bit is equal to $0$ then the two's complement representation has the same value
as an unsigned integer.
For example, consider the 8-bit signed integer `01101011`:

$$
\begin{array}{cccccccc|l}
-(2^7) & 2^6 & 2^5 & 2^4 & 2^3 & 2^2 & 2^1 & 2^0 & \text{multiplier} \\
0 & 1 & 1 & 0 & 1 & 0 & 1 & 1 & \text{bit} \\
0 \times -(2^7) & 
1 \times 2^6 & 
1 \times 2^5 & 
0 \times 2^4 & 
1 \times 2^3 & 
0 \times 2^2 & 
1 \times 2^1 & 
1 \times 2^0 & \text{bit}\ \times\text{multiplier} \\
0 &
64 &
32 &
0 &
8 &
0 & 
2 & 
1 & \text{sum} = 107
\end{array}
$$

If the most significant bit is equal to $1$ then the two's complement representation leads to a negative
value. For example, consider the 8-bit signed integer `11101011`:

$$
\begin{array}{cccccccc|l}
-(2^7) & 2^6 & 2^5 & 2^4 & 2^3 & 2^2 & 2^1 & 2^0 & \text{multiplier} \\
1 & 1 & 1 & 0 & 1 & 0 & 1 & 1 & \text{bit} \\
1 \times -(2^7) & 
1 \times 2^6 & 
1 \times 2^5 & 
0 \times 2^4 & 
1 \times 2^3 & 
0 \times 2^2 & 
1 \times 2^1 & 
1 \times 2^0 & \text{bit}\ \times\text{multiplier} \\
-128 &
64 &
32 &
0 &
8 &
0 & 
2 & 
1 & \text{sum} = -21
\end{array}
$$

The most positive two's complement binary number has all bits equal to $1$ except for the most significant bit:

$$
\begin{array}{cccccccc|l}
-(2^7) & 2^6 & 2^5 & 2^4 & 2^3 & 2^2 & 2^1 & 2^0 & \text{multiplier} \\
0 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & \text{bit} \\
0 \times -(2^7) & 
1 \times 2^6 & 
1 \times 2^5 & 
1 \times 2^4 & 
1 \times 2^3 & 
1 \times 2^2 & 
1 \times 2^1 & 
1 \times 2^0 & \text{bit}\ \times\text{multiplier} \\
0 &
64 &
32 &
16 &
8 &
4 & 
2 & 
1 & \text{sum} = 127
\end{array}
$$

The most negative two's complement binary number has all bits equal to $0$ except for the most significant bit:

$$
\begin{array}{cccccccc|l}
-(2^7) & 2^6 & 2^5 & 2^4 & 2^3 & 2^2 & 2^1 & 2^0 & \text{multiplier} \\
1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & \text{bit} \\
1 \times -(2^7) & 
0 \times 2^6 & 
0 \times 2^5 & 
0 \times 2^4 & 
0 \times 2^3 & 
0 \times 2^2 & 
0 \times 2^1 & 
0 \times 2^0 & \text{bit}\ \times\text{multiplier} \\
-128 &
0 &
0 &
0 &
0 &
0 & 
0 & 
0 & \text{sum} = -128
\end{array}
$$

The above examples illustrate the range of an $n$-bit two's complement number is $-(2^{n-1})$ to $2^{n-1} - 1$.
Observe that the range is asymmetric about $0$: The magnitude of the most negative value is greater than that
of the most positive value. The programming implication of this is that signed integers are susceptible to
difficult to detect errors:

* `-x` may not exist
* `abs(x)` may not exist
* `-1 * x` may not exist
* `x / -1` may not exist

In fact, the four cases above result in *undefined behavior* in C. Undefined behavior means that anything
might occur and that whatever occurs is not necessarily repeatable or consistent between compilers or
architectures.

The following program prints the value of the four cases above using `signed char`.

In [9]:
#include <limits.h>
#include <stdio.h>
#include <stdlib.h>

int main(void) {
    signed char c = SCHAR_MIN;    // requires limits.h
    printf("%d\n", c);
    
    signed char x = -c;
    printf("%d\n", x);
    
    x = abs(c);                   // requires stdlib.h
    printf("%d\n", x);
    
    x = -1 * c;
    printf("%d\n", x);
    
    x = x / -1;
    printf("%d\n", x);
}

-128
-128
-128
-128
-128


#### The relationship between $x$ and $-x$ in two's complement

For almost any integer value $x$, the two's-complement representation of $-x$ can be found by flipping the bits
of $x$ and adding $1$. For example, if $x=00000001$ then its decimal value is $1$; to compute $-x$, we 
take the following steps:

$$
\begin{array}{cccccccc|l}
0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & = 1\\
\hline
1 & 1 & 1 & 1 & 1 & 1 & 1 & 0 & \text{flip bits} \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & \text{add}\ 1 \\
\hline
1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & \text{bit} \\
1 \times -(2^7) & 
1 \times 2^6 & 
1 \times 2^5 & 
1 \times 2^4 & 
1 \times 2^3 & 
1 \times 2^2 & 
1 \times 2^1 & 
1 \times 2^0 & \text{bit}\ \times\text{multiplier} \\
-128 &
64 &
32 &
16 &
8 &
4 & 
2 & 
1 & \text{sum} = -1
\end{array}
$$

Summing two binary numbers is similar to summing two decimal numbers. Summing two bits equal to $1$ results
in the bit $0$ with a carry bit of $1$ that propogates to the next greater significant bit. For example, if $x=00111000$ then its decimal value is $56$; to compute $-x$, we 
take the following steps:

$$
\begin{array}{cccccccc|l}
0 & 0 & 1 & 1 & 1 & 0 & 0 & 0 & = 56\\
\hline
1 & 1 & 0 & 0 & 0 & 1 & 1 & 1 & \text{flip bits} \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & \text{add}\ 1 \\
\hline
1 & 1 & 0 & 0 & 1 & 0 & 0 & 0 & \text{bit} \\
1 \times -(2^7) & 
1 \times 2^6 & 
0 \times 2^5 & 
0 \times 2^4 & 
1 \times 2^3 & 
0 \times 2^2 & 
0 \times 2^1 & 
0 \times 2^0 & \text{bit}\ \times\text{multiplier} \\
-128 &
64 &
0 &
0 &
8 &
0 & 
0 & 
0 & \text{sum} = -56
\end{array}
$$
