# Types

<div class="alert alert-block alert-info">
    You can find all of the C programs in this notebook in the subdirectory containing this notebook:
    <code>./src/types</code>
</div>

This notebook provides a quick overview of some of the main types in C. More detailed information about
particular types can be found in other notebooks.

As in Java, all variables in C have a type. The type of a variable determines what values the variable
can store and what operations can be performed using the variable.

Creating a new variable is performed by *declaring* the variable. In its simplest form, a variable
declaration has the form

*type* *name*

where *type* is the variable type and *name* is the variable name (more formally called an identifier).
An identifier is made up of an arbitrary length sequence of lowercase and uppercase Latin letters, digits,
and underscores (and Unicode characters since C99) that begins with a non-digit character. For example,

```c
int x;
```

declares a variable of type `int` named `x`. If `x` is a variable declared inside of a C function,
then it has a value, but its value is indeterminate (could be any value).

A variable may be given a value at its point of declaration; doing so is called *initialization*. For
example,

```c
int y = 1;
```

declares a variable of type `int` named `y` and initializes its value to `1`.

Giving a previously declared variable a new value is called *assignment*. For example,

```c
int z = 0;
z = 100;
```

assigns the value `100` to the previously declared variable `z`.

Note that initialization and assignment are two different operations. All variables may be initialized,
but not all variables are assignable.

## The integer types

The following table lists the integer types and their minimum width in bits:

| Recommended name | Width in bits |
| :--- | :--- |
| `bool` (requires `<stdbool.h>`) | at least 8 |
| `signed char` | at least 8 |
| `unsigned char` | at least 8 |
| `char` | at least 8 |
| `short int` | at least 16 |
| `unsigned short int` | at least 16 |
| `int` | at least 16 |
| `unsigned int` | at least 16 |
| `long int` | at least 32 |
| `unsigned long int` | at least 32 |
| `long long int` | at least 64 |
| `unsigned long long int` | at least 64 |

Notice that the exact size of the integer types is not specified by the C standard. The standard states that
the size of a `char` is equal to the smallest addressable unit of the machine, which is usually 8 bits.
The smallest addressable unit of the machine is called a *byte*.

The sizes of all other types (including non-integer types) is measured in integer multiples of the size
of a `char`. The `sizeof` operator returns the size of an object or type:

In [None]:
// sizeof.c

#include <stdio.h>
#include <stdbool.h>

int main(void) {
    printf("char      : %ld\n", sizeof(char));
    printf("bool      : %ld\n", sizeof(bool));
    printf("short int : %ld\n", sizeof(short int));
    printf("int       : %ld\n", sizeof(int));
    printf("long int  : %ld\n", sizeof(long int));
    printf("float     : %ld\n", sizeof(float));
    printf("double    : %ld\n", sizeof(double));
    printf("char *    : %ld\n", sizeof(char *));
    printf("double *  : %ld\n", sizeof(double *));
    return 0;
}

`sizeof` returns an unsigned integer value of type `size_t`. The type `size_t` is defined in the header
file `<stddef.h>` (and also becomes available by including `<stdlib.h>`). `size_t` can store the maximum 
size of a theoretically possible object of any type; this implies that a variable of type `size_t`
can store the largest theoretically usable array index. On the author's computer, the actual type
of `size_t` is `unsigned long int`.

#### `bool`

C99 introduced the Boolean type `_Bool` that stores only the value `0` or `1` (false and true).
Assigning any non-zero value to a `_Bool` causes the value to become `1`. 
If you include the header `<stdbool.h>` then you can use the type name `bool` and the values `false` and `true`. 

In [None]:
// boo.c

#include <stdbool.h>
#include <stdio.h>

int main(void) {
    bool flag = true;          // or any non-zero value
    if (flag) {
        puts("true");
    }
    else {
        puts("false");
    }
    return 0;
}

The `printf` function has no conversion for Boolean values. You can use any of the integer conversions
`%d`, `%i`, or `%u`:

In [None]:
// print_boo.c

#include <stdbool.h>
#include <stdio.h>

int main(void) {
    bool flag = false;
    printf("%u\n", flag);
    return 0;
}

Alternatively, you can convert the Boolean value to a string and then print the string:

In [None]:
// print_boo2.c

#include <stdbool.h>
#include <stdio.h>

int main(void) {
    bool flag = false;
    printf("%s\n", flag ? "true" : "false");
    return 0;
}

#### `char`

As in Java, `char` is usually used to represent character data. A `char` literal is any single character
inside of single quotes. The `printf` conversion `%c` will print a single `char`:

In [None]:
// char.c 

#include <stdio.h>

int main(void) {
    char c = 'x';
    printf("%c\n", c);
    return 0;
}

A `char` literal may also be an escape sequence (similar to Java). A table of most of the escape sequences
is shown below:

| Escape sequence | Description |
| :--- | :--- |
| `\'` | single quote |
| `\"` | double quote |
| `\?` | question mark (needed to suppress trigraphs) |
| `\\` | backslash |
| `\a` | audible bell |
| `\b` | backspace |
| `\f` | form feed - new page |
| `\n` | line feed - new line |
| `\r` | carriage return |
| `\t` | horizontal tab |
| `\v` | vertical tab |

#### The various `int` types

Unlike Java, the various integer types come in both signed and unsigned varieties. Unsigned integers are
always non-negative.

Also unlike Java, the precise number of bits in an integer type is not specified by the C standard. For
example, in Java, an `int` is always a 32-bit twos-complement value, but the C standard simply states that
an `int` must have at least 16 bits (and may not be a twos-complement value, but this is set to change
in a future version of the standard).

Otherwise, the integer types behave similarly to their Java counterparts. In particular, integers in
C are subject to overflow.

The `printf` conversions `%d` or `%i` will print an `int` value. An additional *length modifier* character
should be used when printing a signed integer that is not `int`:

| To print a ... | Conversion |
| :--- | :--- |
| `short int` | `%hi` |
| `long int` | `%li` |
| `long long int` | `%lli` |

In [25]:
// signed_ints.c 

#include <stdio.h>

int main(void) {
    char c = 'a';
    short int s = 1;
    int i = 2;
    long int l = 3;
    long long int ll = 4000000000000000000;
    
    printf("%d\n", c);
    printf("%hd\n", s);
    printf("%d\n", i);
    printf("%ld\n", l);
    printf("%lld\n", ll);
    
    return 0;
}

97
1
2
3
4000000000000000000


Notice that printing a `char` using the conversion `%d` will print the numeric value of the character
instead of printing a character literal.

#### enum types

C has integer enumerations that allow a programmer to assign names to integer values for representing a set of constant values. C enumerations have no type safety, unlike Java enumerations.

In [None]:
// enum.c

#include <stdio.h>

int main(void) {
    enum day { sun, mon, tue, wed, thu, fri, sat};
    enum compass { north = 0, east = 90,
                   south = 180, west = 270 };
    enum month { jan = 1, feb, mar, apr,
                 may, jun, jul, aug,
                sep, oct, nov, dec};

    enum day d = 10;
    enum compass dir = south;
    enum month m = oct;

    printf("d   : %d\n", d);
    printf("dir : %d\n", dir);
    printf("m   : %d\n", m);

    return 0;
}

## Floating-point types

The floating-point types are `float`, `double`, and `long double`. The first two types are identical
to their corresponding types in Java on most computers, and follow the IEEE-754 standard for 32-bit and
64-bit binary floating-point numbers. The type `long double` follows the IEEE-754 standard for 128-bit
binary floating-point numbers if such values are supported by the computing platform.

The `printf` conversion `%f` will print a `float` or `double` value as a decimal value, `%e` will
print using scientific notation, and `%g` will print using a conversion similar to `%f` or `%e`
depending on the magnitude of the value:

In [13]:
// float.c 

#include <stdio.h>

int main(void) {
    double pi = 3.14159265358979323846;
    printf("%f\n", pi);
    printf("%e\n", pi);
    printf("%g\n", pi);
    
    return 0;
}

3.141593
3.141593e+00
3.14159


Notice that only six digits after the decimal place are printed when using `%f` and `%e`, and a total of
six digits are printed when using `%g`. The number of digits after the decimal place (called the 
*precision*) can be specified using `.n` where `n` is the desired precision:

In [15]:
// float2.c 

#include <stdio.h>

int main(void) {
    double pi = 3.14159265358979323846;
    printf("%.1f\n", pi);                     // precision 1
    printf("%.10f\n", pi);                    // precision 10
    printf("%.100f\n", pi);                   // precision 100?
    
    return 0;
}

3.1
3.1415926536
3.1415926535897931159979634685441851615905761718750000000000000000000000000000000000000000000000000000
3
3.141592654
3.141592653589793115997963468544185161590576171875


## `const`

A variable may be qualified as being `const` which marks the variable as being *unassignable* (read only).
Such a variable can be intialized (and you almost certainly will want to do so) but is not assignable.
`const` arithmetic values normally represent constant numeric values such as the mathematical constant
$\pi$:

In [None]:
// const.c

#include <math.h>
#include <stdio.h>

int main(void) {
    // computes and prints the coordinates of a point located at 30 degrees
    // on the unit circle
    
    const double PI = 3.14159265358979323846;
    
    double deg = 30.0;
    double x = cos(deg * PI / 180.0);
    double y = sin(deg * PI / 180.0);
    printf("(%f, %f)", x, y);
    
    // UNCOMMENT NEXT LINE TO ATTEMPT TO RE-ASSIGN PI
    // PI = 3.1416;
    
    return 0;
}

## `void`

On its own, `void` means *cannot hold any value*. It
can be used as a function return type to indicate that the function returns no value.
It can be used as the sole parameter of a function to indicate that the function takes no arguments.
You should always use a `void` parameter when you are declaring a function that accepts no arguments.

A `void *` pointer means that the pointer can point to any object. See the *Pointers* notebook for details.

`sizeof(void)` is an error.

## Arrays

Arrays in C have many similarities to arrays in Java:

* contiguous sequence of objects all having the same type
* the capacity (maximum number of elements) never changes during the array lifetime
* characterized by their element type
* use square brackets to access individual elements

and some differences:

* variables of array type cannot be assigned to (but elements of the array can be assigned to)
* the capacity of the array (if known) can be part of the declaration
* **there is no index bounds checking**
* an array degrades to a pointer to the element type when the array is passed to a function
* it is impossible to return an array from a function

An array of constant known size can be created as shown in the example below:

In [None]:
// arr1.c

#include <stdio.h>

int main(void) {
    int x[3];       // an array of 3 ints, elements are not initialized
    x[0] = 1;
    x[1] = 2;
    x[2] = 3;

    for (int i = 0; i < 3; i++) {
        printf("%d\n", x[i]);
    }
    
    return 0;
}

`sizeof(arr)` returns the amount of memory (in bytes) that is used by the entire array `arr` which is equal to

$$ \text{array capacity} \times \text{sizeof the element type} $$

In [None]:
// arr2.c

#include <stdio.h>

int main(void) {
    int x[3];       // an array of 3 ints
    x[0] = 1;
    x[1] = 2;
    x[2] = 3;

    printf("sizeof(x) : %ld\n", sizeof(x));
    printf("capacity  : %ld\n", sizeof(x) / sizeof(int));
    
    return 0;
}

<div class="alert alert-block alert-warning">
    Now that you know what <tt>sizeof</tt> does when given a local array variable, you should remember
    to never use <tt>sizeof</tt> to determine the capacity of an array! <tt>sizeof(arr)</tt> returns
    the number of bytes used by <tt>arr</tt> only when <tt>arr</tt> is a locally declared array having
    automatic storage duration (see the <it>Scope and Lifetime</it> notebook).
    If you have a locally declared array then you already know its size and capacity. If you are
    inside a function that is passed an array as an argument, then function actually receives a
    pointer to the first element of the array. Using <tt>sizeof</tt> on the pointer returns the
    size of the pointer, not the size of the array.
</div>

There is no array index bounds checking in C (you are really going to miss the exceptions you get in Java).
Using an invalid index causes undefined behavior. If you are lucky, your program will crash. If you are
unlucky, then some object gets overwritten and your program continues running.

In [None]:
// arr3.c

#include <stdio.h>

int main(void) {
    int arr1[5] = { -9, -9, -9, -9, -9 };            // array initialization
    printf("arr1[0]   : %d\n", arr1[0]);             // -9
    
    int x[3];       // an array of 3 ints
    
    x[0] = 1;
    x[1] = 2;
    x[2] = 3;
    x[3] = 4;       // no error?
 
    printf("sizeof(x) : %ld\n", sizeof(x));
    printf("capacity  : %ld\n", sizeof(x) / sizeof(int));
    
    printf("arr1[0]   : %d\n", arr1[0]);             // not -9 on the author's computer
    
    return 0;
}

Arrays are discussed in greater detail in the *Arrays* notebook.

## Strings

There is no proper string type in C. Strings are represented as arrays of `char` where the end of the string
is marked with the null character constant `'\0'`. String literals are similar to string literals in Java
(a sequence of characters enclosed by double quotes).

In [None]:
// str1.c

#include <stdio.h>

int main(void) {
    char str[] = "CISC220";
    
    puts(str);
    
    return 0;
}

There is a difference between the length of a string and the capacity of underlying array. The length of a string
is equal to the number of characters in the string before the *first* `'\0'` in the array. The capacity of the array
is the maximum number of elements that can be stored in the array.

In [None]:
// str2.c

#include <stdio.h>
#include <string.h>     // needed for strlen

int main(void) {
    char str[100] = "CISC220";
    
    printf("capacity      : %ld\n", sizeof(str) / sizeof(char));
    printf("string length : %ld\n", strlen(str));
    
    return 0;
}

Notice that the length of a string does not include the terminating `'\0'`.

Because strings are simply arrays, it is possible to incorrectly create a string omitting the terminating
`'\0'`. For example:

In [None]:
// str3.c

#include <stdlib.h>     // needed for malloc
#include <stdio.h>
#include <string.h>     // needed for strlen

int main(void) {
    char *str = malloc(1);    // dynamically allocate memory for 1 char
    str[0] = 'C';
    
    printf("string length : %ld\n", strlen(str));
    
    return 0;
}

## Structs

A struct is a composite data type that groups one or more variables under a single name in a block of memory.
The variables, called *members*, can be accessed via the struct object. A struct somewhat resembles a
Java top-level class where all of the fields are public (and there are no methods or constructors). Like
a Java class, a struct allows the programmer to create a new type.

The syntax for declaring a struct is:

```c
struct tag_name {
    type1 member1;
    type2 member2;
    // and so on
};
```

An example of a simple struct is a struct that represents a two-dimensional point having real
coordinates `x` and `y`:

```c
struct point2 {
    double x;
    double y;
};
```

The typename of a struct includes the `struct` keyword. For example, the typename of the 
`point2` struct is `struct point2`.

In an uninitialized struct object, the members are all uninitialized (can have any value).
A struct object can be initialized in a manner similar to initializing an array. For example, we
can create a point having coordinates $x=0.1$ and $y=0.2$ by writing:

```c
struct point2 p = { 0.1, 0.2 };
```

To access the members, we use the `.` notation familiar to Java programmers:

In [7]:
#include <stdio.h>

struct point2 {
    double x;
    double y;
};

int main(void) {
    struct point2 p = { 0.1, 0.2 };
    
    double xval = p.x;
    double yval = p.y;
    
    printf("p = (%f, %f)\n", xval, yval);
    
    p.x = 100.0;
    p.y = 200.0;
    
    printf("p = (%f, %f)\n", p.x, p.y);
    
    return 0;
}

p = (0.100000, 0.200000)
p = (100.000000, 200.000000)


A struct object can be assigned to another struct object of the same type. A bitwise copy of the struct
is performed (the contents of the memory used by the struct is copied bit-by-bit into the assigned object).

In [9]:
#include <stdio.h>

struct point2 {
    double x;
    double y;
};

int main(void) {
    struct point2 p = { 0.1, 0.2 };
    struct point2 q;
    
    // assignment
    q = p;
    
    printf("p = (%f, %f)\n", p.x, p.y);
    printf("q = (%f, %f)\n", q.x, q.y);
    
    // changing one struct does not change the other
    p.x = 100.0;
    q.y = 200.0;
    
    printf("p = (%f, %f)\n", p.x, p.y);
    printf("q = (%f, %f)\n", q.x, q.y);
    
    return 0;
}

p = (0.100000, 0.200000)
q = (0.100000, 0.200000)
p = (100.000000, 0.200000)
q = (0.100000, 200.000000)
