# A brief summary of C (89, 99, 11)
<br>
<div style="opacity: 0.8; font-family: Consolas, Monaco, Lucida Console, Liberation Mono, DejaVu Sans Mono, Bitstream Vera Sans Mono, Courier New; font-size: 12px; font-style: italic;">
    ────────
    for more from the author, visit
    <a href="https://github.com/hazemanwer2000">github.com/hazemanwer2000</a>.
    ────────
</div>

## Table of Contents
* [The Compilation Process](#the-compilation-process)
* [Data Types](#data-types)
* [Operators](#operators)
* [The Fundamentals](#the-fundamentals)
    * [Functions](#functions)
    * [Variables](#variables)
        * [Local Variables](#local-variables)
        * [Global Variables](#global-variables)
    * [Casting](#casting)
    * [Literals](#literals)
        * [Integer Literals](#integer-literals)
        * [Floating-point Literals](#floating-point-literals)
        * [Character Literals](#character-literals)
    * [Conditional Execution](#conditional-execution)
* [Pointers](#pointers)
    * [`void *`](#void)
    * [Function Pointers](#function-pointers)
    * [The `restrict` keyword (C99+)](#the-restrict-keyword)
* [Arrays](#arrays)
* [Structures and Unions](#structures-and-unions)
* [Enumerations](#enumerations)
* [Additional Core Features](#additional-core-features)
    * [The `volatile` keyword](#the-volatile-keyword)
    * [The `const` keyword](#the-const-keyword)
    * [String Literals](#string-literals)
    * [The comma operator](#the-comma-operator)
    * [The `sizeof` operator](#the-sizeof-operator)
<hr>

C is a *case-sensitive*, procedural, low-level language.

Every C program must contain a single `main` function across all files. It is the entry point of the program.

When running as bare-metal firmware, the function returns nothing. When running on an OS, the function returns a value to the OS, indicating its exit status.

In [23]:
int main() {
    return 0;
}

## The Compilation Process <a class="anchor" id="the-compilation-process"></a>

Files in *C* are either *source* files, `*.c`, or *header* files, `*.h`.

*Note:* Header files are ignored throughout the compilation process.

*Note:* The compiler used throughout this notebook is *GCC* (GNU Compiler Collection). The term *compiler* used here is an umbrella term for all tools used throughout the compilation process, usually lumped together under a single executable.

The first stage in the compilation process is the *preprocessing* stage. The **preprocessor** reads and executes *directives* in `*.c` files, outputting a `*.i` file for each `*.c` file.

`gcc -E *.c > *.i`

*Note:* Only a single file may be preprocessed at a time.

The second stage is the *compilation* stage. The **compiler** (proper) *compiles* each `*.i` file into a `*.s` file, translating C code into assembly instructions of the *target architecture*. Hence, compilers are architecture-dependent, but platform-independent.

*Note:* An *architecture* is a family of *platforms* that share the same *ISA* (Instruction Set Architecture).

`gcc -S *.i`

*Note:* Passing a `*.c` file instead, it will be preprocessed in the background, before it is compiled.

The third stage is the *assembly* stage. The **assembler** assembles each `*.s` file into an `*.o*` object file, translating assembly instructions into machine code.

`gcc -c *.s`

*Note:* Passing a `*.c` or `*.i` file instead, it will be preprocessed and compiled in the background, before it is assembled.

While compiling and assembling, files are handled separately. Declared but undefined symbols within a single file are left symbolic, for the **linker** to resolve.

The fourth stage is the *linking* stage. The linker *links* all input `*.o` files into a single *relocatable file*, by resolving symbolic associations between files. Also, it implicitly links to the C standard library.

*Note:* A *declaration* acknowledges the existence of a *definition*, in the same or in a different source file. It does not allocate memory.

In the fifth and final stage, the **locator** uses the *linker map file* of the *target platform* to map the memory regions within the relocatable file according to the target's memory map. The final output is a single *executable*, able to run on the target platform.

Usually, there is no intermediary output between both stages, and *symbolic resolution* and *relocation* are performed by the linker, implicity calling or implementing the locator. The linker is, hence, platform-dependent.

`gcc *.o -o run.exe`

*Note:* Passing a `*.c`, `*.i` or `*.s` file instead, it will be preprocessed, compiled and assembled in the background.

## Data Types <a class="anchor" id="data-types"></a>

* Arithmetic Types
    * Integer Types
        * Size: *1 byte*
            * `unsigned char`
            * `signed char`
        * Size: *2 bytes*
            * `unsigned short int`
            * `signed short int`
        * Size: *2 or 4 bytes*
            * `unsigned int`
            * `signed int`
        * Size: *4 or 8 bytes*
            * `unsigned long int`
            * `signed long int`
        * Size: *8 bytes*
            * `unsigned long long int`
            * `signed long long int`
    * Floating-point Types
        * Size: *4 bytes*
            * `float`
        * Size: *8 bytes*
            * `double`
        * Size: *(compiler-dependent)*
            * `long double`
* Derived Types
    * Functions
    * Pointers
    * Arrays
    * Structures
    * Unions
* Enumerations
* `void`

*Note:* Omitting the `unsigned` keyword, the integer data type will default to `signed`.

*Note:* For non-`char` integer data types, the `int` keyword may be omitted.

## Operators <a class="anchor" id="operators"></a>

 | *Operator* | *Associativity* | *Precedence* |
| --- | --- | |
| `()` `[]` `->` `.` | *left-to-right* | ↑ |
| `++` `--` `+` `-` `!` `~` *`(type)`* `*` `&` `sizeof` | *right-to-left* |
| `*` `/` `%` | *left-to-right* |
| `+` `-` | *left-to-right* |
| `<<` `>>` | *left-to-right* |
| `<` `<=` `>` `>=` | *left-to-right* |
| `==` `!=` | *left-to-right* |
| `&` | *left-to-right* |
| `^` | *left-to-right* |
| `\|` | *left-to-right* |
| `&&` | *left-to-right* |
| `\|\|` | *left-to-right* |
| `?:` | *right-to-left* |
| `=` `+=` `-=` `*=` `/=` `%=` `&=` `^=` `\|=` `<<=` `>>=` | *right-to-left* |
| `,` | *left-to-right* | ↓ |

## The Fundamentals <a class="anchor" id="the-fundamentals"></a>

### Functions <a class="anchor" id="functions"></a>

The C compiler (proper) is a single-pass compiler. Hence, to perform type-checking on passed function parameters, a function declaration (also called *prototype*) should be present before a function call. 

In [41]:
#include <stdio.h>

int add(int x, int y) {   /* function definition (and declaration) */
    return x + y;
}

int main() {
    int x = 5;
    int y = 6;
    int sum = add(x, y);
    printf("%d", sum);
    return 0;
}

11

In [37]:
#include <stdio.h>

int add(int x, int y);   /* function declaration */

int main() {
    int x = 5;
    int y = 6;
    int sum = add(x, y);
    printf("%d", sum);
    return 0;
}

int add(int x, int y) {   /* function definition (and declaration) */
    return x + y;
}

11

If a function declaration is missing before a function call, the compiler issues a warning.

In [39]:
#include <stdio.h>

int main() {
    int x = 5;
    int y = 6;
    int sum = add(x, y);
    printf("%d", sum);
    return 0;
}

int add(int x, int y) {   /* function definition (and declaration) */
    return x + y;
}

/tmp/tmpe6jm12l2.c: In function ‘main’:
    6 |     int sum = add(x, y);
      |               ^~~


11

Usually, a `*.h` file contains function declarations of functions in the corresponding `*.c` file. The `#include` directive is used to include the content of another file, usually a `*.h` file, into a `*.c` file.

In [47]:
#include <stdio.h>       /* Standard Library IO functions prototypes. */ 
                         /* Standard Library function definitions are linked to, automatically. */

int main() {
    printf("Hi!");       /* 'printf' prototype defined in 'stdio.h' */
    return 0;
}

Hi!

*Note:* The `#include` directive is recursive. Any directives in the included file are further executed.

A *static* function may not be referenced outside the file it is defined in.

In [50]:
#include <stdio.h>

static int add(int x, int y);      /* static function declaration */

int main() {
    int x = 5;
    int y = 6;
    int sum = add(x, y);
    printf("%d", sum);
    return 0;
}

static int add(int x, int y) {       /* static function definition */
    return x + y;
}

11

If a static function is referenced outside the file it is defined in, the linker throws an error.

In [65]:
//%cflags: .jupyter/add.c

#include <stdio.h>

int add(int x, int y);           /* 'add' is a global (non-static) function declaration */
                                 /* to a global function definition in another file */

int main() {
    printf("%d", add(1, 2));
    return 0;
}

3

In [66]:
//%cflags: .jupyter/add_static.c

#include <stdio.h>

int add(int x, int y);           /* 'add' is a global function declaration */
                                 /* to a static function definition in another file */
                                 
int main() {
    printf("%d", add(1, 2));
    return 0;
}

/tmp/tmpy32z25fi.out: symbol lookup error: /tmp/tmpcjdsntr8.out: undefined symbol: add
[C kernel] Executable exited with code 127

If a static function declaration is present before a function call, while the function definition is missing, the compiler issues a warning only.

In [72]:
#include <stdio.h>

static int add(int x, int y);

int main() {
    printf("%d", add(1, 2));
    return 0;
}

    3 | static int add(int x, int y);
      |            ^~~
/tmp/tmpy32z25fi.out: symbol lookup error: /tmp/tmpv305fucn.out: undefined symbol: add
[C kernel] Executable exited with code 127

In C, function parameters are passed by value.

In [48]:
#include <stdio.h>

void increment(int x) {
    x++; 
}

int main() {
    int x = 5;
    increment(x);
    printf("%d", x);
}

5

### Variables <a class="anchor" id="variables"></a>

#### Local Variables <a class="anchor" id="local-variables"></a>

Variables defined within a function have *local* scope. They can only be accessed within that function.

In [98]:
#include <stdio.h>

int main() {
    int x = 10;
    printf("%d", x);
    return 0;
}

10

Uninitialized local variables contain *garbage* values.

In [100]:
#include <stdio.h>

int main() {
    int x;
    printf("%d", x);
    return 0;
}

32767

Local variables may be `auto` or `static`. `auto` local variables are allocated and deallocated with each function call. `static` local variables persist throughout the lifetime of the program.

In [90]:
#include <stdio.h>

int x_plus_one();

int main() {
    printf("%d\n", x_plus_one());
    printf("%d\n", x_plus_one());
    printf("%d\n", x_plus_one());
    return 0;
}

int x_plus_one() {
    auto int x = 10;         /* auto local variable definition executed */
                             /* with each function call */
    return ++x;
}

11
11
11


In [91]:
#include <stdio.h>

int x_plus_one();

int main() {
    printf("%d\n", x_plus_one());
    printf("%d\n", x_plus_one());
    printf("%d\n", x_plus_one());
    return 0;
}

int x_plus_one() {
    static int x = 10;         /* static local variable definition executed */
                               /* once only at start of program, and not with */
                               /* each function call */
    return ++x;
}

11
12
13


*Note:* Local variables are `auto` by default.

*Note:* Storage class specifiers may come before or after the data type.

Local variables may also be `register`. The `register` *storage class specifier* tells the compiler to allocate and deallocate memory with each function call, but in one of the CPU's registers, for faster access.

In [537]:
#include <stdio.h>

int x_plus_one();

int main() {
    printf("%d\n", x_plus_one());
    printf("%d\n", x_plus_one());
    printf("%d\n", x_plus_one());
    return 0;
}

int x_plus_one() {
    int x = 10;         /* register (auto) local variable definition executed */
                             /* with each function call */
    return ++x;
}

11
11
11


*Note:* The `register` storage class specifier may be ignored by the compiler, defaulting to `auto`.

If mulitple storage class specifiers are used on a single local variable, the compiler throws an error.

In [116]:
#include <stdio.h>

int x_plus_one();

int main() {
    printf("%d\n", x_plus_one());
    printf("%d\n", x_plus_one());
    printf("%d\n", x_plus_one());
    return 0;
}

int x_plus_one() {
    auto register int x = 10;         /* register (auto) local variable definition executed */
                             /* with each function call */
    return ++x;
}

/tmp/tmpze_ttdz4.c: In function ‘x_plus_one’:
/tmp/tmpze_ttdz4.c:13:5: error: multiple storage classes in declaration specifiers
   13 |     auto register int x = 10;         /* register (auto) local variable definition executed */
      |     ^~~~
[C kernel] GCC exited with code 1, the executable will not be executed

#### Global variables <a class="anchor" id="global-variables"></a>

Variables defined outside a function have *global* scope. They can be accessed within any function in any file, as long as a declaration of the variable preceeds.

In [77]:
#include <stdio.h>

int x = 10;      /* variable definition (and declaration) */

int main() {
    printf("%d", x);
    return 0;
}

10

Explicitly uninitialized global variables are implicitly initialized to zero.

In [101]:
#include <stdio.h>

int x;        /* variable definition (and declaration) */

int main() {
    printf("%d", x);
    return 0;
}

0

The keyword `extern` is used to declare variables, that may be defined later in the file, or within another file.

In [539]:
//%cflags: .jupyter/x_def.c

#include <stdio.h>

extern int x;      /* Variable declaration */
                   /* Variable definition resides in '.jupyter/x_def.c' */

int main() {
    printf("%d", x);
    return 0;
}

23

`static` global variables may not be accessed within another file.

In [108]:
//%cflags: .jupyter/x_def_static.c

#include <stdio.h>

extern int x;      /* Variable declaration */
                   /* Static variable definition resides in '.jupyter/x_def_static.c' */

int main() {
    printf("%d", x);
    return 0;
}

/tmp/tmpy32z25fi.out: /tmp/tmpqrzuawqj.out: undefined symbol: x
[C kernel] Executable exited with code 1

#### Scope

Within a *scope*, identifiers must be unique. Within a *nested scope*, identifiers will overshadow similar identifiers in the parent scope.

A *local scope* within a function can be considered nested within the *global scope* of the containing file (or across files).

In [446]:
#include <stdio.h>

int x = 1;

int main() {
    int x = 2;
    printf("%d", x);
}

2

Braces define scope. A nested scope can be defined within a function by using braces.

In [260]:
#include <stdio.h>

int x = 1;

int main() {
    printf("Global scope: %d\n", x);
    int x = 2;
    printf("Local scope: %d\n", x);
    {
        int x = 3;
        printf("Nested Scope: %d\n", x);
    }
    printf("Again, local scope: %d\n", x);
}

Global scope: 1
Local scope: 2
Nested Scope: 3
Again, local scope: 2


*Note:* Defined local variables within a nested scope are deallocated once the scope ends.

### Casting <a class="anchor" id="casting"></a>

In C, *implicit casting* of one data type into another never throws any error.

In [135]:
#include <stdio.h>

int main() {
    float x = 13.0f;             /* float literal, discussed later */
    printf("%d", (int) x);
    return 0;
}

13

In arithmetic, bitwise, comparison and equivalence operations, the C compiler follows a number of casting rules:
* `char` and `short` types are promoted to `int` by default.
* If different integer data types are present, implicit conversion occurs towards the largest data type.
* If `signed` and `unsigned` data types are present, implicit conversion occurs towards the `unsigned` data type.
* If integer and floating-point types are present, implicit conversion occurs towards the floating-point data type.
* If different floating-point types are present, implicit conversion occurs towards the largest floating-point data type.

In [192]:
#include <stdio.h>

                    /* signed to unsigned */
int main() {
    signed int x = -1; 
    printf("%u", x);         /* print as unsigned, discussed later */
    return 0;
}

4294967295

In [193]:
#include <stdio.h>

                    /* unsigned to signed */
int main() {
    unsigned int x = 4294967295; 
    printf("%d", x);                  /* print as signed, discussed later */
    return 0;
}

-1

*Note:* The bits remain unchanged while casting from `signed` to `unsigned`, and vice versa.

In [163]:
#include <stdio.h>

                    /* unsigned int to unsigned long long */
int main() {
    unsigned int x = 1; 
    printf("%llu", (unsigned long long) x);         /* print as unsigned long long, discussed later */
    return 0;
}

1

In [191]:
#include <stdio.h>

                    /* signed int to signed long long */
int main() {
    signed int x = -1; 
    printf("%lld\n", (signed long long) x);         /* print as signed long long, discussed later */
    
    x = 1; 
    printf("%lld\n", (signed long long) x);
    return 0;
}

-1
1


*Note:* Casting an integer data type into a larger data type, the bits will be zero-expanded if it is `signed` and non-negative, or `unsigned`. It will be one-expanded if it is `signed` and negative.

In [461]:
#include <stdio.h>
#include <limits.h>

                    /* signed int to signed long long */
int main() {
    unsigned int x = UINT_MAX;
    printf("%u", (unsigned char) x);
}

255

*Note:* Casting an integer data type into a smaller data type, bits are simply omitted.

In [515]:
#include <stdio.h>
#include <limits.h>

                    /* signed int to signed long long */
int main() {
    unsigned char x = 2;
    signed char y = -3;     /* 253 as 'unsigned char'*/
    
    printf("%u\n", x * y);
    printf("%u\n", x * (unsigned char) y);
}

4294967290
506


*Note:* When casting, bit expansion or omission comes before sign conversion.

### Literals <a class="anchor" id="literals"></a>

#### Integer Literals <a class="anchor" id="integer-literals"></a>

| *Prefix* | *Type* | 
| --- | --- | 
|  | Decimal |
| `0x`, `0X` | Hexadecimal |
| `0` | Octal |
| `0b`, `0B` | Binary |

| *Suffix* | *Type* | 
| --- | --- | 
| <br> <br> <br> <br> (Non-Decimal) | `int` <br> `long int` <br> `long long int` <br><br> (`unsigned`) |
| `u`, `U` | `unsigned int` <br> `unsigned long int` <br> `unsigned long long int` |
| `l`, `L` | `long int` |
| `ll`, `LL`  | `long long int` |
| `ul`, `uL`, `Ul`, `UL` | `unsigned long int` |
| `ull`, `uLL`, `Ull`, `ULL` | `unsigned long long int` |

*Note:* If a value is too large for its data type, the compiler issues a warning.

#### Floating-point Literals <a class="anchor" id="floating-point-literals"></a>

| *Valid Format* | *Example* |
| --- | --- |
| *int.frac* | 1.1
| *int.* | 1.
| *.frac* | .1
| *int-e-exp* | 1e-1

| *Suffix* | *Type* |
| --- | --- |
|  | `double`
| `f`, `F` | `float`
| `l`, `L` | `long double`

#### Character Literals <a class="anchor" id="character-literals"></a>

A *character literal* is simply a `char`, which value of is the *ASCII* code of the corresponding character.

| *Example* | *ASCII code* |
| --- | --- |
| `'0'` | 48
| `'A'` | 65
| `'a'` | 97
| `'\n'` | 10
| `'\0'` | 0
| `'"'` | 34
| `'\''` | 39

### Conditional Execution <a class="anchor" id="conditional-execution"></a>

In C, in any decision (conditional branching), any non-zero value evaluates to true, and otherwise false.

In [475]:
#include <stdio.h>

int main() {                /* if statement */
    if (0.01) {
        printf("True!");
    } else {
        printf("False!");
    }
    return 0;
}

True!

In [564]:
#include <stdio.h>

int main() {                /* ternary operator */
    printf("%c", 0.01 ? 'T' : 'F');
    return 0;
}

T

In C, loops consist of *while*, *do-while* and *for* loops.

In [426]:
#include <stdio.h>

int main() {
    register int count = 3;
    while (count) {
        printf("%d ", count--);
    }
}

3 2 1 

In [29]:
#include <stdio.h>

int main() {
    do {
        printf("Hi!");
    } while (0);
}

Hi!

In [428]:
#include <stdio.h>

int main() {
    register int count;
    for (count = 3; count; count--) {
        printf("%d ", count);
    }
}

3 2 1 

*Note:* Use `break` and `continue` statements to break from a loop, or skip the current iteration, respectively.

A *switch* is used to test for a specific value of an integer data type.

In [477]:
#include <stdio.h>

int main() {
    int x = 2;
    switch (x) {
        case 1:
            printf("One.");
            break;
        case 2:
            printf("Two.");
            break;
        default:
            printf("Default.");
            break;
    }
}

Two.

In [481]:
#include <stdio.h>

int main() {
    float x = 2.0f;
    switch (x) {
        case 1.0f:
            printf("One.");
            break;
        case 2.0f:
            printf("Two.");
            break;
        default:
            printf("Default.");
            break;
    }
}

/tmp/tmp4fn7rtds.c: In function ‘main’:
/tmp/tmp4fn7rtds.c:5:13: error: switch quantity not an integer
    5 |     switch (x) {
      |             ^
/tmp/tmp4fn7rtds.c:6:9: error: case label does not reduce to an integer constant
    6 |         case 1.0f:
      |         ^~~~
/tmp/tmp4fn7rtds.c:9:9: error: case label does not reduce to an integer constant
    9 |         case 2.0f:
      |         ^~~~
[C kernel] GCC exited with code 1, the executable will not be executed

Once a case matches in a switch, all statements further down are executed, regardless, until a `break` statement is met.

In [431]:
#include <stdio.h>

int main() {
    int x = 1;
    switch (x) {
        case 1:
            printf("One.");
        case 2:
            printf("Two.");
        default:
            printf("Default.");
    }
}

One.Two.Default.

If no case matches, the `default` (optional) case is executed.

In [772]:
#include <stdio.h>

int main() {
    int x = 3;
    switch (x) {
        default:
            printf("Default.");
        case 1:
            printf("One.");
        case 2:
            printf("Two.");
    }
}

Default.One.Two.

*Note:* It is recommended to add a `break` statement at the end of each case.

A `goto` statement jumps to a label in code, and can be used to implement conditional branching structures and functions calls from scratch.

In [444]:
#include <stdio.h>

int main() {
    goto skip;
    printf("Hi!");
    skip:
    printf("Bye!");
    return 0;
}

Bye!

*Note:* A label must be defined within the same function in which it is used.

When working with `if`, `else`, `while`, `do`, and `for` keywords, it is not a must to define a new scope. Zero or one statements may follow without a new scope, with a terminating semicolon at the end.

In [2]:
#include <stdio.h>

int main() {
    int x = 3;
    while(x)
        printf("%d\n", x--);
    return 0;
}

3
2
1


In [22]:
#include <stdio.h>

int main() {
    int x = 3;
    while(x--)
        ;
    return 0;
}

*Note:* A semicolon with no statement preceeding is called an *empty statement*.

Additionally, one more similar keyword may follow, instead of a semi-colon.

In [24]:
#include <stdio.h>

int main() {
    if (0)
        printf("If.");
    else if (1)              /* 'else' followed by 'if'
        printf("Else-If.");
    else
        ;
    return 0;
}

Else-If.

In some examples, the path of execution may become undefined (compiler-dependent).

In [17]:
#include <stdio.h>

int main() {
    if (1)
        if (0)
            ;
    else                        /* 'else' associated with nested-'if' and not 'if' */
        printf("Else.");
    return 0;
}

Else.

*Note:* It is preferred to always define a new scope after `if`, `else`, `while`, `do`, or `for`.

## Pointers <a class="anchor" id="pointers"></a>

A *pointer* is a variable that points to some location in memory, typically another variable's memory location. A pointer is typically associated with a specific data type in definition and declaration.

*Note:* The size of a pointer variable is architecture-dependent, since it depends on the address bus size.

In [247]:
#include <stdio.h>

int main() {
    int x = 5;
    int *ptr;            /* pointer variable definition and declaration (using the dereference operator '*') */
    ptr = &x;            /* '&' reference operator, yields address location of 'x' */
    *ptr = 10;           /* '*' dereference operator, accesses memory at address location stored in 'ptr' */
    printf("%d", x);
    return 0;
}

10

If implicitly casting a literal, usually a hexadecimal value, as a pointer, the compiler issues a warning.

In [509]:
int main() {
    int *ptr = 0x12345678;
    return 0;
}

/tmp/tmpnoshsqds.c: In function ‘main’:
    2 |     int *ptr = 0x12345678;
      |                ^~~~~~~~~~


If implicitly casting a pointer as an integer data type, the compiler also issues a warning.

In [512]:
int main() {
    int x = 5;
    unsigned long long int address = &x;
    return 0;
}

/tmp/tmp633j5827.c: In function ‘main’:
    3 |     unsigned long long int address = &x;
      |                                      ^


A *double pointer* points to a pointer.

In [530]:
#include <stdio.h>

int main() {
    int x = 5;
    int *ptr = &x;
    int **dbPtr = &ptr;
    printf("%d", **dbPtr);
}

5

*Note:* A triple pointer points to a double pointer, and so on.

If implicitly casting a pointer to a double pointer, or vice versa, the compiler issues a warning.

In [531]:
#include <stdio.h>

int main() {
    int x = 5;
    int *ptr = &x;
    int **dbPtr = ptr;
}

/tmp/tmpfkcq_uhc.c: In function ‘main’:
    6 |     int **dbPtr = ptr;
      |                   ^~~


Pointers allow a function to modify variables outside of its scope.

In [506]:
#include <stdio.h>

void increment(int *x);

int main() {
    int x = -4;
    increment(&x);
    printf("%d", x);
}

void increment(int *x) {
    (*x)++;
}

-3

A *wild pointer* is an uninitialized local pointer. Since uninitialized local variables contain garbage values, a wild pointer is dangerous. If dereferenced, an attempt to write to a random location occurs, resulting in a bug. If this location is read-only at the moment of writing, a run-time segmentation error occurs.

*Note:* Generally, compilers warn about wild pointers.

To avoid wild pointers, initialize local pointer variables with `NULL`, which is a *macro definition* (discussed later) defined in the standard library headers, including `stddef.h`, `stdlib.h` and `stdio.h`. `NULL` is simply defined as `0`. The C Standard gurantees that an executable exits immediately when an attempt to write to the `0` memory location is made. This well-defined behavior makes it easier to debug defected code.

In [258]:
#include "stdio.h"

int main() {
    int *ptr = NULL;
    *ptr = 50;
    return 0;
}

[C kernel] Executable exited with code -11

Adding or subtracting an integer data type from a pointer is permitted, and results in a pointer.

In [324]:
#include "stdio.h"

int main() {
    int x = 5;
    int y = 10;
    int z = 20;
    
    int *ptr = &x;
    
    printf("%d\n", *ptr);
    printf("%d\n", *(ptr+1));
    printf("%d\n", *(ptr+2));
    return 0;
}

5
10
20


*Note:* Since each address in memory points to a specific byte, incrementing a pointer increases the address stored in the pointer variable by the number of bytes needed to store the data type associated with the pointer variable.

Subtraction between two pointers of the same data type is permitted, and results in an integer data type, the size of a pointer (architecture-dependent).

In [488]:
#include "stdio.h"

int main() {
    int x = 5;
    int y = 10;
    
    printf("%ld\n", &y - &x);
    return 0;
}

1


*Note:* The result of pointer subtraction is divided by the number of bytes needed to store the associated data type.

*Note:* Subtracting two pointers of different associated data types, the compiler throws an error. 

*Note:* Subtracting two pointers makes more sense when working with arrays, discussed later.

Comparison and equivalence operators may be used between pointers of the same data type.

In [362]:
#include "stdio.h"

int main() {
    int x = 5;
    int y = 10;
    
    printf("%d\n", &y > &x);
    return 0;
}

1


*Note:* If comparison or equivalence operators are used between pointers of different data types, or between a pointer and an integer data type, the compiler issues a warning.

### `void *` <a class="anchor" id="void"></a>

A *void pointer* is a pointer with no associated data type. If dereferenced and read from or written to, the compiler throws an error. All other operations are, however, valid.

In [400]:
#include "stdio.h"

int main() {
    int x = 5;
    void *ptr = &x;
    *ptr = 6;
    return 0;
}

/tmp/tmpb5of1xpr.c: In function ‘main’:
    6 |     *ptr = 6;
      |     ^~~~
/tmp/tmpb5of1xpr.c:6:10: error: invalid use of void expression
    6 |     *ptr = 6;
      |          ^
[C kernel] GCC exited with code 1, the executable will not be executed

In [402]:
#include "stdio.h"

int main() {
    int x = 5;
    int y = 10;
    void *ptr = &x;
    void *ptr2 = &y;
    
    printf("%lx\n", (unsigned long int) (ptr2));
    printf("%lx\n", (unsigned long int) (ptr2+1));
    printf("%lx\n", ptr2 - ptr);
    printf("%d\n", ptr < ptr2);
    
    return 0;
}

7ffd0a7a4484
7ffd0a7a4485
4
1


*Note:* Operations on void pointers assume a byte-sized data type (e.g: char), since there is no associated data type.

### Function Pointers <a class="anchor" id="function-pointers"></a>

A *function pointer* points to a function's memory location. It must define the number of parameters and corresponding types, as well as the return type.

In [None]:
int * add(int x, int y);            /* This is a function, that accepts */
                                    /* two 'int' parameters, and returns an 'int' pointer */

In [None]:
int (*add)(int x, int y);           /* This is a pointer, to a function, that accepts */
                                    /* two 'int' parameters, and returns an 'int' */

Function pointers may be initialized.

In [528]:
#include <stdio.h>

int add(int x, int y);
int (*ptr)(int x, int y) = add;

int main() {
    printf("%d\n", add(5, 4));
    return 0;
}

int add(int x, int y) {
    return x + y;
}

9


*Note:* A function pointer is an example of a complex definition. Complex definitions should be understood according to the precedence and associativity of the operators present.

Dereferencing a function pointer evaluates to the same function pointer.

In [35]:
#include <stdio.h>

int add(int x, int y);
int (*ptr)(int x, int y) = add;

int main() {
    printf("%d\n", (*ptr)(5, 4));       /* dereferencing once */
    printf("%d\n", (**ptr)(5, 4));      /* dereferencing twice */
    return 0;
}

int add(int x, int y) {
    return x + y;
}

9
9


*Note:* A function name (identifier) is a *fixed* function pointer, with a special property. Referencing the name of a function always yields the same result.

A function pointer may be a parameter to another function.

In [407]:
#include <stdio.h>

int add(int x, int y);
int intermediate(int (*ptr)(int x, int y), int x, int y);

int main() {
    printf("%d", intermediate(add, 5, 4));
    return 0;
}

int add(int x, int y) {
    return x + y;
}

int intermediate(int (*ptr)(int x, int y), int x, int y) {
    return ptr(x, y);
}

9

If a function pointer does not match the function assigned to it, the compiler issues a warning.

In [417]:
#include <stdio.h>

int add(int x, int y);
int (*ptr)(int x) = add;

int main() {
    printf("%d", ptr(5));     /* second argument contains garbage */
    return 0;
}

int add(int x, int y) {
    return x + y;
}

    4 | int (*ptr)(int x) = add;
      |                     ^~~


1572403669

### The `restrict` keyword (C99+) <a class="anchor" id="the-restrict-keyword"></a>

The `restrict` specifier may be used in pointer declarations and definitions. It tells the compiler that for the lifetime of the pointer, no other pointer (technically, no other identifier) will be used to access the memory location to which it points. This allows the compiler to make optimizations that would not otherwise have been possible. It is usually used with function parameters in definitions and declarations.

In [None]:
            /* adding to '*ptrA' may have altered '*val', must be re-read from memory */
            
void updatePtrs(unsigned char *ptrA, unsigned char *ptrB, unsigned char *val) {
  *ptrA += *val;
  *ptrB += *val;
}

In [None]:
            /* '*val' cannot be altered by changing '*ptrA', will not be re-read from memory */
            
void updatePtrs(unsigned char *ptrA, unsigned char *ptrB, unsigned char * restrict val) {
  *ptrA += *val;
  *ptrB += *val;
}

## Arrays <a class="anchor" id="arrays"></a>

An *array* is a consecutive number of memory blocks, each block of the same size, according to the associated data type.

In [572]:
#include <stdio.h>

int arr[5];        /* {0, 0, 0, 0, 0} */

int main() {
    int arr2[5];                         /* garbage values  */
    int arr3[] = {1, 2, 3, 4, 5};        /* {1, 2, 3, 4, 5} */
    int arr4[5] = {1, 2, 3};             /* {1, 2, 3, 0, 0} */
    int arr5[5] = {0};                   /* {0, 0, 0, 0, 0} */
    
                                         /* C99+ */
    int arr6[] = {1, [2]=3, [4]=5};      /* {1, 0, 3, 0, 5} */
    int arr7[5] = {1, [1]=2, [3]=4};     /* {1, 2, 0, 4, 0} */
}

In [703]:
#include <stdio.h>

int arr[];

int main() {
    return 0;
}

    3 | int arr[];
      |     ^~~


An *array element* is accessed using brackets.

In [575]:
#include <stdio.h>

int arr[5] = {1, 2, 3, 4, 5};

int main() {
    register unsigned char i;
    for (i = 0; i < 5; i++) {
        printf("%d ", arr[i]);
    }
}

1 2 3 4 5 

The name of an array can be considered a *fixed* pointer to the first element of the array.

In [577]:
#include <stdio.h>

int arr[5] = {1, 2, 3, 4, 5};

int main() {
    register unsigned char i;
    for (i = 0; i < 5; i++) {
        printf("%d ", *(arr+i));      /* equivalent to arr[i] */
    }
}

1 2 3 4 5 

An array can be passed to a function in various ways.

In [110]:
int increment(int *arr, unsigned long len);   /* most (syntactically) consistent way */
                                              /* just a pointer */

int main() {
    int arr[5] = {0};
    increment(arr, 5);
    return 0;
}

int increment(int *arr, unsigned long len) {
    register unsigned long i;
    for(i = 0; i < len; i++) {
        *(arr+i) += 1;
    }
}

In [717]:
int increment(int arr[], unsigned long len);       /* most preferred */

int main() {
    int arr[5] = {0};
    increment(arr, 5);
    return 0;
}

int increment(int arr[], unsigned long len) {
    register unsigned long i;
    for(i = 0; i < len; i++) {
        arr[i] += 1;
    }
}

In [745]:
void increment(int arr[5]);      /* least preferred */
                                 /* but allows compiler to issue some warnings */

int main() {
    int arr[4] = {0};
    increment(arr);
    return 0;
}

void increment(int arr[5]) {
    register unsigned long i;
    for(i = 0; i < 5; i++) {
        arr[i] += 1;
    }
}

/tmp/tmpy2ajkrpj.c: In function ‘main’:
    6 |     increment(arr);
      |     ^~~~~~~~~~~~~~
/tmp/tmpy2ajkrpj.c:6:5: note: referencing argument 1 of type ‘int *’
/tmp/tmpy2ajkrpj.c:10:6: note: in a call to function ‘increment’
   10 | void increment(int arr[5]) {
      |      ^~~~~~~~~


*Note:* When passing an array to a function, the array is never copied in memory (as in, passed by value).

When declaring a global array that is defined in another file, it is preferred to use empty brackets.

In [769]:
//%cflags: .jupyter/global_arr.c

#include <stdio.h>

extern int arr[];

int main() {
    printf("%d", arr[4]);
    return 0;
}

5

*Note:* Using a pointer declaration for a defined array will result in a runtime error.

A *2-dimensional* array, a multi-dimensional array, is an array of arrays.

In [50]:
#include <stdio.h>

int arrA[2][3] = {1, 2, 3, 4, 5};         /* arr[1][2] initialized to zero */
int arrB[2][3] = {{1, 2}, {4, 5}};        /* arr[0][2] and arr[1][2] initialized to zero */

int arrC[][3] = {1, 2, 3, 4, 5};                 /* {{1, 2, 3}, {4, 5, 0}} */
int arrD[][3] = {{1, 2}, {4, 5}, {7, 8}};        /* {{1, 2, 0}, {4, 5, 0}, {7, 8, 0}} */

int arrE[2][3] = {[0][0] = 1};                     /* {{1, 0, 0}, {0, 0, 0}} */
int arrF[2][3] = {{[0] = 1}};                      /* {{1, 0, 0}, {0, 0, 0}} */

int main() {
    printf("%d\n", arrD[2][0]);
    printf("%d\n", *(*(arrD+2)+0));        /* equivalent-syntax */
    return 0;
}

7
7


*Note:* Declarations and definitions of multi-dimensional arrays must have bounds for all dimensions except the first.

In [692]:
#include <stdio.h>

int arr[2][3] = {1, 2, 3, 4, 5, 6};

int main() {
    printf("%lx\n", (unsigned long int) arr);
    printf("%lx\n", (unsigned long int) arr[0]);
    
    printf("%lx\n", (unsigned long int) (arr+1));
    printf("%lx\n", (unsigned long int) (arr[0]+1));
    
    return 0;
}

7f94dc331030
7f94dc331030
7f94dc33103c
7f94dc331034


*Note:* Since the associated data type of a multi-dimensional array is another array of a specific fixed size, adding to a multi-dimensional array (identifier) increases the (address) value by multiples of that number of bytes.

A 2D array can be passed to a function in various ways.

In [746]:
int increment(int (*arr)[3], unsigned long len);     /* most (syntactically) consistent way */
                                                     /* pointer to 3-sized array */

int main() {
    int arr[2][3] = {0};
    increment(arr, 2);
    return 0;
}

int increment(int (*arr)[3], unsigned long len) {
    register unsigned long i;
    register unsigned long j;
    for(i = 0; i < len; i++) {
        for(j = 0; j < 3; j++) {
            (*(arr+i))[j] += 1;
        }
    }
}

In [729]:
int increment(int arr[][3], unsigned long len);      /* most preferred way */

int main() {
    int arr[2][3] = {0};
    increment(arr, 2, 3);
    return 0;
}

int increment(int arr[][3], unsigned long len) {
    register unsigned long i;
    register unsigned long j;
    for(i = 0; i < len_x; i++) {
        for(j = 0; j < len_y; j++) {
            arr[i][j] += 1;
        }
    }
}

In [109]:
int increment(int arr[2][3]);      /* least preferred way */
                                   /* but allows compiler to issue some warnings */ 

int main() {
    int arr[2][3] = {0};
    increment(arr);
    return 0;
}

int increment(int arr[2][3]) {
    register unsigned long i;
    register unsigned long j;
    for(i = 0; i < 2; i++) {
        for(j = 0; j < 3; j++) {
            arr[i][j] += 1;
        }
    }
}

To get the first bound of an array, a playful trick may be employed.

In [102]:
#include <stdio.h>

int arr[] = {1, 2, 3, 4, 5, 6, 7, 8, 9};

            /* &arr yields a (*int[9]) pointer with same address as arr */
            /* adding one to it, increments by the size of the associated data type */
            /* in this case, a 9-sized array. Dereferencing then returns a (*int) pointer */
            /* subtracting 'arr', which is (*int), yields the size of the array, at last. */

int main() {
    printf("Length: %lu", *(&arr+1) - arr);
}

Length: 9

In [103]:
#include <stdio.h>

int arr[][3] = {1, 2, 3, 4, 5, 6, 7, 8, 9};

int main() {
    printf("Length: %lu", *(&arr+1) - arr);
}

Length: 3

*Note:* Because most compilers pad arrays at the end with additional bytes, to enable faster access at the expense of increased memory usage, it is recommended to specify array boundaries that result in array sizes that are powers of two (even, in general).

## Structures and Unions <a class="anchor" id="structures-and-unions"></a>

### Structures <a class="anchor" id="structures"></a>

The `struct` keyword is used to define a *structure* template. It may be defined in global or local scope. It does not allocate any memory.

In [None]:
struct student {
    unsigned long long id;
    unsigned char name[30];
    float gpa;
};

You can define a variable with a structure data type by using the `struct` keyword. A structure data type may be initialized in the same way as an array. The `.` operator is used to access structure members.

In [115]:
#include <stdio.h>

struct student {
    unsigned long long id;
    unsigned char name[30];
    float gpa;
};
                   /* structure members initialized  */
int main() {
    struct student hazem = {12345678u, {'H', 'a', 'z', 'e', 'm', '\0'}, 0.98f};
    printf("%s", hazem.name);
}

Hazem

In [127]:
#include <stdio.h>

struct student {
    unsigned long long id;
    unsigned char name[30];
    float gpa;
};
                /* some structure members initialized, rest zero-ed */
int main() {
    struct student hazem = {12345678u};
    printf("%llu\n", hazem.id);
    printf("%f", hazem.gpa);
}

12345678
0.000000

In [128]:
#include <stdio.h>

struct student {
    unsigned long long id;
    unsigned char name[30];
    float gpa;
};
                /* zero initialization of local structure */
int main() {
    struct student hazem = {};
    printf("%llu\n", hazem.id);
    printf("%f", hazem.gpa);
}

0
0.000000

In [131]:
#include <stdio.h>

struct student {
    unsigned long long id;
    unsigned char name[30];
    float gpa;
};
                              /* feature in C99+ */
int main() {
    struct student hazem = {.gpa = 0.98f};
    printf("%llu\n", hazem.id);
    printf("%f", hazem.gpa);
}

0
0.980000

In C99+, it is permitted to define a *flexible* array member at the end of a structure.

In [None]:
struct student {
    unsigned long long id;
    float gpa;
    unsigned char name[];
};

*Note:* In dynamic memory allocation of a structure, a flexible array member should be taken into account, discussed later.

You can define a variable with a structure data type while defining the structure template. If the structure template is of one-time use, then it need not be given a name (in C11+).

In [162]:
#include <stdio.h>

struct {
    unsigned long long id;
    unsigned char name[30];
    float gpa;
} hazem = {123u};

int main() {
    printf("%llu\n", hazem.id);
    printf("%f\n", hazem.gpa);
    return 0;
}

123
0.000000


A pointer variable with an associated structure data type may use the `->` operator to directly access members, without dereferencing.

In [166]:
#include <stdio.h>

struct student {
    unsigned long long id;
    unsigned char name[30];
    float gpa;
} hazem = {12345678u, .gpa=0.98};

int main() {
    struct student *ptr = &hazem;
    printf("%llu\n", ptr->id);
    printf("%f\n", (*ptr).gpa);     /* equivalent */
    return 0;
}

12345678
0.980000


A *bit field* member of a structure allocates a number of bits of memory to itself. Consecutive bitfield members pack themselves in as low a number of bytes as possible.

In [176]:
#include <stdio.h>

struct REG {               /* consumes a single byte in memory */
    unsigned int Pin0: 1;
    unsigned int Pin1: 1;
    unsigned int Pin2: 1;
    unsigned int Pin3: 1;
    unsigned int Pin4: 1;
    unsigned int Pin5: 1;
    unsigned int Pin6: 1;
    unsigned int Pin7: 1;
};

int main() {
    unsigned char x = 0;
    struct REG *ptr = (struct REG *) &x;
    ptr->Pin1 = 1;
    printf("%u\n", x);
    return 0;
}

2


*Note:* Because the layout of bitfields is compiler-dependent, bitfields are uncommonly used.

Because most compilers pad consecutive structure members to enable faster access at the expense of increased memory usage, it is recommended to write structure members in the order of decreasing data type size.

In [None]:
struct student {
    unsigned char name[8];
    unsigned long long id;
    float gpa;
};

### Unions <a class="anchor" id="unions"></a>

A *union* is similar to a structure in syntax and use, except members overlap each other in memory. The size of a union is the size of its largest member.

In [188]:
#include <stdio.h>

union number {
    int integer;
    float floating;
} num = {8};

int main() {
    printf("%d", num.integer);
    return 0;
}

8

In [189]:
#include <stdio.h>

union number {
    int integer;
    float floating;
} num = {.floating = 0.879};           /* C99+ */

int main() {
    printf("%f", num.floating);
    return 0;
}

0.879000

## Enumerations <a class="anchor" id="enumerations"></a>

An *enumeration* is defined using the `enum` keyword. It gives constants identifiers for ease of use by programmers. It may be defined in local or global scope.

In [None]:
enum Days {Sun, Mon, Tues, Wed, Thrus, Fri, Sat};     /* 0, 1, 2, 3, 4, 5, 6 */

In [None]:
enum Days {Sun=-2, Mon, Tues, Wed, Thrus=5, Fri, Sat};     /* -2, -1, 0, 1, 5, 6, 7 */

You can define an enumeration data type also using the `enum` keyword. It also need not be given a name (in C11+).

In [None]:
enum {Sun=-2, Mon, Tues, Wed, Thrus=5, Fri, Sat} day;

*Note:* An enumeration data type is internally a `signed int` data type.

Enumeration constants are used directly, through their identifiers, hence identifiers of enumeration constants may not match within the same scope.

In [210]:
enum Single_Day {Sun};
enum Just_a_Day {Sun};

int main() {
    return 0;
}

/tmp/tmppmky3pj6.c:2:18: error: redeclaration of enumerator ‘Sun’
    2 | enum Just_a_Day {Sun};
      |                  ^~~
/tmp/tmppmky3pj6.c:1:18: note: previous definition of ‘Sun’ with type ‘enum Single_Day’
    1 | enum Single_Day {Sun};
      |                  ^~~
[C kernel] GCC exited with code 1, the executable will not be executed

## Additional Core Features <a class="anchor" id="additional-core-features"></a>

### The `volatile` keyword <a class="anchor" id="the-volatile-keyword"></a>

The `volatile` specifier may be used in variable declarations and definitions, and castings. It tells the compiler that the value at a particular memory location may change without any action being taken by the code the compiler finds nearby. The compiler will then not apply optimization techniques while reading from and writing to this memory location. Practical examples of such memory locations include:

* Global variables in multi-threaded applications.
* Global variables in *Interrupt Service Routines* (ISRs).
* Memory-mapped peripheral registers.

In [None]:
volatile int x;         /* volatile variable 'x' */
int volatile x;         /* equivalent */

In [None]:
volatile int *x;                 /* non-volatile pointer, volatile memory */
int * volatile x;                /* volatile pointer, non-volatile memory */
volatile int * volatile x;       /* volatile pointer, volatile memory */

In [None]:
*((volatile unsigned char *) 0x12345678) |= 0b00000001;     /* bit mask applied to a memory-mapped */ 
                                                            /* register with a '0x12345678' address */