# Systems Programming

## C Basics

In [1]:
// Hello World in C
#include <stdio.h>
int main() {
    printf("Hello, World!\n");
    return 0;
}

Hello, World!


### Pre-processor commands
Lines that begin with `#` are commands the C pre-processor. The line `#include <stdio.h>` looks for the source code file `stdio.h`  and includes it before compilation. This is a file required to use the standard input and output library, such as the `printf` function. 

### The `main()` function
All C programs have an entry function called `main()`. This is called by the runtime system in order to start the program running. Every C program must have exactly one `main()`, which must return an integer. Only functions called in this function will be executed. 

### The `printf()` function
The `printf()` function is used to print formatted text to the console. 

In [2]:
#include <stdio.h>
int main() {
    printf("We've got rectal bleeding.\nWhat, all of you?");
    return 0;
}

We've got rectal bleeding.
What, all of you?

The `printf()` function  does not automatically add a new line - the newline character `\n` must be used to move to the next line. 

In [3]:
#include <stdio.h>
int main() {
    printf("We've got rectal bleeding.");
    printf("What, all of you?");
}

We've got rectal bleeding.What, all of you?

In [4]:
#include <stdio.h>
int main() {
    printf("We've got rectal bleeding.\n");
    printf("What, all of you?");
}

We've got rectal bleeding.
What, all of you?

The `printf()` function uses a number of format specifiers to control the format of the output. These are used in the first parameter, which describes how the remaining parameters are to be formatted:
- `%d` - signed decimal (`int`)
- `%u` - unsigned decimal
- `%o`, `%x` - octal, hexadecimal
- `%l` - long integer (used to store numbers larger than 4 bytes, the limit for `int`). Must be combined with one of the specifiers above, e.g. `%ld`, `%lx`
- `%f` - floating point
- `%.nf` - floating point with `n` decimals
- `%e` - floating point in exponent form
- `%c`, `%s` - character, string

In [5]:
#include <stdio.h>
int main() {
    int a = -15;
    long b = 999999999999999999;
    float c = 3.14159;
    char d = 'f';
    char e[] = "lupus";
    
    printf("Signed int: %d\n", a);
    printf("Unsigned int: %u\n", a);
    printf("Octal: %o\n", a);
    printf("Long: %ld\n", b);
    printf("Float: %f\n", c);
    printf("Float (2d.p): %.2f\n", c);
    printf("Exponent: %e\n", c);
    printf("Char: %c\n", d);
    printf("String: %s\n", e);
    
    return 0;
}

Signed int: -15
Unsigned int: 4294967281
Octal: 37777777761
Long: 999999999999999999
Float: 3.141590
Float (2d.p): 3.14
Exponent: 3.141590e+00
Char: f
String: lupus


### The function `return` statement
The `return` function is used to immediately exit a function, optionally sending a value back to the caller.

The return value from the `main()` function is special, with programs usually returning a zero value to indicate they have exited normally. 
If there is no `return` statement in the `main()` function, this generally will not cause a problem at compile-time (with the compiler assuming a return statement of `return 0;`).  

If the return value is of the wrong type, this may cause a warning at compile-time, or an error at run-time. 
allows the compiler to include or skip parts of code depending on whether certain macros are defined. This can be very useful for debugging.

### Variables
Variables and constants are the basic data objects manipulated by a program.

Declarations declare the variables used, their type and possibly initial value e.g. `int x` or `float y = 0.67`

Expressions combine variables and constants to form new values e.g. `int b = 3*5+a`. Expressions are evaluated according to operator precedence (essentially BIDMAS). 

### Data types
C is a strongly typed language, meaning every variable must have a type, usually one of:
- `char`: a single byte, often used to store a character
- `short`: an integer type, represents small whole numbers
- `int`: an integer type, represents whole numbers
- `long int`, `long long int`: an integer type, represents large or very large whole numbers
- `float`: single precision floating point number
- `double`, `long double`: double precision floating point number

The size of each type can vary depending on the system/compiler.

On 64-bit Linux:
- `char` is 1 byte
- `short int` is 2 bytes
- `int` is 4 bytes
- `long int` is 8 bytes

The `sizeof()` function can be used to check the memory size of a data type or variable in bytes.

In [6]:
#include <stdio.h>
int main() {
    printf("Char: %zu\n", sizeof(char));
    printf("Short: %zu\n", sizeof(short));
    printf("Int: %zu\n", sizeof(int));
    printf("Long: %zu\n", sizeof(long));
    return 0;
}

Char: 1
Short: 2
Int: 4
Long: 8


#### `signed` and `unsigned`
`signed` and `unsigned` apply to to `char` or integer types.

`unsigned` integers are 0 or positive, whereas half of the range of `signed` integers is negative:
- a 1 byte `signed char` stores integers in the range [-128,127]
- a 1 byte `unsigned char` stores integers in the range [0,255]

`<limits.h>` and `<float.h>` specify what limits apply on a given system; they are system and architecture dependent. 

Unsigned arithmetic is performed modulo 2<sup>n</sup>, so overflows 'wrap around' automatically.

In [7]:
#include <stdio.h>
int main() {
    unsigned char x = 255;
    x++; // x = (255 + 1) % 2^8 = 256 % 256 = 0
    printf("%u", x);
    return 0;
}

0

Signed arithmetic overflow is undefined (or implementation-defined). What happens during an overflow is up to the compiler (most compilers use two's complement wrapping behaviour, but this cannot be relied upon). 

In [8]:
#include <stdio.h>
int main() {
    signed char x = 127;
    x++;
    printf("%d", x);
}

-128

#### Character constants

A character constant is a single character (including escape characters) enclosed in single quotes, e.g. `'A'`, `'3'`, `'\n'`. 

These are stored as integer values - specifically, the integer code for that character in ASCII (or another character set), e.g. `'A'` = 65 in ASCII. 

In [9]:
#include <stdio.h>
int main() {
    printf("Printing using %%c: %c\n", 'A');
    printf("Printing using %%d: %d", 'A');
}

Printing using %c: A
Printing using %d: 65

#### String constants

String constants are zero or more characters enclosed in double quotes, e.g. `"Hello"`.

They are stored as a string of `char`s that has a `NULL` character `\0` at the end of the string. 

For example,  `char a[]="Hello";` is the same as `char a[]={'H','e','l','l','o','\0'};`

#### Enumeration

An enumeration (`enum`) creates a new type that can only take one of several named constant values. 

For example, `enum Suit {CLUBS, DIAMONDS, HEARTS, SPADES}` defines a new type `enum Suit`. Any variable of this type can have one of four possible values: `CLUBS`, `DIAMONDS`, `HEARTS` and `SPADES`. A variable `s` of this class can be declared using `enum Suit s`.

Each `enum` named constant value is replaced with an integer value by C during compilation. By default, the first value is 0 and each subsequent value is incremented by 1. 
In the previous example, `CLUBS`, `DIAMONDS`, `HEARTS` and `SPADES` represent `0`, `1`, `2` and `3` respectively. 

In [10]:
#include <stdio.h>
int main() {
    enum Suit {CLUBS, DIAMONDS, HEARTS, SPADES};
    printf("Value of CLUBS: %d\n", CLUBS);
    printf("Value of HEARTS: %d\n", HEARTS);
    enum Suit s = DIAMONDS;
    printf("Value of s: %d", s);
}

Value of CLUBS: 0
Value of HEARTS: 2
Value of s: 1

Values for the enumeration constants can be set manually, and can be arbitrary integers in no particular order. 
It is possible allowed for multiple constants to have the same value
When no value is specified for an enumeration constant, its value is one greater than the value of the previous constant. 

In [11]:
#include <stdio.h>
int main() {
    enum names {ANTIONE=24, ADAM=15, MARCUS, ADRIEN=3};
    printf("Value of ANTIONE: %d\n", ANTIONE);
    printf("Value of ADAM: %d\n", ADAM);
    printf("Value of MARCUS: %d\n", MARCUS);
    printf("Value of ADRIEN: %d\n", ADRIEN);
}

Value of ANTIONE: 24
Value of ADAM: 15
Value of MARCUS: 16
Value of ADRIEN: 3


### True/False and comparison

Traditionally, C did not have a boolean type, instead just using `int`:
- 0 is false
- Any other `int` is true

Comparisons `<`, `<=`, `==`, `>=`, `>`, `!=` will evaluate to:
- 1 if they hold
- 0 if they don't

The bool type has been introduced in later versions, being defined in `stdbool.h`, but the integer convention can still be used if preferred. 

### Statements and compound statements

A statement in C is a single instruction terminated with a semicolon. 

```c
printf("Hello World");
```

A compound statement is a set of statements enclosed in curly braces `{}`. A statement can always be replaced by a compound statement.

```c
{
printf("Hello ");
printf("World\n");
}
```

Note that in C, formatting does not matter, but it is useful for making code readable. 

### Iteration

There are three iteration statements in C:
- the `while` statement is used for loops whose controlling expression is tested before the loop body is
executed
- the `do` statement is used if the expression is tested after the loop body is executed
- the `for` statement is convenient for loops that increment or decrement a counting variable or iterator

#### `while`
The `while` statement has the following form:
```c
while ( expression ) {
    statement
   }
```
- `expression` is the controlling expression
- `statement` is the loop body
- the expression is evaluated and if it is non-zero (true), the body is executed
- the expression is tested before the loop body begins
- if there is only one `statement` in the loop body, the enclosing `{}` can be omitted

In [12]:
#include <stdio.h>

int main(){
    int i = 1;
    while(i<5){
        printf("Verstappen has %d championships\n", i);
        i = i+1;
    }
    
    printf("Verstappen will soon have %d championships", i);
    return 0;
}

Verstappen has 1 championships
Verstappen has 2 championships
Verstappen has 3 championships
Verstappen has 4 championships
Verstappen will soon have 5 championships

#### `do`
The `do` statement has the following form:
```c
do {
 statement
 } while ( expression );
```
- `expression` is the controlling expression
- `statement` is the loop body
- the expression is evaluated and if it is non-zero (true), the body is executed again
- the expression is tested after the loop body ends

In [13]:
#include <stdio.h>

int main(){
    int i = 1;
    do{
        printf("Verstappen has %d championships\n", i);
        i = i+1;
    } while(i<5);
    
    printf("Verstappen will soon have %d championships", i);
    return 0;
}

Verstappen has 1 championships
Verstappen has 2 championships
Verstappen has 3 championships
Verstappen has 4 championships
Verstappen will soon have 5 championships

#### `for`
The `for` statement has the following form:
```c
for ( expr1 ; expr2 ; expr3 ){ 
    statement
  }
```
- `expr1` - initialisation
- `expr2` - conditional
- `expr3` - increment

In [14]:
#include <stdio.h>

int main() {
    int i;
    for (i=1; i<5; i++) {
        printf("Verstappen has %d championships\n", i);
    }
    printf("Verstappen will soon have %d championships", i);
    return 0;
}

Verstappen has 1 championships
Verstappen has 2 championships
Verstappen has 3 championships
Verstappen has 4 championships
Verstappen will soon have 5 championships

#### `break`
The `break` statement causes the innermost enclosing loop to be exited immediately. 

In [15]:
#include <stdio.h>

int main() {
    int i;
    for (i=1; i<5; i++) {
        break;
        printf("Verstappen has %d championships\n", i);
    }
    printf("Verstappen will soon have %d championships", i);
    return 0;
}

Verstappen will soon have 1 championships

#### `continue`
The `continue` statement causes the next iteration for the loop to begin. 
- in the case of a `while` or `do` loop, the test part is executed immediately
- in the case of a `for` loop, control first passes to the increment step

In [16]:
#include <stdio.h>

int main() {
    int i;
    for (i=1; i<5; i++) {
        continue;
        printf("Verstappen has %d championships\n", i);
    }
    printf("Verstappen will soon have %d championships", i);
    return 0;
}

Verstappen will soon have 5 championships

### Conditionals 

#### `if-else`
`if` allows a choice between two alternatives by testing an expresion. An `if` statements can have an `else` clause:
```c
if ( expr1 ) {
    statement1 
  }
  else {
    statement2
  }
```
When executed, `expr1` is evaluated;
- if `expr1` is non-zero, `statement1` is executed 
- otherwise `statement2` (if present) is executed

In [17]:
#include <stdio.h>

int main() {
    for (int i=1; i < 5; i++) {
        if (i % 2 == 0) {
            printf("%d is even\n", i);
        }
        else {
            printf("%d is odd\n", i);
        }
    }
    return 0;
}

1 is odd
2 is even
3 is odd
4 is even


#### `switch`
A `switch` statement allows a choice between different blocks of code based on the value of an expression. 

It has the following form:
```c
switch (expression) {
    case value1:
        // statements
        break;

    case value2:
        // statements
        break;

    default:
        // statements
        break;
}
```
If there is no `break` statement in a block, then execution 'falls through' - the code in subsequent case blocks is executed even if a matching case has already been found. 

The `default` block is executed if the expression does not match any of the cases.



### `x++` vs `++x`

Both `x++` and `++x` can be used to increment a variable `x`, i.e. they both mean `x=x+1`. However, there is a subtle difference between the two:
- `x++` returns the value of `x` first, then increments
- `++x` increments first, then returns the value of `x`

In [18]:
#include <stdio.h>

int main() {
    int x = 5;
    int y = x++;  // y = 5, x incremented to 6
    int z = ++x;  // x incremented to 7, z = 7
    printf("x=%d, y=%d, z=%d\n", y, x, z);
    return 0;
}

x=5, y=7, z=7


### Functions
Functions encapsulate code in a convenient way so that it can be reused, organised and understood more easily. 

A function definition can appear anywhere in the file, as long as the function declaration comes before it and before the function is used. 

#### Function declarations
Functions can be declared before they are defined, using a function declaration:
```
    [return-type] [function-name] ( [parameters] );     
```
For example,
```c
    int add (int a, int b);
```
Note that the input parameters do not need to be named in the declaratio - only their types must be included. Parameter names are only required in the function definition.

Therefore, the example above could also be written as:
```c
    int add (int, int);
```

Function declarations are often placed in a `.h` header file.

#### Call-by-value
Function parameters in C are passed in using a call-by-value semantic; the values of the arguements are copied into the parameter variables of function. A function cannot effect the value of its arguements. 

In [19]:
#include <stdio.h>

int timesFive(int a) {
    a *= 5;
    return a;
}

int main() {
    int x = 3;
    int y = timesFive(x);
    printf("Value returned: %d\n", y);
    printf("Value of x: %d", x);
    return 0;
}

Value returned: 15
Value of x: 3

#### Organisation
Functions can be organised by using `.h` header files and `.c` source files. An example setup is given below:

`timesFive.h`
```c
    int timesFive(int);
```

`timesFive.c`
```c
    #include "timesFive.h"

    int timesFive(int a) {
        a *= 5;
        return a;
    }
```

`main.c`
```c
    #include <stdio.h>
    #include "timesFive.h"

    int main() {
        int x = 5;
        int y = timesFive(x);
        printf("%d times 5 is %d\n", x, y);
        return 0;
    }
```

If the same header is included multiple times in the same compilation unit, it can cause a 'multiple definitions' error. To prevent this, header guards can be used in header files to conditionally include the contents of the header only if it hasn’t been included already.

For example,

`timesFive.h`
```c
    #ifndef TIMESFIVE_H
    #define TIMESFIVE_H
    int timesFive(int);
    #endif
```

### Data structures

#### `struct`
A `struct` is a way to group different variables together into one unit. It is the closest thing C has to a class from an object-oriented language. 

Each variable inside the struct is called a member, and members can all have different types. A member can be accessed by using the struct’s name followed by a dot and the member name.

In [20]:
#include <stdio.h>

struct point {
    int x;
    int y;
};

int main() {
    struct point a_point = {5, 6};
    printf("Struct initialised to: %d and %d\n", a_point.x, a_point.y);
    
    a_point.x = 4;
    a_point.y = 3;
    printf("Struct values after assignment: %d and %d\n", a_point.x, a_point.y);

    return 0;
}

Struct initialised to: 5 and 6
Struct values after assignment: 4 and 3


Each structure (`struct`) represents a new scope; 
any names declared in that scope won't conflict with other names in a program.

The `.` operator used to access members takes precedence over pretty much all operators. 

If two structures have the same type, one can be assigned to the other, e.g. `point2 = point1;`.
This essentially copies the entire contents of `point1` into `point2` member-by-member, e.g. `point1.x` is copied into `point2.x`, `point1.y` into `point2.y` and so on. 

Structures can be nested within other structures. 

#### `union`

A `union`, like a `stuct`, consists of one or more members, possibly of different types.

However, it differs in that the compiler allocates only enough space for the largest of the members, which overlay each other within this space:
- If in a `struct` there is an `int` and a `double`, the memory for that `struct` is the space for an `int` and a `double`. 
- If in a `union` there is an `int` and a `double`, the memory for that `union` is the space for only a `double`.

Therefore, in a `union`, assigning a new value to one member alters the values of the other members as well. 

The structure `s` and the union `u` differ in just one way:
- The members of `s` are stored at different addresses in memory.
- The members of `u` are stored at the same address.

Members of a union are accessed in the same way as members of a structure. Changing one member of a union alters any value previously stored in any of the other members.

In [21]:
#include <stdio.h>

union {
  int i;
  double d;
} u; //declares variable u of anonymous union type

int main() {
    u.i = 82;
    printf("u.i = %d, u.d = %f\n", u.i, u.d );
    u.d = 74.8;  //changing u.d corrupts u.i
    printf("u.i = %d, u.d = %f\n", u.i, u.d );
    return 0;
}


u.i = 82, u.d = 0.000000
u.i = 858993459, u.d = 74.800000


The properties of unions are almost identical to the properties of structures.

By default, only the first member of a union can be given an initial value:
```c
union {
  int i;
  double d;
} u = {0};
```
Here, the member `i` is initialised to 0.

Designated initialisers can also be used with unions which allows the member of the union that should be initialised to be specified. 
Only one member can be initialised, but it doesn't have to be the first one.
```c
union {
  int i;
  double d;
} u = {.d = 10.0};
```
Unions can be used to save space in structures. 

### `typedef`

`typedef` can be used to assign names to types:
```c
typedef unsigned char byte;
byte b1 = 12;
```

This can also be used with `struct`s and `union`s:

In [22]:
typedef struct coords {
  int x;
  int y;
} point;


typedef union id_thing {
  int i;
  double d;
} number;

int main() {
    point p1 = {5, 4}; // instead of struct point p1;
    number n = {.d = 10.0}; // instead of union number n;
    return 0;
}

### Input and output

There are several methods of I/O (input/output) in C, e.g.:
- terminal
- files
- arguments

#### Reading from a file
- `fopen` opens a file
- `fclose` closes a file
- `fread` reads from a file
- `fwrite` writes to a file

#### Passing arguments to `main`
In order to pass function to `main` from the command line, the `main` declaration can be modified to:
```c
int main(int argc, char **argv);
```
- `argc` is the number of arguements passed to the program (including the program name itself)
- `arg` is an array of C-strings (char*) containing the actual arguments
  - `argv[0]` is always the program name or path
  - `argv[1]` is the first argument passed by the user

### Pointers
Variables are a logical name for an allocated area of memory which has been assigned to store a value of a certain type. 

The memory address of a variable can be accessed using `&`. 

For example, suppose we have executed `int i = 10;`. In memory, this could be stored as 

![memory example](mem_example.png)

Then, the memory address of `i` is `&i` and has a value of 4. Note that real addresses are usually given in hexadecimal and look something like 0x7211ac3f7028.

The symbol `*` can be used to declare that a variable is a pointer to a value of the type on the left. 

For example, `int *p` would declare variable `p` as a pointer to a memory location containing an integer. 

`int *p = &i` would store the memory address of `i` in the variable `p`, so for the example above `p = 4`.

`*` can also be used to access and modify the value that the pointer points to, e.g. `*p = 7`, `*p = *p + 1`. It is known as the indirection operator when used in this way.

In [23]:
#include <stdio.h>

int main() {
    int x = 10;
    int *p = &x;
    printf("Value of p (memory address): %p\n", p);
    printf("Value at p (in memory address): %d\n", *p);
    *p = *p + 1;
    printf("New value at p: %d\n", *p);
    printf("Value of x: %d", x);
    return 0;
}

Value of p (memory address): 0x7fffd0aae20c
Value at p (in memory address): 10
New value at p: 11
Value of x: 11

It is important not to apply the indirection operator to an uninitialised pointer variable, as this causes undefined behaviour.
It can be especially dangerous to try to assign a value to the memory at an unitialised pointer, as it may try to rewrite some memory, which can cause unpredictable behaviour and segmentation faults.

Assignment can be used to copy pointers of the same type, e.g.
```c
int i, *p, *q;
p = &i;
q = p;
```

Here, both `p` and `q` are pointers to the memory address of `i`, and both can be used to modify the value using `*p` or `*q`. Any number of pointers can point to the same object. 

It's also possible to have pointers to pointers.

#### Pointers as arguments
Since C uses [call-by-value](#Call-by-value) by default, pointers need to be used to modify variables outisde of the function's scope. 

The swap function below does not work, since the arguments are passed by value:

In [24]:
#include <stdio.h>

void swap(int a, int b) {
    int temp = a;
    a = b;
    b = temp;
}

int main() {
    int x = 4;
    int y = 81;
    printf("Original values: x=%d, y=%d\n", x, y);
    swap(x,y);
    printf("After calling swap(x,y): x=%d, y=%d", x, y);
    return 0;
}

Original values: x=4, y=81
After calling swap(x,y): x=4, y=81

Using pointers instead to pass the addresses of the variables as the arguments fixes this:

In [25]:
#include <stdio.h>

void swap(int *a, int *b) {
    int temp = *a;
    *a = *b;
    *b = temp;
}

int main() {
    int x = 4;
    int y = 81;
    printf("Original values: x=%d, y=%d\n", x, y);
    swap(&x,&y);
    printf("After calling swap(x,y): x=%d, y=%d", x, y);
    return 0;
}

Original values: x=4, y=81
After calling swap(x,y): x=81, y=4

##### `scanf`
The `scanf` funcion is used to read data from `stdin` and store it in variables. 
The arguments in calls of `scanf` are pointers - the pointer to the variables must be given as an argument, not the variables themselves.

For example, the following code is incorrect:
```c
int i;
scanf("%d", i);
```

The following code is correct:
```c
int i;
scanf("%d", &i);
```

If a variable pointer has been defined, then the argument should not use the `&` operator:
```c
int i;
int *p = &i;
scanf("%d", p);
```

#### Arrays
An array in C is declared using the format:
```
data_type array_name[array_size];
```

For example, `int a[10];` declares a fixed-size array `a` holding ten `int` values.

- The `i`th element of an array `a` is accessed using `a[i]`.
- The size of the array is equal to the number of elements in the array times the size of the type of elements in the array
  - For the array declared by `int a[10];`, `sizeof(a)` = 10 * `sizeof(int)` = 40 bytes
- Each array is stored in memory as a single contigous block
- Assignment of an array to an array is not supported
- Arrays can be multi-dimensional  e.g. `int matrix[2][3] = {{1,2,3},{4,5,6}};`

An array can't be passed into a function, but a pointer to the array can be passed to the function. 

#### Strings
In C, strings are represented as an array of characters.

Since the assignment of an array to an array is not possible, strings can be copied using `strcpy()` from the `string.h` library (there are also other, more memory-safe functions that can be used to copy strings). 

Strings are null-terminated, which is important to note when allocating space to them:

In [26]:
#include <stdio.h>
#include <string.h>
int main() {
    char a[] = "Lupus";
    printf("strlen(a)=%zu\n", strlen(a));
    printf("sizeof(a)=%zu", sizeof(a));
    return 0;
}

strlen(a)=5
sizeof(a)=6

#### Pointer safety
Pointers can cause hard-to-diagnose problems in programs. 

This can be protected against by:
- initialising pointers to `NULL`, and setting them to `NULL` when no longer required
- use a guard before using pointers, e.g. `assert(ptr != NULL);` from `<assert.h>`

### The `->` operator
The `->`operator provides a shorthand method of accessing members of structures using a pointer.

Suppose we have the following:
```c
struct point {
 int x;
 int y;
} pt, *ptr;

ptr=&pt;
```
Then the following three operations are all functionally identically:
```c
pt.x=3; // Access directly
(*ptr).x=3; // Access by dereferencing a pointer
ptr->x=3;   // Access using the -> operator
```

### Memory

The memory layout for a program is genrally of the following form:

![memory layout](memory-layout.png)



Each section has a different use:
- the stack stores local variables, function parameters, and return addresses for each function call
- the static data section stores data that stays in memory for the duration of the program, such as global variables and read-only data
- the heap is used for storing more long-term data, and the programmer controls what is in the heap and when it is released. The heap is typically much bigger than the stack
  - both the stack and heap grow into free memory (from opposite ends) - if they collide, the program may crash due to memory overflow

#### `malloc`
The `malloc()` function from `<stdlib.h>` is used to dynamically allocate memory on the heap during a program’s execution.

The function prototype for `malloc()` is: 
```c
void *malloc(size_t size);
```
- this allocates a contiguous block of memory `size` bytes long
- the return type is `void *`, which is a generic pointer type that can be used with all types
  - in many cases, it is automatically recast, e.g. if storing the result of `malloc()` into a pointer that has already been defined with a type
- `malloc()` returns a `NULL` pointer if it fails to allocate the requested memory

For example, to allocate a 10 `int` array, use:
```c
int *a =  malloc(10 * sizeof(int));
```
- the initial values of the array are random, although Jupyter Notebook sets them to 0 for safety

In [27]:
#include <stdio.h>
#include <stdlib.h>
int main(){
    int *a =  malloc(10 * sizeof(int));
    if (a == NULL) exit(1);
    
    for(int i = 0; i < 10; i++){
        printf("%d\n",a[i]);
    }
    return 0;
}

0
0
0
0
0
0
0
0
0
0


#### `free`
The `free()` function (also from `<stdlib.h>`) is used to release memory that was previously allocated on the heap. 

The function prototype is:
```c
void free(void *ptr);
```
- it takes a generic pointer to a block of memory and returns the memory for reuse

If memory on the heap allocated using `malloc()` is 'forgotten' about and not released use `free()`, then this is called a memory leak. Memory leaks can be very dangerous and difficult to trace, and can eventually use up all memory.

`free()` has no return value to indicate if it succedded or failed, so it may appear to succeed even if it hasn't, such as if it is passed a pointer that is not allocated using `malloc()` (in which case the behaviour is undefined).

An example of how its use is given below:

In [28]:
#include <stdio.h>
#include <stdlib.h>
int main(){
    int *a =  malloc(10 * sizeof(int));
    if (a == NULL) exit(1);
    
    free(a);
    a = NULL;
    return 0;
}

#### `calloc`
The `calloc()` function (again from `<stdlib.h>`) is pretty much the same as `malloc()`, except it initialises all of the bits to 0 for security (which Jupyter Notebook already did, but is not the case for most C compilers).

The function prototype is:
```c
void *calloc(size_t n, size_t size);
```

- allocates a contiguous block of memory of `n` elements each of `size` bytes long, initialises all bits to 0
  - this is useful to ensure old data is not reused inappropriately  (it prevents leakage of stale data, which is important for secure systems programming)
- just like `malloc()`, the return type of `calloc()` is `void*` (the generic pointer type used for all types) and it returns a `NULL` pointer if it fails to allocate memory

#### `realloc`
The realloc() function (again from <stdlib.h>) is used to resize a previously allocated memory block on the heap.

The function prototype is:
```c
void *realloc(void *ptr, size_t size);
```
- allows a dynamic change in size of a previously allocated block (to `size`) of memory pointed to by `ptr`
  - `ptr` must point to memory previously allocated by `malloc()`, `calloc()` or `realloc()`
- preserves existing data up to the smaller of the old and new sizes
- if there is not space to resize the block in-place, it moves and copies the contents to a new location
  - the old block is automatically freed
- returns a pointer to the resized memory (it will be the same as `ptr` if it was able to resize in-place, and different if it wasn't)
- returns a `NULL` pointer if it fails

### Scope
An identifier in a C program is visible (meaning that it can be used) only in some possibly discontiguous portion of the
source code called its scope.

```c
#include <stdio.h>

int x = 5; // x has global scope. It can be accessed anywhere in the program

void func() {
    int y = 10; // y has block scope. It can only be accessed within func()
    printf("%d\n", y);
}

int main() {
    printf("%d\n", x);
    return 0;
}
```

Note that block scope refers to any code block, and not just functions. For example, a variable defined in an `if` statement has block scope.

```c
if (a > b) {
    int temp = a; // temp has block scope
    a = b;
    b = temp;
}
```

#### Variable lifetime
Lifetime refers to how long a variable exists in memory, regardless of where its scope is. There are three possible lifetimes:
- static: lives for the entire duration the program is running (e.g. global variables, like `x` in the example above)
- automatic: lives while the block it is in is executing (e.g. normal local variables, like `y` in the example)
- dynamic: its life is controlled by the programmer (e.g. using `malloc()` and `free()`)

#### Storage classes
A storage class defines the scope and lifetime of variables and/or functions.

Each variable in C has one of the following four storage types (these are also keywords):
- extern
- static
- auto
- register

##### `extern`
`extern` can be used to declare a variable without defining it (so it will be defined elsewhere but should be accessible here). 

Variables with the extern storage type have lifetime and scope of the whole program.

##### `static`
`static` and `extern` are mutually exclusive as keywords. 

Static variables have the same lifetime as the program. 

- static global variables have file scope (they can be accessed anywhere in the file)
- static local variables have block scope
  - a static local variable keeps its value between function invocations
 
##### `auto`
`auto` variables have the same lifetime as the function where they are defined, and have block scope. 

Storage is automatically allocated when the function is called and de-allocated when it terminates. 

Local variables are automatic by default, so `auto` is never explicity used in practice.

##### `register`
`register` suggests that a variable should stored in a register rather than main memory if possible. 

It is much quicker to retrieve a variable from a register than in main memory, but space is more limited, so only highly frequently used variables should be stored in registers. 

It is not possible to use the address of (`&`) operator on register variables. 

Not all `register` variables are necessarily stored in registers (such as if there are too many `register` variables), and some variables may be stored in registers without being declared with `register`, due to compiler optimisations. 

Modern compilers are good at working out which variables are best made into register variables and will do this automatically, so explicity using `register` is quite rare. 

#### Global variables
Global variables are convenient when many functions must share a variable or when a few functions share a large number of variables.

In most cases, it is better for functions to communicate through parameters rather than shared variables: 
- if a global variable is changed during program maintenance (e.g. altering its type), every function needs to be checked to see how the change effects it
- if a global variable is assigned an incorrect value, it may be difficult to identify the guilty function
- functions that rely on global variables are hard to reuse elsewhere

### `const`
`const` is a keyword in C used to define variables whose values should not be altered after initialisation.

It ensures code safety by preventing accidental changes to these values, making code more predictable and maintainable. It also helps the compiler to optimise the code since it knows the value is immutable. 

The `const` keyword is used differently when pointers are involved.
The two statements below are not equivalent:
```c
int const *ptr_a;
int *const ptr_b;
```
- in the first one, the `int` (i.e. `*ptr_a`) is `const`, so the pointer can be changed but the value it points to cannot
- in the second one, the pointer itself (`ptr_b`) is `const`, so the pointer cannot be changed, but the value it point to (`*ptr_b`) can be changed

### `memset`
`memset` is a function from the standard header `<string.h>`.

Its function prototype is:
```c
void *memset(void *ptr, int value, size_t n);
```

It fills the first `n` bytes of memory at `ptr` with the byte value `value`.

This can be useful for zero-initialising arrays, structs or buffers, clearing senstive data before freeing memory and resetting large regions of contigous memory quickly. 

Note that `memset` sets the values of bytes, so it may yield unexpected integer values when setting non-zero patterns (since an `int` is 4 bytes).

In [7]:
#include <stdio.h>
#include <string.h>

int main() {
    int a[2];
    memset(a, 1, sizeof(a)); // sets each byte of a to 1
    printf("%d", a[0]);      // an integer is 4 bytes, 0x01010101 = 16843009
    return 0;
}

16843009

### `memcpy`
`memset` is another function from the standard header `<string.h>`.

Its function prototype is:
```c
void *memcpy(void *dest, const void *src, size_t n);
```
It copies `n` bytes from `src` to `dest`. Note that it operates on raw memory, not types or array dimensions. 

In [10]:
#include <stdio.h>
#include <string.h>

int main() {
    int a[2][2] = {{1,2},{3,4}};
    int b[2][2] = {{5,6},{7,8}};
    printf("Before memcpy: b = {{%d,%d},{%d,%d}}\n", b[0][0], b[0][1], b[1][0], b[1][1]);
    memcpy(b, a, sizeof(a));
    printf("After memcpy: b = {{%d,%d},{%d,%d}}", b[0][0], b[0][1], b[1][0], b[1][1]);
    return 0;
}

Before memcpy: b = {{5,6},{7,8}}
After memcpy: b = {{1,2},{3,4}}

### Other memory functions
Some other potentially useful memory functions are given below.

#### `memcmp`
Header: `<string.h>`

Prototype:
```c
memcmp(const void *a, const void *b, size_t n);
```
Compares the first `n` bytes of two memory blocks. Returns <0, 0, or >0 depending on lexical byte order.


#### `memchr`
Header: `<string.h>`

Prototype:
```c
memchr(const void *ptr, int value, size_t n)
```
Scans memory for the first occurrence of a byte `value`. Useful for binary search within raw buffers.

#### `alligned_alloc`
Header: `<stdlib.h>`

Prototype:
```c
aligned_alloc(size_t alignment, size_t size);
```
Allocates memory aligned to a given power-of-two boundary. 

### Function pointers
It is possible to take the address of a function (using `&`, just like with a variable).

For example, say we wanted a function pointer the `strcpy` function, which has declaration 
```c
char *strcpy(char *dst, const char *src);
```

The function pointer can then be declared as:
```c
char *(*strcpy_ptr)(char *dst, const char *src);
```
and assigned using either of:
```c
strcpy_ptr = strcpy;
strcpy_ptr = &strcpy;
```

The function can then be called through the function pointer, e.g.
```c
strcpy_ptr(&a, &b);
```

## Compilation

![Compilation Stages](compiling.png)

### The C Pre-processor

The C pre-processor is a program that runs before compilation, modifying the source code according to the pre-processor directives. Such directives include `#define` e.g. `#define PI 3.14151` and `#include` e.g. `#include <stdio.h>`.

`#define` is used to define a macro, which is essentially a name for a value or a code snippet. The pre-processor replaces every occurence of the macro with its replacement text before the code is compiled.

When using `#include`:
- if `< >` are used, the system directory (`usr/include`)  is prioritised
- if `" "` are used, the current working directory is used
  - the appropriate delimiters should be used depending on the type of header file e.g. system or user-defined

#### Conditional compilation
Conditional compilation allows the compiler to include or skip parts of code depending on whether certain macros are defined. This can be very useful for debugging.

In [29]:
#include <stdio.h>

// Uncomment to enable debug mode
//#define DEBUG

int main() {
    printf("Program started\n");

#ifdef DEBUG
    printf("Debug mode is ON\n");
#else
    printf("Debug mode is OFF\n");
#endif
    return 0;
}

Program started
Debug mode is OFF


In [30]:
#include <stdio.h>

// Comment to enable debug mode
#define DEBUG

int main() {
    printf("Program started\n");

#ifdef DEBUG
    printf("Debug mode is ON\n");
#else
    printf("Debug mode is OFF\n");
#endif
    return 0;
}

Program started
Debug mode is ON


#### Parameterised macros 

A parameterised (function-like) macro accepts parameters and uses them in its replacement text. They act like inline functions, but the replacment is done textually by the pre-processor before compilation, preventing the need for actual function calls. 

The parameters may appear as many times as desired in the replacement text. 

In [31]:
#include <stdio.h>

#define ADD(a, b) ((a) + (b))  // Parameterized macro

int main() {
    int x = 5, y = 3;

    printf("Sum: %d\n", ADD(x, y));      // replaced by ((x) + (y))
    printf("Sum: %d\n", ADD(2+3, 4+1));  // replaced by ((2+3) + (4+1)) = 10

    return 0;
}

Sum: 8
Sum: 10


Using parameterised macros may make a program slightly faster, since a function call usually requires some overhead during program execution, but a macro invocation does not. Furthermore, macros are 'generic' - they can accept arguements of any type, provided that the resulting program is valid. 

However, this can also be a disadvantage, as arguements aren't checked or converted to the correct type by the pre-processor, whereas in a function, the compiler checks each arguement to see if it has the appropriate type. Since macros work as direct substitutions in code, it is important to always use brackets to the fullest extent possible to prevent any unexepected results.  

### Compiling with GCC

GCC is a common compiler for C, which takes C source code and turns it into a machine-executable binary that can be run. 

A program can be compiled from the shell using `gcc -o outfile file.c`:
- the option `-o` is used to name the output file
- the option `-E` is used to do pre-processing only (this can also be done using `cpp` i.e. `cpp file.c`)
- the option `-S` is used to go as far as compilation only (no assembling/linking)
- the option `-c` is used to go as far as assembly only (no linking)
- the option `-l` is used to link external libraries
- the option `-I` is used to include the path for more `.h` header files
- multiple files can be compiled and linked together by listing them (e.g. `gcc part1.c part2.c -o outfile`)

### Makefiles

When a program consists of multiple source files, compiling each one can be tedious and error-prone. A Makefile is used to automate and manage the build process, using the `make` keyword. 

A Makefile is a rule-based (declarative) configuration file that tells the make utility how to build the main program. 

The format of each rule is:
```
target [target...]: [component ...]
    [command 1]
    ...
    [command n]
```
`target` is the file to be created and `component` is the files that the target depends on, which must exist or be created by another rule. Note that the space before each `command` is a Tab character. 

Let's say for example we have the files `main.c`, `counter.h`, `counter.c`, `sales.h` and `sales.c`.

Then we could have a Makefile that looks like:

```makefile
all: counter.o sales.o main.c
        gcc -o program main.c counter.o sales.o

counter.o: counter.c counter.h
        gcc -c counter.c

sales.o: sales.c sales.h
        gcc -c sales.c

clean:
        rm -rf program counter.o sales.o
```

`all` is the default target that is built when you just run `make`. 

#### Macros

Macros in Makefiles can be used to store definitions e.g. `CC=gcc`. Macros can also be defined using the output of shell commands e.g. `DATE = $(shell date)`. The Macros can be used in the Makefile, using `$` just as in the shell normally. 

#### Pattern rules

Pattern rules can be used to match multiple files, so that each dependency does not need to be manually listed, e.g.
```
DEPS = counter.h sales.h
%.o: %.c $(DEPS)
        gcc -c $< -o $@
```
This example rule means that to build any `.o` object file, you need a file with the same name ending in `.c`, as well as the header files listed in `DEPS`. `$<` means the first prerequisite, i.e. the first dependency listed after the colon, and `$@` means the target (in this case the `.o` file being built).

`%` is the wildcard symbol, used to match any non-empty substring. The substring that % matches in the target is called the stem. It can only be used once per pattern (it cannot be used multiple times in the target), e.g.:
- `%.c` as a pattern matches any file name that ends in `.c`. 
- `s.%.c` as a pattern matches any file name that starts with `s.`, ends in `.c` and is at least five characters long (there must be at least one character to match the `%`).

Automatic variables have values that are computed for each rule that is executed, based on the current target being built. These can be used in commands. 

| Variable | Meaning |
|----------|---------|
| `$@`       | The target filename |
| `$<`       | The first prerequisite |
| `$^`       | All prerequisites, with duplicates removed |
| `$+`       | All prerequisites, with duplicates kept |
| `$?`      | Newer prerequisites than the target |
| `$*`       | The stem (the part that % matched) |

#### Additional information
- comments can be included by starting the line with `#`
- lazy evaluation is used - an expression is not evaluated or computed until its value is actually needed
- if a target exists and has a later timestamp than all of its components, the Makefile will assume it is up to date and will not re-process it
- Makefiles are not linked with C; they can be used with any code/work
- any specific rule can be run by invoking its target e.g. `make sales.o`

## The Shell

A shell is a powerful command-line interface (CLI) thats allows the user to interact with the operating system (OS) by typing commands. This includes the ability to:
- run programs
- control how programs work
- move around between different directories/folders
- perform sequences of commands to achieve more complex work

There are a number of different shells, such as bash and PowerShell. 

### Basic Commands

Some basic commands are given below. 

Note: the `!` before each command is not needed when using an actual shell (it is only necessary since this is a Jupyter Notebook).

These commands seem to not be working with jupyter-c-kernel; they have been run using the Python 3 kernel instead.

`pwd` - *Print working directory*

In [1]:
!pwd

/mnt/d/Notebooks/COMP2221 Programming Paradigms


`ls` - *List*

In [2]:
!ls 

 Lectures		      gradescope-submission	  myscript.sh
 Practicals		      gradescope-submission.zip   myscript2.sh
'Systems Programming.ipynb'   mem_example.png		  permission_string.png
 compiling.png		      memory-layout.png


`man` - *Manual*

In [3]:
!man ls

[4mLS[24m(1)                                               User Commands                                               [4mLS[24m(1)

[1mNAME[0m
       ls - list directory contents

[1mSYNOPSIS[0m
       [1mls [22m[[4mOPTION[24m]... [[4mFILE[24m]...

[1mDESCRIPTION[0m
       List  information  about the FILEs (the current directory by default).  Sort entries alphabetically if none of
       [1m-cftuvSUX [22mnor [1m--sort [22mis specified.

       Mandatory arguments to long options are mandatory for short options too.

       [1m-a[22m, [1m--all[0m
              do not ignore entries starting with .

       [1m-A[22m, [1m--almost-all[0m
              do not list implied . and ..

       [1m--author[0m
              with [1m-l[22m, print the author of each file

       [1m-b[22m, [1m--escape[0m
              print C-style escapes for nongraphic characters

       [1m--block-size[22m=[4mSIZE[0m
              with [1m-l[22m, scale sizes by SIZE whe

`cd` - *Change directory*
- `.` *= current directory*
- `~` *= home folder*
- `..` *= one folder up*

In [4]:
!pwd

/mnt/d/Notebooks/COMP2221 Programming Paradigms


In [5]:
!cd ~ && pwd

/home/francis


In [6]:
!cd .. && pwd

/mnt/d/Notebooks


### `stdin`, `stdout` and `stderr`

`stdin`, `stdout` and `stderr` are the three built-in communication channels that each program recieves from the OS when it starts running. They remove the need to worry about I/O devices.
- `stdin` (Standard Input) is where programs read data from
- `stdout` (Standard Output) is where programs send normal output
- `stderr` (Standard Error) is where programs send error messages

### Pipes 

The shell provides many small tools (commands) - the power comes from composing them together. Pipes provide a means to do this. 

By default, each command takes an input (from the keyboard) and produces an output (to the screen). The input and output of a command can be redirected:
- `<` - taken input from a file
- `>` - write output to a file
  - a single `>` overwrites the file; `>>` appends to the file
- `|` - take the output of one command and use at input to the next

### `grep`
`grep` is a search tool that can be used to search through files or the output of other commands (via pipes). It can search through specific file(s) by providing the filename(s), or it can search through all files in the current directory by using the `-r` recursive flag.

In [7]:
!grep "shell" "systems programming.ipynb"

    "A program can be compiled from the shell using `gcc -o outfile file.c`:\n",
    "Macros in Makefiles can be used to store definitions e.g. `CC=gcc`. Macros can also be defined using the output of shell commands e.g. `DATE = $(shell date)`. The Macros can be used in the Makefile, using `$` just as in the shell normally. \n",
    "A shell is a powerful command-line interface (CLI) thats allows the user to interact with the operating system (OS) by typing commands. This includes the ability to:\n",
    "There are a number of different shells, such as bash and PowerShell. \n",
    "Note: the `!` before each command is not needed when using an actual shell (it is only necessary since this is a Jupyter Notebook).\n",
      "              do not list implied entries matching shell PATTERN (overridden by \u001b[1m-a \u001b[22mor \u001b[1m-A\u001b[22m)\n",
      "              do not list implied entries matching shell PATTERN\n",
      "              use quoting style WORD for entry names

In [8]:
!grep -r "pipes" 

.ipynb_checkpoints/Systems Programming-checkpoint.ipynb:    "`grep` is a search tool that can be used to search through files or the output of other commands (via pipes). It can search through specific file(s) by providing the filename(s), or it can search through all files in the current directory by using the `-r` recursive flag."
.ipynb_checkpoints/Systems Programming-checkpoint.ipynb:      ".ipynb_checkpoints/Systems Programming-checkpoint.ipynb:    \"`grep` is a search tool that can be used to search through files or the output of other commands (via pipes). It can search through specific file(s) by providing the filename(s), or it can search through all files in the current directory by using the `-r` recursive flag.\"\n",
.ipynb_checkpoints/Systems Programming-checkpoint.ipynb:      ".ipynb_checkpoints/Systems Programming-checkpoint.ipynb:      \".ipynb_checkpoints/Systems Programming-checkpoint.ipynb:    \\\"`grep` is a search tool that can be used to search through files or th

`grep` uses **regular expressions** for matching text. 

## Regular Expresions
Regular expressions provide a concise way to match different strings. They use a specific syntax:
- `.` - matches any single character (except a newline character)
- `*` - matches zero or more of the preceeding character
- `?` - matches zero or one of the preceeding character
- `+` - matches one or more of the preceeding character
- `[ABC]` - matches one character that is `A`, `B` or `C`
- `[A-Z]` - matches any upper case character `A` to `Z`
- `[0-9]` - matches any digit

For example, the regular expression `[A-Za-z]*[0-9].txt` matches zero or more letters (uppercase or lowercase), followed by exactly one digit and the literal suffix `.txt`.
Examples of strings that this expression would match include `MyFile5.txt`, `abc0.txt` and `1.txt`.

## File Permissions
Every file and directory in UNIX has an access mode controlling who can read, write, or execute it.

| Permission | Symbol | Description |
|------------|--------|-------------|
| Read       | r      | View or copy the file contents |
| Write      | w      | Modify or delete the file |
| Execute    | x      | Run as a program (for files) or enter (for directories) |


There are three permission groups which can each be granted specific permissions:
- Owner (user) - the person who created the file
- Group - a named collection of users who share the same permissions
- Others - everyone else

The permission string is a 10 character string that specifies the permissions of the different groups. 

<img src="permission_string.png" alt="Permission String" width="300px">

File permissions can be changed using `chmod`. 
The syntax for `chmod` is `chmod [permissions] [file]`.
For example, `chmod u+x file.sh` adds execute permission for the owner (for the file file.sh).

## Text Operations

#### `sort`

`sort` takes in a file, if specified, or reads from `stdin` if not file is specified. It sorts the input (alphabetically/numerically) and outputs it to `stdout`, or a file if specified with `-o filename`. 

In [9]:
!echo "C \nA \nD \nB"

C \nA \nD \nB


In [10]:
!echo "C \nA \nD \nB" | sort

C \nA \nD \nB


### `translate`

Usage: `tr SET1 SET2`
- translates or deleted characters from SET1 to SET2
- e.g. `tr 'A-Z' 'a-z' produces a lower case version of `stdin`
- option `-c` takes the complement of SET1
  - `tr -c 'a-zA-Z' '\n'` replaces all non-letter characters with newlines
- option `-s` squeezes repeats to a single character
  - `tr -s ' '` converts multiple spaces into one
- option `-d` deletes all characters in SET1

In [11]:
!echo "abc123" | tr 'a-z' 'A-Z'

ABC123


In [12]:
!echo "abc123" | tr -d 'a-z'

123


In [13]:
!echo "abc123" | tr -dc 'a-z'

abc

In [14]:
!echo "aaabccc12223" | tr -s 'ac2' 

abc123


### `uniq`

`uniq` is used to remove or report repeated lines. It only removes consecutive repeated lines, so it is often used with `sort` to find/remove repeated lines throughout the document (i.e. `sort | uniq`). The option `-c` can be used to count the number of repitions. 

In [15]:
!echo "a\na\nb\na\nc\na" | uniq -c

      1 a\na\nb\na\nc\na


In [16]:
!echo "a\na\nb\na\nc\na" | sort | uniq -c

      1 a\na\nb\na\nc\na


In [17]:
!echo "a\na\nb\na\nc\na" | uniq

a\na\nb\na\nc\na


In [18]:
!echo "a\na\nb\na\nc\na" | sort | uniq

a\na\nb\na\nc\na


## File handling
Files are stored in a hierarchical structure (a tree) - the top level is the root directory `/`. Each directory (folder) can contain files or subdirectories, which allows grouping and organisation.

There are a number of commands for navigating around the file system. `ls` and `cd` have been covered [previously](#Basic-Commands), but additional commands include:
- `mkdir` - make a new folder
- `mv` - move a file/folder (also used to rename)
- `cp` - copy a file/folder
- `rm` - delete a file, or a folder using `-r`
- `du` - show disk usage of a file/folder
- `find` - search for files/folders in a directory tree

## Shell scripts

A shell script is a collection of commands enclosed in a file. 

It allows tasks to be automated by running each command in order automatically, rather than having to type out each command manually. 

When writing a shell script:
- the script can be written in any chosen text editor
- the script should be saved with a `.sh` extension
- they must all begin with the line `#!/bin/bash` (when writing a script for the bash shell)
  - `#!` tells UNIX it is a script that can be run
  - `/bin/bash` tells Linux what program to run the script with



In [19]:
!bash myscript.sh

Hello from myscript.sh


Parameters can be passed in to a script when it is run. The parameters are referred to using the `$` sign in scripts i.e. the first parameter is `$1`, the second is `$2`. 

In [20]:
!bash myscript2.sh "foo" "bar"

Input 1: foo, Input 2: bar


### `For` loops

For loops are useful for performing the same operation on lots of files. The basic syntax is 
```
#!/bin/bash
for f in *;
do
 #something in here
 echo $f
done
```


### `If` statements

An example of an if statement in a bash shell script is:
```
#!/bin/bash
if [ $1 -lt $2 ]
then
 echo "yes" $1 "is less than" $2
else
 echo "no it isn't"
fi
```
The `else` clause is optional. For the comparison, `==`, `!=`, `-gt`, `-lt`, `-le` and `-ge` are used for equality, inequality, greater than, less than, less than or equal to and greater than or equal to respectively. 

### Shell variables

A shell variable is a name that stores a temporary value in the shell session. Values can be strings, numbers, filenames, or any text. Like in shell scripts, they are accessed using `$`.

In [21]:
!name="Eric" && echo "Name is $name"

Name is Eric


### Environmental variables
Environment variables store information about the user session and are shared with programs started from the shell.

| Variable | Meaning                               |
|------------|---------------------------------------|
| `$USER`    | Current username                       |
| `$HOME`    | User’s home directory                  |
| `$PWD`     | Present working directory              |
| `$PATH`    | List of directories searched for commands |
| `$SHELL`   | Path to login shell               |

The `export` command be used to change the value of an existing environmental variable, or create a new one, e.g. `export MYVAR="HELLO"`.

## Git

Git is software for tracking changes in files, keeping a history of modifications and enabling you to revert to previous versions, compare changes and see how made what changes. It is used for coordinating work among collaborators and has support for continous integration (CI) tools. 

Common git commands include:
- `git clone` - creates a local copy of a given respository
- `git add` - stage new/modified files for the next commit
- `git rm` - removes files from git tracking
  - using the `--cached` option keeps a local copy
- `git commit` - commits the current staged changes
- `git push` - add the changes made to the repository
- `git pull` - get the changes made to the repository

