# Systems Programming

## Lecture 11: Large Programs

### Amir Atapour-Abarghouei

amir.atapour-abarghouei@durham.ac.uk


# Recap

In the previous lecture, we learned about **Debugging**

### `gdb`

- running the program

- inspecting variables

- Breakpoints

- Watchpoints

- Attaching to running processes

- Command-line options


# Recap

We also learned about **Recursion**, which is a function that calls itself. This must have both a:
- base case
- recursive case

To terminate, the base case must happen
- Relies on the run time system to:
    - create the function’s scope
    - keep track of the local variables for each call
    - This is a performance overhead compared to iterative loops

## Today

# Large Programs

# Scope - again

- Scope in a single file has two specific uses:
    - Local identifiers visible only in code blocks.
    - Global identifiers visible to all functions in one file

- Larger programs need multiple source files.

- Therefore scope has to be managed across files in large programs and external libraries like OpenGL.

- In C, functions and variables must be **declared** before they are used, but can be *defined later*.

# Multiple Source Files

- A C program may be divided among any number of source files.

- By convention, source files have the extension `.c`

- Each source file contains part of the program:
    - primarily definitions of functions and variables
    
- One source file must still contain a function named `main()`, which is the **<ins>entry point</ins>** for the program.

# Header Files

Problems that arise when a program is divided into several source files:

- How can a function in one file call a function that’s defined in another file?
- How can a function access an external variable in another file?
- How can two files share the same macro definition or type definition?

The answer lies with the `#include` directive, which makes it possible to share information among any number of source files.

# Header and Multiple Source Files

- `func.h`
    - header: contains the **declarations** needed to use functions in `func.c`
    
- `func.c`

    - `#include "func.h"`
    - source: contains the **definitions** of global and private functions and variables
    
- `main.c`

    -  `#include "func.h"`
    - this file should contain at least the `main()` function

# Sharing Identifier Declarations

- When variables and functions need to be shared between files there often needs to be a way to separate declarations and definitions.

    - We can then declare identifiers so that they can be used in any file, while keeping the definition in a single place in one file.
    
- The solution to this is the `extern` modifier.

# `extern` with Variables

Use the header file to contain the declarations of variables that are shared with other files:

```c
func.h
    extern int cost; // declaration
    
func.c
    int cost = 1;    // definition

```

# Abstract Data Types

- How should we divide functions into files?

- Abstract data types pre-date O-O concepts.

- Abstraction is the idea of separating what something is from how it works, by separating interface from implementation.

- Identify key data types and encapsulate them in separate files.

- Access the instances using the public interface, functions and variables.

- Hide other implementation details from the users.

# Wall of Abstraction

<center><img src="images/wall.png" alt="Wall of Abstraction" width="1000"></center>

# ADT Benefits

- Abstraction
    - from the implementation details
- Encapsulation
    - user cannot access internals
- Independence
    - reduces number of interactions
- Flexibility
    - implementation change transparent
- Protection from our brain’s limited powers to manage complexity in systems.

# ADT Implementation

- C usually implements complex types with a `struct` definition.

- In part to hide the details of the struct, ADTs are sometimes implemented with only a pointer type visible to the user, the struct itself remains private to the ADT source file.

- More modern languages than C have clearer ways to handle this through class definitions


# ADT implementation for `POINT_T`

- Publicly in the header file `point.h`, define a new type:

    ```c
    typedef struct PointStructType *POINT_T;
    ```

- Privately in the source file `point.c`, declare the underlying structure:

    ```c
    struct PointStructType {
        double array[NUM_DIMS];
    };
    ```

# `typedef` -  Simple Example

In [None]:
#include <stdio.h>
#include <string.h>
 
typedef struct Humans {
    char name[50];
    int age;
    char status[50];
} Human;
 
int main( ) {

    Human human;

    strcpy( human.name, "Amir");
    human.age = 1135296000;
    strcpy( human.status, "Good-Looking");
    
    printf( "human name : %s\n", human.name);
    printf( "human age : %d seconds\n", human.age);
    printf( "human status : %s\n", human.status);
    return 0;
}

# `typedef` - Example

In [None]:
#include <stdio.h>
#include <string.h>

// A triangle ADT
// Defining the sides of the triangle
typedef struct Triangle {
  int a;
  int b;
  int c;
} Tri;

int TrianglePerimeter(const Tri *tri) {
  return tri->a + tri->b + tri->c;
}

int main() {
  Tri t1 = { 3, 4, 5 };
  printf("perimeter is %d", TrianglePerimeter(&t1));
}

# Large Programs

- Split large software projects into separate files to manage complexity.

- `extern` allows variables and functions to be declared and shared in header files.

- `#include` allows header (`.h`) files to be included wherever needed.

- `typedef` allows the creation of new abstract data types that encapsulate implementation privately.

## We will learn about creating libraries next lecture...!

# Compilation Model

<center><img src="images/compile.png" alt="Compilation Model" width="750"></center>

# The C Preprocessor

<center><img src="images/pre-processor.png" alt="pre-processor" width="320"></center>

Directives such as `#define` and `#include` are handled by the preprocessor, a piece of software that edits C programs just prior to compilation.

Its reliance on a preprocessor makes C (and C++) unique among major programming languages.

# Macros vs Functions

In [None]:
#include <stdio.h>

#define PI 3.1415
#define circleAreaMacro(r) (PI*r*r)

float circleAreaFunc(r) {
    return PI*r*r;
}

int main() {
    float radius = 2, areaM, areaF;
    areaM = circleAreaMacro(radius);
    areaF = circleAreaFunc(radius);
    printf("Area    [Macro] = %.2f\n", areaM);
    printf("Area [Function] = %.2f\n", areaF);

    return 0;
}

# Macros vs Functions

- Macros are preprocessed but Functions are compiled.
- No Type Checking takes place in Macros.
- Using Macros increases the code length and Macros are hard to debug.
- Speed of execution is faster with Macros.
- Macros are useful when small code is repeated many times.


# Conditional `include`

```c
#ifndef CODE_H
    #define CODE_H     // define the identifier
    extern void setCount( int val );
#endif
```

This allows the header file to be `#include`d many times
- If the header file has not been seen before, set definitions.
    - And set that we’ve seen it before - #define CODE_H
    - Otherwise skip

# The Link Editor (Linker)

<center><img src="images/linker.png" alt="linker" width="520"></center>

The linker’s job is to combine all the files needed to form the executable.

- It specifically has to resolve all symbols, functions and variables, it most often fails when it can’t find required object code, for example because it is in the wrong folder, or you have forgotten to specify which external library to link with e.g. the maths library with `-lm`.

# `make`

- You learned about `make` in Lecture 4.

- You can easily find it on almost any system today:
    - Mac OS
    - Any Linux System
    - Easy to get on Windows
    
- Helps create object files from the source and then linking the object files to create the executable using a rule-set.

- Very helpful in large projects.

- You learned about alternatives earlier as well.

# Recap - Makefiles


- The `Makefile` is a rule-file that enables `make` to compile and link files into the final executable..

- Declarative programming style set of rules for building the program.
```
    target [target ...]: [component ...]
           [command 1]
               ...
           [command n]
```
- `target` - what you want to make
- `component` - something which needs to exist (might need another rule)
- `command` - need TAB

# Recap - Makefiles: Example

- Files: `main.c`, `counter.h`, `counter.c`, `sales.h`, `sales.c`

```
all: counter.o sales.o main.c
        gcc -o program main.c counter.o sales.o
        
counter.o: counter.c counter.h
        gcc -c counter.c
```
```
sales.o: sales.c sales.h
        gcc -c sales.c
        
clean:
        rm -rf program counter.o sales.o
```

# Recap - Makefiles
    
- Nothing to do with C

```
all: say_hello generate

say_hello:
	@echo "Hello World"
```
```
generate:
	@echo "Creating empty text files..."
	touch file-{1..10}.txt

clean:
	@echo "Cleaning up..."
	rm *.txt
```

# Summary

- Source files and header files

- Sharing declarations

- `extern`

- Abstract data types

- `typedef`

- Preprocessing