<hr style="border:3px coral solid"</hr>

# A C Tutorial

<hr style="border:3px coral solid"</hr>


The following series of simple functions should give you an idea as to how to use pointer variables, statically allocated arrays and dynamically allocated arrays. 

## Hello, Jupyter!

<hr style="border:2px coral solid"</hr>

We can use Jupyter notebooks to compile and run simple C programs. 

In what follows, we will use the following Jupyter "cell magic" commands

* `%%file <filename>`  Write the contents of a Jupyter notebook cell to a file `filename`, e.g. `%%file prog.c`.   

* `%%bash` Run bash commands from a notebook cell.  These commands can also be run directly from a command line. 

Magic commands typically appear as the first line in a notebook cell. 

Below is a sample C program.  

In [None]:
%%file c_demo_01.c

#include <stdio.h>

int main(int argc, char** argv)
{
    printf("Hello, Jupyter!\n");
    
    return 0;
}

### Components of the C program

This demo program has several components

<hr style="border:1px solid black"></hr>

    %%file c_demo_01.c
    
This Jupyter magic command will save the contents of the Jupyter notebook cell to a file called `c_demo_01.c`.  This is a handy way to create text files that we can later read using a text editor, VSCode, or even a Jupyter notebook, when opened as "plain text". 

To see that the file was written correctly, you should see `Writing c_demo_01.c` in an output cell.  

We can see that the file was creating by checking the input using the line magic `%cat`. 

In [None]:
%cat c_demo_01.c

You can also open this file in any text editor. 

*We use `%%file` here for  demonstration purposes. However, it is suggested that you use a real text editor, designed for C, to actually write programs.  Some examples include Emacs, VI, VIM, VSCode, Sublime, Atom.*

<hr style="border:1px solid black"></hr>

    #include <stdio.h>
    
An `#include` statement is acts like the `import` statement in Python.  We include `stdio.h` (read "standard I/O") here so we can call the `printf` function from the body of our code. 

Files with a `.h` extension are called *header files* and are needed to provide the format that a given function will take.  

<hr style="border:1px solid black"></hr>

    int main(int argc, char** argv)
    
Every C program has to have a `main` function. This function contains the entry point for the executable.  The arguments to this function are 

  * **argc**   The number of input arguments (read from the command line)
    
  * **argv**   The names of the input arguments (an array of strings)

The `int` keyword indicates that the `main` function may return an integer exit code.  

*Not all compilers will require the return `int`.*

<hr style="border:1px solid black"></hr>
    
The body of the code is delineated using braces `{}`, (e.g. "curly braces"). 

    int main(int argc, char** argv)
    {
        printf("Hello, Jupyter!\n");

        return 0;
    }

The code contains a single print statement using the `printf` function (note the **f** at the end), and a return code (0 in this case). 

*The `printf` statement in C does not automatically include a new line character.  It will generally be necessary to include `\n` at the end of a any string to be printed to the console using `printf`.*

<hr style="border:1px solid black"></hr>

Some differences between Python and C : 

* Every line in the C program must end with a semicolon (";"). 

* Curly braces `{}` (or just "braces", as opposed to brackets `[]` or parenthesis `()`) are used to delineate "scope".  In Python, scope is delineated using indentation.  In C, indendation adds to readability, but does not have any particular meaning to the compiler. 

* Unlike Python, all C strings must be delineated using double quotes, e.g. `"a string"`.   Single quotes are reserved for single characters only.

### Compiling the code

We compile the code at a `bash` command line using the compiler `gcc`.  `gcc` is a standard 'GNU'.  Depending on your machine, `gcc` may also be a CLang compiler. 

To test if we have a C compiler, try the following

In [None]:
%%bash 

which gcc

This is the C compiler that is found when typing `gcc` on the command line.  We can see waht version this is by passing in a `--version` flag. 

In [None]:
%%bash

gcc --version

On Apple Macs, it will be necessary to install the Command Line Tools (part of Xcode).  The compiler that comes with command line tools is a `Clang` compiler. 

In [None]:
%%bash

/Library/Developer/CommandLineTools/usr/bin/gcc --version

To compiler our sample code, we use the command

    gcc -o c_demo_01 c_demo_01.c

The `-o` flag tells the compiler to create an executable `c_demo_01`.  The filename of the file containing the code follows this statement.  

If the compiler successfully compilers our code, we will not see any output.  

In [None]:
%%bash

gcc -o c_demo_01 c_demo_01.c

 The `ls` statement will show us that an executable, indicated with multiple `x`s in the file descriptor, was successfully created. 

In [None]:
%%bash 

# List the executable just created. 
ls -lh c_demo_01

### Running the code

To run the code, we type the name of the executable on the command line. 

In [None]:
%%bash 

c_demo_01

<hr style="border:2px coral solid"</hr>

## Variables and data types in C

<hr style="border:2px coral solid"</hr>

All variables in C are *strongly typed*. This means that not only must we first initialize a value before using it (as in Python), we must also *declare its type*.  For example

     double x;
     x = 1.2; 
     
The above declares the variable `x` as a type double (e.g. double precision, or 64bit).  Using shorthand notation, we can both declare the type and assign a value in a single statement. 

     double x = 1.2;  
     
Other data types are `int` (32 bit integers) and `char` (for characters and string variables).  Examples include

     int m = 4;      
     char c = 'c';
     
A string variable is an array of characters, and as such, must be declared using a pointer variable `char*`.    
     
     char* msg = "Hello, World!";  
     
We discuss pointers and more below.      

In the following sample program, we demonstrate each of these data types, and show how to assign them and print them out. 


In [None]:
%%file c_demo_02.c

#include <stdio.h>

int main(int argc, char** argv)
{
    double x = 1.2;   
    printf("x = %f\n",x);
        
    int m = 10;
    printf("m = %d\n",m);
    
    char* msg = "Hello, World!";
    printf("Greeting : %s\n",msg);

    return 0;
}


The `printf` statement takes a format specifiers to format the numeric values. 

   Other examples of format specifiers include 


   * `%12.4f` Format a floating point value using a fixed point format in a field of width 12.  Show 4 digits after the decimal place. 

   * `%16.8e` Format a floating point number using exponential notation (e.g. `1.2E+01`) using a format field of width 16, with 8 digits after the decimal represented. 

   * `%10d` Format an integer in a field of width 10. 


### Compile and run example

In [None]:
%%bash 

# Remove old executable
rm -rf c_demo_02

# Compile executable to program `c_demo_02`
gcc -o c_demo_02 c_demo_02.c

# Execute code
./c_demo_02

### Referencing by value

<hr style="border: 2px solid black"</hr>

A significant difference between Python and C is that in C, we can reference variables either by *value* or by *reference*.  

We are already familar with what it means to reference a variable by *value*. 

      double x = 1.2;
      double y = x;
      
Here, the *value* stored in the variable `x` is copied to the variable `y`.  If we later decide to change the value of `x`, the value of `y` will not be affected.  



In [None]:
%%file c_demo_03.c

#include <stdio.h>

int main(int argc, char** argv)
{
    double x = 1.2;
    
    double y = x;    
    printf("y         = %g\n",y);
    
    x  = 5.4;
    printf("y (again) = %g\n",y);

    return 0;
}


In [None]:
%%bash

rm -rf c_demo_03

gcc -o c_demo_03 c_demo_03.c

c_demo_03

### Referencing by memory address

<hr style="border: 2px solid black"</hr>

This situation changes if we refer to `x` by reference.  To refer to the address of `x` rather than its value, we use `&`. 

     double x = 1.2;
     printf("Address of x : %p\n",&x);
     

In [None]:
%%file c_demo_04.c

#include <stdio.h>

int main(int argc, char** argv)
{
    double x = 1.2; 
    printf("Address of x : %p\n",&x);
    return 0;
}


In [None]:
%%bash

rm -rf c_demo_04

gcc -o c_demo_04 c_demo_04.c

c_demo_04

This address is a hexidecimal value that refers directly to a location in memory.  

### Pointer variables 

<hr style="border: 2px solid black"</hr>

We can directly manipulate the value stored in a memory location through the use of a *pointer* variable. 

The value of a pointer variable is a memory address (not the value stored in the address).

A pointer variable is designated using a `*`.  A pointer variable must also know what data type is stored in the given memory location.  

Some examples of pointer variables declarations are : 

      double* y;   // A pointer to a type double
      
      int* p;      // A pointer to a type int
      
      char* str;     // A pointer to a type char (used for string types)


#### Assigning values to pointer variables

Pointer variables can take *addresses* as values.  In what follows, the pointer variable (or just "pointer") is assigned the memory location labeled `x`. 

     double x = 1.2;
     
     double *y;  // Declare pointer variable
     
     y  = &x;    // Assign of x to y
     
This will assign the address of `x` to the pointer variable `y`. 

In [None]:
%%file c_demo_05.c

#include <stdio.h>

int main(int argc, char** argv)
{
    double x = 1.2; 
    double *y;
    
    y = &x;    /* Assign address to y */
    
    printf("Address of x : &x = %p\n",&x);
    printf("Value of y   :  y = %p\n",y);    
    return 0;
}


In [None]:
%%bash

rm -rf c_demo_05

gcc -o c_demo_05 c_demo_05.c

c_demo_05

### De-referencing a memory address

<hr style="border: 2px solid black"</hr>

We can obtain the value stored by `x` indirectly by *de-referencing* the pointer variable `y`.  To get the value stored in the memory address `y`, we use the `*` operator. 

     double z = *y;   // Copy value stored in location y to z. 
     
#### Note

* Try not to be confused by the *declaration* of a pointer variable using `double *y` and the *de-referencing* of a pointer variable using `*y`.  

In [None]:
%%file c_demo_06.c

#include <stdio.h>

int main(int argc, char** argv)
{
    double x = 1.2; 
    
    printf("Value of x         (x) : %g\n",x);
    printf("Address of x      (&x) : %p\n",&x);
    printf("\n");
    
    double *y;
    y = &x;
    
    double z = *y;   // dereference y and copy value to z
    
    printf("Value of y         (y) : %p\n",y);
    printf("Value of z    (z = *y) : %g\n",z);
    
    return 0;
}


In [None]:
%%bash

rm -rf c_demo_06

gcc -o c_demo_06 c_demo_06.c

c_demo_06

One of two main uses of pointer variables is to modify directly the value stored in a particular memory location.   

For example, the following will change the value of `x` by de-referencing a pointer variable `y`. 

In [None]:
%%file c_demo_07.c

#include <stdio.h>

int main(int argc, char** argv)
{
    double x = 1.2;
    printf("x = %g\n",x);
      
    double *y;
    y = &x;  
    
    *y = 3.5; 
    printf("x = %g\n",x);
    
    return 0;
}


In [None]:
%%bash

rm -rf c_demo_07

gcc -o c_demo_07 c_demo_07.c
c_demo_07

The fact that C gives us such low level access to memory locations can be used for both good and evil!

<hr style="border:3px solid coral"</hr>

## Arrays

<hr style="border:3px solid coral"</hr>

Static arrays are arrays whose size cannot change once it has been declared.  Furthermore, any spaced used for the static array cannot be "freed" during runtime. 

Some array similarities with Python

* Arrays are indexed starting with 0. 

* We access elements of an array using brackets `[]`.  

Unlike Python, if we try to access an element that is outside the bounds of the array, we don't generally get an error. But we can expect strange uninitialized values. 

In [None]:
%%file c_demo_08.c

#include <stdio.h>

int main(int argc, char** argv)
{
    double x[3]; 

    for(int i = 0; i < 3; i++)
    {
        x[i] = i*3.14159;
    }
    
    printf("x[0] = %g\n",x[0]);
    printf("x[1] = %g\n",x[1]);
    printf("x[2] = %g\n",x[2]);
    printf("\n");
    printf("x[3] (out of bounds!) %g\n",x[3]);
    
    return 0;
}

In [None]:
%%bash 

rm -rf c_demo_08

gcc -o c_demo_08 c_demo_08.c

c_demo_08

### Connection between arrays and pointers

Arrays variables are actually *pointer* variables.   That is, the value of `x` in the following is an address. 

     double x[3];
     printf("Value of x : %p\n",x);

Moreover, we can assign `x` to another pointer variable. 

     double *y = x;
     
We can then use `y` as a replacement for `x`.  Elements of `x` can be accessed and modified using `y[...]`. 

In [None]:
%%file c_demo_09.c

#include <stdio.h>

int main(int argc, char** argv)
{
    double x[3] = {1.,2.,3.}; 
    
    
    printf("x[0] = %f\n",x[0]);
    printf("x[1] = %f\n",x[1]);
    printf("x[2] = %f\n",x[2]);
    printf("\n");
    
    printf("Value of x %p\n",x);
    
    
    double *y = x;
    printf("Value of y %p\n",y);
    printf("\n");
    
    y[1] = 15;
    
    printf("x[0] = %f\n",x[0]);
    printf("x[1] = %f\n",x[1]);
    printf("x[2] = %f\n",x[2]);
    
    return 0;
}

In [None]:
%%bash 

rm -rf c_demo_09

gcc -o c_demo_09 c_demo_09.c

c_demo_09

However, `x` is not a pointer variable in the sense that we can set its address to a new location.   Attempting 

    double z = 1.0;
    x = &z;     
    
will result in an error, since the variable `x` is statically bound to the memory location fixed at compile time. 

    c_demo_09.c: In function 'main':
    c_demo_09.c:22:7: error: assignment to expression with array type
       22 |     x = &z;
          |       ^


### Variable length arrays

Modern versions of C allow us to use a dynamically defined variable when sizing static arrays.  

In the following, we set the length of the array to a value input by the user on the command line.  We can read an integer from the command line as follows. 

    int n;
    if (argc == 2)
        n = atoi(argv[1]);
    else
        n = 5;    

The function `atoi` converts an Ascii value to an integer ("ascii-to-integer").  Here, we use it to convert input arguments `8` or `21` to an integer value.  The `atoi` function is defined in the header `stdlib.h`.  

And then we can call the executable with a *command line argument*. 

     c_demo_10 8
     
We will use this integer value to create an array of length 8. 

     double x[n];
          
          
To read          

In [1]:
%%file c_demo_10.c

#include <stdio.h>
#include <stdlib.h>   // contains header for atoi

int main(int argc, char** argv)
{
    int n;
    if (argc == 2)
        n = atoi(argv[1]);
    else
        n = 5;    
    
    double x[n];
    
    for(int i = 0; i < n; i++)
        x[i] = i*i;
    
    for(int i = 0; i < n; i++)
        printf("x[%d]  %6.1f\n",i,x[i]);
        
    return 0;
}

Writing c_demo_10.c


In [4]:
%%bash 

rm -rf c_demo_10

gcc -o c_demo_10 c_demo_10.c

./c_demo_10 12

x[0]     0.0
x[1]     1.0
x[2]     4.0
x[3]     9.0
x[4]    16.0
x[5]    25.0
x[6]    36.0
x[7]    49.0
x[8]    64.0
x[9]    81.0
x[10]   100.0
x[11]   121.0


### Dynamically allocated arrays

<hr style="border:2px black solid"</hr>

We can also allocate memory *dynamically*.  This is more flexible than the statically defined arrays, since we can allocate memory at run time, and free it when we no longer need the memory. 

To dynamically allocate memory, we use the library routine `malloc`.  We specify the size of the dynamically allocated array in *bytes*.  To determine the bytes needed, we use the `sizeof` function.  

      double n = 5;
      bytes = n*sizeof(double);

To allocate the array, we make a call to `malloc` ("memory allocate").  To ensure portability with different compilers, it also a good idea to "cast" the type of the return to the proper pointer variable type. 

       double *x = (double*) malloc(bytes);

We use the dynamically allocated array in exactly the same way as the statically allocated array. 

#### Comments 

* The values stored in the memory allocated by malloc is uninitialized. 

* To avoid memory leaks, we must remember to `free` the memory we allocated using `malloc`. 

* The header for `malloc` is in `stdlib.h`, while the header for `sizeof` is in `stdio.h`. 


In [5]:
%%file c_demo_10.c

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char** argv)
{
    int n = 5;
    
    size_t bytes = n*sizeof(double);
    
    double *x = (double*) malloc(bytes);  // cast when using with g++
    
    for(int i = 0; i < n; i++)
        x[i] = i*i;
    
    for(int i = 0; i < n; i++)
        printf("x[%d]  %6.0f\n",i,x[i]);
        
    free(x);
    return 0;
}

Overwriting c_demo_10.c


In [6]:
%%bash

rm -rf c_demo_10

gcc -o c_demo_10 c_demo_10.c

./c_demo_10

x[0]       0
x[1]       1
x[2]       4
x[3]       9
x[4]      16


<hr style="border: 3px solid coral"></hr>

##  Scope

<hr style="border: 3px solid coral"></hr>

In C, variables, functions and so on can go in and out of "scope".  We have access to variables that are "in scope", but do not have access to variables that are "out of scope". 

Scope is typically delineated by braces `{}`.  As a simple example, consider the following code. 

    #include <stdio.h>
    int main(int argc, char** argv)
    {
        {
            double x = 3.5;
        }
        printf("x = %f\n",x);
    }
    
If we were to try to compile this code, we would get an error :     

    c_demo_11.c: In function 'main':
    c_demo_11.c:9:23: error: 'x' undeclared (first use in this function)
        9 |     printf("x = %f\n",x);
          |                       ^
    c_demo_11.c:9:23: note: each undeclared identifier is reported only once for each function it appears in
    
The code fails to compile  since the variable `x` is not longer "in scope" when we refer to it in the print statement.   We also say that "x has gone out of scope".  

### Example

<hr style="border: 2px solid black"></hr>


What does the following example do? 

In [7]:
%%file c_demo_12.c

#include <stdio.h>

int main(int argc, char** argv)
{
    double x = 1.2;
    {
        double x = 3.5;
    }
    printf("x = %f\n",x);
}

Writing c_demo_12.c


What value for `x` gets printed out? 

In [8]:
%%bash

rm -rf c_demo_12

gcc -o c_demo_12 c_demo_12.c

./c_demo_12

x = 1.200000


The variable `x` declared in the inner `{}` is only "local" in scope.  Even though it has the same name as the previous `x`, it is a different variable.  Assigning it a value does not affect the variable `x` in the outer scope. 

### Example

<hr style="border: 2px solid black"></hr>

Here is an example using pointer variables to try to keep track of the variable `x` with lcoal scope. 

In [9]:
%%file c_demo_13.c

#include <stdio.h>

int main(int argc, char** argv)
{
    double *y;
    {
        double x;
        y = &x;
        
        x = 1.2;
    }    
    printf("*y = %24.16f\n",*y);        
}

Writing c_demo_13.c


In [10]:
%%bash

rm -rf c_demo_13

gcc -o c_demo_13 c_demo_13.c

./c_demo_13

*y =       1.2000000000000000


Is this what we expect to happen? 

### Scope and arrays

<hr style="border: 2px solid coral"></hr>



For statically defined arrays, memory is automatically deleted as soon as the variable goes out of scope. Consider the following code

In [11]:
%%file c_demo_14.c

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char** argv)
{
    int n = 5;
    double *y;
    {
        double x[n];
        y = x;
    
        for(int i = 0; i < n; i++)
            x[i] = i;      
    }
    
    for(int i = 0; i < n; i++)
        printf("x[%d]  %6.1f\n",i,y[i]);
        
    return 0;
}

Writing c_demo_14.c


In [12]:
%%bash 

rm -rf c_demo_14

gcc -o c_demo_14 c_demo_14.c

./c_demo_14

x[0]     0.0
x[1]     0.0
x[2]     0.0
x[3]  -5870777135244153110549465978205803155262591176789432355731265330666571538705630875655733493772579609446670204235844179788559322596826677768093696.0
x[4]     0.0


### Example

<hr style="border: 2px solid black"></hr>


The following example on the other hand, will work (Why?)

In [13]:
%%file c_demo_15.c

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char** argv)
{
    int n = 5;
    double *y;
    {
        double *x = (double*) malloc(n*sizeof(double));
        y = x;
    
        for(int i = 0; i < n; i++)
            x[i] = i;      
    }
    
    for(int i = 0; i < n; i++)
        printf("y[%d]  %6.1f\n",i,y[i]);
        
    free(y);
    return 0;
}

Writing c_demo_15.c


In [14]:
%%bash 

rm -rf c_demo_15

gcc -o c_demo_15 c_demo_15.c

./c_demo_15

y[0]     0.0
y[1]     1.0
y[2]     2.0
y[3]     3.0
y[4]     4.0


The difference is that memorry allocated by `malloc` will not be automatically deleted, and so is accessible even if the original pointer used to reference the memory location goes out of scope.  


#### Question

* Can you see why the above might lead to memory leaks? 