# Lecture 3 : Introduction to C Programming

## Here are some major reasons for learning C (and why we use C in CMDA 3634):

* ### C code runs much faster than code written in high level languages such as Python and R.

* ### C code runs faster than Java code although the speedups are not as extreme as when comparing C to Python.

* ### Java and C share a large amount of syntax (in fact Java is a derivative of C).

* ### It is good to start with C if you want to learn more advanced programming languages such as C++ and C#.

* ### Python and C can be combined to get the performance of C along with the high level programming of Python.  

* ### Supercomputers are usually programmed in C++, C, or Fortran (with extensions to handle parallel execution).

* ### Most popular parallel computing libraries such as OpenMP, MPI, and CUDA work best with C++, C, or Fortan.

* ### TIOBE rankings of the most popular programming languages https://www.tiobe.com/tiobe-index/

# Part 1 : Our First C Program

## We start by creating a C program to print *Hello World!*.

In [1]:
%%writefile hello.c
#include <stdio.h>

int main () {
    printf ("Hello World!\n");

    /* program completed successfully */
    // this return statement is optional in C99
    return 0;
}

Overwriting hello.c


## Notes:
* ### Line 1 is the magic command that creates the source file *hello.c*.  This line is only needed if you want to write C source code within a Jupyter notebook.  
* ### Line 2 instructs the C preprocessor to include the **header file** that includes the interfaces to the standard input/output functions such as *printf*.  
* ### Lines 4-10 are the main function that is run when the compiled program is executed.  Every C program **must** have a main function.  
* ### Line 5 prints the message using the *printf* function which stands for **print formatted**.  The \n at the end of the "Hello World!\n" **string literal** (a sequence of characters or escape sequences enclosed in double quotation mark symbols) stands for **new line**.
* ### Lines 7-8 are C comments.  The C++/Java style comment syntax used in line 8 is valid in C99 and later.
* ### Line 9 returns 0 which indicates that the program completed successfully.  We can use, for example, *return 1*, to indicate that the program encountered an error and did not complete successfully.  In C99 and later, this statement is optional.  If it is not included at the end of the main function of C99 and later, the return value will automatically be set to 0.  
* ### **Going forward we will omit the *return 0* statement at the end of main functions for brevity.**

## The file *hello.c* is called a C **source file**.  It must be compiled into a C **program** using a C **compiler**.

In [2]:
!gcc -o hello hello.c

## Notes:

* ### We use the **gcc** (GNU compiler collections) compiler.  

* ### Linux commands such as *gcc* given inside of a Jupyter notebook have to be preceded by the ! symbol.

* ### The *-o hello* part of the compilation command names the program.  If this part is not included the program will be called *a.out*


## Finally, we can run the *hello* program created by the compiler from the C source code hello.c using the command *!./hello*

In [3]:
!./hello

Hello World!


# Part 2 : Determining if a number is prime in C


## One way to determine if an integer $n$ is prime is to check all integers $d$ between 2 and $\sqrt{n}$ to see if any are a factor of n.

## Why do we not have to look for factors $d$ larger than $\sqrt{n}$?

## Fortunately, we can implement this algorithm without explicity calculating $\sqrt{n}$.



## Here is our first attempt at a primality test in C.

In [4]:
%%writefile prime_v1.c
#include <stdio.h>

int main () {
    int n = 1234567;
    for (int d = 2; d*d <= n; d++) {
        if (n % d == 0) {
            printf ("The number %d is not prime since %d divides it.\n",n,d);
            return 0;
        }
    }
    printf ("The number %d is prime.\n",n);
}

Overwriting prime_v1.c


## Notes:

* ### Lines 6-11 contain a C for loop.  The variable *d* is called a loop counter.  For loops in C have the same syntax and behavior as for loops in Java.  Note that by ending the loop when $n^2 > d$ we avoid computing $\sqrt{n}$.
* ### Lines 7-10 contain a C if statement.  If statements have the same syntax and behavior as if statements in Java.  
* ### In line 7 we check to see if d divides n by using the C mod operator.  
* ### **Note that we use == to check for equality rather than =**
* ### Line 8 uses the *printf* function to print that n is not prime.  Note that *%d* is the C format specifier for **int**.  Also note that we can use printf with multiple format specifiers and arguments.
* ### In line 9 we use *return 0* to exit the main function with a successful termination.  
* ### If we make it to line 12, then $n$ is prime.

In [5]:
!gcc -o prime_v1 prime_v1.c

In [6]:
!./prime_v1

The number 1234567 is not prime since 127 divides it.


## Exercise 1

* ### Recompile and run the program with $n=161218349$.  What do you observe?

* ### It is known that the number $n=5261656080911617$ is prime.  Recompile and run the program with this value of $n$.  What do you observe?

# Part 3 : Handling Large Integers

## A C int (and Java int) has 32 bits of storage.

* ### One of the 32 bits is a sign bit.
* ### A C int has a range of $-2^{31}$ to $2^{31}-1$ or $-2147483648$ to $2147483647$.

## A C long long (and Java long) has 64 bits of storage.  

* ### One of the 64 bits is a sign bit.  
* ### A C long long has a range of $-2^{63}$ to $2^{63}-1$ or $-9223372036854775808$ to $9223372036854775807$.

## Here is a modification of our primality tester that handles larger $n$.

In [7]:
%%writefile bigprime.c
#include <stdio.h>

int main () {
    long long n = 5261656080911617;
    for (long long d = 2; d*d <= n; d++) {
        if (n % d == 0) {
            printf ("The number %lld is not prime since %lld divides it.\n",n,d);
            return 0;
        }
    }
    printf ("The number %lld is prime.\n",n);
}

Overwriting bigprime.c


## Notes:
* ### On lines 8 and 12 we use the format specifier *%lld* for variables of type *long long*.

In [8]:
!gcc -o bigprime bigprime.c

In [9]:
!./bigprime

The number 5261656080911617 is prime.


# Part 4 : Command Line Arguments

## Command line arguments allow us to to alter the behavior of our program at runtime.  

## Here is a C program that prints out its command line arguments (one per line).  

In [10]:
%%writefile args.c
#include <stdio.h>

int main (int argc, char* argv[]) {
    for (int i=0;i<argc;i++) {
        printf ("%s\n",argv[i]);
    }
}

Overwriting args.c


## Notes:
* ### Line 4 includes the optional arguments *argc* and *argv*.  The variable *argc* tells us the number of command line arguments and *argv* is an array of pointers to the command line arguments.  We will discuss arrays and pointers in detail later.
* ### Line 5-7 contain a C for loop.  Note that C (like Java and Python) is a zero-based language which is why we start the loop counter at 0 and go up to argc-1.  
* ### Line 6 uses the *printf* function to print one command line argument per line.  Note that *%s* is the C format specifier for **string**.  Unlike Java, C does not have a built in String datatype.  In C, strings are null-terminated arrays of characters.  We will discuss strings in detail later.


In [11]:
!gcc -o args args.c
!./args abc 123 hello world!

./args
abc
123
hello
world!


## Note that argv[0] is just the name of the C command *./args*.
## Thus the actual command line arguments are *argv[1]*, *argv[2]*, etc.

## Next let's look at a C program to print a personalized Hello message.

In [12]:
%%writefile greet.c
#include <stdio.h>

int main (int argc, char* argv[]) {
    printf ("Hello %s!  How are you?\n",argv[1]);
}

Overwriting greet.c


In [13]:
!gcc -o greet greet.c
!./greet Jason

Hello Jason!  How are you?


In [14]:
!./greet

Hello (null)!  How are you?


## Note that running the command without a command line argument gives a strange result.  In particular, we went off the end of the *argv* array and no runtime error was given!  

## This example illustrates that C does **not** do arrays bounds checking.  

## **Reading or writing past the end (or beginning) of an array in C will not produce a runtime error but will likey produce unexpected results.**

## It is important to provide error checking in your code where reading/writing past the end of an array is possible.  One simple way of handling an error is to *return 1* from main which will terminate the program with an abnormal execution status.  

## Here is a version of the code with error checking.  Note that if an error is encountered we provide instructions on how to correctly use the command and abnormally terminate the program.

In [15]:
%%writefile greet.c
#include <stdio.h>

int main (int argc, char* argv[]) {
    if (argc < 2) {
        printf ("command usage: %s %s\n",argv[0],"name");
        return 1; // abnormal exit
    }
    printf ("Hello %s!  How are you?",argv[1]);
}

Overwriting greet.c


In [16]:
!gcc -o greet greet.c
!./greet Jason

Hello Jason!  How are you?

In [17]:
!./greet

command usage: ./greet name


## If no command line arguments are provided, the program terminates with instructions on how to use the program rather than attempt to print a greeting.

In [18]:
!./greet Jason Wilson

Hello Jason!  How are you?

## Extra command line arguments are ignored by the program.

# Part 5 : Primality Test Revisited

## Let's revise our primality test to specifiy $n$ using a command line argument.

In [19]:
%%writefile prime.c
#include <stdio.h>
#include <stdlib.h>

int main (int argc, char* argv[]) {
    if (argc < 2) {
        printf ("command usage: %s %s\n",argv[0],"n");
        return 1; // abnormal exit
    }
    long long n = atoll(argv[1]);
    for (long long d = 2; d*d <= n; d++) {
        if (n % d == 0) {
            printf ("The number %lld is not prime since %lld divides it.\n",n,d);
            return 0;
        }
    }
    printf ("The number %lld is prime.\n",n);
}

Overwriting prime.c


## Notes:

* ### On line 3 we include *stdlib.h* which includes interfaces for the C standard library including the function *atoll* that we are using on line 10.
* ### On line 10 we use the function *atoll* to convert the first command line argument string into a C *long long*.  Other useful conversion functions are *atoi* which converts a string into a C *int* and *atof* which converts a string into a C *double* (a C *double* is a 64-bit double precision floating point number).

In [20]:
!gcc -o prime prime.c

In [21]:
!./prime 5261656080911617

The number 5261656080911617 is prime.


In [22]:
!./prime 729476671297368179

The number 729476671297368179 is prime.


## It is known that 10918483718784063109 is prime.  

## Let's run our primality tester with this very large input.

In [23]:
!./prime 10918483718784063109

The number 9223372036854775807 is not prime since 7 divides it.


## What do you observe? Explain why this happened.  What does this tell us about our program?

# Exercise 2

* ### How long does it take our primality tester to check if 729476671297368179 is prime?
* ### Do we have to check to see if all even numbers below $\sqrt{n}$ divide $n$?  Explain.
* ### Suggest an improvement to our primality tester.  
* ### Write a primality tester with source code fastprime.c that incorporates your improvement.  
* ### How does your fast primality tester handle even numbers?
* ### How long does it take your fast primality tester to check if 729476671297368179 is prime?
* ### What is the speedup when using your fast primality example for this example?