# The Basics of Programming in C
## *A Living Tutorial With An Eye Towards Scientific Computing*

Written by Adam George Morgan

*Last Updated May 9, 2025*

The C programming language continues to be an important part of modern consumer and scientific software. The purpose of this notebook is to provide a "one-stop-shop" for the essential information needed to write and understand C code. I'm preparing it mostly for self-study, but I hope the content below can helps others as well. 

This tutorial unashamedly leans into my own biases: 
1) scientific computing is my primary interest, so I may include or exclude certain features of C that someone interested in commerical software would consider "essential";
2) Python is my mother tongue when it comes to coding, so much of my discussion focuses on the differences between Python and C. If you're unfamiliar with Python, this notebook may not be the right resource for you (admittedly, if you've successfully launched this tutorial notebook and are reading this message, then you probably know enough Python to appreciate the material).

Consider yourself warned!

As my understanding of C in particular and coding in general becomes more refined, I'll return to this document and provide updates. Please get in touch with me (say, by opening up a GitHub issue) if you have any concerns with the material here, or suggestions for major improvements.

This notebook would not even be possible without Brendan Rius' creation of a C kernel in Jupyter: https://github.com/brendan-rius/jupyter-c-kernel. Thanks Brendan for the excellent work! 

### Why Learn C? 

I genuinely love programming in Python, but you can't go your whole life using the same tool for every job. So, what can C do that Python can't? Here are some answers relevant to my own interests. 

1) *C provides a pathway to writing fast code*. Python is not optimally performant: for instance, Python loops are notoriously slow. While NumPy and SciPy provide many wrappers of battle-tested Fortran codes to mitigate some of these performance issues, the time will probably come when you need to write your own low-level code to achieve good performance. In my experience, the two most common languages used in performant scientific computing codes are Fortran and C++. Since C++ arose as an expansion of C, knowing C can only help you get a better grasp on great codes.
2) *C provides a pathway for writing code that runs on faster machines*. Thanks to the power of modern GPUs, high-performance computing (HPC) is no longer the exclusive province of supercomputing clusters. NVIDIA's GPUs can be programmed with the CUDA toolkit which is essentially a C/C++-based interface between CPU and GPU. So, C is helpful for understanding modern HPC (admittedly, Python- and Julia-based versions of CUDA are also supported nowadays).
3) *C appears to be everywhere*. It seems like every big project I work with uses some C here and there, and I would like to know what the hell is going on insofar as is possible.
4) Finally, *the more languages you know, the better you understand coding in general*. 

### Other Helpful Resources

-*Guide to Scientific Computing in C++*, ed.2 by Joe Pitt-Francis and Jonathan Whiteley. Springer, 2017.

-*Solving PDEs in C++* by Yair Shapira. SIAM, 2006. 

### Our First C Program

Let's start with a "Hello, World!" program. This will illuminate the basic structure of all C scripts. 

In [1]:
#include <stdio.h>

int main() {
    printf("Hello World!");
    return 0; 
}

Hello World!

Let's dissect this code snippet. 

1) The first line (`#include <stdio.h>`) imports a header script (.h file) containing the necessary functions for handling inputs and outputs, including printing. The `io` in the filename is short for "input/output". 

2) You can probably figure out what `printf` does from context (note: it's taken from `stdio.h`), but what is `int main()` doing? The parentheses suggest it's a *function*, but why does it return `0` if the point of the script is just to print "Hello World!"? The answer is simply that all $C$ scripts *must* include an `int main()` function returning `0` (and **NOT** `0.`! more on this later). In the same way that `__init__.py` files are integral to Python's directory structure, `int main()` is integral to C's script structure.

3) To be clear: `main()` is indeed a function. Note how the `return` statement is placed inside {braces}. 

4) `int` is short for "integer", and specifies the data type of `main()`'s `return` value. In C, you must declare your variables and their data types.

5) Note how each statement in the code terminates with a semicolon `;`. Deletion of any one of these semicolons will give a compile error. 

Strictly speaking, `main()` is a little flexible. You can pass the keyword `void` into `main` to avoid having to give a `return` value: 

In [2]:
#include <stdio.h>

int main(void) {
    printf("Hello World!");
}

Hello World!

Think of `void` as being short for "void output". Below, I'll prefer the `void` approach since it always saves us one line of code. 

Before moving on, I'll discuss a limitation of the Jupyter C notebook. For a typical Python notebook, the notebook is a "workspace" and variables from one block can be called in another. However, in a C notebook each block is treated as an individual script: in particular, it requires a `main()`. 

### Declaring Variables. Data Types. 

In Python, I can instantiate a whole bunch of different sorts of variables without being very careful: I can just type `x = 1` to store the integer `1` in memory, I can type `x = ["32", 32.]` to have a list whose elements contain both string and float data types, and so forth. 

As already mentioned, however, in C you must be more careful. You need to be explicit about data types when you *declare* your variables. Here is an example where we allocate memory for three integers (by declaring them), assign two of them values, and set the third equal to the sum of the first two: 

In [3]:
#include <stdio.h>

int main(void) {
    int a, b, c; /* remember, int = integer */
    a = 1;
    b = 2; 
    c = a + b;
    printf("%d", c); /* %d is a stand-in for an int that comes in the second arg. of printf */
}

3

Here is a similar code where `a,b` are double-precision floating-point reals: 

In [4]:
#include <stdio.h>

int main(void) {
    double a, b, c; /* double = double-precision floating-point real number */
    a = 1.1;
    b = 2.2; 
    c = a + b;
    printf("%f", c); /* use %f for printing doubles instead of ints */
}

3.300000

Often, we'll also have to work with the type `char`: C has no "basic" `string` type, but `char` is as close as it gets. 

In [42]:
#include <stdio.h>

int main(void) {
    char a, b;
    a = 'a'; /* chars are enclosed by SINGLE quotes ONLY */
    b = 'b'; 
    printf("%c \n", a); /* use %c for printing chars. Also, \n means "new line" */
    printf("%c", b);
}

a 
b

In each of the above examples, values where assigned to `a,b,c` after they were declared. Below, we'll also see that you can assign a value during declaration. 

In addition to C having fewer data types than Python, it has no option to write custom classes! This is, to my knowledge, a big reason why C++ is more widely used than "base" C in the scientific computing world. 

A comment on style: in C, it's typical to use camelCase for names of variables, functions, etc. That is, it's widely considered cleaner to do this

In [36]:
#include <stdio.h>

int main(void) {
    char myFirstVar, mySecondVar;
    myFirstVar = 'a';
    mySecondVar = 'b'; 
    printf("%c \n", myFirstVar);
    printf("%c", mySecondVar);
}

a 
b

and not this 

In [37]:
#include <stdio.h>

int main(void) {
    char my_first_var, my_second_var;
    my_first_var = 'a';
    my_second_var = 'b'; 
    printf("%c \n", my_first_var);
    printf("%c", my_second_var);
}

a 
b

even though both scripts work and have the same output (I shouldn't have to say that names like `myfirstvar` or `my_First_Var` constitute blasphemy instead of mere style violations). 

### Simple Logical Statements

Here is an example of an `if` statement in C: 

In [18]:
#include <stdio.h>

int main(void) {
    int a = 1, b = 2;
    if (a > 1) {
        printf("I am printing a = %d", a);
    } else {
        printf("I am printing b = %d", b);
    }
}

I am printing b = 2

Remember to close any braces you open! 

We can form more complex logical statements using the "and" operator `&&` and the "or" operator `||`. 

In [22]:
#include <stdio.h>

int main(void) {
    int a = 1, b = 2, c = 3;
    if (a > 1) {
        printf("I am printing a = %d", a);
    } else if (b > 1 && c < 3) {
        printf("I am printing b = %d", b);
    } else if (a > 1 || b == 32 || c > 1) {
        printf("I am printing c = %d", c);
    } else {
        printf("I'm all out of ideas!");
    }
}

I am printing c = 3

C also includes a very handy "ternary operator" `statement ? a : b` which returns `a` if `statement` is true and `b` otherwise.  

In [25]:
#include <stdio.h>

int main(void) {
    int a = 1, b = 2, c;
    c = (a > 1) ? a : b;
    printf("%d", c);
}

2

**Warning**: C recognizes `0` as "false" and `1` as "true", but does not have a native boolean data type. If you really want to use boolean datastypes, put `#include <stdbool.h>` into your script. 

### Loops 

The only thing to worry about with `for` loops in C is setting up the index properly. Defining a loop index consists of three statements:

1) the index declaration and initialization,
2) the range of values the index can take, and
3) the rule for incrementing the index between passes through the loop.

Here's an example that prints the sum of the first `N = 10` natural numbers (endpoint-inclusive). 

In [32]:
#include <stdio.h>

int main(void) {
    int N = 10, sum = 0;
    for (int k = 1; k <= N; k++) { /* "k++" = shortcut for "k = k + 1" */
        sum += k; /* shortcut for "sum = sum + k" */
    }
    printf("%d", sum);
}

55

The same result can be obtained with a `while` loop, at the cost of turning the index into a global (script-wide) variable: 

In [33]:
#include <stdio.h>

int main(void) {
    int N = 10, sum = 0, k = 1;
    while (k <= N) {
        sum += k;
        k++; 
    }
    printf("%d", sum);
}

55

### Functions with Arguments 

So far, we've see that every C script must include a function `main()`. Defining your own C functions is pretty straightforward for us now that we understand that declarations are imperative: naturally, we must specify the data type of the *inputs* and *outputs* of any function we want to define. Here is an example: 

In [35]:
#include <stdio.h>

int myFunc(int n) {
    /* Takes in an integer "n" and returns "32 x n" */
    return 32 * n;
}

int main(void) {
    printf("%d", myFunc(2));
}

64

I emphasize that the custom function `myFunc` is *global*, hence it can be called within `main()`. 

### Worked Example: Checkerboard

*Based on Chapter 1, Problem 5 in Shapira's book* 

Let's put together everything we've learned about logic, loops, and functions to perform a classical coding exercise. The goal is to write a script that prints an 8$\times$8 "checkerboard" where red tiles are represented by the symbol "+" and black tiles are represented by the symbol "-". We assume that the top-left tile on the board is red. 

I've chosen to solve this problem by defining two helper functions that tell us what symbol to place in the i,j-th tile of the board. From there, `main()` loops through each tile and determines its symbol. 

In [41]:
#include <stdio.h>

int evenAndOdd(int i, int j) {
    /* True if first arg is Even and second arg is Odd */
    return i % 2 == 0 && j % 2 == 1; /* a % b = a mod b, as in Python */
}

char boardTile(int i, int j) {
    /* Get the i,j ^th tile in the checkerboard */
    int evenRowOddColumn = evenAndOdd(i, j);
    int oddRowEvenColumn = evenAndOdd(j, i);
    return evenRowOddColumn || oddRowEvenColumn ? '-' : '+'; /* Use single quotes for chars! */
}

int main(void) {
    /* Define number of tiles per axis in the checkerboard */
    const int N = 8; 

    /* Fill up the checkerboard one tile at a time*/
    for (int i = 0; i < N; i++) {
      for (int j = 0; j < N; j++) {
        printf("%c ", boardTile(i, j)); /* note the white space */

        /* Do a linebreak at the end of each row of the board */
        if (j == N - 1) {
           printf("\n");
        }
      }
    }
}

+ - + - + - + - 
- + - + - + - + 
+ - + - + - + - 
- + - + - + - + 
+ - + - + - + - 
- + - + - + - + 
+ - + - + - + - 
- + - + - + - + 


**WARNING: the stuff below is not polished or even carefully explained, and will be updated in the future!**

### Pointers

In [7]:
#include <stdio.h>

int main()
{
    /* Initialize two integer vars */
    int j = -1;
    int k = 32;

    const int j_clone = j;

    /* Swap their values using only pointers */
    int *p_j, *p_k;

    p_j = &j;
    p_k = &k;

    /* Stage 1: Replace j with k */
    *p_j = *p_k;

    /* Stage 2: Replace k with original j-value via j_clone */
    *p_k = j_clone;

    // Check all works as expected.
    printf(" Actual result: j = %d, k = %d \n ", j, k);
    printf("Expected result: j = 32, k = -1");

    return 0;
}

 Actual result: j = 32, k = -1 
 Expected result: j = 32, k = -1

Fun fact: thanks to the `new` keyword, in C++ the above exercise can be done by introducing an auxiliary `int*` instead of the auxiliary `const int` we called `j_clone`. 

### Arrays

In [8]:
#include <stdio.h>

int main() {
    int a[3] = {0, 1, 2};

    for (int k = 0; k < 3; k++) {
        printf("%d \n", a[k]);
        printf("%d \n", &a[k]);
        printf("%d \n", a + k);
    }

    return 0;
}

0 
2140521264 
2140521264 
1 
2140521268 
2140521268 
2 
2140521272 
2140521272 


In [9]:
#include <stdio.h>

int main() {
    int a[3][3] = {{0, 1, 2}, {3, 4, 5}, {6, 7, 8}};

    for (int j = 0; j < 3; j++) {
        for (int k = 0; k < 3; k++) {
            printf("%d \n", a[j][k]);
            printf("%d \n", &a[j][k]);
            printf("%d \n", a[j] + k);
        }
    }

    return 0;
}

0 
117417024 
117417024 
1 
117417028 
117417028 
2 
117417032 
117417032 
3 
117417036 
117417036 
4 
117417040 
117417040 
5 
117417044 
117417044 
6 
117417048 
117417048 
7 
117417052 
117417052 
8 
117417056 
117417056 


### Worked Example: Pascal's Triangle (adapted from $\S$ 1.19 in Shapira)

The code below spits out the bottom $n\times n$ slice of Pascal's triangle.  

In [12]:
#include <stdio.h>

int main() {

    const int n = 3;

    int a[n][n];

    /*Fill the left col and bottom row with 1's */
    for (int i=0; i<n; i++) {
      a[0][i] = a[i][0] = 1;
    }

    for (int i=1; i<n; i++) {
      for (int j=1; j<n; j++) {
        a[i][j] = a[i][j-1] + a[i-1][j];
      }
    }

    for (int i=n-1; i>=0; i--) {
      for (int j=0; j<n; j++) {
        printf("%d ", a[i][j]);
        if (j == n-1) {
           printf("\n");
        }
      }
    }
    return 0;
}

1 3 6 
1 2 3 
1 1 1 


### Recursion

In [10]:
int power(int base, unsigned exp) {
    return exp ? base * power(base, exp - 1) : 1;  
}

#include <stdio.h>

int main() {
    printf("2^9 = %d", power(2, 9));
    return 0;
}    

2^9 = 512

Example: Binary Representation

In [11]:
#include <stdio.h>

void printBinary(int n) {
    if (n > 1) { printBinary(n/2); }
    printf("%d", n%2);
}

int main() {
    // printBinary(0);
    // printBinary(1);
    // printBinary(2); 
    printBinary(4);
    //printBinary(32);
    return 0;
}  

100