# Phase 1. Foundations: Concurrency, The GIL, and Threads

In [2]:
import sys

sys.version

'3.14.0 free-threading build (main, Nov  1 2025, 00:37:47) [GCC 14.2.0]'

# <b>1.1 Comparing Sequential vs Concurrent</b>

## pthreads - C example

### `sleep` example

Let's write a C program that creates two threads. One will print a series of letters, and the other will print a series of numbers. Because of concurrent execution, the output will be interleaved in a non-deterministic way, demonstrating the core concept.

**To Compile and Run:**

You must link the `pthread` library when compiling.

```bash
gcc -o concurrent concurrent.c -lpthread
./concurrent
```

When you run this, you will *not* see all letters then all numbers, or vice versa. You will see a mixed output like this:

```
Main: Starting threads...
Main: Threads started. Waiting for them to finish...
Number: 0
Letter: A
Number: 1
Letter: B
...
```

The exact order will change each time you run it. This is the non-deterministic scheduling of the operating system in action.

```c
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h> // for sleep()

// A simple function that will be run in a thread.
// It takes a void* argument and returns a void*.
// This is the required signature for a thread function.
void* print_numbers(void* arg);
void* print_letters(void* arg);

int main(void)
{
    pthread_t thread1, thread2; // These are handles for our threads, like file descriptors.

    printf("Main: Starting threads...\n");

    // Create the first thread. It will run the print_numbers function.
    if (pthread_create(&thread1, NULL, print_numbers, NULL) != 0) {
        perror("Failed to create thread1");
        return 1;
    }

    // Create the second thread. It will run the print_letters function.
    if (pthread_create(&thread2, NULL, print_letters, NULL) != 0) {
        perror("Failed to create thread2");
        return 1;
    }
    
    printf("Main: Threads started. Waiting for them to finish...\n");

    // pthread_join is crucial. It makes the main thread WAIT for the other threads to finish.
    // If we didn't do this, main might exit immediately, killing the child threads.
    pthread_join(thread1, NULL);
    pthread_join(thread2, NULL);

    printf("Main: Both threads have finished.\n");
    return 0;
}

void* print_numbers(void* arg)
{
    for (int i = 0; i < 5; i++) {
        printf("Number: %d\n", i);
        sleep(1); // 1 sec
    }
    return NULL;
}
void* print_letters(void* arg)
{
    for (char c = 'a'; c <= 'e'; c++) {
        printf("Letter: %c\n", c);
        sleep(1);
    }
    return NULL;
}
```

In [22]:
!gcc -o concurrent_c concurrent.c -lpthread
!./concurrent_c

Main: Starting threads...
Main: Threads started. Waiting for them to finish...
Number: 0
Letter: a
Number: 1
Letter: b
Number: 2
Letter: c
Number: 3
Letter: d
Number: 4
Letter: e
Main: Both threads have finished.


In [23]:
!./concurrent_c

Main: Starting threads...
Number: 0
Main: Threads started. Waiting for them to finish...
Letter: a
Number: 1
Letter: b
Number: 2
Letter: c
Number: 3
Letter: d
Number: 4
Letter: e
Main: Both threads have finished.


**Key Concepts Illustrated:**

1.  `pthread_t`: A type for a thread "handle". It's how we reference the thread.
2.  **`pthread_create`**: The system call to create a new thread. It takes:
    *   A pointer to a `pthread_t` to store the handle.
    *   Thread attributes (we use `NULL` for defaults).
    *   The *function pointer* to the routine the thread will execute.
    *   A *single argument* to pass to that function (we use `NULL` for now).
3.  **Thread Function**: Must be of the form `void* function_name(void* arg)`. This is the independent path of execution.
4.  `pthread_join`: This is how one thread waits for another to terminate. The `main` function blocks here until `thread1` and `thread2` are done. This is vital for coordination, see an example with the commented out join commands below:

```c
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h> // for sleep()

// A simple function that will be run in a thread.
// It takes a void* argument and returns a void*.
// This is the required signature for a thread function.
void* print_numbers(void* arg);
void* print_letters(void* arg);

int main(void)
{
    pthread_t thread1, thread2; // These are handles for our threads, like file descriptors.

    printf("Main: Starting threads...\n");

    // Create the first thread. It will run the print_numbers function.
    if (pthread_create(&thread1, NULL, print_numbers, NULL) != 0) {
        perror("Failed to create thread1");
        return 1;
    }

    // Create the second thread. It will run the print_letters function.
    if (pthread_create(&thread2, NULL, print_letters, NULL) != 0) {
        perror("Failed to create thread2");
        return 1;
    }

    printf("Main: Threads started. Waiting for them to finish...\n");

    // pthread_join is crucial. It makes the main thread WAIT for the other threads to finish.
    // If we didn't do this, main might exit immediately, killing the child threads.
    // pthread_join(thread1, NULL);
    // pthread_join(thread2, NULL);

    printf("Main: Both threads have finished.\n");
    return 0;
}

void* print_numbers(void* arg)
{
    for (int i = 0; i < 5; i++) {
        printf("Number: %d\n", i);
        sleep(1); // 1 sec
    }
    return NULL;
}
void* print_letters(void* arg)
{
    for (char c = 'a'; c <= 'e'; c++) {
        printf("Letter: %c\n", c);
        sleep(1);
    }
    return NULL;
}
```

In [24]:
!gcc -o concurrent_no_join concurrent_no_join.c -lpthread
!./concurrent_no_join

Main: Starting threads...
Number: 0
Main: Threads started. Waiting for them to finish...
Main: Both threads have finished.


In [27]:
!./concurrent_no_join

Main: Starting threads...
Number: 0
Main: Threads started. Waiting for them to finish...
Letter: a
Main: Both threads have finished.


### A Production-Level Test

We need to introduce more chaos. We'll remove the artificial synchronization of `sleep` and make the threads do *real, variable-length work* that the scheduler can't predict.

Here is a modified version of the C program. Let's call it `concurrent_chaos.c`.

```c
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

// A function to simulate some "work" (a CPU-bound calculation)
// The amount of work is variable to upset any rhythm.
void do_work(int iterations);
void* print_numbers(void* arg);
void* print_letters(void* arg);

int main(void)
{
    pthread_t thread1, thread2;

    printf("Main: Starting threads...\n");

    if (pthread_create(&thread1, NULL, print_numbers, (void*)1) != 0) {
        perror("Failed to create thread1");
        return 1;
    }

    if (pthread_create(&thread2, NULL, print_letters, (void*)2) != 0) {
        perror("Failed to create thread2");
        return 1;
    }

    printf("Main: Threads started. Waiting...\n");
    pthread_join(thread1, NULL);
    pthread_join(thread2, NULL);
}

void do_work(int iterations)
{
    volatile double x = 1.0; // 'volatile' prevents compiler from optimizing the loop away

    for (int i = 0; i < iterations; i++)
        x = x * 1.01;
}

void* print_numbers(void* arg)
{
    int work_base = 1000000;
    srand(time(NULL) ^ ((long)arg + 1)); // Seed RNG uniquely for this thread

    for (int i = 0; i < 5; i++) {
        printf("Number: %d\n", i);
        int iterations = work_base + (rand() % 1000000);
        do_work(iterations);
    }
    return NULL;
}

void* print_letters(void* arg)
{
    // Let's use a variable amount of work for each step
    int work_base = 1000000;
    srand(time(NULL) ^ (long)arg); // Different seed for this thread

    for (char c = 'a'; c <= 'e'; c++) {
        printf("Letter: %c\n", c);
        // Do a random amount of work between steps
        int iterations = work_base + (rand() % 1000000);
        do_work(iterations);
    }
    return NULL;
}
```

In [28]:
!gcc -o concurrent_chaos concurrent_chaos.c -lpthread
!./concurrent_chaos

Main: Starting threads...
Main: Threads started. Waiting...
Number: 0
Letter: a
Number: 1
Letter: b
Letter: c
Number: 2
Letter: d
Number: 3
Letter: e
Number: 4
Main: Both threads have finished.


In [29]:
!./concurrent_chaos

Main: Starting threads...
Number: 0
Letter: a
Main: Threads started. Waiting...
Number: 1
Letter: b
Number: 2
Letter: c
Number: 3
Letter: d
Number: 4
Letter: e
Main: Both threads have finished.


**Key Changes:**
*   **Removed `sleep`:** We replaced it with a CPU-bound calculation loop (`do_work`).
*   **Introduced Randomness:** Each thread now does a variable, unpredictable amount of work between print statements. This destroys the "turn-taking" rhythm.
*   **Unique Random Seeds:** We pass a unique "argument" to each thread and use it to seed the random number generator, ensuring they don't have the same random sequence.

This is the true nature of concurrent scheduling.

## Python example

Let's now bridge this directly to Python. The `threading` module provides an object-oriented interface to these exact same OS-level pthreads.

Here is the Python equivalent of your first, more predictable C program.

Run this Python code. You will see the same kind of interleaved output. The structure is a direct mapping:
*   `pthread_create` -> `threading.Thread()` + `.start()`
*   `pthread_join` -> `.join()`

In [10]:
import time


def print_numbers(n):
    """Print numbers."""
    for i in range(n):
        time.sleep(1)
        print(f"Number: {i}")


def print_letters(string):
    """Print letters."""
    for letter in string:
        time.sleep(1)
        print(f"Letter: {letter}")


### Sequential

In [12]:
import utils

def main():
    """Run code."""
    start = time.perf_counter()

    utils.print_numbers(5)
    utils.print_letters("abcde")

    end = time.perf_counter()

    print(f"Time: {round(end - start, 2)}")


if __name__ == "__main__":
    main()


Number: 0
Number: 1
Number: 2
Number: 3
Number: 4
Letter: a
Letter: b
Letter: c
Letter: d
Letter: e
Time: 10.0


### Concurrent

In [18]:
import threading
import time

import utils


def main():
    """Run code."""
    print("Main: Starting threads...")

    start = time.perf_counter()

    t1 = threading.Thread(target=utils.print_numbers, args=(5,))
    t2 = threading.Thread(target=utils.print_letters, args=("abcde",))

    t1.start()
    t2.start()

    print("Main: Threads started. Waiting for them to finish...")

    t1.join()
    t2.join()

    end = time.perf_counter()

    print("Main: Both threads have finished.")
    print(f"Time: {round(end - start, 2)}")


if __name__ == "__main__":
    main()


Main: Starting threads...
Main: Threads started. Waiting for them to finish...
Number: 0Letter: a

Letter: bNumber: 1

Number: 2
Letter: c
Number: 3
Letter: d
Number: 4Letter: e

Main: Both threads have finished.
Time: 5.01


In [20]:
!python3.14 concurrent.py

Main: Starting threads...
Main: Threads started. Waiting for them to finish...
Number: 0
Letter: a
Number: 1
Letter: b
Number: 2
Letter: c
Number: 3
Letter: d
Number: 4
Letter: e
Main: Both threads have finished.
Time: 5.01


Now, for the crucial production-level insight. In Python 3.14t with free-threading, these `print_letters` and `print_numbers` threads can run on *different CPU cores simultaneously*. The `print` function, however, involves a shared resource: the standard output (`stdout`). The Python interpreter internally uses a lock to make `stdout` thread-safe, so you don't get garbled output like "LNumettbeer: r A: 0". This is an example of the interpreter handling one class of thread-safety for you.

## My turn

Your task is to write the Python equivalent of our chaotic C program. Take the Python code and modify it. Replace `time.sleep` with a CPU-bound calculation (like calculating a Fibonacci number or doing many math operations) to see the true concurrency.

Your goal is to create a file, `concurrent_chaos.py`, where:

1.  Two threads run concurrently: one prints letters (A-Z), the other prints numbers (0-25).
2.  You replace `time.sleep` with a CPU-intensive task (e.g., a loop that calculates a sum, a factorial, or simulates work like we did with the `volatile double` in C).
3.  You introduce **variability** in the amount of work done between prints, so the thread scheduling becomes truly non-deterministic across multiple runs.