# Process Creation

Recall: A process is an instance of a running program

** We can add more to process: Global variables, File descriptors

![Alt text](images/image5.png)

Each process contains a <mark>Process Control Block (PCB)</mark>, containing all the information about the process: (`task_struct` in Linux)

- Process state
- CPU registers
- Scheduling information
- Memory management information
- I/O status information
- Any other type of accounting information

Each process gets a unique **process ID (pid)** to keep track of it

## Process State Diagram

![Alt text](images/image6.png)

- "running" means the process is actually executing
- "waiting/ready": at some point while running a process, the kernel will decide to run another process, and current is put to "waiting" (*Scheduling*)
- "exit" terminates a process
- "block" means even if the CPU is not running any processes, this process is still not going to be executed  
Running -> Block when a process is waiting for a request (network or write/read from a file)

You can read process's state using the `proc` filesystem

> We will use this in Lab 1, which tells you information about the current processes and display it (like Task Manager app)

- There’s a standard `/proc` directory (on Linux) that represents the kernel’s state
- Every directory that’s a number (process ID) in `/proc` represents a process
- For each process, there’s a file called `status` that contains the state (used for Lab 1)

## Creating a process

There are different ways of creating a process:

1. Load the program into memory and create the process control block (Windows)
2. Unix decomposes process creation into more flexible abstractions

## Cloning a process

- Pause the currently running process, and copy it’s PCB into a new one.  
This will reuse all of the information from the process <mark>(all instructions, variables)</mark>

- Distinguish between the two processes with a parent and child relationship  
They could both execute different parts of the program together

- We could then allow either process to load a new program and setup a new PCB

    int fork(void)

Returns the process ID of the newly created child process (see example)
- -1: on failure
- 0: in the child process
- \>0: in the parent process

There are now 2 processes running  

Note: they can access the same variables, but they’re separate (virtual memory!)

Operating system does “copy on write” to maximize sharing

#### EXAMPLE:

In [29]:
%%file run.c

#include <errno.h>
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>

int main(void) {
  int x = 42;
  pid_t returned_pid = fork();
  printf("pid returned: %d, Address of x: %d\n", returned_pid, &x);
  if (returned_pid > 0) {
    printf("Parent returned pid: %d\n", returned_pid);
    printf("Parent pid: %d\n", getpid());
    printf("Parent parent pid: %d\n", getppid());
    usleep(1000);
  }
  else if (returned_pid == 0) {
    printf("Child returned pid: %d\n", returned_pid);
    printf("Child pid: %d\n", getpid());
    printf("Child parent pid: %d\n", getppid());
  }
  else {
    int err = errno;
    perror("fork failed");
    return err;
  }
  return 0;
}

Overwriting run.c


In [30]:
!gcc run.c -g -o run.o

  printf("pid returned: %d, Address of x: %d\n", returned_pid, &x);
[0;1;32m                                          ~~                   ^~


In [31]:
!./run.o

pid returned: 2931, Address of x: 1809706808
Parent returned pid: 2931
Parent pid: 2930
Parent parent pid: 2241
pid returned: 0, Address of x: 1809706808
Child returned pid: 0
Child pid: 2931
Child parent pid: 2930


Example walkthrough:

Line | Parent process (id = 2930)           | Child process |
:--- | :---                                 | :--- |
0    | int x = 42                           |      |
1    | calls `fork`, creating child process | Gets created (id = 2931), deep copy all instructions & memory |
1    | returned_pid = 2931                  | returned_pid = 0       |
2    | prints 1809706808                    | prints 1809706808 (the exact same address, but this is virtual $\implies$ different physical address) |

<mark>**NOTE:** after line 1, which process will run first? WE DON'T KNOW!!!! (not always parent)</mark>

## Replacing a process

`execve` replaces the process (ITSELF - the one that calls `execve`) with another program, and resets data

API:

    int execve (string pathname, char* argv[], char* envp[])

- pathname: Full path of the program to load

- argv: Array of strings (array of characters), terminated by a null pointer  
Represents arguments to the process

- envp: Same as argv  
Represents the environment of the process

- Returns an error on failure, **does not return if successful** (already switched to another process)

In [1]:
%%file run.c

#include <errno.h>
#include <stdio.h>
#include <unistd.h>

int main(int argc, char *argv[]) {
    printf("I'm going to become another process\n");
    char *exec_argv[] = {"ls", NULL};
    char *exec_envp[] = {NULL};
    int exec_return = execve("ls", exec_argv, exec_envp);
    if (exec_return == -1) {
        exec_return = errno;
        perror("execve failed");
        return exec_return;
    }
    printf("If execve worked, this will never print\n");
    return 0;
}

Overwriting run.c


In [2]:
!gcc run.c -g -o run.o

'gcc' is not recognized as an internal or external command,
operable program or batch file.


### The OS Creates Processes

The operating system has to:
- Maintain process control blocks, including state
- Create new processes
- Load a program, and re-initialize a process with context