### Processes
- Process
    - a program in execution
    - an instance of a program running on a computer 
    - also called a job
    - process is active, program is passive
    - program becomes a process when it is loaded into memory
        - each new execution creates a new process
-  process in memory
    - <img src="images/process_memory.png" alt="drawing" width="400"/> 
    - stack
        - local variables (function parameters)
        - calling a function allocates memory on the stack 
    - heap
        - dynamic memory allocation (variables)
        - can grow and shrink during execution
    - data
        - global variables
        - static variables
    - text
        - functions
    - <img src="images/process_memory2.png" alt="drawing" width="1000"/> 
        - local variables are only available in the function
            - because they are deallocated from the stack when the function returns
        - global variables are available to all functions
            - because they are not deallocated when the function returns and exist on the heap
        - garbage collection is performed on the heap
            - not all languages have garbage collection
- Process States 
    - new - process is being created
    - running - instructions are being executed
    - waiting - process is waiting for some event to occur
    - ready - process is waiting to be assigned to a processor
    - terminated - process has finished execution
    - every process must be in exactly one of the above states
    - each processor can only execute one process at a time
- process transitions
    - **looks like an exam question** 

### Process Control Block
- represents a process in the operating system
    - maintains information about the process
    - necessary for scheduling
- contains
    - process state
    - program counter
        - address of the next instruction to be executed
    - CPU registers
        - contents of all process registers
            - CPU can only hold one process at a time
            - registers must be saved somewhere when a process is interrupted so it can be restored
    - CPU scheduling information
        - priority
        - scheduling queue pointers
    - memory management information
        - page table
        - segment table
    - accounting information
        - CPU used
        - clock time elapsed
    - I/O status information
        - list of I/O devices allocated to the process
        - list of open files
- process representation in Linux
    - contained in a C struct
- context switching
    - the process of storing and restoring the CPU state, called a context
    -  used in multiprogramming or time-shared systems
    - involves time overhead
        - time to stop and save one process
        - time to restore and start another
        - amount of overhead depends on the hardware
            - e.g. register speed, number of registers, etc.
        - OS will try to minimize the overhead
- mode switching
    - switching from user to kernel mode or vice versa
    - all context switches require a mode switch into kernel mode

### Scheduling
- goal
    - maximize CPU utilization in a multiprogramming environment
    - provide the illusion of multiple processes running simultaneously on one CPU
- scheduling queues
    - job queue
        - set of all processes in the system
    - ready queue
        - set of all processes residing in main memory, ready and waiting to execute
    - device queue
        - set of processes waiting for an I/O device
    - processes migrate between the various queues
- <img src="images/process_scheduling.png">
- schedulers
    - long-term (job scheduler)
        - **not used in many devices and not discussed in depth in this course**
        - selects which processes should be brought into the ready queue
        - controls the degree of multiprogramming
        - controls the mix of I/O bound and CPU bound processes
            - has a target ratio
        - invoked very infrequently
        - can afford more time to select the best processes
    - short-term (CPU scheduler)
        - **primary scheduler discussed in this course** 
        -  selects the process to be executed next
        - invoked very frequently
        - necessary to limit scheduling overhead
    - key difference
        - long-term scheduler is more selective
            - selects from the job queue 
        - short-term scheduler is more frequent
            - selects from the ready queue

### Process Creation
- processes are created and destroyed dynamically
- any process can create a new process 
    - starts with a *primordial process*
        - the first process created by the OS
            - pid = 0
        - all other processes are descendants of this process
    - parent process
        - the process that created the new process
    - child process
        - the process that was created
    - managed by pid (process identifier)
        - pid of parent is stored in the child process
        - pid of child is returned to the parent
- resource sharing options
    - parent and child share all resources
    - children share a subset of the parent's resources
    - parent and child share no resources
        - e.g. UNIX pipes
- execution options
    - parent and child execute concurrently
        - parent, or child, or an entirely different process can execute first
        - e.g. UNIX shell
    - parent waits until child terminates
        - e.g. on UNIX, parent can explicitly call wait() to wait for child to finish
- address space options
    - child is a duplicate of the parent
        - e.g. UNIX fork() system call
    - child has a new program loaded into it
        - e.g. UNIX exec() system call

### fork example

~~~ C
int main()
{
    pid_t  ret;
    /* fork another process */
    ret = fork();
    if (ret < 0) { /* error occurred */
        fprintf(stderr, "Fork Failed");
        exit(-1);
    }
    else if (ret == 0) { /* child process */
        execlp("/bin/ls", "ls", NULL);
    }
    else { /* parent process */
        /* parent will wait for the child to complete */
        wait (NULL);
        printf ("Child Complete");
        exit(0);
    }
}
~~~
- the only difference between the parent and child is the return value of fork()
    - the child gets a return value of 0
    - the parent gets the pid of the child

~~~ C
int main(){
   pid_t  ret;
	/* fork another process */
	ret = fork();
	
   printf ("0: Value %d\n", ret);
	if (ret == 0) { /* child process */
      execlp("/bin/ls", "ls", NULL);
      printf ("1: Process %d\n", getpid()); // this line will not be executed because the child process is replaced by the ls command
	}
	else { /* parent process */
	   wait (NULL);
		  printf ("2: Process %d\n", getpid());
	}
}
~~~
Assume:
- parent process has pid 100 
- child process has pid 200. 
Output:
0: value 200
0: value 0
*some output of the ls command*
2: process 100

~~~ C
int main(){
   pid_t  ret;
	/* fork another process */
	ret = fork();
	
	if (ret == 0) { /* child process */
      execlp("/bin/ls", "ls", NULL);
      printf ("1: Process %d\n", getpid());
	}
	else { /* parent process */
	   wait (NULL);
		  printf ("2: Process %d\n", getpid());
	}
   printf ("3: Process %d\n", getpid());
}
~~~
Assume:
- parent process has pid 100 
- child process has pid 200. 
Output:
*some ls output*
2: process 100
3: process 100

~~~ C
int main(){
   pid_t  ret;
	/* fork another process */
	ret = fork();
	
	if (ret == 0) { /* child process */
		  sleep(5);
      printf ("1: Process %d, parent: %d\n", getpid(), getppid());
	}
	else { /* parent process */
	   printf ("2: Process %d\n", getpid());
	}
   printf ("3: Process %d\n", getpid());
}
~~~
Assume:
- parent process has pid 100 
- child process has pid 200. 
Output:
2: process 100
3: process 100
- the child is still sleeping
1: process 200, parent: 1
- the parent has finished executing so a new parent PID is assigned to the child
3: process 200


~~~ C
int main(){
   pid_t  ret; //pid_t is an integer type
	/* fork another process */
ret = fork(); 
ret = fork();

	if (ret == 0) { /* child process */
      printf ("1: Process %d\n", getpid());
	}
	else { /* parent process */
	   printf ("2: Process %d\n", getpid());
	}
}
~~~
total number of processes is $2^n$ where $n$ is the number of fork() calls
- 1 fork() call creates 2 processes


total number of processes is $2^n$ where $n$ is the number of fork() calls
- 1 fork() call creates 2 processes

~~~ C
int global = 100;

int main(int argc, char *argv[]) {
  pid_t ret;
  int local = 100;

  ret = fork();
  if(ret == 0){ // child
    local = 20;
    global = 20;
  }
  else { // parent
    wait(NULL); // wait for child process to finish
    printf("Global: %d; Local: %d\n", global, local);
  }
  exit(0);
}
~~~
Assume:
Output:
Global: 100; Local: 100
- the child process has a copy of the parent's memory

### Multiprocess Architecture

### Models of IPC
- shared memory
    - two or more processes share a region of memory
    - requires synchronization
        - to ensure that processes do not overwrite each other's data
    - fast, convenient communication
- message passing
    - processes communicate with each other without sharing memory
    - requires a mechanism for processes to exchange messages
    - slower than shared memory
    - typically involve smaller amounts of data
    - can work for inter-computer communication
    - easier to implement
    - typically messages do not overwrite each other
        - no need for conflict resolution

### Message Passing
- can be employed for client-server communication
- provides at least a send and receive function
- if P and Q wish to communicate, they need to:
    - establish a connection between them
    - exchange messages
    - close the connection when done
- implementation issues
    - how are links established?
    - can a link be associated with more than two processes?
    - how many links can there be between every pair of communicating processes?
    - what is the capacity of a link?
    - is the size of a message that the link can accommodate fixed or variable?
    - is a link unidirectional or bi-directional?
- direct communication
    - processes name each other explicitly
    - disadvantage
        - processes need to know each other's identity
            - must be hard coded values
- indirect communication
    - messages are sent to and received from mailboxes (ports)
    - each mailbox has a unique id
    - processes can communicate without knowing each other's identity
        - they must share a mailbox
    - mailbox exists until explicitly deleted
        - even if process ends (assuming mailbox is not deleted by the process)
    - can be held in a process address space or the kernel
    - communication link
        - link may be associated with multiple processes
        - each pair of processes may have multiple links
        - links may be unidirectional or bi-directional
        - multiple receivers may need synchronization
    - advantages
        - processes do not need to know each other's identity
        - processes can communicate even if they are not executing at the same time
    - disadvantages
        - system call overhead
        - kernel involvement
- synchronization
    - blocking send
        - sender blocks until message is received
    - blocking receive
        - receiver blocks until message is available
    - non-blocking send
        - sender sends the message and continues
    - non-blocking receive
        - receiver receives a valid message or null
- buffering
    - zero capacity
        - 0 messages
        - sender blocks (waits) until receiver receives the message
    - bounded capacity
        - finite length of n messages
        - sender blocks until space is available
    - unbounded capacity
        - infinite length
        - sender never waits
        - receiver may block if no messages are available

#### Examples of IPC
- pipes
    - most basic form of IPC on UNIX
        - powerful CLI tool
    - ordinarily require parent-child relationships between processes
    - generally unidirectional
    - issues
        - one-way communication
    - anonymous pipes can only be used between related processes
        - e.g. parent-child
    - processes must be controlled by the same OS
    - process exit closes the pipe
        - may cause data loss
    - FIFO only
##### pipe example
~~~ C
#include <unistd.h>
#include <stdio.h>
#include <string.h>
              
main()         
{              
  char *s, buf[1024];
  int fds[2];       
  s  = "EECS 678\n";
                                  
  /* open a pipe. fd[0] is opened for reading, 
     and fd[1] for writing.*/
  pipe(fds);             
                          
  /* write to the write-end of the pipe */ 
  write(fds[1], s, strlen(s));     
                                  
  /* This can be read from the other end of the pipe */ 
  read(fds[0], buf, strlen(s));  
                              
  printf("fds[0]=%d, fds[1]=%d\n", fds[0], fds[1]);
  write(1, buf, strlen(s));  
}
~~~
- the headers list the functions that are used in the program
    - unistd.h
        - contains the pipe() function
    - stdio.h
        - contains the printf() function
    - string.h
        - contains the strlen() function
    - they are not libraries
        - they are header files
            - they contain the function prototypes
            - tell the compiler how much memory to allocate for the function
                - i.e. the data types of the parameters
- fd[0] is the read end of the pipe
- fd[1] is the write end of the pipe
- output:
```
fds[0]=3, fds[1]=4
EECS 678
``` 

### something something aside on memory and labs
- in memory, a file descriptor table is created
    - each entry in the table points to a file, pipe, or socket
    - each process has its own file descriptor table
        - the file descriptor table is copied when a process is forked
        - the child process has a copy of the parent's file descriptor table
    - may point to things from header files
        - e.g. stdin, stdout, stderr
    - may point to files
        - e.g. a file opened by the process
    - may point to pipes
        - e.g. a pipe created by the process

~~~ C
main()              
{               
  char *s, buf[1024];         
  int fds[2];                   
  s  = "EECS 678. Pipe program 3\n";
                          
  /* create a pipe */    
  pipe(fds);          
                    
  if (fork() == 0) { 
                     
    /* child process. */    
    printf("Child line 1\n"); 
    read(fds[0], s, strlen(s));
    printf("Child line 2\n"); 
  } else {  
              
    /* parent process */
    printf("Parent line 1\n"); 
     write(fds[1], buf, strlen(s));
    printf("Parent line 2\n"); 
  }                                
}
~~~
- **the read() function blocks until there is something to read**
    - the parent process writes to the pipe
    - the child process reads from the pipe 
- output:
- `Parent line 1` must be printed before `Child line 1` because the parent process is executed first

### Pipes in UNIX
- commonly used in UNIX shells
    - output from one command is piped to the input of another command
        - e.g. `ls | wc -l`
    - `dup` system call
        - duplicates a file descriptor to the smallest available file descriptor
        - e.g. `dup(fd[1])` duplicates the write end of the pipe
            - `fd[1]` is the write end of the pipe
    - `dup2` system call
        - duplicates a file descriptor to a specific file descriptor
        - e.g. `dup2(fd[1], 1)` duplicates the write end of the pipe to stdout
            - `fd[1]` is the write end of the pipe
            - `1` is the file descriptor for stdout
    - stdin, stdout, and stderr are file descriptors
        - stdin is 0
        - stdout is 1
        - stderr is 2
        - **for the lab we hijack the stdout file descriptor**
            - we redirect it to the write end of the pipe
                - close(1) - this closes stdout
                - dup(fd[1]) - this duplicates the write end of the pipe to stdout on the smallest available file descriptor
                - close(0) - this closes stdin
                - dup(fd[0]) - this duplicates the read end of the pipe to stdin on the smallest available file descriptor
                - close(fd[0]) - this closes the read end of the pipe
                - close(fd[1]) - this closes the write end of the pipe
                    - we don't need to read or write to the pipe anymore
            - it is good practice to close the file descriptors that you are not using
                - e.g. close(fd[0]) and close(fd[1])
                - otherwise weird behavior can occur
                    - probably similar to garbage empty pointer data
            - this allows us to write to the pipe using printf()

- named pipes (FIFOs)
    - any process that knows the name of the pipe can access it
        - not just related processes
    - allows bidirectional communication
    - exist even after the process that created them has terminated
    - effectively just files
    - only allow half duplex communication
        - one process can write to the pipe
        - another process can read from the pipe
        - not both at the same time
    - communicating processes must be on the same machine 


- producer code (writes to the pipe):
``` C
main()           
{               
  char str[MAX_LENGTH]; 
  int num, fd;   
                  
  mkfifo(FIFO_NAME, 0666); // create FIFO file
        
  printf("waiting for readers...");
  fd = open(FIFO_NAME, O_WRONLY); // open FIFO for writing
  printf("got a reader !\n");
                         
  printf("Enter text to write in the FIFO file: ");
  fgets(str, MAX_LENGTH, stdin);
  while(!(feof(stdin))){  
    if ((num = write(fd, str, strlen(str))) == -1)
      perror("write");   
    else                 
      printf("producer: wrote %d bytes\n", num);
    fgets(str, MAX_LENGTH, stdin);
  }                          
}
``` 

- message queues
- shared memory
- sockets

In [1]:
print("test")

test
