In [2]:
#load "sig.fsx"
open CSCI374.ExtraReflection

# Subprograms

## Outline

- Fundamentals of Subprograms
- Local Referencing Environments
- Parameter-Passing Methods
- Parameters That Are Subprograms
- Calling Subprograms Indirectly
- Overloaded Subprograms
- Generic Subprograms
- Design Issues for Functions
- User-Defined Overloaded Operators
- Closures
- Coroutines
- Implementing Subprograms


## Abstractions

Two fundamental abstraction facilities:

- Process abstraction
    - Emphasized from early days
    - Discussed in this chapter

- Data abstraction
    - Emphasized in the 1980s

## Fundamentals of Subprograms

- Each subprogram has a single entry point

- The calling program is suspended during execution of the called subprogram

- Control always returns to the caller when the called subprogram's execution terminates


## Basic Definitions

- A **subprogram definition** describes the interface to and the actions of the subprogram abstraction

- A **subprogram call** is an explicit request that the subprogram be executed

- A **subprogram header** is the first part of the definition, including the name, the kind of subprogram, and the formal parameters

- The **parameter profile (signature)** of a subprogram is the number, order, and types of its parameters

- The **protocol** is a subprogram's parameter profile and, if it is a function, its return type

## Basic Definitions (cont.)

- Function declarations in C and C++ are often called **prototypes**

- A subprogram declaration provides the protocol, but not the body, of the subprogram

- A **formal parameter** is a dummy variable listed in the subprogram header and used in the subprogram

- An **actual parameter** represents a value or address used in the subprogram call statement

## Actual/Formal Parameter Correspondence

- Positional
    - The binding of actual parameters to formal parameters is by position: the first actual parameter is bound to the first formal parameter and so forth
    - Safe and effective

- Keyword
    - The name of the formal parameter to which an actual parameter is to be bound is specified with the actual parameter
    - *Advantage:* Parameters can appear in any order, thereby avoiding parameter correspondence errors
    - *Disadvantage:* User must know the formal parameter's names

## Procedures and Functions

- There are two categories of subprograms
    - **Procedures** are collection of statements that define parameterized computations

    - **Functions** structurally resemble procedures but are semantically modeled on mathematical functions
        - They are expected to produce no side effects
        - In practice, program functions have side effects

## Design Issues for Subprograms

- Are local variables static or dynamic?
- Can subprogram definitions appear in other subprogram definitions?
- What parameter passing methods are provided?
- Are parameter types checked?
- If subprograms can be passed as parameters and subprograms can be nested, what is the referencing environment of a passed subprogram?
- Can subprograms be overloaded?
- Can subprogram be generic?
- If the language allows nested subprograms, are closures supported?

## Local Referencing Environments

- The **referencing environment** of a statement is the collection of all variables that are visible in the statement.
- Local variables can be stack-dynamic
    - Advantages
        - Support for recursion
        - Storage for locals is shared among some subprograms

    - Disadvantages
        - Allocation/deallocation, initialization time
        - Indirect addressing
        - Subprograms cannot be history sensitive

- Local variables can be static
    - Advantages and disadvantages are the opposite of those for stack-dynamic local variables

## Semantic Models of Parameter Passing

![](img/param-pass.png)

## Conceptual Models of Transfer

- Physically move a value (Pass-by-Value/Result)
- Move an access path to a value (Pass-by-Reference/Name)

## Pass-by-Value (In Mode)

- The value of the actual parameter is used to initialize the corresponding formal parameter

    - Normally implemented by copying

    - Can be implemented by transmitting an access path but not recommended (enforcing write protection is not easy)

- Disadvantages
    - *Physical move:* additional storage is required (stored twice) and the actual move can be costly (for large parameters)
    - *Access path method:* must write-protect in the called subprogram and accesses cost more (indirect addressing)

## Pass-by-Result (Out Mode)

- When a parameter is passed by result, no value is transmitted to the subprogram; the corresponding formal parameter acts as a local variable; its value is transmitted to caller's actual parameter when control is returned to the caller, by physical move
    - Require extra storage location and copy operation

## Pass-by-Value-Result (InOut Mode)

- A combination of pass-by-value and pass-by-result

- Sometimes called pass-by-copy

- Formal parameters have local storage

- Disadvantages: Those of pass-by-result and pass-by-value


## Pass-by-Reference (InOut Mode)

- Pass an access path

- Also called pass-by-sharing

- *Advantage:* 
    - Passing process is efficient (no copying and no duplicated storage)

- *Disadvantages:*
    - Slower accesses (compared to pass-by-value) to formal parameters
    - Potentials for unwanted side effects (collisions)
    - Unwanted aliases (access broadened)

## Pass-by-Name (InOut Mode)

- By textual substitution

- Formals are bound to an access method at the time of the call, but actual binding to a value or address takes place at the time of a reference or assignment

- Allows flexibility in late binding

- Implementation requires that the referencing environment of the caller is passed with the parameter, so the actual parameter address can be calculated

## Implementing Parameter-Passing Methods

- In most languages parameter communication takes place through the run-time stack
- Pass-by-reference are the simplest to implement; only an address is placed in the stack

- Function header:
    - `void sub(int a, int b, int c, int d)`
- Function call in `main`: `sub(w, x, y, z)`
    - pass `w` by value, `x` by result, `y` by value-result, `z` by reference

![](img/param-stack.png)

## Design Considerations for Parameter Passing

- Two important considerations
    - Efficiency
    - One-way or two-way data transfer

- But the above considerations are in conflict
    - Good programming suggest limited access to variables, which means one-way whenever possible
    - But pass-by-reference is more efficient to pass structures of significant size

## Overloaded Subprograms

- An **overloaded subprogram** is one that has the same name as another subprogram in the same referencing environment
    - Every version of an overloaded subprogram has a unique protocol

- C++, Java, C#, and Ada include predefined overloaded subprograms

- Ada, Java, C++, and C# allow users to write multiple versions of subprograms with the same name


## Generic Subprograms

- A **generic** or **polymorphic subprogram** takes parameters of different types on different activations

- Overloaded subprograms provide *ad hoc polymorphism*

- *Subtype polymorphism* means that a variable of type `T` can access any object of type `T` or any type derived from `T` (OOP languages)

- A subprogram that takes a generic parameter that is used in a type expression that describes the type of the parameters of the subprogram provides **parametric polymorphism**
    - A cheap compile-time substitute for dynamic binding

## Generic Subprograms in F#

- Infers a generic type if it cannot determine the type of a parameter or the return type of a function - **automatic generalization**

- Such types are denoted with an apostrophe and a single letter, e.g., `'a`

- Functions can be defined to have generic parameters
    - These parameters are not type constrained

In [32]:
let printPair (x: 'a) (y: 'a) =
    printfn "%A %A %b" x y (x=y)
    
sgn printPair

Function: x:obj -> y:obj -> unit

In [34]:
printPair 2 1

2 1 false


In [36]:
printPair 'c' 'c'

'c' 'c' true


## Design Issues for Functions

- Are side effects allowed?
    - Parameters should always be in-mode to reduce side effect (like Ada)
- What types of return values are allowed?
    - Most imperative languages restrict the return types
    - C allows any type except arrays and functions
    - C++ is like C but also allows user-defined types
    - Java and C# methods can return any type (but because methods are not types, they cannot be returned)
    - Python and Ruby treat methods as first-class objects, so they can be returned, as well as any other class

## Closures

A **closure** is a subprogram and the referencing environment where it was defined
- The referencing environment is needed if the subprogram can be called from any arbitrary place in the program
- A static-scoped language that does not permit nested subprograms doesn't need closures
- Closures are only needed if a subprogram can access variables in nesting scopes and it can be called from anywhere
- To support closures, an implementation may need to provide unlimited extent to some variables (because a subprogram may access a nonlocal variable that is normally no longer alive)

In [37]:
let add x =
    let z = x
    let add_y y = //is a closure
        z + y 
    add_y
    
sgn add

Function: x:int -> int -> int

In [38]:
let add10 = add 10

sgn add10

Function: y:int -> int

In [20]:
add10 1

## Coroutines

- A **coroutine** is a subprogram that has multiple entries and controls them itself

- Also called *symmetric control:* caller and called coroutines are on a more equal basis

- A coroutine call is named a *resume*

- The first resume of a coroutine is to its beginning, but subsequent calls enter at the point just after the last executed statement in the coroutine

- Coroutines repeatedly resume each other, possibly forever

- Coroutines provide *quasi-concurrent execution* of program units (the coroutines); their execution is interleaved, but not overlapped


- Execution control sequences for two coroutines without loops
<div>
<img src="img/coroutines1.png" style="float: left;width: 45%; margin-right: 50px;"/>
<img src="img/coroutines2.png" style="float: left;width: 45%;"/>
</div>

- Coroutine execution sequence with loops
![](img/coroutines3.png)

##  The General Semantics of Calls and Returns

- The subprogram call and return operations of a language are together called its **subprogram linkage**

- General semantics of **calls** to a subprogram
    - Parameter passing methods
    - Stack-dynamic allocation of local variables
    - Save the execution status of calling program
    - Transfer of control and arrange for the return
    - If subprogram nesting is supported, access to nonlocal variables must be arranged

- General semantics of subprogram **returns**:
    - In/InOut mode parameters must have their values returned
    - Deallocation of stack-dynamic locals
    - Restore the execution status
    - Return control to the caller

## Implementing "Simple" Subprograms

- Call Semantics:
    - Save the execution status of the caller
    - Pass the parameters
    - Pass the return address to the called
    - Transfer control to the called

- Return Semantics:
    - If pass-by-value-result or out mode parameters are used, move the current values of those parameters to their corresponding actual parameters
    - If it is a function, move the functional value to a place the caller can get it
    - Restore the execution status of the caller
    - Transfer control back to the caller

- Required storage:
    - Status information, parameters, return address, return value for functions, temporaries

## Call Implementation 

- Two separate parts:
    - the actual code, and
    - the non-code part (local variables and data that can change)

- The format, or *layout*, of the non-code part of an executing subprogram is called an **activation record**

![](img/activation-record1.png)

- An *activation record instance* is a concrete example of an activation record (the collection of data for a particular subprogram activation)

![](img/ar-instance.png)

## Subprograms with Stack-Dynamic Local Variables

- More complex activation record
    - The compiler must generate code to cause implicit allocation and deallocation of local variables
    - Recursion must be supported (adds the possibility of multiple simultaneous activations of a subprogram)
    
![](img/activation-record2.png)

### Activation Record

- The activation record format is static, but its size may be dynamic

- The **dynamic link** points to the top of an instance of the activation record of the caller

- An activation record instance is dynamically created when a subprogram is called

- Activation record instances reside on the run-time stack

- The **Environment Pointer (EP)** must be maintained by the run-time system. It always points at the base of the activation record instance of the currently executing program unit


## Revised Semantic Call/Return Actions

- Caller Actions
    - Create an activation record instance
    - Save the execution status of the current program unit
    - Compute and pass the parameters
    - Pass the return address to the called
    - Transfer control to the called
- Prologue actions of the called
    - Save the old EP in the stack as the dynamic link and create the new value
    - Allocate local variables
- Epilogue actions of the called
    - If there are pass-by-value-result or out-mode parameters, the current values of those parameters are moved to the corresponding actual parameters
    - If the subprogram is a function, its value is moved to a place accessible to the caller
    - Restore the stack pointer by setting it to the value of the current EP-1 and set the EP to the old dynamic link
    - Restore the execution status of the caller
    - Transfer control back to the caller

## Example Without Recursion

```cpp
void fun1(float r) {    // main calls fun1
    int s, t;           // fun1 calls fun2
    ...                 // fun2 calls fun3        
    fun2(s);
    ...
}
void fun2(int x) {
    int y;
    ...
    fun3(y);
    ...
}
void fun3(int q) {
    ...
}
void main() {
    float p;
    ...
    fun1(p);
    ...
}
```

![](img/ari-example1.png)

## Recursion

- The activation record used in the previous example supports recursion

```cpp
int factorial (int n) {
    //<-----------------------------1
    if (n <= 1) return 1;
    else return (n * factorial(n - 1));
    //<-----------------------------2
}
void main() {
    int value;
    value = factorial(3);
    //<-----------------------------3
}
```

![](img/ari-example2.png)

![](img/ari-example3.png)