# Scope and lifetime of objects

<div class="alert alert-block alert-info">
    You can find all of the C programs in this notebook in the subdirectory containing this notebook:
    <code>./src/scope</code>
</div>

The contiguous region of the program where an identifier (name) can be accessed is called the *scope* of the identifier. C and Java share similarities in that blocks determine the scope of identifiers, but there are
different kinds of scope in the two languages.

The lifetime of an object determines when it is valid to use the object. In Java, the programmer usually is
not concerned about object lifetime because the language uses garbage collection to release memory used by
objects that can no longer be accessed. Furthermore, there are no pointers in Java which means that it is
usually impossible to obtain a reference to a "dead" object. In C, the existance of pointers and the ability
to dynamically allocate and deallocate memory means that the programmer must always be aware of object lifetime
issues.

This notebook mentions the term *linkage*. See the *Linkage* notebook for details.

## Scope

There are four kinds of scope in C:

* file scope
* block scope
* function prototype scope
* function scope

Function scope is relevant only to labels declared inside of a function and is not discussed in this notebook.

#### File scope

If an identifier is declared outside of any block or parameter list then it has *file scope*. The identifier
is usable at the point of declaration and ends at the end of the translation unit. Furthermore, other
translation units may be able to link to such an identifier if the identifier has external linkage.

The following program has three identifiers with file scope:

1. the variable `j`
2. the function `f`
3. the function `main`

In [None]:
// scope.c

#include <stdio.h>

int j;                                                       // j has file scope, usable from this line on

void f(int i) {                                              // f has file scope, usable from this line on
    int j = 1;
    i++;
    printf("\tfunc: i = %d, j = %d\n", i, j);
    for (int i = 0; i < 2; i++) {
        int j = 2;
        printf("\t\tfor loop, i = %d, j = %d\n", i, j);
    }
    printf("\tfunc: i = %d, j = %d\n", i, j);
    
    // UNCOMMENT NEXT LINE TO TEST IF k IS IN SCOPE
    // k = k + 1;
}

int k;                                                       // k has file scope, usable from this line on

int main(void) {                                             // main has file scope
    printf("main: j = %d\n", j);
    f(100);
    printf("main: j = %d\n", j);

    return 0;
}


The example above illustrates that where a variable is declared affects its scope. Even though `k` has
file scope, it is not usable inside the function `f` because `k` is not in scope relative to `f`.

#### Block scope

Braces `{ }` denote blocks of code (similar to Java).
If an identifier is declared inside of a block or in a parameter list, then the identifier has *block scope*.
The identifier is accessible everywhere in the block it is declared in after the point where it is declared.

Blocks appear inside of files and inside of other blocks which causes their scopes to be nested.
It is legal, but often confusing, to declare identifiers with the same name in different scopes.
An identifier declared at an inner scope takes precedence over an identifier declared at an outer
scope. We say that the other scope identifier is *hidden* at the inner scope.

In [None]:
// scope.c

#include <stdio.h>

int j;

void f(int i) {                                          // i has block scope (the function f)
    int j = 1;                                           // j has block scope; hides the file scope j
    i++;
    printf("\tfunc: i = %d, j = %d\n", i, j);
    for (int i = 0; i < 2; i++) {                        // i has block scope (the loop); hides parameter i
        int j = 2;                                       // j has block scope (the loop); hides j from Line 8
        printf("\t\tfor loop, i = %d, j = %d\n", i, j);
    }
    printf("\tfunc: i = %d, j = %d\n", i, j);
}

int main(void) {                                         // main has file scope
    printf("main: j = %d\n", j);
    f(100);
    printf("main: j = %d\n", j);

    return 0;
}


#### Function prototype scope

If a function is declared, but not defined, then the parameters of the declared function have
*function prototype scope*. 

In [None]:
// function prototype
// parameter a has function prototype scope
void some_function(int a);  

Usually, function prototype scope is not very interesting but it can affect the order in
which the parameters must appear. The following prototype is correct:

In [None]:
// function prototype
// parameters n and a have function prototype scope
void another_function(int n, int a[n]);  

The following prototype is incorrect because the parameter `n` is used before it comes into scope:

In [None]:
// function prototype
// parameters n and a have function prototype scope
// error because n is used before it is in scope
void an_incorrect_function(int a[n], int n);  

## Lifetime

The lifetime of an object is determined by its *storage duration*. There are four kinds of storage duration in C:

1. automatic
2. static
3. allocated
4. thread

Thread storage duration is not covered in CISC220 but may be relevant in CISC324 Operating Systems.

### Automatic storage duration

Automatic storage duration is the default storage duration of an object declared within a block or as a function
parameter. Memory for the object is allocated when the block in which the object was declared is entered
and deallocated when the block is exited by any means.

In [None]:
// scope.c

#include <stdio.h>

int j;

void f(int i) {                                          // lifetimes of i and j start when function is called
    int j = 1;                                           
    i++;
    printf("\tfunc: i = %d, j = %d\n", i, j);
    for (int i = 0; i < 2; i++) {                        
        int j = 2;                                       
        printf("\t\tfor loop, i = %d, j = %d\n", i, j);
    }
    printf("\tfunc: i = %d, j = %d\n", i, j);
}                                                        // lifetimes of i and j end

int main(void) {                                         
    printf("main: j = %d\n", j);
    f(100);
    printf("main: j = %d\n", j);

    return 0;
}


In [None]:
// scope.c

#include <stdio.h>

int j;

void f(int i) {                                          
    int j = 1;                                           
    i++;
    printf("\tfunc: i = %d, j = %d\n", i, j);
    for (int i = 0; i < 2; i++) {                        // lifetime of i starts on first loop iteration
                                                         //     lifetime of j starts on each loop iteration
        int j = 2;                                       // 
        printf("\t\tfor loop, i = %d, j = %d\n", i, j);  //     lifetime of j ends after this line
    }                                                    // lifetime of i ends on loop completion
    printf("\tfunc: i = %d, j = %d\n", i, j);
}                                                        

int main(void) {                                         
    printf("main: j = %d\n", j);
    f(100);
    printf("main: j = %d\n", j);

    return 0;
}


Objects with automatic storage duration *are not automatically initialized*. If you declare a (non-static)
variable inside of a block and do not provide an explicit initial value, then the value of the variable
is indeterminate (could be any value). This often leads to difficult to find bugs. Consider the following
program that contains a function that computes the sum of the first $n$ natural numbers:

In [None]:
// sum_of.c

#include <stdio.h>

unsigned long sum_of(unsigned int n) {
    unsigned long sum;                        // uh-oh, uninitialized block scope variable
    for (unsigned int i = 0; i <= n; i++) {
        sum += i;
    }
    return sum;
}

int main(void) {
    for (int i = 1; i < 10; i++) {
        unsigned long s = sum_of(i);
        printf("sum_of(%d) = %lu\n", i, s);
    }
    
    return 0;
} 

If you are unlucky, then running the cell above might print the correct sum for some values of `n` because
by chance, the value of variable `sum` happened to be equal to `0`.

### Static storage duration

Static storage duration is the storage duration of an object having file scope and of objects declared
to be `static`. The lifetime of such an object is the entire execution of the program. The value stored
in the object is initialized exactly once before the program begins running. Unlike objects with
automatic storage duration, static storage duration objects are given default initialization values
when no explicit initial value is specified:

* if the object has a pointer type, then it is initialized to a null pointer
* if the object has an arithmetic type, then it is initialized to positive or unsigned zero
* if the object is a struct, then every member is initialized recursively according to the previous rules

In [None]:
// scope.c

#include <stdio.h>

int j;        // lifetime of j is for the entire program execution, initial value is 0

void f(int i) {                                         
    int j = 1;                                           
    i++;
    printf("\tfunc: i = %d, j = %d\n", i, j);
    for (int i = 0; i < 2; i++) {                        
        int j = 2;                                       
        printf("\t\tfor loop, i = %d, j = %d\n", i, j);
    }
    printf("\tfunc: i = %d, j = %d\n", i, j);
}                                                        

int main(void) {                                         
    printf("main: j = %d\n", j);
    f(100);
    printf("main: j = %d\n", j);

    return 0;
}


#### Static variables

Variables may be declared using the specified `static` storage-class specifier. A `static` variable
has static storage duration (there is an exception to this that is beyond the scope of this notebook)
and it also has what is called *internal linkage*. Internal linkage means that the variable is visible
only inside of its translation unit.

**`static` file scope variables (and functions)**  

A `static` file scope variable has static storage duration and is visible only inside of the translation
unit that it is declared in. Such a variable is somewhat similar to a `private static` field in Java (only one
copy of the field exists and the field is visible only inside of the class that declares it).

The following program uses a `static` file scope variable to compute and print the sum of the first $n$ natural
numbers:

In [None]:
// sum_of2.c

#include <stdio.h>

static unsigned int sum = 0;

void sum_of(unsigned int n) {
    for (unsigned int i = 0; i <= n; i++) {
        sum += i;
    }
}

void print_sum() {
    printf("sum_of(10) = %u\n", sum);
}

int main(void) {
    sum_of(10);
    print_sum();
    
    return 0;
} 

The above example illustrates a common bug when using `static` variables: The function `sum_of` computes
the wrong result if it is called more than once with a value of `n > 0`:

In [None]:
// sum_of2_fails.c

#include <stdio.h>

static unsigned int sum = 0;

void sum_of(unsigned int n) {
    for (unsigned int i = 0; i <= n; i++) {
        sum += i;
    }
}

void print_sum() {
    printf("sum_of(10) = %u\n", sum);
}

int main(void) {
    sum_of(10);
    print_sum();
    
    // call sum_of a second time
    sum_of(10);
    print_sum();
    
    return 0;
} 

The error is that the function `sum_of` does not set `sum` to `0` before accumulating the sum (which is
easily fixed). The real issue here is that the use of a `static` variable is inappropriate: It would
be better if `sum_of` used a local variable to accumulate the sum and return the sum as a return value,
and add a parameter to `print_sum` which contains the value to print.

<div class="alert alert-block alert-info">
    We won't be using such functions in this course, but
    a function may also be specified as being <tt>static</tt> in which case the function is visible only 
    inside of the translation unit that it is declared in. Such a function is somewhat similar to a
    <tt>private static</tt> method in Java (the function is visible only inside of the class that declares it).
</div>

**`static` block scope variables**

A `static` block scope variable can be used by a function to retain information between function calls.
The following function implements a counter whose value can be queried or incremented upwards:

In [None]:
// counter.c

#include <stdio.h>

unsigned int count(unsigned int increment) {
    static unsigned int value = 0;           // initialized to 0 exactly once, retains value between calls;
                                             // scope is limited to the function count
    if (increment > 0) {
        // potential overflow bug
        value += increment;
    }
    return value;
}

int main(void) {
    // current value of the counter
    unsigned int c = count(0);
    printf("current count = %u\n", c);
    c = count(1);
    printf("increment, current count = %u\n", c);
    c = count(1);
    printf("increment, current count = %u\n", c);
    c = count(1);
    printf("increment, current count = %u\n", c);
    c = count(0);
    printf("current count = %u\n", c);
    
    return 0;
}

Functions that retain information between calls are uncommon. One notable exception is the
Standard Library function `strtok` declared in `<string.h>`.

### Allocated storage duration

An object having allocated storage duration is one where the programmer fully controls the lifetime
of the object. To create such an object, the programmer:

* requests memory for the object using a dynamic memory allocation function
* assigns a value to the object
* uses the object for as long as required
* releases memory used by the object using a dynamic memory deallocation function

If the object itself (as opposed to a shallow copy of the object) must persist across one or more function calls
then an object having either static or allocated storage duration is required. An object having
static storage duration must be known at compile time; thus, static storage duration is unsuitable
for objects created during the runtime of the program.

Perhaps the clearest example where allocated storage duration is required is when a function must create and
return an array (or more precisely, return a pointer to the first element of the array). The obvious approach
does not work in C:



In [None]:
// badarray.c

#include <stdio.h>

int* intarray(size_t len) {
    if (len == 0) {
        len = 1;
    }
    int arr[len];                        // arr lifetime starts here
    return arr;                          // arr lifetime ends when function returns
}

int main(void) {
    int *a = intarray(10);
    printf("a[0] = %d\n", a[0]);
    
    return 0;
}

The example above causes a warning when compiled on the author's computer, and fails at runtime when attempting
to access the array element.

The example above fails because the array created in the function `intarray` has automatic storage duration. its
lifetime ends when the function returns which causes the `main` function to access an invalid memory
location when attempting to read from the array.

Details of creating objects having allocated storage duration can be found in the 
*Dynamic memory allocation and deallocation* notebook.