# Modularity, Objects, and State

The preceding chapters introduced the basic elements from which programs are made. We saw how primitive functions and primitive data are combined to construct compound entities, and we learned that abstraction is vital in helping us to cope with the complexity of large systems. But these tools are not sufficient for designing programs. Effective program synthesis also requires organizational principles that can guide us in formulating the overall design of a program. **In particular, we need strategies to help us structure large systems so that they will be *modular*, that is, so that they can be divided “naturally” into coherent parts that can be separately developed and maintained.**

**One powerful design strategy, which is particularly appropriate to the construction of programs for modeling physical systems, is to base the structure of our programs on the structure of the system being modeled. For each object in the system, we construct a corresponding computational object.** For each system action, we define a symbolic operation in our computational model. Our hope in using this strategy is that extending the model to accommodate new objects or new actions will require no strategic changes to the program, only the addition of the new symbolic analogs of those objects or actions. If we have been successful in our system organization, then to add a new feature or debug an old one we will have to work on only a localized part of the system.

To a large extent, then, the way we organize a large program is dictated by our perception of the system to be modeled. **In this chapter we will investigate two prominent organizational strategies arising from two rather different “world views” of the structure of systems.** The first organizational strategy concentrates on *objects*, viewing a large system as a collection of distinct objects whose behaviors may change over time. An alternative organizational strategy concentrates on the *streams* of information that flow in the system, much as an electrical engineer views a signal-processing system.

Both the object-based approach and the stream-processing approach raise significant linguistic issues in programming. With objects, we must be concerned with how a computational object can change and yet maintain its identity. This will force us to abandon our old substitution model of computation (section 1.1.5) in favor of a more mechanistic but less theoretically tractable *environment model* of computation. The difficulties of dealing with objects, change, and identity are a fundamental consequence of the need to grapple with time in our computational models. These difficulties become even greater when we allow the possibility of concurrent execution of programs. The stream approach can be most fully exploited when we decouple simulated time in our model from the order of the events that take place in the computer during evaluation. We will accomplish this using a technique known as *delayed evaluation*.

## 3.1 Assignment and Local State

We ordinarily view the world as populated by independent objects, each of which has a state that changes over time. An object is said to “have state” if its behavior is influenced by its history. A bank account, for example, has state in that the answer to the question “Can I withdraw $100?” depends upon the history of deposit and withdrawal transactions. **We can characterize an object’s state by one or more *state variables*, which among them maintain enough information about history to determine the object’s current behavior.** In a simple banking system, we could characterize the state of an account by a current balance rather than by remembering the entire history of account transactions.

In a system composed of many objects, the objects are rarely completely independent. Each may influence the states of others through interactions, which serve to couple the state variables of one object to those of other objects. Indeed, the view that a system is composed of separate objects is most useful when the state variables of the system can be grouped into closely coupled subsystems that are only loosely coupled to other subsystems.

This view of a system can be a powerful framework for organizing computational models of the system. For such a model to be modular, it should be decomposed into computational objects that model the actual objects in the system. Each computational object must have its own *local state variables* describing the actual object’s state. Since the states of objects in the system being modeled change over time, the state variables of the corresponding computational objects must also change. If we choose to model the flow of time in the system by the elapsed time in the computer, then we must have a way to construct computational objects whose behaviors change as our programs run. In particular, **if we wish to model state variables by ordinary symbolic names in the programming language, then the language must provide an *assignment operation* to enable us to change the value associated with a name.**

### 3.1.1 Local State Variables

To illustrate what we mean by having a computational object with time-varying state, let us model the situation of withdrawing money from a bank account. We will do this using a function `withdraw`, which takes as argument an amount to be withdrawn. If there is enough money in the account to accommodate the withdrawal, then `withdraw` should return the balance remaining after the withdrawal. Otherwise, `withdraw` should return the message Insufficient funds. For example, if we begin with $100 in the account, we should obtain the following sequence of responses using `withdraw`:

```javascript
withdraw(25);
75

withdraw(25);
50

withdraw(60);
"Insufficient funds"

withdraw(15);
35
```

Observe that the expression `withdraw(25)`, evaluated twice, yields different values. This is a new kind of behavior for a function. Until now, all our JavaScript functions could be viewed as specifications for computing mathematical functions. A call to a function computed the value of the function applied to the given arguments, and two calls to the same function with the same arguments always produced the same result.

So far, all our names have been *immutable*. When a function was applied, the values that its parameters referred to never changed, and once a declaration was evaluated, the declared name never changed its value. To implement functions like `withdraw`, we introduce *variable declarations*, which use the keyword `let`, in addition to constant declarations, which use the keyword `const`. We can declare a variable `balance` to indicate the balance of money in the account and define `withdraw` as a function that accesses `balance`. The `withdraw` function checks to see if `balance` is at least as large as the requested amount. If so, `withdraw` decrements `balance` by `amount` and returns the new value of `balance`. Otherwise, `withdraw` returns the *Insufficient funds* message. Here are the declarations of `balance` and `withdraw`:

In [2]:
let balance = 100;

function withdraw(amount) {
    if (balance >= amount) {
        balance = balance - amount;
        return balance;
    } else {
        return "Insufficient funds";
    }
}

Decrementing `balance` is accomplished by the expression statement

```javascript
balance = balance - amount;
```

The syntax of *assignment* expressions is

*name* `=` *new-value*

Here *name* has been declared with `let` or as a function parameter and *new-value* is any expression. The assignment changes *name* so that its value is the result obtained by evaluating *new-value*. In the case at hand, we are changing `balance` so that its new value will be the result of subtracting `amount` from the previous value of `balance`.

The function `withdraw` also uses a *sequence of statements* to cause two statements to be evaluated in the case where the `if` test is true: first decrementing `balance` and then returning the value of `balance`. In general, executing a sequence

$$
\textit{stmt}_1 \text{ } \textit{stmt}_2 \text{ } \dots \text{ } \textit{stmt}_n
$$

causes the statemtents $\textit{stmt}_1$ through $\textit{stmt}_n$ to be evaluated in sequence.

Although `withdraw` works as desired, the variable `balance` presents a problem. As specified above, `balance` is a name defined in the program environment and is freely accessible to be examined or modified by any function. It would be much better if we could somehow make `balance` internal to `withdraw`, so that `withdraw` would be the only function that could access `balance` directly and any other function could access `balance` only indirectly (through calls to `withdraw`). This would more accurately model the notion that `balance` is a local state variable used by `withdraw` to keep track of the state of the account.

We can make `balance` internal to `withdraw` by rewriting the definition as follows:

In [3]:
function make_withdraw_balance_100() {
    let balance = 100;
    // anonymous function that accepts the argument `amount`
    return amount => {
        if (balance >= amount) {
            balance = balance - amount;
            return balance
        } else {
            return "Insufficient funds";
        ;}
    };
}
const new_withdraw = make_withdraw_balance_100();

What we have done here is use let to establish an environment with a local variable `balance`, bound to the initial value 100. Within this local environment, we use a lambda expression4 to create a function that takes `amount` as an argument and behaves like our previous `withdraw` function. This function—returned as the result of evaluating the body of the `make_withdraw_balance_100` function—behaves in precisely the same way as `withdraw`, but its variable `balance` is not accessible by any other function.

> In programming-language jargon, the variable `balance` is said to be *encapsulated* within the `new_withdraw` function. Encapsulation reflects the general system-design principle known as the *hiding principle*: One can make a system more modular and robust by protecting parts of the system from each other; that is, by providing information access only to those parts of the system that have a “need to know.”

Combining assignments with variable declarations is the general programming technique we will use for constructing computational objects with local state. Unfortunately, using this technique raises a serious problem: when we first introduced functions, we also introduced the substitution model of evaluation (section 1.1.5) to provide an interpretation of what function application means. We said that applying a function whose body is a return statement should be interpreted as evaluating the return expression of the function with the parameters replaced by their values. For functions with more complex bodies, we need to evaluate the whole body with the parameters replaced by their values. The trouble is that, as soon as we introduce assignment into our language, substitution is no longer an adequate model of function application. (We will see why this is so in section 3.1.3.) As a consequence, we technically have at this point no way to understand why the `new_withdraw` function behaves as claimed above. In order to really understand a function such as `new_withdraw`, we will need to develop a new model of function application. In section 3.2 we will introduce such a model, together with an explanation of assignments and variable declarations. First, however, we examine some variations on the theme established by new_withdraw.

Parameters of functions as well as names declared with `let` are variables. The following function, `make_withdraw`, creates “withdrawal processors.” The parameter `balance` in `make_withdraw` specifies the initial amount of money in the account.

> In contrast with `make_withdraw_balance_100` above, we do not have to use let to make `balance` a local variable, since parameters are already local. This will be clearer after the discussion of the environment model of evaluation in section 3.2. (See also exercise 3.10.)

In [4]:
function make_withdraw(balance) {
    return amount => {
        if (balance >= amount) {
            balance = balance - amount;
            return balance;
        } else {
            return "Insufficient funds";
        }
    };
}

The function `make_withdraw` can be used as follows to create two objects `W1` and `W2`:

In [5]:
const W1 = make_withdraw(100);
const W2 = make_withdraw(100);

In [6]:
W1(50);

[33m50[39m

In [7]:
W2(70);

[33m30[39m

In [8]:
W2(40);

[32m"Insufficient funds"[39m

In [9]:
W1(40);

[33m10[39m

Observe that `W1` and `W2` are completely independent objects, each with its own local state variable balance. Withdrawals from one do not affect the other.

We can also create objects that handle deposits as well as withdrawals, and thus we can represent simple bank accounts. Here is a function that returns a “bank-account object” with a specified initial balance:

In [12]:
function make_account(balance) {
    function withdraw(amount) {
        if (balance >= amount) {
            balance = balance - amount;
            return balance;
        } else {
            return "Insufficient funds";
        }
    }
    function deposit(amount) {
        balance = balance + amount;
        return balance;
    }
    function dispatch(m) {
        return m === "withdraw"
               ? withdraw
               : m === "deposit"
               ? deposit
               : error(m, "unknown request -- make_account");
    }
    return dispatch;
}

Each call to `make_account` sets up an environment with a local state variable `balance`. Within this environment, `make_account` defines functions `deposit` and `withdraw` that access `balanc`e and an additional function `dispatch` that takes a “message” as input and returns one of the two local functions. The `dispatch` function itself is returned as the value that represents the bank-account object. This is precisely the *message-passing* style of programming that we saw in section 2.4.3, although here we are using it in conjunction with the ability to modify local variables.

The function `make_account` can be used as follows:

In [13]:
const acc = make_account(100);

In [14]:
acc("withdraw")(50);

[33m50[39m

In [15]:
acc("withdraw")(60);

[32m"Insufficient funds"[39m

In [16]:
acc("deposit")(40);

[33m90[39m

In [17]:
acc("withdraw")(60);

[33m30[39m

Each call to `acc` returns the locally defined deposit or `withdraw` function, which is then applied to the specified `amount`. As was the case with `make_withdraw`, another call to `make_account`

```javascript
const acc2 = make_account(100);
```

will produce a completely separate account object, which maintains its own local `balance`.

#### Exercise 3.1

An *accumulator* is a function that is called repeatedly with a single numeric argument and accumulates its arguments into a sum. Each time it is called, it returns the currently accumulated sum. Write a function `make_accumulator` that generates accumulators, each maintaining an independent sum. The input to `make_accumulator` should specify the initial value of the sum; for example

```javascript
const a = make_accumulator(5);

a(10);
15

a(10);
25
```

In [48]:
function make_accumulator(total) {
    return add => {
        total = total + add;
        return total
    };
}

In [49]:
const a = make_accumulator(5);

In [50]:
a(10);

[33m15[39m

In [51]:
a(10);

[33m25[39m

#### Exercise 3.2

In software-testing applications, it is useful to be able to count the number of times a given function is called during the course of a computation. Write a function `make_monitored` that takes as input a function, `f`, that itself takes one input. The result returned by `make_monitored` is a third function, say `mf`, that keeps track of the number of times it has been called by maintaining an internal counter. If the input to `mf` is the string `"how many calls"`, then `mf` returns the value of the counter. If the input is the string `"reset count"`, then `mf` resets the counter to zero. For any other input, `mf` returns the result of calling `f` on that input and increments the counter. For instance, we could make a monitored version of the `sqrt` function:

```javascript
const s = make_monitored(math_sqrt); 

s(100); 
10 

s("how many calls");
1
```

In [53]:
function make_monitored(func) {
    let calls = 0;
    function monitored(arg) {
        calls = calls + 1;
        return func(arg);
    }
    return arg => {
        return arg === "how many calls"
               ? calls
               : arg === "reset count"
               ? calls = 0
               : monitored(arg);
    }
}

In [54]:
const s = make_monitored(Math.sqrt);

In [55]:
s(100);

[33m10[39m

In [56]:
s("how many calls");

[33m1[39m

### 3.1.2 The Benefits of Introducing Assignment

As we shall see, introducing assignment into our programming language leads us into a thicket of difficult conceptual issues. Nevertheless, viewing systems as collections of objects with local state is a powerful technique for maintaining a modular design. As a simple example, consider the design of a function rand that, whenever it is called, returns an integer chosen at random.

It is not at all clear what is meant by “chosen at random.” What we presumably want is for successive calls to `rand` to produce a sequence of numbers that has statistical properties of uniform distribution. We will not discuss methods for generating suitable sequences here. Rather, let us assume that we have a function `rand_update` that has the property that if we start with a given number $x_1$ and form

$$
\begin{align}
x_2\texttt{ = rand\_update(}x_1\texttt{);}\\
x_3\texttt{ = rand\_update(}x_2\texttt{);}
\end{align}
$$

then the sequence of values $x_1, x_2, x_3, \dots,$ will have the desired statistical properties.

> One common way to implement `rand_update` is to use the rule that $x$ is updated to $ax + b$ modulo $m$, where $a, b,$ and $m$ are appropriately chosen integers. Chapter 3 of Knuth 1997b includes an extensive discussion of techniques for generating sequences of random numbers and establishing their statistical properties. Notice that the rand_update function computes a mathematical function: Given the same input twice, it produces the same output. Therefore, the number sequence produced by rand_update certainly is not “random,” if by “random” we insist that each number in the sequence is unrelated to the preceding number. The relation between “real randomness” and so-called *pseudo-random* sequences, which are produced by well-determined computations and yet have suitable statistical properties, is a complex question involving difficult issues in mathematics and philosophy. Kolmogorov, Solomonoff, and Chaitin have made great progress in clarifying these issues; a discussion can be found in Chaitin 1975.

In [19]:
// A very simple rand_update function computes a 
// number from 0 (inclusive) to 2^31 (exclusive) 
// from a value x by multiplying it with a constant a, 
// adding a constant c. We used it here for illustration
// only, and do not claim any statistical properties.
const m = Math.pow(2, 31); 
const a = 1103515245;
const c = 12345;

function rand_update(x) {
    return (a * x + c) % m;
}

We can implement `rand` as a function with a local state variable `x` that is initialized to some fixed value `random_init`. Each call to `rand` computes `rand_update` of the current value of `x`, returns this as the random number, and also stores this as the new value of `x`.

In [22]:
const random_init = 123456789;

function make_rand() {
    let x = random_init;
    return () => {
        x = rand_update(x);
        return x;
    };
}

Of course, we could generate the same sequence of random numbers without using assignment by simply calling `rand_update` directly. However, this would mean that any part of our program that used random numbers would have to explicitly remember the current value of `x` to be passed as an argument to `rand_update`. To realize what an annoyance this would be, consider using random numbers to implement a technique called *Monte Carlo simulation*.

The Monte Carlo method consists of choosing sample experiments at random from a large set and then making deductions on the basis of the probabilities estimated from tabulating the results of those experiments. For example, we can approximate $\pi$ using the fact that $6/\pi2$ is the probability that two integers chosen at random will have no factors in common; that is, that their greatest common divisor will be 1. To obtain the approximation to $\pi$, we perform a large number of experiments. In each experiment we choose two integers at random and perform a test to see if their GCD is 1. The fraction of times that the test is passed gives us our estimate of $6/\pi2$, and from this we obtain our approximation to $\pi$.

The heart of our program is a function `monte_carlo`, which takes as arguments the number of times to try an experiment, together with the experiment, represented as a no-argument function that will return either true or false each time it is run. The function `monte_carlo` runs the experiment for the designated number of trials and returns a number telling the fraction of the trials in which the experiment was found to be true.

In [33]:
const rand = make_rand();
function gcd(a, b) {
    return b === 0 ? a : gcd(b, a % b);
}
function dirichlet_test() {
    return gcd(rand(), rand()) === 1;
}
function monte_carlo(trials, experiment) {
    function iter(trials_remaining, trials_passed) {
        return trials_remaining === 0
               ? trials_passed / trials
               : experiment()
               ? iter(trials_remaining - 1, trials_passed + 1)
               : iter(trials_remaining - 1, trials_passed);
    }
    return iter(trials, 0);
}
function estimate_pi(trials) {
    return Math.sqrt(6 / monte_carlo(trials, dirichlet_test));
}

In [38]:
estimate_pi(1000);

[33m29.277002188455995[39m

Now let us try the same computation using `rand_update` directly rather than `rand`, the way we would be forced to proceed if we did not use assignment to model local state:

In [None]:
function estimate_pi(trials) {
    return math_sqrt(6 / random_gcd_test(trials, random_init));
}
function random_gcd_test(trials, initial_x) {
    function iter(trials_remaining, trials_passed, x) {
        const x1 = rand_update(x);
        const x2 = rand_update(x1);
        return trials_remaining === 0
               ? trials_passed / trials
               : gcd(x1, x2) === 1
               ? iter(trials_remaining - 1, trials_passed + 1, x2)
               : iter(trials_remaining, trials_passed, x2);
    }
    return iter(trials, 0, initial_x);
}

While the program is still simple, it betrays some painful breaches of modularity. In our first version of the program, using `rand`, we can express the Monte Carlo method directly as a general `monte_carlo` function that takes as an argument an arbitrary experiment function. In our second version of the program, with no local state for the random-number generator, `random_gcd_test` must explicitly manipulate the random numbers `x1` and `x2` and recycle `x2` through the iterative loop as the new input to `rand_update`. This explicit handling of the random numbers intertwines the structure of accumulating test results with the fact that our particular experiment uses two random numbers, whereas other Monte Carlo experiments might use one random number or three. Even the top-level function `estimate_pi` has to be concerned with supplying an initial random number. The fact that the random-number generator’s insides are leaking out into other parts of the program makes it difficult for us to isolate the Monte Carlo idea so that it can be applied to other tasks. In the first version of the program, assignment encapsulates the state of the random-number generator within the `rand` function, so that the details of random-number generation remain independent of the rest of the program.

The general phenomenon illustrated by the Monte Carlo example is this: From the point of view of one part of a complex process, the other parts appear to change with time. They have hidden time-varying local state. If we wish to write computer programs whose structure reflects this decomposition, we make computational objects (such as bank accounts and random-number generators) whose behavior changes with time. We model state with local state variables, and we model the changes of state with assignments to those variables.

It is tempting to conclude this discussion by saying that, by introducing assignment and the technique of hiding state in local variables, we are able to structure systems in a more modular fashion than if all state had to be manipulated explicitly, by passing additional parameters. Unfortunately, as we shall see, the story is not so simple.

### 3.1.3 The Costs of Introducing Assignment

As we have seen, assignment enables us to model objects that have local state. However, this advantage comes at a price. Our programming language can no longer be interpreted in terms of the substitution model of function application that we introduced in section 1.1.5. Moreover, no simple model with “nice” mathematical properties can be an adequate framework for dealing with objects and assignment in programming languages.

So long as we do not use assignments, two evaluations of the same function with the same arguments will produce the same result, so that functions can be viewed as computing mathematical functions. Programming without any use of assignments, as we did throughout the first two chapters of this book, is accordingly known as *functional programming*.

To understand how assignment complicates matters, consider a simplified version of the `make_withdraw` function of section 3.1.1 that does not bother to check for an insufficient amount:

In [39]:
function make_withdraw_simplified(balance) {
    return amount => {
        balance = balance - amount;
        return balance;
    };
}

In [40]:
const W = make_withdraw_simplified(25);

In [41]:
W(20);

[33m5[39m

In [42]:
W(10);

[33m-5[39m

Compare this function with the following `make_decrementer` function, which does not use assignment:

In [43]:
function make_decrementer(balance) { 
    return amount => balance - amount; 
}

The function `make_decrementer` returns a function that subtracts its input from a designated amount `balance`, but there is no accumulated effect over successive calls, as with `make_simplified_withdraw`:

In [44]:
const D = make_decrementer(25);

In [45]:
D(20);

[33m5[39m

In [46]:
D(10);

[33m15[39m

We can use the substitution model to explain how `make_decrementer` works. For instance, let us analyze the evaluation of the expression

```javascript
make_decrementer(25)(20)
```

We first simplify the function expression of the application by substituting 25 for `balance` in the body of `make_decrementer`. This reduces the expression to

```javascript
(amount => 25 - amount)(20)
```

Now we apply the function by substituting 20 for `amount` in the body of the lambda expression:
```javascript
25 - 20
```

The final answer is 5.

Observe, however, what happens if we attempt a similar substitution analysis with `make_withdraw_simplified`:
```javascript
make_withdraw_simplified(25)(20)
```

We first simplify the function expression by substituting 25 for `balance` in the body of `make_withdraw_simplified`. This reduces the expression to

```javascript
(amount => {
    balance = 25 - amount;
    return 25;
})(20)
```

Now we apply the function by substituting 20 for `amount` in the body of the lambda expression:

```javascript
balance = 25 - 20;
return 25;
```

If we adhered to the substitution model, we would have to say that the meaning of the function application is to first set balance to 5 and then return 25 as the value of the expression. This gets the wrong answer. In order to get the correct answer, we would have to somehow distinguish the first occurrence of `balance` (before the effect of the assignment) from the second occurrence of `balance` (after the effect of the assignment), and the substitution model cannot do this.

The trouble here is that substitution is based ultimately on the notion that the name in our language are essentially symbols for values. This worked well for constants. But a variable, whose value can change with assignment, cannot simply be a name for a value. A variable somehow refers to a place where a value can be stored, and the value stored at this place can change. In section 3.2 we will see how environments play this role of “place” in our computational model.

#### Sameness and change

The issue surfacing here is more profound than the mere breakdown of a particular model of computation. As soon as we introduce change into our computational models, many notions that were previously straightforward become problematical. Consider the concept of two things being “the same.” Suppose we call `make_decrementer` twice with the same argument to create two functions:

```javascript
const D1 = make_decrementer(25);
const D2 = make_decrementer(25);
```

Are `D1` and `D2` the same? An acceptable answer is yes, because `D1` and `D2` have the same computational behavior—each is a function that subtracts its input from 25. In fact, `D1` could be substituted for `D2` in any computation without changing the result.

Contrast this with making two calls to `make_withdraw_simplified`:

```javascript
const W1 = make_withdraw_simplified(25);
const W2 = make_withdraw_simplified(25);
```

Are `W1` and `W2` the same? Surely not, because calls to `W1` and `W2` have distinct effects, as shown by the following sequence of interactions:

```javascript
W1(20);
5

W1(20);
-15

W2(20);
5
```

Even though `W1` and `W2` are “equal” in the sense that they are both created by evaluating the same expression, `make_withdraw_simplified(25)`, it is not true that `W1` could be substituted for `W2` in any expression without changing the result of evaluating the expression.

A language that supports the concept that “equals can be substituted for equals” in an expression without changing the value of the expression is said to be *referentially transparent*. Referential transparency is violated when we include assignment in our computer language. This makes it tricky to determine when we can simplify expressions by substituting equivalent expressions. Consequently, reasoning about programs that use assignment becomes drastically more difficult.

Once we forgo referential transparency, the notion of what it means for computational objects to be “the same” becomes difficult to capture in a formal way. Indeed, the meaning of “same” in the real world that our programs model is hardly clear in itself. In general, we can determine that two apparently identical objects are indeed “the same one” only by modifying one object and then observing whether the other object has changed in the same way. But how can we tell if an object has “changed” other than by observing the “same” object twice and seeing whether some property of the object differs from one observation to the next? Thus, we cannot determine “change” without some a priori notion of “sameness,” and we cannot determine sameness without observing the effects of change.

As an example of how this issue arises in programming, consider the situation where Peter and Paul have a bank account with $100 in it. There is a substantial difference between modeling this as

```javascript
const peter_acc = make_account(100); 
const paul_acc = make_account(100);
```

and modeling it as 

```javascript
const peter_acc = make_account(100); 
const paul_acc = peter_acc;
```

In the first situation, the two bank accounts are distinct. Transactions made by Peter will not affect Paul’s account, and vice versa. In the second situation, however, we have defined `paul_acc` to be *the same thing* as `peter_acc`. In effect, Peter and Paul now have a joint bank account, and if Peter makes a withdrawal from `peter_acc` Paul will observe less money in `paul_acc`. These two similar but distinct situations can cause confusion in building computational models. With the shared account, in particular, it can be especially confusing that there is one object (the bank account) that has two different names (`peter_acc` and `paul_acc`); if we are searching for all the places in our program where `paul_acc` can be changed, we must remember to look also at things that change `peter_acc`.

> The phenomenon of a single computational object being accessed by more than one name is known as *aliasing*. The joint bank account situation illustrates a very simple example of an alias. In section 3.3 we will see much more complex examples, such as “distinct” compound data structures that share parts. Bugs can occur in our programs if we forget that a change to an object may also, as a “side effect,” change a “different” object because the two “different” objects are actually a single object appearing under different aliases. These so-called *side-effect bugs* are so difficult to locate and to analyze that some people have proposed that programming languages be designed in such a way as to not allow side effects or aliasing.

With reference to the above remarks on “sameness” and “change,” observe that if Peter and Paul could only examine their bank balances, and could not perform operations that changed the balance, then the issue of whether the two accounts are distinct would be moot. In general, so long as we never modify data objects, we can regard a compound data object to be precisely the totality of its pieces. For example, a rational number is determined by giving its numerator and its denominator. But this view is no longer valid in the presence of change, where a compound data object has an “identity” that is something different from the pieces of which it is composed. A bank account is still “the same” bank account even if we change the balance by making a withdrawal; conversely, we could have two different bank accounts with the same state information. This complication is a consequence, not of our programming language, but of our perception of a bank account as an object. We do not, for example, ordinarily regard a rational number as a changeable object with identity, such that we could change the numerator and still have “the same” rational number.

#### Pitfalls of imperative programming

In contrast to functional programming, programming that makes extensive use of assignment is known as *imperative programming*. In addition to raising complications about computational models, programs written in imperative style are susceptible to bugs that cannot occur in functional programs. For example, recall the iterative factorial program from section 1.2.1 (here using a conditional statement instead of a conditional expression):

```javascript
function factorial(n) {
    function iter(product, counter) {
        if (counter > n) {
            return product;
        } else {
            return iter(counter * product,
                        counter + 1);
    }
    return iter(1, 1);
}
```

Instead of passing arguments in the internal iterative loop, we could adopt a more imperative style by using explicit assignment to update the values of the variables `product` and `counter`:

```javascript
function factorial(n) {
    let product = 1;
    let counter = 1;
    function iter() {
        if (counter > n) {
            return product;
        } else {
            product = counter * product;
            counter = counter + 1;
            return iter()
    }
    return iter();
}
```

This does not change the results produced by the program, but it does introduce a subtle trap. How do we decide the order of the assignments? As it happens, the program is correct as written. But writing the assignments in the opposite order

```javascript
            counter = counter + 1;
            product = counter * product;
```

would have produced a different, incorrect result. In general, programming with assignment forces us to carefully consider the relative orders of the assignments to make sure that each statement is using the correct version of the variables that have been changed. This issue simply does not arise in functional programs.

> In view of this, it is ironic that introductory programming is most often taught in a highly imperative style. This may be a vestige of a belief, common throughout the 1960s and 1970s, that programs that call functions must inherently be less efficient than programs that perform assignments. (Steele (1977) debunks this argument.) Alternatively it may reflect a view that step-by-step assignment is easier for beginners to visualize than function call. Whatever the reason, it often saddles beginning programmers with “should I set this variable before or after that one” concerns that can complicate programming and obscure the important ideas.

The complexity of imperative programs becomes even worse if we consider applications in which several processes execute concurrently. We will return to this in section 3.4. First, however, we will address the issue of providing a computational model for expressions that involve assignment, and explore the uses of objects with local state in designing simulations.

## 3.2 The Environment Model of Evaluation

When we introduced compound functions in chapter 1, we used the substitution model of evaluation (section 1.1.5) to define what is meant by applying a function to arguments:

* To apply a compound function to arguments, evaluate the return expression of the function (more generally, the body) with each parameter replaced by the corresponding argument.

Once we admit assignment into our programming language, such a definition is no longer adequate. In particular, section 3.1.3 argued that, in the presence of assignment, a name cannot be considered to be merely representing a value. Rather, a name must somehow designate a “place” in which values can be stored. In our new model of evaluation, these places will be maintained in structures called *environments*.

An environment is a sequence of *frames*. Each frame is a table (possibly empty) of *bindings*, which associate names with their corresponding values. (A single frame may contain at most one binding for any name.) Each frame also has a pointer to its *enclosing environment*, unless, for the purposes of discussion, the frame is considered to be *global*. The *value of a name* with respect to an environment is the value given by the binding of the name in the first frame in the environment that contains a binding for that name. If no frame in the sequence specifies a binding for the name, then the name is said to be *unbound* in the environment.

![fig 3.1](./figs/fig3.1.png)

Figure 3.1 shows a simple environment structure consisting of three frames, labeled I, II, and III. In the diagram, A, B, C, and D are pointers to environments. C and D point to the same environment. The names z and x are bound in frame II, while y and x are bound in frame I. The value of x in environment D is 3. The value of x with respect to environment B is also 3. This is determined as follows: We examine the first frame in the sequence (frame III) and do not find a binding for x, so we proceed to the enclosing environment D and find the binding in frame I. On the other hand, the value of x in environment A is 7, because the first frame in the sequence (frame II) contains a binding of x to 7. With respect to environment A, the binding of x to 7 in frame II is said to *shadow* the binding of x to 3 in frame I.

The environment is crucial to the evaluation process, because it determines the context in which an expression should be evaluated. **Indeed, one could say that expressions in a programming language do not, in themselves, have any meaning. Rather, an expression acquires a meaning only with respect to some environment in which it is evaluated.** Even the interpretation of an expression as straightforward as `display(1)` depends on an understanding that one is operating in a context in which the name `display` refers to the primitive function that displays a value. Thus, in our model of evaluation we will always speak of evaluating an expression with respect to some environment. To describe interactions with the interpreter, we will suppose that there is a global environment, consisting of a single frame (with no enclosing environment) that includes values for the names associated with the primitive functions. For example, the idea that `display` is the name for the primitive display function is captured by saying that the name `display` is bound in the global environment to the primitive display function.

Before we evaluate a program, we extend the global environment with a new frame, the *program frame*, resulting in the *program environment*. We will add the names that are declared at the top level of the program, outside of any block, to this frame. The given program is then evaluated with respect to the program environment.

### 3.2.1 The Rules for Evaluation 

The overall specification of how the interpreter evaluates a function application remains the same as when we first introduced it in section 1.1.4:
* To evaluate an application:
    1. Evaluate the subexpressions of the application.
    2. Apply the value of the function subexpression to the values of the argument subexpressions.

The environment model of evaluation replaces the substitution model in specifying what it means to apply a compound function to arguments. 

In the environment model of evaluation, a function is always a pair consisting of some code and a pointer to an environment. Functions are created in one way only: by evaluating a lambda expression. This produces a function whose code is obtained from the text of the lambda expression and whose environment is the environment in which the lambda expression was evaluated to produce the function. For example, consider the function declaration

```javascript
function square(x) {
    return x * x;
}
```

evaluated in the program environment. The function declaration syntax is equivalent to an underlying implicit lambda expression. It would have been equivalent to have used

```javascript
const square = x => x * x;
```

which evaluates `x => x * x` and binds `square` to the resulting value, all in the program environment.

![fig 3.2](./figs/fig3.2.png)

Figure 3.2 shows the result of evaluating this declaration statement. The global environment encloses the program environment. To reduce clutter, after this figure we will not display the global environment (as it is always the same), but we are reminded of its existence by the pointer from the program environment upward. The function object is a pair whose code specifies that the function has one parameter, namely `x`, and a function body `return x * x;`. The environment part of the function is a pointer to the program environment, since that is the environment in which the lambda expression was evaluated to produce the function. A new binding, which associates the function object with the name `square`, has been added to the program frame.

In general, `const`, `function`, and `let` add bindings to frames. Assignment is forbidden on constants, so our environment model needs to distinguish names that refer to constants from names that refer to variables. We indicate that a name is a constant by writing an equal sign after the colon that follows the name. We consider function declarations as equivalent to constant declarations; observe the equal signs after the colons in figure 3.2.

Now that we have seen how functions are created, we can describe how functions are applied. The environment model specifies: To apply a function to arguments, create a new environment containing a frame that binds the parameters to the values of the arguments. The enclosing environment of this frame is the environment specified by the function. Now, within this new environment, evaluate the function body.

![fig 3.3](./figs/fig3.3.png)

To show how this rule is followed, figure 3.3 illustrates the environment structure created by evaluating the expression `square(5)` in the program environment, where `square` is the function generated in figure 3.2. Applying the function results in the creation of a new environment, labeled E1 in the figure, that begins with a frame in which `x`, the parameter for the function, is bound to the argument 5. Note that name `x` in environment E1 is followed by a colon with no equal sign, which indicates that the parameter `x` is treated as a variable.15 The pointer leading upward from this frame shows that the frame’s enclosing environment is the program environment. The program environment is chosen here, because this is the environment that is indicated as part of the `square` function object. Within E1, we evaluate the body of the function, `return x * x;`. Since the value of `x` in E1 is 5, the result is `5 * 5`, or 25.

The environment model of function application can be summarized by two rules:

* A function object is applied to a set of arguments by constructing a frame, binding the parameters of the function to the arguments of the call, and then evaluating the body of the function in the context of the new environment constructed. The new frame has as its enclosing environment the environment part of the function object being applied. The result of the application is the result of evaluating the return expression of the first return statement encountered while evaluating the function body.
* A function is created by evaluating a lambda expression relative to a given environment. The resulting function object is a pair consisting of the text of the lambda expression and a pointer to the environment in which the function was created.

Finally, we specify the behavior of assignment, the operation that forced us to introduce the environment model in the first place. Evaluating the expression *name* `=` *value* in some environment locates the binding of the name in the environment. That is, one finds the first frame in the environment that contains a binding for the name. If the binding is a variable binding—indicated in the frame by just $:$ after the name—that binding is changed to reflect the new value of the variable. Otherwise, if the binding in the frame is a constant binding—indicated in the frame by $:=$ after the name—the assignment signals an "assignment to constant" error. If the name is unbound in the environment, then the assignment signals a `"variable undeclared"` error.

These evaluation rules, though considerably more complex than the substitution model, are still reasonably straightforward. Moreover, the evaluation model, though abstract, provides a correct description of how the interpreter evaluates expressions. In chapter 4 we shall see how this model can serve as a blueprint for implementing a working interpreter. The following sections elaborate the details of the model by analyzing some illustrative programs.