# 1.2.1: Bikeshare (State and Change)

<br>



---



*Modeling and Simulation in Python*


Copyright 2021 Allen Downey, (License: [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International](https://creativecommons.org/licenses/by-nc-sa/4.0/))

Revised, Mike Augspurger (2021-present)

<br>

---





## Types of Models

In this course, we'll build and interpret a range of models, and they are often quite different from each other.  So as we start out, it's important to get a sense of the different kinds of models.  We can divide models in any number of different ways, based on, their underlying math, their behavior, their relationship to time, etc...  

<br>

We've already seen one way to describe a model: its purpose.  Some models are built to explain, others to predict, and others to optimize.   Let's look at a few other ways to categorize models.

### Behavior: Stochastic vs. Deterministic

A stochastic model is one that is governed by a random function of some sort.  This "randomness" might be a result of the complexity of the system: for instance, a model of traffic at an intersection would have to take into account the fact that at some moments a large number of cars might be in the vicinity of the intersection, while at others it might be empty.  At other times the randomness is inherent in the nature of the system: the movement of quantum particles, for instance, is (as far as we can tell) genuinely random, so a quantum model needs to take this into account.

<br>

When a stochastic model is implemented, the results of an individual simulation will vary from simulation to simulation, even if the initial conditions stay the same.  The *results* of such models, then, tend to be reported in terms of statistics: the *average* time for a car to move through the intersection is 36 seconds, for instance.  In such cases, variability is important too: do most cars take between 34-38 seconds, or do some take 2 seconds while others sit at the intersection for 120 seconds? 

<br>

A deterministic model, on the other hand, is governed by a hard and fast rule, and an individual simulation run with the same initial conditions will always produce the same result.  If I model the rise in temperature of a metal block that I put in boiling water, I would expect to see the same temperature curve every time I put the same metal block into the same boiling water.

✅ Active Reading: Describe in your own words the difference between a stochastic and deterministic model.

### Type of Implementation: Analytical vs. Computational

The way we solved the penny problem in the previous chapter was to take an equation and apply that equation to a particular situation.  This is called an *analytical* model.  This means essentially that the tool we use to solve the problem is mathematical.  As we'll see, we can use analytical tools to model much more complicated systems, too.  The advantage of an analytical model is that it gives us a clear mathematical expression of the nature of the problem--it's great for providing a way to compare one system to another.   If I say, "I'm going to model the height of a growing tree as a linear system," we're saying that the height $h$ of the tree can be represented as a linear mathematical equation (as a function of time $t$ with a constant $C$):

$$h=Ct$$

We know a lot about linear systems, and can compare this system to other known examples: this is helpful for understanding the behavior of a system.

<br>

A *computational* model depends on using a computer, rather than just mathematical tools, to implement the model.  In doing so, we might lose sight the mathematical nature of the system.  But the advantage is that we can model much more complex systems with computational models than with analytical ones.  There might not be a mathematical solution to a complex geometry (say, the shape of a car) moving in a complex system (say, a windy road), but with enough computational power, we can create a model that describes how that system behaves. 



✅ Active Reading: What is the main advantage of an analytical model and of a computational model?

### Known values: Initial values vs. Boundary values

Much of the time we are interested in what happens to a system as it moves through time.  If we know the starting values in a system, and have a model for how those values interact with each other, we can implement an *initial value model*.  Such a simulation will move through time step-by-step, recording the change in conditions along the way.   As you can imagine, this sort of problem is often useful for predictive purposes: what will happen to this system in time?

<br>

A *boundary value model*, on the other hand, is concerned only with a single moment in time, but is interested in the state of the system through space.   If I have an aluminum bar of a complex shape, and know that I am going to keep one end of the bar motionless while moving the other end by 2 cm, I might want to know where the bar might crack.  We could create a boundary value model that detailed the state of stress within the bar with the given change in location of one end of the bar.  This would be a boundary value model.

<br> 

These are not the only way to describe differences in models, but it is a good start.  Over the next couple chapters, we'll encounter some of these types as we learn to build and improve our models and simulations.  In this chapter, for instance, we will look at a stochastic initial value problem that aims to optimize a design through a computational implementation.  In the next chapter, we'll work through an example of a deterministic model with an analytical solution designed to help us understand and predict the behavior of a system.

<br>

---

## The Bikeshare Problem

One purpose of modeling is to optimize an engineered system.  In this chapter, we are going to create a simulation to do just that.

<br>

We are interested in instituting a bike share system for students traveling between Augie and downtown Moline.  By observation, we know that more students use the bikes to get from Augie to Moline than the reverse, so bikes tend to build up at Moline.

<br>

We only want to move bikes once every day from Moline to Augie.  We know we have 12 bikes in total.  If we want as few disappointed customers as possible, how many bikes should we put in each place at the start of the day?

## State and Change in Modeling

Most models attempt to describe some system that changes over time.  How does the population grow? When will the company start to make a profit? How does the rocket re-enter the atmosphere? 

This process consist of two parts:

1) A state (the current population, the company's current financial situation, the rocket in space, etc...).      

2) A process of change over time.  

Any model will need to define both of these parts. In our simulation, we'll need to keep track of where the bikes are--that is, we'll need to keep track of the *state* of the system--and we'll also need to define how and when the bikes move.  

### Storing the *state* in a Series

The state is defined by a set of values which will change over the course of the simulation.  In order to store this state, we'll use an `object` called a `Series`, which is part of a `Library` of code called Pandas.  A couple definitions:

- An `object` is a catch-all term for a "thing" in python: a collection of data of any sort.

- A `Series` is an object that holds a set of variable values in a table-like form: it's a particular kind of `object`.  We'll see an example of this in action in just a moment.

- A `Library` is a set of pre-written code that we can access in our own programs.  You can "check out" this pre-written code by "importing" it into your code.

We'll start by importing the Pandas library (which is a data analysis library), and creating a `Series` called `bikeshare`.  This will be our *state object*: it will hold information about the state of our system.

In [None]:
import pandas as pd

bikeshare = pd.Series(dict(augie=10,moline=2),name='Number of Bikes')
bikeshare


augie     10
moline     2
Name: Number of Bikes, dtype: int64

Notice the format here (We'll use a lot of cut-and-paste in our programming, so you don't need to memorize this stuff.  But you do want to start paying attention to the format):

- The first line imports the library, and gives it the shorthand `pd`.  Importing the library makes all of its objects and methods available to the program, and the shorthand version slims down the code a bit.

- The second line creates the `Series` and calls it `bikeshare`.  
- The first expression on the right hand side of the second line creates a foundational Python object called a `dictionary`, which is a set of label-value pairs (think of it like a word and a definition--thus the name dictionary!). This dictionary contains the data for our `Series`. Notice that each `value` has a `label` (the value '10', for instance, has the label 'Augie').  `name` is a *keyword argument* that describes the meaning of our values: '2' and '10' represent the number of bikes at the two locations.  

Together, two variables represent the *state variables*.  The *initial state* indicates that there are 10 bikes at Augie and 2 at Moline. 

We can read the variables inside a `Series` using the *dot operator*, like this:

In [None]:
bikeshare.augie

10

And this:

In [None]:
bikeshare.moline

2

Or, to display the state variables and their values, you can just enter the name of the object:

In [None]:
bikeshare

augie     10
moline     2
Name: Number of Bikes, dtype: int64

These values make up the *state* of the system.

We can represent our `Series` in a way that looks more like a table by transforming the `Series` into another Pandas object called a `DataFrame` (we'll get to those later: we're just using it as a display tool here):

In [None]:
pd.DataFrame(bikeshare)

Unnamed: 0,Number of Bikes
augie,10
moline,2


You don't have to use this format,  but there a couple reasons to use it.  One, the results look better.  Two, it provides a better visual sense of the nature of a `Series`.  But most importantly, Colab turns a `DataFrame` into an interactive table.  Click the little magic wand.  Now click 'Number of Bikes': see how it rearranges the table?  Click filter, and play with that.  Obviously, we don't really need this tool here, but it might come in handy later!

### The change function

Ok, we now have an initial state.  But any useful model is going to involve change.

<br>

In a model, the process of change must be defined by a rule, or a set of rules, about how this change occurs.  We will call these rules a *change function*.  Much of this class will be spent exploring different types of change functions.  These rules might stay the same the whole time, or they might change, gradually or suddenly.  The rules might create random *stochastic* changes, or established *deterministic* changes.

<br>

The change occurs through *time steps*.  Each time step represents a certain amount of time, and each step represents a change in the state of the system.

At its most basic, we enact the change function manually: that is, we can update the state by assigning new values to the variables. 
For example, if a student moves a bike from Augie to Moline, we can figure out the new values and assign them:

In [None]:
# First simple change function
bikeshare.augie = 9
bikeshare.moline = 3
pd.DataFrame(bikeshare)

Unnamed: 0,Number of Bikes
augie,9
moline,3


That's not particularly efficient, though.  One step better, we can avoid doing the math ourselves by using *update operators*, `-=` and `+=`, to subtract 1 from
`augie` and add 1 to `moline`:

In [None]:
# Second simple change function
bikeshare.augie -= 1
bikeshare.moline += 1
bike_df = pd.DataFrame(bikeshare)
bike_df

Unnamed: 0,Number of Bikes
augie,8
moline,4


Try running the last cell again.  What happens to the state of the system?  Now run the previous code cell (with the title "first simple change function") again.

<br>

Each time you run the cell, it performs the action described by the code, even if you had already run that cell before.

<br>

Notice here that we didn't just send the `dataFrame` to output: we also assigned it a name, so it exists as a variable.   Now click `{x}` in the left hand tool bar.  Notice that `bikeshare` is defined as a Series.  Hover over it, and you can see it's current values as well as its shape `(2,)`, which indicates it has 2 rows of data and 0 columns (a `pd.Series` by definition has 0 columns).  The `bike_df` has the same values, but a different shape: 2 rows and 1 column of data.  As we'll see, a `dataFrame` can be expanded to include multiple columns, like a table in a spreadsheet.

### Using a Python function to define the change function

The advantage of using a computer to create a model (i.e. of creating *simulations*) is that we don't have to manually enter the change for each time step.  We want to define the rule (that is, the change function) and let the computer do the grunt work.

<br>

Oftentimes, we will want to code the change function in a Python `function`.  This allows us to write a few lines of code, test them to confirm they do what we intend, and then use them to implement our change function. 

<br>

For example, these lines move a bike from Augie to Moline:

In [None]:
bikeshare.augie -= 1
bikeshare.moline += 1

Rather than repeat them every time a bike moves, we can define a new
function:

In [None]:
def bike_to_moline():
    bikeshare.augie -= 1
    bikeshare.moline += 1

`def` is a special word in Python that indicates we are defining a new
function. The name of the function is `bike_to_moline`. Notice some of the details:
* The empty
parentheses indicate that this function requires no additional
information when it runs: it can run by itself. 
* The colon indicates the beginning of an
indented *code block*.
* The next two lines are the *body* of the function. They have to beindented; by convention, the indentation is four spaces.
* Notice the name of the function: always choose a name that describes what a function does, as this makes reading and editing the code much easier.


When you define a function, it has no immediate effect, because all you have done is to create a set of steps (i.e. the function). The steps are not followed until you *call* the function. Here's how to call
this function:

In [None]:
bike_to_moline()

When you call the function, it runs the steps in the body of the function, which
update the variables of the `bikeshare` object; you can check by
displaying the new state.

In [None]:
pd.DataFrame(bikeshare)

Unnamed: 0,Number of Bikes
augie,6
moline,6


When you call a function, you have to include the parentheses. If you
leave them out, you get this:

In [None]:
bike_to_moline

<function __main__.bike_to_moline()>

This result indicates that `bike_to_moline` is a function. You don't have to know what `__main__` means, but if you see something like this, it probably means that you named a function but didn't actually call it.
So don't forget the parentheses.

The chief benefit of defining functions is that you avoid repeating chunks of
code, which makes programs smaller and much easier to read, edit, and debug. 




### Using `print` to monitor a simulation

As you write more complicated programs, it is easy to lose track of what
is going on. One of the most useful tools for debugging is the *print statement*, which displays text in the Jupyter notebook.

<br>

Normally when Jupyter runs the code in a cell, it displays the value of
the last line of code. For example, if you run:

In [None]:
bikeshare.augie
bikeshare.moline

6

Jupyter runs both lines, but it only displays the value of the
second. If you want to display more than one value, you can use
print statements:

In [None]:
print(bikeshare.augie)
print(bikeshare.moline)

6
6


When you call the `print` function, you can put a variable in
parentheses, as in the previous example, or you can provide a sequence
of variables separated by commas, like this:

In [None]:
print("There are", bikeshare.augie,"bikes at Augustana, and", 
      bikeshare.moline, "bikes in Moline.")

There are 6 bikes at Augustana, and 6 bikes in Moline.


Python looks up the values of the variables and displays them. Notice that when you put letters or numbers inside quotation marks, the letters/ numbers are not treated as variables or numerical values.  These quotation marked `objects` are called `strings`.

<br>

Print statements are useful for debugging functions. For example, we can
add a print statement to `move_bike`, like this:

In [None]:
def bike_to_moline():
    print('Moving a bike to Moline')
    bikeshare.augie -= 1
    bikeshare.moline += 1

Each time we call this version of the function, it displays a message,
which can help us keep track of what the program is doing.
The message in this example is a *string*, which is a sequence of
letters and other symbols in quotes.

<br>

Just like `bike_to_moline`, we can define a function that moves a
bike from Moline to Augustana:

In [None]:
def bike_to_augie():
    print('Moving a bike to Augustana')
    bikeshare.moline -= 1
    bikeshare.augie += 1

And call it like this:

In [None]:
bike_to_augie()

pd.DataFrame(bikeshare)

Moving a bike to Augustana


Unnamed: 0,Number of Bikes
augie,7
moline,5


When "Moving a bike to Augustana" is printed in the output, we can be sure that this function was properly called.  If that were not printed, we would know that the function was never called.

<br>



---

## Summary and Exercises

This chapter introduces the tools we need to keep track of the state of a system and to change that state.

In the next chapter, we'll use these tools to create our first simulation, which will show the change in the system over the course of a day.



### Exercise 1

✅  What happens if you spell the name of a state variable wrong?  Edit the following cell, change the spelling of `moline`, and run it.

The error message uses the word *attribute*, which is another name for what we are calling a state variable. 

In [None]:
bikeshare = pd.Series(dict(augie=10,moline=2),name="Number of Bikes")

bikeshare.moline

2

### Exercise 2

✅  Make a state object, but this time add a third state variable in addition to `augie` and `moline`.  Call the third variable `rock_island`, with initial value 0, and display the state of the system.

In [None]:
# Solution goes here
bikeshare = pd.Series(dict(augie=10,moline=2,rock_island=0),name="Number of Bikes")
pd.DataFrame(bikeshare)

Unnamed: 0,Number of Bikes
augie,10
moline,2
rock_island,0
