# 1.2.6: Bike Share (Improving the Code through Iteration)

<br>



---



*Modeling and Simulation in Python*

Copyright 2021 Allen Downey, (License: [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International](https://creativecommons.org/licenses/by-nc-sa/4.0/))

Revised, Mike Augspurger (2021-present)

<br>



---



We've done our investigation, abstraction, and even implementation.  So we now have a functioning code.   But it's not perfect yet, so now we enter the validation-iteration-implementation feedback loop.  

In [None]:
# Import libraries
import pandas as pd
import numpy.random as npr



---



## Incremental Development

When you start writing programs that are more than a few lines, you
might find yourself spending more time debugging. The more code you write before you start debugging, the harder it is to find the problem.

<br>

*Incremental development* is a way of programming that tries to
minimize the pain of debugging. You might not have noticed, but we've been doing incremental development already.  

<br>

The fundamental steps are:

1.  Start with a working program. If you have an example from a
book, or a program you wrote that is similar to what you are working
on, start with that. Otherwise, start with something you *know* is
correct, like a simple statement of your parameters.  

2.  Make one small, testable change at a time. A "testable" change is one that displays something or has some other effect you can check. Above, we added arguments to our function.

3.  Run the program and see if the change worked. In the case above, we compared the results of our new function to the results of the old one.  You may need to print out values (see below) to tell whether the code is working as intended.  

4. If the change did worked, go back to Step 2 with another alteration. If it didn't, you have to do some debugging, but if the
change you made was small, it shouldn't take long to find the problem.

In the exercises in this class, if you find yourself writing more than a few lines of code before you start testing, remember to take small steps and constantly check the results of your changes!

Let's do some incremental development with our code:

### Choosing the next step for the model

We've made some improvements to our code.  Let's turn to the more substantive issues.  The model we have so far is simple, but it is based on unrealistic
assumptions. What weaknesses did you identify in the exercises for the previous notebook?

<br>

Here are some of the weaknesses you might have found:

-   In the model, a student is equally likely to arrive during any
    15-minute period. In reality, this probability varies depending on time of day, day of the week, etc.

-   The model does not account for travel time from one bike station to another.

-   The model does not allow more than one student to arrive in a given 15-minute period.

-   The model does not check whether a bike is available, so it's
    possible for the number of bikes to be negative (as you might have
    noticed in some of your simulations).

Some of these modeling decisions are better than others:
* the first assumption might be reasonable if we simulate the system for a short period of time, like one hour.
* the second and third assumptions are not very realistic, but they might not affect the results very much, depending on what we use the model for.

The last assumption seems problematic, and not to hard to fix.  So let's start there.

<br>

This is how incremental development works: start with a simple model, identifying the most
important problems, and make improvements one step at a time. It often takes several iterations to develop a model that is good enough for the intended
purpose, but no more complicated than necessary.

### Development 1: Eliminating Negative Bikes

Currently the simulation does not check whether a bike is available when a customer arrives, so the number of bikes at a location can be
negative. That's not very realistic. Here's a version of `bike_to_augie` that fixes the problem:

In [None]:
def bike_to_augie(state):
    if state.moline > 0:
        state.moline -= 1
        state.augie += 1
    return state

The first line checks whether the number of bikes at Moline is greater than zero. If not, it skips to the return line of the function.  So if there are no bikes at Moline, the state is unchanged.

<br>

We can test it by initializing a state with no bikes at Moline and calling `bike_to_augie`.

In [None]:
bikeshare = pd.Series(dict(augie=12,moline=0),name="Number of Bikes")
bike_to_augie(bikeshare)

The state of the system should be unchanged.  No more negative bikes (at least at Moline)!

### Development 2: Running the simulation as a single function

Some incremental development, like eliminating negative bikes, concern the accuracy of the simulation. Others, though, are designed to help the simulation run more smoothly.  Let's try one of those here.

<br>

Here's the code we have so far.  We want to put it all under the umbrella of a single function called `run_simulation()`:

In [None]:
def bike_to_moline(state):
    if state.augie > 0:
        state.augie -= 1
        state.moline += 1
    return state

def bike_to_augie(state):
    if state.moline > 0:
        state.augie += 1
        state.moline -= 1
    return state

def change_func(state, ptm, pta):
    if npr.random() < ptm:
        state = bike_to_moline(state)

    if npr.random() < pta:
        state = bike_to_augie(state)
    return state

# Call the function
bikeshare = change_func(bikeshare, 0.5, 0.4)

Notice the "nested" nature of the arguments and returned values in `change_func()`:

<br>

* When we call `change_func()`, the argument pulls our the state `bikeshare` into that function, but calls it `state` inside the function.
* When the function reaches line 3, `bike_to_moline()` pulls the `state` Series into the `bike_to_moline()` environment, and after altering it, returns it to the `change_func()` environment.
* After repeating this with `bike_to_moline()`, `change_func` returns the now twice altered `state` Series back to the global environment.



We can double-down on this "nesting" to create afunction that can run an entire simulation.  `run_simulation()`
creates a state object, uses a "for" loop to run a certain number of time steps (`num_steps`), and then returns the
state object.  

In [None]:
def run_simulation(ptm, pta, iAug, iMol, num_steps):
    state = pd.Series(dict(augie=iAug,moline=iMol),name="Number of Bikes")

    for i in range(num_steps):
        state = change_func(state, ptm, pta)

    return state

We can call `run_simulation` like this:

In [None]:
final_state = run_simulation(0.5, 0.4, 10, 2, 60)
final_state

Notice that we enter our independent variables (the state variables, `augie=10` and `moline=2`) as well as the parameters for our model (`ptm=0.5`, `pta=0.4`, and `num_steps=60`) without interacting with the code at all.  This is very efficient!

### Monitoring changes with `print()`

As you write more complicated programs, it is easy to lose track of what
is going on: `run_simulation()` actually performs hundreds of simulations in 60 time steps, and we can't "see" any of them. One of the most useful tools for debugging is the *print statement*, which displays text in cell output.

<br>

Let's say we made a mistake in writing `bike_to_moline()`: in this case we accidentally made the "greater than" sign a "less than" sign.  Run the cell below:

In [None]:
def bike_to_moline(state):
    if state.augie < 0:
        state.augie -= 1
        state.moline += 1
    return state

As a result, our final state is the same every time.  Run the cell below multiple times.  Why do we keep getting the same result?

In [None]:
final_state = run_simulation(0.5, 0.4, 10, 2, 60)
final_state

So how can `print()` help us?  Why are all the bikes ending up at Augie?

<br>

We might suspect something is wrong with our change function.  Is `bike_to_moline()` not getting called?  Or is something wrong with the function?  Let's add some print statements:

In [None]:
def change_func(state, ptm, pta):
    if npr.random() < ptm:
        print("Bike should go to Moline", state.moline)
        state = bike_to_moline(state)
        print("Did bike go to moline?", state.moline)

    if npr.random() < pta:
        state = bike_to_augie(state)
    return state

Now run the simulation.  If the lines get printed, then the function is being called, and we can see if the state variable changes as it should.  But if the lines don't get printed, we'll know that the code is never getting inside the `if` clause for some reason.  

<br>

We'll start with only 20 time steps, so we can better see a problem:

In [None]:
final_state = run_simulation(0.5, 0.4, 10, 2, 20)

Ok, so now we can see that the `bike_to_moline()` is being called (because the print statements before and after it are being called)!  But the number of bikes isn't going up between the two lines: so we know that something is wrong with that function.  Now look carefully at the code and see if you can see what the problem is (and answer the questions below):




---
<br>

🟨 🟨

In [52]:
# import supporting files
from urllib.request import urlretrieve

location = 'https://github.com/MAugspurger/ModSimPy_MAugs/raw/main/'
folder = 'Support_files/'
name = 'Embedded_Qs.ipynb'
local, _ = urlretrieve(location + folder + name, name)
%run /content/$name
home = 'https://github.com/MAugspurger/ModSimPy_MAugs/raw/main/Images_and_Data/Embedded_Qs/'
efile = '1_2_bikeshare'

#@title #======================================= { run: "auto", form-width: "50%", display-mode: "form" }
#@markdown #####*Multiple Choice*:  <br><br> Choose the correct letter.  <br><br>
data = display_multC(efile,home,5)
answer = "" # @param ["", "A", "B", "C", "D", "E"]
check_multC(data,answer)

Look at the output of the previous cell.  Why does the number of bikes at Moline never go up?

A) Because the "bike_to_moline" function never gets called
B) Because the change in the number of bikes is not properly recorded in "bikeshare"
C) Because "state.augie" is always above 0, so the "if" clause in "bike_to_moline" never runs even when the function is called


In [53]:
#@title #======================================= { run: "auto", form-width: "50%", display-mode: "form" }
#@markdown #####*Multiple Choice*:  <br><br> Choose the correct letter.  <br><br>
data = display_multC(efile,home,6)
answer = "" # @param ["", "A", "B", "C", "D", "E"]
check_multC(data,answer)

Look at the output of the previous cell.  20 time steps are run.  Why are the print statements always run fewer than 20 times?

A) Because the reversed "greater than" sign means the  print statements are not always run
B) Because the function "bike_to_moline" is only run 50% of the time (and this is working properly)
C) Because the function "bike_to_moline" is only run 50% of the time (and this is a problem)


Using the print statements helps us narrow down what the problem is: hopefully, we would now notice the error with the "greater than" sign and fix it.  Fix the function in the cell below and make sure everything runs as it should:

In [None]:
def bike_to_moline(state):
    if state.augie < 0:
        state.augie -= 1
        state.moline += 1
    return state

final_state = run_simulation(0.5, 0.4, 10, 2, 20)
final_state

Alright, that's better! Now a bike gets added to Moline each time the function is called!



---

## Exercises

---

<br>

🟨 🟨

### Exercise 1

Here is a copy of `run_simulation`.  Add an inline comment (with #) above each of the lines of code.  Each comment should explain what that line does.  Remember that a comment should have its own line, and should be *above* the line it is commenting on.

In [None]:
def run_simulation(ptm, pta, iAug, iMol, num_steps):
    state = pd.Series(dict(augie=iAug,moline=iMol),name="Number of Bikes")

    for i in range(num_steps):
        state = change_func(state, ptm, pta)

    return state

---

<br>

🟨 🟨

### Exercise 2

Remember back when we took data about the bikes leaving Moline and Augie, and at one point 2 bikes left at the same time, presumably because two friends arrived at the station?

<br>

Rewrite `bike_to_augie()` to capture this possibility.  Imagine that everytime someone took a bike from Augie there was a 25% chance that 2 bikes were taken.  How can you capture that in the function?

In [None]:
# Here's the current version: add to it to create the new bike_to_augie
def bike_to_augie(state):
    if state.moline > 0:
        state.moline -= 1
        state.augie += 1
    return state

In [None]:
# Test function by running this cell
bikeshare = pd.Series(dict(augie=6,moline=6),name="Number of Bikes")
bike_to_augie(bikeshare)