# Lecture 15

#### The Truth About Variables and Objects;  Creating 2D Lists; GCD; Functions and Modular Programming

# 1. The Truth About Variables and Objects

Remember this?  What's going on here???

In [1]:
# EXAMPLE 1a: Assignment

# Compare the results of this bit of code....
x = "Old"
y = x
x = "New"
print(y)

# ...with this one... #
x = ["Old", "One"]    #
y = x                 #   BLOCK A
x = ["New", "Guy"]    #
print(y)              #

#...AND THIS ONE!!!!! #
x = ["Old", "One"]    #
y = x                 #   BLOCK B
x[0] = "New!!!!"      #
print(y)              #

Old
['Old', 'One']
['New!!!!', 'One']


Why are the last two snippets so different?  They both change `x`, but one change causes `y` to change, and the other doesn't. 

The short answer is:

* the line `y = x` (in Block A and Block B), makes `y` and `x` refer to the same object in memory;

* the subsequent line in Block A, starting with `x = `, reassigns `x` to an *entirely new object*, while `y` still is bound to the old object;

* the subsequent line in Block B, starting with `x[0] = `, is a *mutation* which alters, but does not replace, the object which `x` and `y` both point to.  

But that probably doesn't make a lot of sense.  To make sense out of this, I need to go back to the beginning, and tell you the truth about variables, objects, and your computer's memory. 

(What follows is still not 100% accurate, but it does give a morally correct overview of how Python manages your computer's resources, and explains the above phenomenon.)

<br><br><br><br><br><br><br><br><br><br>

Let's say I ask you to multiply 5201 by 3157. You probably can't do that in your head: you need to write down those numbers, and write down the results of intermediate computations.  Maybe you write it down on the whiteboard, or on a piece of paper.

If I ask the computer to multiply 5201 by 3157, it needs to "write things down", too.  Its version of the whiteboard is the Random Access Memory, or RAM (we'll just call it the memory).  The two big differences between the computer and humans is that: the computer doesn't have eyes, so it uses numbers called *memory addresses* to keep track of where it exactly it wrote everything down; and the computer writes down only 0's and 1's, using circuitry instead of ink (we already discussed this).  I'll use `0x` to indicate that I'm writing down a memory address in the examples I show you -- these addresses will be made-up numbers, just meant to help to assist explanations, but your computer does use a numerical addressing system kind of like what I'm doing.

---------


For every value Python needs -- every literal value it encounters, and every evaluation of an expression -- it creates an **_object_**: a parcel of memory which contains a *value* and information about what *data type* that value represents, and which has an *address*.   

I've intimated that variables are attached to values.  It's maybe more accurate to say that variables are bound to **addresses**.  In fact, Python maintains a sort-of "table", which matches defined variables to the addresses of the  objects they are bound to.



<br><br><br><br><br><br><br><br><br><br>

Let's look at a simple program to illustrate.

In [5]:
# EXAMPLE 1b: A simple program

var = 5
x = var + 1
var = x
print(var)

print(hex(id(var)))

6
0x7ffa3142b3f0


Here's what Python does with this.

*On the first line:* As always, assignment starts on the right.  So, Python first creates an object for `5`. Then, `var` is bound to that object: it is added to the table of variables, matched with the address of the object.



![IMAGE NOT FOUND!!!!!!!!!!!!](frame1.jpg)

*On the second line:* An object for `1` is created.  Then, Python looks to address `0x1` to retrieve the value of `var`, and then the values of these two objects are added, to create a new object, with value `6`.  That new object is bound to the variable `x`.


![IMAGE NOT FOUND!!!!!!!!!!!!](frame2.jpg)


*On the third line:* Finally, the last assignment doesn't involve any new objects being created.  Python checks which object is attached to `x`, and then that object is bound to `var`.  So now, both `var` and `x` look to address `0x3` for their value.


![IMAGE NOT FOUND!!!!!!!!!!!!](frame3.jpg)

*On the last line:* Since `var` is attached to `0x3`, Python looks at address `0x3` to retrieve what it will be printing out.


<br><br><br><br><br><br><br><br><br><br>

So far, none of this is particularly illuminating.  But when you bring in mutable objects like lists, things get more interesting.

In [6]:
# EXAMPLE 1c: Lists

my_list = [8,9]
my_list[0] = 20

This code snippet plays out as follows:

*On the first line*: first, objects are created for the literals `8` and `9`. Then, a list object is created. **A list object contains the addresses of the values it contains.** And the list object is bound to `my_list`.


![IMAGE NOT FOUND!!!!!!!!!!!!](frame4.jpg)


*On the second line*: an object is created for the value `20`, and the address of that object is placed as the first entry of the list object at `0x13`. Let's emphasize this: **this line is a mutation, and so it doesn't change which object `my_list` is bound to -- it just alters that object**. 


![IMAGE NOT FOUND!!!!!!!!!!!!](frame5.jpg)


<br><br><br><br><br><br><br><br><br><br>

Finally, we can solve the riddle.

In [None]:
# EXAMPLE 1d: The weird code.

x = ["Old", "One"]  # Line 1a
y = x                # Line 2a
x = ["New", "Guy"]  # Line 3a

x = ["Old", "One"]  # Line 1b
y = x                # Line 2b
x[0] = "New!!!!"     # Line 3b

Line 1a/b: Creates objects for `"Old"`, `"One"`, and the list; puts the addresses of the strings into the list; and binds `x` to the list object.


Line 2a/b: `x` is bound to the object at `0x23`, so this object now gets bound to `y` as well.


![IMAGE NOT FOUND!!!!!!!!!!!!](frame6.jpg)


*Now, here's the difference*

Line 3a: Creates objects for `"New"`, `"Guy"`, and a new list object to hold them; puts the addresses of the strings into the list; and binds `x` to the new list object.  **The old object at `0x23` isn't touched, and `y` is still bound it it.**


![IMAGE NOT FOUND!!!!!!!!!!!!](frame7.jpg)

Line 3b: Modifies the contents of the list object at `0x23`.  Note that neither of the variable bindings change -- they both still point to `0x23`, which has now been changed!


![IMAGE NOT FOUND!!!!!!!!!!!!](frame8.jpg)



<br><br><br><br><br><br><br><br><br><br>


In [None]:
# EXAMPLE 1e: What happens in this case?

x = [1,2,3]
y = x
x[0] = 4
x = [5,6,7]
x[0] = 8

# What will y be? Print it when you know.
# Also: you can run code at http://pythontutor.com/visualize.html 
# which will give nice visualizations of what we've described.

Why does Python do this?  In short, because copying addresses is faster than copying large lists. 

Also, you should know that these mechanics are simpler to how *pointers* behave in C/C++.



<br><br><br><br><br><br><br><br><br><br>

# 2. Creating 2D Lists

The above is relevant to one more topic regarding 2D lists:  creating them from scratch, rather than manipulating 2D lists that are handed to you.

For this, nested loops and `.append()` is a reasonable way to go.  Let's make code which creates an 4 by 8 list (that means 4 "rows", where each "row" has 8 entries), with each entry initialized to 0.  That sounds easy, right?

In [9]:
# EXAMPLE 2a: Create a 2D List, initialized with 0's.
# Why doesn't this work?

my_table = []

for i in range(4):
    for j in range(8):
        my_table[i][j] = 0
        
print(my_table)

IndexError: list index out of range

In [10]:
# EXAMPLE 2b: Create a 2D List, initialized with 0's.
# What about this?

my_table = []

for i in range(4):
    for j in range(8):
        my_table.append(0)
        
print(my_table)

[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]


Maybe it's not so easy!  


<br><br><br><br><br><br><br><br><br><br>


Here's a strategy that works: we will start `my_table` as empty.  For each entry, we'll create an empty row, which we'll append with the requisite number of 0's.  Then, we will append this row to `my_table`.

In [15]:
# EXAMPLE 2c: Create a 2D List, initialized with 0's.
# This works.

my_table = []

for i in range(4):
    row = []
    for j in range(8):
        row.append(0)
    
    #After you've created an entire row of 0's, add this row to my_table
    my_table.append(row)
    
    
print(my_table)

my_table[0][0] = 212
print(my_table)

[[0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0]]
[[212, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0]]



<br><br><br><br><br><br><br><br><br><br>

Hey, how about this?  

`zeros = [0]*8`

`my_table = [zeros]*4`



`[0]*8` creates a list with eight 0's, `[zeros]*4` creates a list containing four copies of that list.  This actually works.  

But beware! Such a 2D list is very very brittle.

In [14]:
# EXAMPLE 1d: Create a 2D List, initialized with 0's. Be EXTREMELY careful with this shortcut.

zeros = [0]*8
my_table = [zeros]*4
print(my_table)

# Watch what happens when I try to change one element!!!!!!!
my_table[0][0] = 21361366
print(my_table)

[[0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0]]
[[21361366, 0, 0, 0, 0, 0, 0, 0], [21361366, 0, 0, 0, 0, 0, 0, 0], [21361366, 0, 0, 0, 0, 0, 0, 0], [21361366, 0, 0, 0, 0, 0, 0, 0]]


What happened?????  The answer is that when you write `my_table = [zeros]*4`, Python goes to memory, and creates `zeros` at some memory address.  Then, it represents `my_table` in memory as a list with 4 addresses -- all of which will be the address of `zeros`.  When I try to change `my_table[0][0]`, Python goes to `zeros` and changes an entry -- but all four entries (rows) of `my_table` still have the address of `zeros`!  So *all* the rows get updated.

So, avoid using list multiplication to initialize 2D lists.  (Or to initialize any list which contains mutable entries.)


<br><br><br><br><br><br><br><br><br><br>

# 3. GCD

Before we move on from loops, let's look at a well-known problem: finding the greatest common divisor of two numbers.

First, what is the most basic way to perform this?

In [None]:
Get the two numbers x and y
Set GCD = 1
For all numbers up to the smaller one:
    if number goes into both x and y
        GCD = number

In [None]:
# EXAMPLE 3a: Basic GCD

x = int(input("Enter the first number: "))
y = int(input("Enter the second number: "))
lesser = min(x,y)

current_gcd = 1


for divisor in range(2, lesser + 1):
    if x % divisor == 0 and y % divisor == 0:
        current_gcd = divisor
        
print(current_gcd)



<br><br><br><br><br><br><br><br><br><br>

Here's a much faster way, called Euclid's algorithm.  Starting with two integers `x` and `y`, observe that the GCD of `x` and `y` is the same as the GCD of `y` and `x%y`. (For example, any number that goes into 100 and 108 would have to go into 8 as well.)

So, we can replace the GCD(`x`,`y`) problem with the problem of finding GCD(`y`, `x%y`).  Furthermore, since `x%y` is less than `x`, the new problem will be "smaller."  Do this enough times and you'll be done.

In [None]:
Get the two numbers x and y
While y != 0:
    Replace x and y with y and x%y, respectively
Print x


<br><br><br><br><br><br><br><br><br><br>

In [None]:
# EXAMPLE 3b: Euclid's algorithm

x = int(input("Enter the first number: "))
y = int(input("Enter the second number: "))

while y != 0:
    # y is going to be the smaller number always.
    # At each step, new x will be the old y, and new y will be the old x % old y.  A temp variable helps for the 
    # transition.
    # The loop should stop when y goes into x -- then, y will be a factor of x, and obviously a factor of y.
    temp = y
    y = x%y
    x = temp
    
print(x)

Notice how much faster the Euclid algorithm is than the "trial division" method proposed above.  If $n$ represents the size of the smaller input, trial division is known as an $O(n^{1})$ algorithm.  The "$O$" stands for *order* -- because we don't care about the exact number of steps involved, but we would like to know roughly how many steps will take place as a function of the size of the input.  And the "$n^{1}$" here means that the number of steps is roughly a *linear* function of the input.  After all, if your lesser number is $10468$, then you're going to have to do roughly $2 \cdot 10468$ mod operations to find the GCD. If your lesser number is $n$, you're going to have to do roughly $2n$ operations.

On the other hand, the Euclid algorithm can be shown to run $O(\log n)$ time when the larger number is $n$.  This means that the number of steps the algorithm needs is no more than (a constant multiple of) $\log (n)$ when the larger number is $n$.  If you want to be really precise: the number of steps is guaranteed to be $\leq 5\log_{10}(n)$, where a "step" means a pass through the loop.  (This is far from obvious!)


<br><br><br><br><br><br><br><br><br><br>

# 4. Functions and Modular Programming

We've discussed writing functions before.  I'd like to return to that now, because our programs have gotten fairly complex, and it will become more and more of a good idea to decompose our programs into "modules": independent subprograms that can be **written**, **understood**, **tested**, and **debugged** independently.  They also help you **minimize repetition** of code. 

Two examples: 


In [None]:
# EXAMPLE 4a: Happy birthday, without functions.

birthday_boy = input("Whose birthday is it? ")
print("Happy birthday to you.")
print("Happy birthday to you.")
print("Happy birthday dear " + birthday_boy + ".")
print("Happy birthday to you.")

birthday_girl = input("Is it anyone else's birthday? ")
print("Happy birthday to you.")
print("Happy birthday to you.")
print("Happy birthday dear " + birthday_girl + ".")
print("Happy birthday to you.")


What if you want to change the song, because the copyright holders want to sue you for singing it without their permission?  Then you have to go through and change the words *twice*. Here's an alternative.

In [None]:
# EXAMPLE 4b: Happy birthday, with functions.

# Here is a Python FUNCTION.
# The top line is the SIGNATURE --
# it contains the keyword def, the NAME of the function, and the FORMAL PARAMETER list.
def bday_song(name):
    """Sing the birthday song."""
    
    print("Happy birthday to you.")
    print("Happy birthday to you.")
    print("Happy birthday dear " + name + ".")
    print("Happy birthday to you.")

##################################################
# Now, we use the function:

birthday_boy = input("Whose birthday is it? ")
bday_song(birthday_boy)

birthday_girl = input("Is it anyone else's birthday? ")
bday_song(birthday_girl)

# What will happen on this line?
# print(bday_song("Bob"))

So, this example demonstrates code re-use.  

Notice: this function has no return statement.  Any function *call* -- when you use a function -- will execute the function code until either a return statement is encountered, or until the end of the function body is reached.  If the end of the body is reached before a return statement is encountered, then the special value "None" is automatically returned.  

Also, this function does something that most of the functions we've done so far don't: it **prints**.  That's ok here, because the whole point of this function is to make printing something less tedious.  It's also a decent idea to have a function print if you are trying to debug, and want to follow how the function executes.  Generally, though, outputs of a function should be *returned*, so that the outputs can be printed, stored to a variable, or used in further computations.  Most functions should *not* print anything.


<br><br><br><br><br><br><br><br><br><br>

Here's an example of using functions to break a problem down.  First, a non-modular program: remember thing explaining? We want to check if each word in the `text_list` is in the list of 1000 common words.

In [None]:
# EXAMPLE 4c: Ten Hundred Most Common Words

my_dictionary = open("smallwords.txt", "r")
dic_list = my_dictionary.read().split()
my_dictionary.close()

# The text!
text_list = """you have a bad problem and you will not go to space today""".split()

all_good = True
for text_word in text_list:
    found_yet = False 
    for dic_word in dic_list:
        if text_word == dic_word:
            found_yet = True
        break
            
    if found_yet:
        all_good = False
        print("\"" + text_word + "\" is not in the dictionary!!")
                
if all_good:
    print("Every word was in the dictionary.")

It has a bug (perhaps an obvious one).  Let's compare this to a modular solution to the same problem.  The idea here is to seperate the code that checks whether an individual word is in the dictionary from the code that keeps track of all the words in the text.


<br><br><br><br><br><br><br><br><br><br>

In [None]:
# EXAMPLE 4d: Ten Hundred Most Common Words
# A modular solution, with a function, and still a bug. 

import random

def is_word_in_dic(word, dict_list):
    """Check if the given word is in dict_list."""
    found_yet = False 
    for dic_word in dic_list:
        if text_word == dic_word:
            found_yet = True
        break
    return found_yet

###################################################################
# Here's the TEST code: what do you expect to have happen?  What actually does?
###################################################################

my_dictionary = open("smallwords.txt", "r")
dic_list = my_dictionary.read().split()
my_dictionary.close()

print(is_word_in_dic("here", dic_list))
print(is_word_in_dic("juxtaposition", dic_list))
print(is_word_in_dic("trip", dic_list))

###################################################################
# Now, here's the code that goes through the text.
###################################################################
text_list = """you have a bad problem and you will not go to space today""".split()

all_good = True
for text_word in text_list:
    found_yet = is_word_in_dic(text_word, dic_list) # USING THE FUNCTION!
            
    if found_yet:
        all_good = False
        print("\"" + text_word + "\" is not in the dictionary!!")
                
if all_good:
    print("Every word was in the dictionary.")

Now, we can clearly identify the part of the program that contains the error -- since the tests are clearly failing, the problem is in the function, not the code that calls it.  (Or at least, the FIRST problem is in the function.)


<br><br><br><br><br><br><br><br><br><br>

What if you wanted to turn the Scrooge program from the homework into a modular program?  What might that look like?

In [None]:
# EXAMPLE 4e: Scrooge

import random


# Write a function that simulates a game, and returns the total amount won
def game_sim():
    """Return the amount won in one simulated game"""
    coin_sum = 0
    pick_list = [1, 5, 10, 25]
    for pick in range(10):
        index_choice = random.randrange(0,4) # OR:
        coin_pick = pick_list[index_choice]  # coin_pick = random.choice(pick_list)
                
        coin_sum += coin_pick
        
    if coin_sum >= 100:
        return coin_sum/100
    else:
        return -1
########################################################

# Now, 100,000 simulations.    
NUM_GAMES = 100000
games_won = 0
money_won = 0

for game in range(NUM_GAMES):
    #
    # WHAT CAN WE PUT HERE?
    #
        
print(games_won/NUM_GAMES)
print(money_won/NUM_GAMES)

