# Objects, Assignments, Passing arguments

In this lab, we will go over some of the basics of builtin data types, assignments, and passing arguments to functions. These programming details will help in writing and reading code, and figuring out exactly what is happening (and avoid unexpected side effects). A lot of this development is from "Learning Python" by M. Lutz (O'Reilly Publishers) and that is a good book for programming reference in python.

In python, you do not have to declare variables before using them. This is not because python doesn't care about data-types, rather it is because the assignment statement means something else altogether in python.

Consider for example:

In [3]:
a = 2

In the above, we did not declare a. The statement above does two things: 

(i) create an object of type "integer" (with appropriate storage) of value 2, and 

(ii) create the name "a" if it doesn't exist yet and assign the name "a" to the object created in step (i).

In particular, the object of type integer is the storage with value 2. It is not "a"---the variable or name "a" does not have type information or constraints associated with it. When you use "a" in code, it is replaced with the object it currently refers to. Since the name "a" is only created when you assign it an object, a variable or name cannot be used before it is assigned to something.

On the other hand, since names do not have type, you can reuse the name "a" for anything else subsequently. For example, we assign the name "a" to the tuple object below.

In [4]:
a = (1,2,3)

# a is now the name for the tuple object

a = "I am now a string"

# a is now a string object

As mentioned before, the name/variable "a" is just that: a name. It is just an entry maintained in the system table by python, quite distinct from the object it is assigned to. 

After the above tuple assignment, what happens to the integer object 2? Well, at this point, no name refers to it. When an object does not have any variable referencing it, it is "garbage-collected" internally by python, which means that memory is freed up for use. In python, you do not have to explicitly remove references, rather objects with no more references remaining are automatically destroyed by python. There are subtleties in exactly when they are garbage collected, but for the most part, we do not need to be concerned about that. 

In the same way, the tuple object (1,2,3) is garbage-collected when a is assigned as the name to the string object "I am now a string".

In both the above assignments, the integer, tuple and string objects are so-called "immutable" objects, meaning that these objects cannot be changed in place. The integer object with value 2 is only that---it is not possible to change the value in place to 3. If we need an integer object with value 3, python creates a separate object for that. Same goes for the other immutable objects: tuples and strings. This has a few important consequences down the line, and behaves quite differently from lists and arrays. 

Before we go deeper into the consequences, consider what happens when you assign a statement like b=a.

In [5]:
# create an integer object and assign the name a to it:
a = 2 

# Now consider:

b =a

In the above statement, b=a, here is what happens. The variable a is already in the system table, and it is replaced by the object it refers to (the integer object with value 2). Therefore, when you write the assignment statement b = a, it is as if we assign b to be the name of the same integer object with value 2 that a refers to. Now, both a and b refer to the same object. 

If we now write:


In [6]:
a = 3

The fact that integers are immutable objects kicks in. python cannot change the value of the integer object with value 2. Instead, in the above statement, it creates a new integer object with value 3, and assigns to it the name "a". 

The old object with value 2 still has a name referencing it (b) so it isn't garbage-collected yet. Indeed:


In [7]:
print(b)

2


When you use statements like a = a + 2, once again, python creates a new integer object (with value 5) and gives it the name a. 

To summarize, from the book "Learning python": "Unlike in some languages, in Python variables are always pointers to objects, not labels of changeable memory areas. Setting a variable pointing to an immutable object to a new (immutable) object does not change alter the original object, but rather causes the variable to reference an entirely new object". 

Let us do a similar example, but using tuples which are also immutable.

In [8]:
a = (1,2,3)
b = a
print(b)

(1, 2, 3)


In [9]:
a = (-1,2,3)

print('a is now: ', a, 'and b is: ',b)

a is now:  (-1, 2, 3) and b is:  (1, 2, 3)


Not all objects in python are immutable. Unlike integers, tuples, or strings, objects like lists and arrays are mutable, meaning they can be changed in place. Let us consider what happens when we look at these. Note how in the list below, two of its elements are numbers while the third is a string---indeed lists need not be homogenous, they are quite happy to contain objects of various types.

In [10]:
A = [ 1, 2, 'three']
B = A

You can infer what happens in the statements above. First python creates an object of type "list" ([1,2,'three']) and gives it the name "A" (note that case matters, the name A is different from a). Then in the statement B = A, A is replaced by the object it refers to---with the consequence that B now points to the same object that was created in line 1 above. 

Lists are mutable, meaning you can change the elements of the list in place. Let us change the first entry of the list to -1.

In [11]:
A[0] = -1

Can you figure out what will happen when we print(B) now?

In [12]:
print(B)

[-1, 2, 'three']


What happened was that the statement A[0]=-1 just changed the list object in place, meaning the first entry of the list just got changed to -1. B is pointing to the same list, so print(B) printed [-1, 2, 3]. 

Unlike in the tuple example, A[0]=-1 did not create a new list object but changed the existing list object in place. Also note that in the tuple example, we did not say  "a[0] = -1" (you would get an error since tuples are immutable), rather we said "a = (-1,2,3)". 

This now raises a question: how do we create a copy of a list? Consider: 

In [13]:
A = [1, 2, 3]
B = A[:]

A[0]=-1

print('A is now', A, 'and B is: ',B)

A is now [-1, 2, 3] and B is:  [1, 2, 3]


What happened above was that A[:] refers to the "list of elements of A" not the object pointed to by A. The statement B = A[:] therefore says: create a list object with entries [1,2,3]. Now B is a different list with the same entries as A, so changing A does not change B. Another way to copy objects is via the builtin library copy.


In [14]:
import copy as c

A = [1,2,3]
C = c.copy(A)
A[0]=-1

print('A is now', A, 'and C is', C)

A is now [-1, 2, 3] and C is [1, 2, 3]


At this point, B = [1,2,3] and so is C. But they are different objects that contain the same content. python provides two equality operators: 

"B == C" stands for: are the values of the objects B and C refer to the same (here, yes)

"B is C" stands for: do B and C refer to the same object (here, no).

Let us check them out.

In [15]:
print('Are the values of the objects refered by B and C same?', B == C)

Are the values of the objects refered by B and C same? True


In [16]:
print('Do B and C refer to the same object?', B is C)

Do B and C refer to the same object? False


In [17]:
D = B 

print('Do B and D refer to the same object?', B is D)

Do B and D refer to the same object? True


While python has several other datatypes (dictionaries, sets, etc.), we will mention one more for now: the numpy array, which we will extensively use for matrices. The numpy array is technically just a list of lists, but has some other modifications over the base data types (we have methods such as shape, transpose, etc. that are very useful for matrix operations).

In [18]:
import numpy as np

A = np.array([[1,2,3],[3,4,5]])
print(A)

print('A has shape: ',A.shape)
print('The transpose of A is \n',A.T)

[[1 2 3]
 [3 4 5]]
A has shape:  (2, 3)
The transpose of A is 
 [[1 3]
 [2 4]
 [3 5]]


The numpy array, like the base list object, is also mutable. In the example below, A and B refer to the same object, so the object can be changed with either name.

In [40]:
A = np.array([[1,2,3],[4,5,6]])
print(A)
B = A
B[0,1]=-1

print(A)

A[0,0]=-1

print(B)

[[1 2 3]
 [4 5 6]]
[[ 1 -1  3]
 [ 4  5  6]]
[[-1 -1  3]
 [ 4  5  6]]


There are some quirks. The numpy array is not exactly the list object. A list object for example, need not be homogenous---so you could have a list containing one number, one string and one dictionary, for example. numpy arrays are homogenous---all elements of the array must be the same type. 

The numpy array A above best thought of as containing 

(i) the pointer to its data (the data is best accessed as below by A.view(), while A.data is the address of the start of the memory block containing the data)

(ii) the shape information accessed by A.shape 

Think of the data as a single block of memory, and the shape information organizes the data into the matrix form we see. So if the matrix has two rows---the first being [1,2,3] and the second being [4,5,6], the data is stored contiguously as [1,2,3,4,5,6]. The shape information asks to organize it into 2 rows, each with 3 entries (so the data is read row by row). 
So changing the shape of a matrix is as simple as just setting A.shape to a new tuple. 

### Views of a numpy array

The data is stored in memory, and the numpy array has a pointer to it. You can get access to the data stored in A using A.view(). Think of A.view() as a copy of the pointer to the data, as well as a copy of the shape---but since A.view() copies the pointer to the data, it shares the data with A. 

So in the snippet below, A.view() returns a numpy array that shares data with A (meaning changing the data in A.view() or in A affects the other), but _not_ the shape information (meaning that the shape of the object A.view() can be changed distinctly from the shape of the object A). In the code below, therefore, C initially has a copy of A's shape information, but this is a copy.  Two consequences follow:

(i) if you change data in C, you change the data of A too (or vice versa) since they actually share the same pointer to the data. 

(ii) changing the shape of A or C does not affect the other.

In [43]:
C = A.view()
print('Is C an numpy array? To check, use isinstance(C,np.ndarray): ', isinstance(C,np.ndarray))
print('C is a distinct object from A, since C is A returns', C is A)
print('C= \n', C, '\nwhile A= \n',A)
# C shares the same data with A
C[0,0]=1
print('Changing C has changed A too\n',A)
A[0,0]=2
print('Changing A has changed C too\n',C)

# but not the shape information that organizes the data into the matrix. If we change 
# the
C.shape = (1,6)
print('Changing the shape of C to (1,6) yields', C)
print('but this left A unchanged.\n',A)



Is C an numpy array? To check, use isinstance(C,np.ndarray):  True
C is a distinct object from A, since C is A returns False
C= 
 [[ 2 -1  3]
 [ 4  5  6]] 
while A= 
 [[ 2 -1  3]
 [ 4  5  6]]
Changing C has changed A too
 [[ 1 -1  3]
 [ 4  5  6]]
Changing A has changed C too
 [[ 2 -1  3]
 [ 4  5  6]]
Changing the shape of C to (1,6) yields [[ 2 -1  3  4  5  6]]
but this left A unchanged.
 [[ 2 -1  3]
 [ 4  5  6]]


In [46]:
A = np.array([[1,2,3],[4,5,6]])
print(A)
B = A
A = A.T

print('A.T returns a view of A. So if we ask A is B after A=A.T:', A is B)

# Since the data is shared between A and B, changing one 
# leads to change in another, even though they are of different shapes.

A[1,1]=-1
print('If we change data in A\n',A)
print('B also changes\n',B)

B[0,2]=-1
print('If we change in B\n',B)
print('A also changes\n',A)

print('But notice that the shapes of A and B are no longer the same')

[[1 2 3]
 [4 5 6]]
A.T returns a view of A. So if we ask A is B after A=A.T: False
If we change data in A
 [[ 1  4]
 [ 2 -1]
 [ 3  6]]
B also changes
 [[ 1  2  3]
 [ 4 -1  6]]
If we change in B
 [[ 1  2 -1]
 [ 4 -1  6]]
A also changes
 [[ 1  4]
 [ 2 -1]
 [-1  6]]
But notice that the shapes of A and B are no longer the same


## Passing arguments to functions

Suppose you have a function as below. What happens here is the following: 

* When the function is called, python sets arg = a. Now arg points to the object 2. a points to the object 2 as well.
* The statement arg = 0 reassigns arg to the object 0, since integers are immutable. a continues to point to the object 2.

Therefore, print(a) gives 2. The function doesn't actually do anything in effect for immutable arguments. 

In [56]:
def __zero(arg):
    arg = 0
    
a = 2
__zero(a)

print(a)


2
[[0 0]
 [0 0]
 [0 0]]


A similar thing works differently when you pass numpy arrays as arguments since they are mutable.

In [55]:
def __zero(matrix):
    matrix.fill(0)
    
A = np.array([[1,4],[2,-1],[-1,6]])
__zero(A)

print(A)

[[0 0]
 [0 0]
 [0 0]]


Moral of the story: be very clear of the effects of passing mutable objects into functions is what you intend. Functions can have a return statement as well. 