<a href="https://colab.research.google.com/github/HurricaneCam206/Bootcamp-GT/blob/main/Module%200/Topics%201/week01_session02_NB02_refs_copies.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# References and Copies #



_Main topics covered during today's session:_

Previous NB:
1. **Python Loops**
2. **Comprehensions:  Lists, Dicts, Sets**

This NB:

4. **References and Copies**

Next NB:

5. **Troubleshooting Data for the Exams**


## References ##

Variables are _names_ for objects. When the objects are "complex" (not "primitive"), modifications through one name may be visible to others.

To wit:

In [1]:
x = [1, 2, 3, 4, 5]
print("x:", x)

y = x
print("y:", y)

y[2] *= -1
print("Modified y:", y)

x: [1, 2, 3, 4, 5]
y: [1, 2, 3, 4, 5]
Modified y: [1, 2, -3, 4, 5]


**Question:** What is `x`?

In [2]:
print(x) # What does this produce?

[1, 2, -3, 4, 5]


#### Why does this occur?

Because `y` is simply pointing to the underlying memory location of `x`, in the underlying data storage of the computer. It is not pointing to a new memory location, so modifying one variable modifies both of them.

**What's your alternative?** If you really do need a copy, what are your options? Three ways of doing it below.

```python
y = [1, 2, 3, 4, 5]
y = x.copy()
y = [e for e in x]
```

Go ahead and put the above into the Python Tutor and see how they execute..

**A tricky case.**

In [3]:
x = [1, 2, ['a', 'b', 'c'], 4, 5]
y = x.copy()
print(y)

[1, 2, ['a', 'b', 'c'], 4, 5]


In [4]:
y[2].append('w')
print(y)

[1, 2, ['a', 'b', 'c', 'w'], 4, 5]


In [5]:
print(x) # What is the result?

[1, 2, ['a', 'b', 'c', 'w'], 4, 5]


In Python, all unique objects have an _identifier_ associated with them. You can query these.

In [6]:
id(x), id(y)

(138647767864640, 138647769138880)

In [7]:
id(x[2]), id(y[2])

(138647767870592, 138647767870592)

In this case, `x` and `y` are distinct objects, but `x[2]` and `y[2]` refer to the same object. When we "copied" `x[2]` into `y[2]`, we copied the `id(x[2])` rather than duplicating the entire object. This kind of copy is sometimes called a _shallow copy_.

Still not clear? Check out a Python Tutor version.

In [8]:
%%html

<iframe width="1024" height="350" frameborder="0" src="https://pythontutor.com/iframe-embed.html#code=x%20%3D%20%5B1,%202,%20%5B'a',%20'b',%20'c'%5D,%204,%205%5D%0Ay%20%3D%20x.copy%28%29%0Ay%5B2%5D.append%28'w'%29&codeDivHeight=400&codeDivWidth=350&cumulative=false&curInstr=0&heapPrimitives=nevernest&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false"> </iframe>

**What if you really need a copy for a nested data structure?** The preceding example illustrates that `.copy()` performs a _shallow_ copy. But what if you want a non-shallow, or _deep_, copy? There's a module for that!

In [9]:
from copy import deepcopy

print('x:', x)
z = deepcopy(x)
print('z:', z)

print('=== appending ===')
z[2].append('@')
print('x:', x)
print('z:', z)

x: [1, 2, ['a', 'b', 'c', 'w'], 4, 5]
z: [1, 2, ['a', 'b', 'c', 'w'], 4, 5]
=== appending ===
x: [1, 2, ['a', 'b', 'c', 'w'], 4, 5]
z: [1, 2, ['a', 'b', 'c', 'w', '@'], 4, 5]


**Exercise** (taken from Notebook 1). Let `L` be a list of strings, e.g.,

```python
L = ['abc', 'def', 'ghi']
```

Complete the function, `rev_str_cat_list(L)` so that it reverses the elements in the list and then concatenates these strings into a single string. It should not modify `L`.

For instance, `rev_str_cat_list(L)` on the above list would return,

```python
'ghidefabc'
```

Your friend supplies the following solution. It appears to produce the correct result, but is wrong. Why?

In [12]:
def rev_str_cat_list(L):
    L.reverse()
    return ''.join(L)

L = ['abc', 'def', 'ghi']
result = rev_str_cat_list(L)
print(repr(result)) # So right, and yet so wrong. Why? (we modified the input with .reverse() which fails the test cases)

'ghidefabc'


> _Answer:_ This function is considered _incorrect_ because it modifies its input. Try `print(L)` after the call to `rev_str_cat_list(L)` to verify this claim.
>
> In this case, the exercise stipulates that the function should not modify its input.

#### However, you should always _assume_ that convention unless told otherwise. Why? Remember that you are writing code for others. By adhering to the convention that functions do not modify their inputs, it makes it easier for others to reason about the behavior of your code.


#### When we want your function to modify its input, we will tell you to do so.

### This is really important throughout the course!!!

## Summary ##

1. Every distinct object in Python has an ID, which you can see by `id(x)` for the object `x`.

2. An assignment _copies_ these IDs. That is, in the assignment `y = x`, it will be the case that `id(y)` equals `id(x)`.

3. Shallow vs. deep copies: An object's `.copy()` function will perform a shallow copy. For deep copies, use `deepcopy` from the `copy` module.