### Copying Sequences — Advanced

This notebook expands on shallow vs deep copies, covers multiple ways to copy, shows pitfalls with nested structures, explains how `copy` works for other containers, and demonstrates customizing copy behavior for your own classes.

**Topics**
- Shallow copies: `[:]`, `.copy()`, `list()`, `copy.copy`
- Deep copies: `copy.deepcopy` (and the memo table preserving shared structure)
- Copy semantics for lists/tuples/strings/sets/dicts
- Aliasing pitfalls with nested mutables
- Copying from arbitrary iterables (e.g., generators)
- Customizing copy with `__copy__` and `__deepcopy__`
- Performance notes (micro-benchmarks)


## 1) Shallow copies of lists
Four common ways to make a **shallow** copy of a list:

In [1]:
import copy

l1 = [1, 2, 3]
c_slice = l1[:]        # slicing
c_method = l1.copy()   # list method
c_ctor  = list(l1)     # constructor
c_copy  = copy.copy(l1) # generic shallow copy

(c_slice == l1, c_method == l1, c_ctor == l1, c_copy == l1), \
(c_slice is l1, c_method is l1, c_ctor is l1, c_copy is l1)

((True, True, True, True), (False, False, False, False))

**Shallow** means the outer container is new, but its elements are the **same objects** (shared references).

In [2]:
nested = [[1], [2], [3]]
shallow = nested[:]   # shallow copy
nested[0].append(99)
nested, shallow  # both reflect inner mutation because inner lists are shared

([[1, 99], [2], [3]], [[1, 99], [2], [3]])

## 2) Deep copies of nested structures
`copy.deepcopy` recursively copies nested objects, producing fully independent structures. It also keeps a **memo** to preserve shared references and handle cycles safely.

In [3]:
import copy
nested = [[1], [2], [3]]
deep = copy.deepcopy(nested)
nested[0].append(42)
nested, deep  # deep copy unaffected

([[1, 42], [2], [3]], [[1], [2], [3]])

### Preserving shared structure
If the original shares inner objects, `deepcopy` preserves those relationships in the copy:

In [4]:
a = [1, 2]
b = [a, a]             # shared reference to the same inner list
d = copy.deepcopy(b)
(b[0] is b[1], d[0] is d[1])  # original shares; deep copy also shares its own inner copy

(True, True)

## 3) Copy semantics for other containers
- **Tuples/strings/bytes** are immutable → slicing returns the **same object** optimization in CPython.
- **Sets/dicts**: use `.copy()` or constructors to make **shallow** copies. Inner mutables still alias.

In [5]:
t1 = (10, 20)
t2 = t1[:]  # immutable -> CPython typically returns same object
s1 = 'hello'
s2 = s1[:]  # same object

d1 = {'x': [1, 2], 'y': [3]}
d2 = d1.copy()           # shallow copy
d1['x'].append(99)

(t1 is t2, s1 is s2), (d1 is d2, d1['x'] is d2['x']), d2

((True, True), (False, True), {'x': [1, 2, 99], 'y': [3]})

## 4) Copying from arbitrary iterables (materialization)
Constructors like `list()`, `tuple()`, `set()` **consume** any iterable (including generators) to build a new container. This is a *copy* in the sense of materializing elements, not a structural deep copy.

In [6]:
def gen(n):
    for i in range(n):
        yield i*i

g = gen(5)
l = list(g)  # materialize generator
l, list(gen(5))  # the generator is single-use; calling again produces fresh items

([0, 1, 4, 9, 16], [0, 1, 4, 9, 16])

## 5) When *not* to bother copying
- Immutable containers (tuples/strings) are safe to share.
- If you only **read** from a list (no mutation), aliasing is fine.
- Copying large containers can be expensive; prefer not to unless you must avoid side-effects.

## 6) Customizing copy for your own classes
Implement `__copy__(self)` and `__deepcopy__(self, memo)` to control how instances are copied. Use `memo` with `deepcopy` to avoid infinite recursion and to preserve shared sub-objects consistently.

In [7]:
import copy

class Bag:
    def __init__(self, items):
        self.items = list(items)

    def __copy__(self):
        # shallow: new Bag, but shares inner items list (to demonstrate)
        new = type(self)(self.items)
        return new

    def __deepcopy__(self, memo):
        if id(self) in memo:
            return memo[id(self)]
        new = type(self)(copy.deepcopy(self.items, memo))
        memo[id(self)] = new
        return new

    def __repr__(self):
        return f"Bag({self.items!r})"

b1 = Bag([[1], [2]])
b2 = copy.copy(b1)
b3 = copy.deepcopy(b1)
b1.items[0].append(99)
b1, b2, b3  # shallow reflects, deep does not

(Bag([[1, 99], [2]]), Bag([[1, 99], [2]]), Bag([[1], [2]]))

## 7) Micro-benchmarks (indicative)
Copy strategies have different costs. Below: timing shallow copy methods for a moderately sized list. Numbers vary by machine/interpreter; use this as a rough guide only.

In [8]:
from timeit import timeit
data = list(range(10000))
t_slice = timeit("x = data[:]", globals=globals(), number=2000)
t_copy  = timeit("x = data.copy()", globals=globals(), number=2000)
t_ctor  = timeit("x = list(data)", globals=globals(), number=2000)
t_copy_mod = timeit("import copy; x = copy.copy(data)", globals=globals(), number=2000)
{'slice': t_slice, '.copy()': t_copy, 'list()': t_ctor, 'copy.copy': t_copy_mod}

{'slice': 0.13680819999717642,
 '.copy()': 0.1521898999926634,
 'list()': 0.11645990000397433,
 'copy.copy': 0.11819359999208245}

## 8) Practical patterns
- Defend against callee mutation: pass `arg[:]` or `copy.deepcopy(arg)` depending on your needs.
- For nested data you intend to mutate independently, **deepcopy** (or rebuild only the parts you need).
- Prefer immutable structures (e.g., tuples, `frozen dataclasses`) when possible to avoid copying entirely.


## 9) Recap with quick demonstrations
**Shallow list copy vs deep:**

In [9]:
from copy import deepcopy
m1 = [[1, 0, 0], [0, 1, 0], [0, 0, 1]]
m2 = m1.copy()       # shallow copy
m3 = deepcopy(m1)    # deep copy

m1[0].append(100)
m1, m2, m3

([[1, 0, 0, 100], [0, 1, 0], [0, 0, 1]],
 [[1, 0, 0, 100], [0, 1, 0], [0, 0, 1]],
 [[1, 0, 0], [0, 1, 0], [0, 0, 1]])

**Immutable slicing returns same object (CPython optimization):**

In [10]:
t1 = (10, [1, 2], 'abc')
t2 = t1[:]
s1 = 'Python rocks!'
s2 = s1[:]
(t1 is t2, s1 is s2)

(True, True)