# copy — Shallow and deep copy operations
## Agenda

- copyとdeepcopyの違い

### Hardware

In [1]:
%%bash
system_profiler SPHardwareDataType | grep -E \
"Model Identifier"\|"Processor Name"\|"Processor Speed"\
\|"Number of Processors"\|"Memory:"

      Model Identifier: MacBookPro13,1
      Processor Name: Dual-Core Intel Core i5
      Processor Speed: 2 GHz
      Number of Processors: 1
      Memory: 16 GB


In [2]:
!sw_vers

ProductName:	Mac OS X
ProductVersion:	10.15.4
BuildVersion:	19E287


### Python

In [3]:
!python -V

Python 3.7.4


### Import

In [4]:
import numpy as np
from matplotlib import pyplot as plt
import scipy.stats as sp
import copy

# 1. インタフェース

- `copy.copy(x)`: x の浅い (shallow) コピーを返します。
- `copy.deepcopy(x)`: x の深い (deep) コピーを返します。

浅い (shallow) コピーと深い (deep) コピーの違いが関係するのは、複合オブジェクト (リストやクラスインスタンスのような他のオブジェクトを含むオブジェクト) だけ。

In [5]:
## 代入はcopyではないことの確認
a = "abc"
b = a

print ("id(a) = %s" % id(a))
print ("id(b) = %s" % id(b))


id(a) = 4320467632
id(b) = 4320467632


In [6]:
## shallow copy
a = [1, 2]
b = copy.copy(a)
a.append(3)

print ("a = %s" % a)
print ("b = %s" % b)

print ("id(a) = %i" % id(a))
print ("id(b) = %i" % id(b))

a = [1, 2, 3]
b = [1, 2]
id(a) = 4621721344
id(b) = 4621799168


In [7]:
## shallow copyの問題
a = [[1, 2], [3, 4]]
b = copy.copy(a)
a[1].append(5)

print ("a = %s" % a)
print ("b = %s" % b)

print ("id(a) = %i" % id(a))
print ("id(b) = %i" % id(b))

a = [[1, 2], [3, 4, 5]]
b = [[1, 2], [3, 4, 5]]
id(a) = 4621671552
id(b) = 4621683600


このように構造化されたオブジェクトに対してshallow copyをするとobjectのidは異なるが、配列の中身が更新されてしまう

In [8]:
## deep copy
a = [[1, 2], [3, 4]]
b = copy.deepcopy(a) #変更行
a[1].append(5)

print ("a = %s" % a)
print ("b = %s" % b)

print ("id(a) = %i" % id(a))
print ("id(b) = %i" % id(b))

print ("id(a[0]) = %i" % id(a[0]))
print ("id(b[0]) = %i" % id(b[0]))

print ("id(a[1]) = %i" % id(a[1]))
print ("id(b[1]) = %i" % id(b[1]))

a = [[1, 2], [3, 4, 5]]
b = [[1, 2], [3, 4]]
id(a) = 4621563136
id(b) = 4619826912
id(a[0]) = 4621575104
id(b[0]) = 4621552448
id(a[1]) = 4621564416
id(b[1]) = 4621549968


### スライスとcopy

- スライスを用いることでもshallow copyはできる

In [9]:
a = [1, 2, 3]
b = a[:]
print ("a = %s" % a)
print ("b = %s" % b)

print ("id(a) = %i" % id(a))
print ("id(b) = %i" % id(b))

a = [1, 2, 3]
b = [1, 2, 3]
id(a) = 4621682320
id(b) = 4623355328


## 2. viewとshallow copy

- ここではshallow copyがdataをシェアしていまっていることを確認する
- `numpy.ndarray`を使用する

### viewの場合

In [10]:
a = np.arange(12)
c = a.view()

In [11]:
c is a

False

In [12]:
c.base is a

True

In [13]:
c.flags.owndata

False

In [14]:
c.shape = 2,6

In [15]:
c

array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11]])

In [16]:
a

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

In [17]:
c[0,4] = 1234

In [18]:
a

array([   0,    1,    2,    3, 1234,    5,    6,    7,    8,    9,   10,
         11])

In [19]:
s = a[1:3]
s[:] = 10
a

array([   0,   10,   10,    3, 1234,    5,    6,    7,    8,    9,   10,
         11])

### view その２

In [20]:
c

array([[   0,   10,   10,    3, 1234,    5],
       [   6,    7,    8,    9,   10,   11]])

In [21]:
s = c[0, :]

In [22]:
s.base is c

False

In [23]:
s[:] = 9999
c

array([[9999, 9999, 9999, 9999, 9999, 9999],
       [   6,    7,    8,    9,   10,   11]])

### Deepcopy

- `numpy.ndarray`のcopy methodはdeepcopy =  makes a complete copy of the array and its data.

In [24]:
s = c.copy()

In [25]:
s.base is c

False

In [26]:
s[:] = 0
s

array([[0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0]])

In [27]:
c

array([[9999, 9999, 9999, 9999, 9999, 9999],
       [   6,    7,    8,    9,   10,   11]])

## 3. listと`numpy.ndarray`のスライス代入の差異

Pythonのlistによるスライス代入はcopyだが`numpy.ndarray`のスライス代入はview

### Python List

In [28]:
a = [1, 2, 3]
b = a[:]
id(a) == id(b)

False

In [29]:
b[1] = 9999
b

[1, 9999, 3]

In [30]:
a

[1, 2, 3]

### `Numpy.ndarray`

In [31]:
a = np.arange(0, 12, dtype = np.float)
a

array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11.])

In [32]:
b = a[:]
id(a) == id(b)

False

In [33]:
b.base is a

True

In [34]:
b[1] = 999.0
b

array([  0., 999.,   2.,   3.,   4.,   5.,   6.,   7.,   8.,   9.,  10.,
        11.])

In [35]:
a

array([  0., 999.,   2.,   3.,   4.,   5.,   6.,   7.,   8.,   9.,  10.,
        11.])

## 4. copyかviewか確かめる方法

- memoryをシェアしている可能性があるか否かの判定に用いられる
- Falseの場合はメモリを共有していないが、Trueの場合は必ずしもTrueとは限らない（False NegativeはないがFalse positiveはあり得る）

### syntax
```
numpy.may_share_memory(a, b, max_work=None)
```

- a,b : ndarray

In [36]:
np.may_share_memory(np.array([1,2]), np.array([5,8,9]))

False

In [37]:
x = np.zeros([3, 4])
np.may_share_memory(x[:,0], x[:,1])

True

確実な判定は`np.shares_memory`を用いる。ただし実行時間は遅い。

In [38]:
np.shares_memory(x[:,0], x[:,1])

False