# 索引与切片（视图和复制）

在numpy中对数据进行索引都是基于视图的，对这一部分数据进行操作会影响原数据

In [1]:
import numpy as np

## 1.视图和复制 

In [2]:
arr = np.arange(10)
arr

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [3]:
arr_slice = arr[5:8]
arr_slice[1] = 12345
arr  #可以看到原数据也被修改了

array([    0,     1,     2,     3,     4,     5, 12345,     7,     8,
           9])

In [4]:
arr_slice[:] = 12
arr

array([ 0,  1,  2,  3,  4, 12, 12, 12,  8,  9])

使用复制np.copy()

In [5]:
arr = np.arange(10)
arr_slice = arr[5:8].copy()  
arr_slice

array([5, 6, 7])

In [6]:
arr_slice[:] = 12
arr

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [7]:
arr_slice

array([12, 12, 12])

## 2.索引 

普通索引不再展示  
不同的索引方式也会参数不同的数组类型，整数和切片混合索引会导致降维

In [8]:
arr = np.arange(9).reshape(3,3)
arr

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

In [9]:
a1 = arr[2]
a2 = arr[2,:]
a3 = arr[2:,:]
a1.shape,a2.shape,a3.shape

((3,), (3,), (1, 3))

In [10]:
a1,a2,a3

(array([6, 7, 8]), array([6, 7, 8]), array([[6, 7, 8]]))

整数索引和切片混合，可以得到低维度的切片如a1，a2

### ravel 与 flatten 的区别和联系

In [11]:
#ravel()函数也可以对数组进行降维，且返回的是视图
a3r = a3.ravel()
a3r.shape

(3,)

In [12]:
a3r

array([6, 7, 8])

In [13]:
a3r[:] = 2
a3,a3.shape #可以看到原数组发生了变化，但维度不变

(array([[2, 2, 2]]), (1, 3))

flatten()也可以降维，但返回的是副本

In [14]:
a3f = a3.flatten()
a3f.shape

(3,)

In [15]:
a3,a3f

(array([[2, 2, 2]]), array([2, 2, 2]))

In [16]:
a3f[:] = 4
a3f,a3 #原数据没变化

(array([4, 4, 4]), array([[2, 2, 2]]))

## 3.布尔索引 

运用布尔索引，假设数据是对应的矩阵，可进行类似掩膜的提取

In [17]:
names = np.array(['Bob','Joe','Will','Bob','Will','Joe','Joe'])
data = np.random.randn(7,4)
names,data

(array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'], dtype='<U4'),
 array([[ 0.98267129, -0.05551004, -1.22733747,  0.73921443],
        [-0.63333885,  1.91393381, -0.64437453, -0.04670214],
        [ 0.56654464, -1.78194391,  0.03301866, -1.25012035],
        [ 0.32416562,  0.81225429, -0.39467472, -0.88892041],
        [ 0.49965283, -0.37160871,  1.23493762, -0.7490566 ],
        [-0.76157313, -0.27671444, -0.85417906,  0.88740656],
        [ 2.05707804,  0.57419182, -0.10137799,  1.18776646]]))

In [18]:
data[names == 'Bob']

array([[ 0.98267129, -0.05551004, -1.22733747,  0.73921443],
       [ 0.32416562,  0.81225429, -0.39467472, -0.88892041]])

这里布尔数组要与矩阵行数对应，且假定每一行是一个人的数据

In [19]:
data[names == 'Bob',2:]

array([[-1.22733747,  0.73921443],
       [-0.39467472, -0.88892041]])

In [20]:
data[names == 'Bob',3]

array([ 0.73921443, -0.88892041])

In [21]:
mask = (names == 'Bob') | (names == 'Will')
mask

array([ True, False,  True,  True,  True, False, False])

In [22]:
data[mask]

array([[ 0.98267129, -0.05551004, -1.22733747,  0.73921443],
       [ 0.56654464, -1.78194391,  0.03301866, -1.25012035],
       [ 0.32416562,  0.81225429, -0.39467472, -0.88892041],
       [ 0.49965283, -0.37160871,  1.23493762, -0.7490566 ]])

## 4.花式索引(Fancy Index)

利用整数数组进行索引，且返回的是复制的结果

In [23]:
fi = np.empty((8,4)) 
for i in range(8):
    fi[i] = i
fi

array([[0., 0., 0., 0.],
       [1., 1., 1., 1.],
       [2., 2., 2., 2.],
       [3., 3., 3., 3.],
       [4., 4., 4., 4.],
       [5., 5., 5., 5.],
       [6., 6., 6., 6.],
       [7., 7., 7., 7.]])

In [24]:
#可用于选取指定行顺序数组和矩阵
fi[[4,3,0,6]]

array([[4., 4., 4., 4.],
       [3., 3., 3., 3.],
       [0., 0., 0., 0.],
       [6., 6., 6., 6.]])

In [25]:
fi2 = np.arange(32).reshape((8,4))
fi2

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23],
       [24, 25, 26, 27],
       [28, 29, 30, 31]])

In [26]:
fi2[[1,5,7,2]]

array([[ 4,  5,  6,  7],
       [20, 21, 22, 23],
       [28, 29, 30, 31],
       [ 8,  9, 10, 11]])

In [27]:
#指定列顺序
fi2[:,[3,2,1,0]]

array([[ 3,  2,  1,  0],
       [ 7,  6,  5,  4],
       [11, 10,  9,  8],
       [15, 14, 13, 12],
       [19, 18, 17, 16],
       [23, 22, 21, 20],
       [27, 26, 25, 24],
       [31, 30, 29, 28]])

In [28]:
fi2[[1,5,7,2],[0,3,1,2]]

array([ 4, 23, 29, 10])

选取并不是按行顺序排列，再按列顺序排列，而是选取的(1,0)等对应的点

In [29]:
fi2[[1,5,7,2]][:,[0,3,1,2]]

array([[ 4,  7,  5,  6],
       [20, 23, 21, 22],
       [28, 31, 29, 30],
       [ 8, 11,  9, 10]])

In [30]:
#另一个简单点的方法
fi2[np.ix_([1,5,7,2],[0,3,1,2])]

array([[ 4,  7,  5,  6],
       [20, 23, 21, 22],
       [28, 31, 29, 30],
       [ 8, 11,  9, 10]])

In [31]:
res = fi2[[1,5,7,2]][:,[0,3,1,2]]
res[:] = 0
res,fi2 #没有改变原数组数据

(array([[0, 0, 0, 0],
        [0, 0, 0, 0],
        [0, 0, 0, 0],
        [0, 0, 0, 0]]), array([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11],
        [12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23],
        [24, 25, 26, 27],
        [28, 29, 30, 31]]))