# 数组的索引和切片
目录:
* 数组索引和切片基本用法
* 数组花式索引
* 布尔型索引
    * 布尔类型基本用法：
    * 布尔类型数组跟切片、整数混合使用
    * 使用不等于!=，使用(~)对条件否定
    * 使用&(和)、|(或)组合多个布尔条件
    * 使用布尔类型数组设置值是一种经常用到的手段
    * np.where用法

## 数组索引和切片的基本用法

In [1]:
import numpy as np

ndarray1 = np.arange(10)
ndarray2 = np.arange(15).reshape((3, 5))

array([[ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

In [2]:
ndarray2

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

In [3]:
ndarray1[3]

3

In [4]:
ndarray1[2:5]

array([2, 3, 4])

In [5]:
ndarray2[2][1]

11

In [6]:
ndarray2[1:3]

array([[ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

#### 注意:
1. 当把一个数字值赋值给一个切片时，该值会自动传播到整个选区。跟列表的区别在于，数组切片是原始数组的视图，这意味着数据不会被赋值，视图上的任何修改都会直接反应到源数组上.
2. 大家可能对此感到不解，由于Numpy被设计的目的是处理大数据，如果Numpy将数据复制来复制去的话会产生何等的性能和内存问题.
3. 如果要得到一个切片副本的话，必须显式进行复制操作.

In [7]:
import numpy as np

ndarray1 = np.arange(10)
print('ndarray1->', ndarray1)
print('ndarray1[3]->', ndarray1[3])
print('ndarray1[3]->', ndarray1[2:5])
print('--------------------------')

ndarray2 = np.arange(15).reshape((3, 5))
print('ndarray2->')
print(ndarray2)
print('ndarray2[2][1] ->', ndarray2[2][1]) 
print('ndarray2[2, 1]->', ndarray2[2, 1])
print('ndarray2[:2][:1]-> ', ndarray2[:2][:1])
print('ndarray2[:2, :2]-> ')
print(ndarray2[:2, :2])
print('ndarray2[2, 1:3]-> ', ndarray2[2, 1:3])
print('ndarray2[:2, 1]-> ', ndarray2[:2, 1])

ndarray1-> [0 1 2 3 4 5 6 7 8 9]
ndarray1[3]-> 3
ndarray1[3]-> [2 3 4]
--------------------------
ndarray2->
[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]]
ndarray2[2][1] -> 11
ndarray2[2, 1]-> 11
ndarray2[:2][:1]->  [[0 1 2 3 4]]
ndarray2[:2, :2]-> 
[[0 1]
 [5 6]]
ndarray2[2, 1:3]->  [11 12]
ndarray2[:2, 1]->  [1 6]


## 数组花式索引

In [8]:
import numpy as np
ndarray1 = np.empty((8, 4))
for i in range(8):
    ndarray1[i] = np.arange(i, i + 4)

# 选取特定的子集,参数为列表
ret1 = ndarray1[[0, 1, 6, 7]]

# 使用负数索引会从末尾开始选取行
ret2 = ndarray1[[-1, 0, -2]]

# 一次传入多个数组
ret3 = ndarray1[[1, 3, 5], [1, 2, 3]]
ret4 = ndarray1[[1, 3, 5]][[1, 2]]

# 获取选区数据
ret5 = ndarray1[[1, 3, 5]][:, [1, 2, 3]]
ret6 = ndarray1[np.ix_([1, 2, 4], [1, 2, 3])]

In [9]:
ndarray1

array([[  0.,   1.,   2.,   3.],
       [  1.,   2.,   3.,   4.],
       [  2.,   3.,   4.,   5.],
       [  3.,   4.,   5.,   6.],
       [  4.,   5.,   6.,   7.],
       [  5.,   6.,   7.,   8.],
       [  6.,   7.,   8.,   9.],
       [  7.,   8.,   9.,  10.]])

In [10]:
ret1

array([[  0.,   1.,   2.,   3.],
       [  1.,   2.,   3.,   4.],
       [  6.,   7.,   8.,   9.],
       [  7.,   8.,   9.,  10.]])

In [11]:
ret2

array([[  7.,   8.,   9.,  10.],
       [  0.,   1.,   2.,   3.],
       [  6.,   7.,   8.,   9.]])

In [12]:
ret3

array([ 2.,  5.,  8.])

In [13]:
ret4

array([[ 3.,  4.,  5.,  6.],
       [ 5.,  6.,  7.,  8.]])

In [14]:
ret5

array([[ 2.,  3.,  4.],
       [ 4.,  5.,  6.],
       [ 6.,  7.,  8.]])

In [15]:
ret6

array([[ 2.,  3.,  4.],
       [ 3.,  4.,  5.],
       [ 5.,  6.,  7.]])

## 布尔型索引

1.布尔类型的基本用法

In [16]:
import numpy as np

names = np.array(['aaa', 'bbb', 'ccc', 'ddd', 'eee', 'fff', 'ggg'])
data = np.arange(35).reshape((7, 5))
# 数组中每一个元素都进行==运算，返回一个数组
mask = names == 'aaa'

In [17]:
mask

array([ True, False, False, False, False, False, False], dtype=bool)

In [18]:
names[mask]

array(['aaa'],
      dtype='<U3')

In [19]:
data

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24],
       [25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34]])

In [20]:
data[mask]

array([[0, 1, 2, 3, 4]])

In [21]:
data[names == 'bbb']

array([[5, 6, 7, 8, 9]])

2.布尔类型数组跟切片，证书混合使用

In [22]:
import numpy as np

names = np.array(['aaa', 'bbb', 'ccc', 'ddd', 'eee', 'fff', 'ggg'])
data = np.arange(35).reshape((7, 5))

ret1 = data[names == 'ccc']

# 布尔类型数组和整数混合使用
ret2= data[names == 'ccc', 2]

# 布尔类型数组和切片混合使用
ret3= data[names == 'ccc', 1:]

In [23]:
data

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24],
       [25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34]])

In [24]:
ret1

array([[10, 11, 12, 13, 14]])

In [25]:
ret2

array([12])

In [26]:
ret3

array([[11, 12, 13, 14]])

3.使用不等于!=，使用(~)对条件否定

In [29]:
import numpy as np

names = np.array(['aaa', 'bbb', 'ccc', 'ddd', 'eee', 'fff', 'ggg'])
data = np.arange(35).reshape((7, 5))

ret1 = data[names != 'ccc']
ret2 = data[~(names == 'ccc')]
ret3 = data[~(names > 'ccc')]

In [30]:
data

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24],
       [25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34]])

In [31]:
ret1

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24],
       [25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34]])

In [32]:
ret2

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24],
       [25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34]])

In [33]:
ret3

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

4.使用&(和), |(或)组合多个布尔条件
```
注意: Python的关键字and、or在布尔数组中无效, 不能用来组合多个条件.
```

In [34]:
import numpy as np

names = np.array(['aaa', 'bbb', 'ccc', 'ddd', 'eee', 'fff', 'ggg'])
data = np.arange(35).reshape((7, 5))

# 注意，Python的关键字and、or在布尔数组中无效
ret1 = data[(names == 'aaa') | (names == 'ccc')]
ret2 = data[(names > 'ddd') | (names == 'aaa')]
ret3 = data[(names < 'eee') & (names > 'bbb') ]

In [35]:
data

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24],
       [25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34]])

In [36]:
ret1

array([[ 0,  1,  2,  3,  4],
       [10, 11, 12, 13, 14]])

In [37]:
ret2

array([[ 0,  1,  2,  3,  4],
       [20, 21, 22, 23, 24],
       [25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34]])

In [38]:
ret3

array([[10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

5.使用布尔类型数组设置值是一种经常用到的手段

In [39]:
import numpy as np

ndarray1 = np.arange(5)
ndarray2 = np.arange(16).reshape((4, 4))
names = np.array(['aaa', 'bbb', 'ccc', 'ddd'])

# 将数组ndarray1中所有大于5的元素设置成666
ndarray1[ndarray1 > 2] = 8

# 将ndarray2的aaa这一行所有的元素设置为0
ndarray2[names == 'aaa'] = 0
# 将ndarray2的bbb这一行2位置往后所有的元素设置为1
ndarray2[names == 'bbb', 2:] = 1
# 将ndarray2的ccc ddd这2行所有的元素设置为2
ndarray2[(names == 'ccc') | (names == 'ddd')] = 2

In [40]:
ndarray1

array([0, 1, 2, 8, 8])

In [41]:
ndarray2

array([[0, 0, 0, 0],
       [4, 5, 1, 1],
       [2, 2, 2, 2],
       [2, 2, 2, 2]])

6.np.where用法
```
已知有两个数组: ndarray1 = np.array([6, 7, 8, 6, 8, 3, 4, 5, 8, 7]) ndarray2 = np.array([3, 5, 3, 7, 2, 1, 2, 2, 7, 4]) 以此对比数组中对应位置的值，取出大的值，组成新的数组.
```

In [44]:
import numpy as np

# 创建两个数组
ndarray1 = np.array([6, 7, 8, 6, 8, 3, 4, 5, 8, 7])
ndarray2 = np.array([3, 5, 3, 7, 2, 1, 2, 2, 7, 4])
# 比较条件
result1 = [ n1 if c else n2 for n1, n2, c in zip(ndarray1, ndarray2, ndarray1 > ndarray2) ]
# 这里也可以使用numpy提供的where函数
# 使用格式为: result = np.where(条件, 值1, 值2)
result2 = np.where(ndarray1 > ndarray2, ndarray1, ndarray2)

In [45]:
result1

[6, 7, 8, 7, 8, 3, 4, 5, 8, 7]

In [46]:
result2

array([6, 7, 8, 7, 8, 3, 4, 5, 8, 7])

#### 小练习
* 已知数组: ndarray3 = np.arange(32).reshape((8, 4)) 8行4列元素数组.元素从左向右从上至下依次0~31.
* 将数组中所有大于20的元素，替换为666.
* 将数组中所有大于13, 并且小于17的元素替换为888.

In [47]:
import numpy as np

ndarray3 = np.arange(32).reshape((8, 4))
# 将大于20的元素替换成666
ret1 = np.where(ndarray3 > 20, 666, ndarray3)
# 将大于13，并且小于17的元素替换成100
ret2 = np.where(ndarray3 > 13, np.where(ndarray3 < 17, 100, ndarray3), ndarray3)

In [48]:
ndarray3

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23],
       [24, 25, 26, 27],
       [28, 29, 30, 31]])

In [49]:
ret1

array([[  0,   1,   2,   3],
       [  4,   5,   6,   7],
       [  8,   9,  10,  11],
       [ 12,  13,  14,  15],
       [ 16,  17,  18,  19],
       [ 20, 666, 666, 666],
       [666, 666, 666, 666],
       [666, 666, 666, 666]])

In [50]:
ret2

array([[  0,   1,   2,   3],
       [  4,   5,   6,   7],
       [  8,   9,  10,  11],
       [ 12,  13, 100, 100],
       [100,  17,  18,  19],
       [ 20,  21,  22,  23],
       [ 24,  25,  26,  27],
       [ 28,  29,  30,  31]])