Numpy100题地址

https://github.com/rougier/numpy-100/blob/master/100_Numpy_exercises.md

IgLM: Generative language modeling for antibody design

Google colab地址：https://colab.research.google.com/github/Graylab/IgLM/blob/main/IgLM.ipynb#scrollTo=5AEsv1z5KXXA

使用 NumPy 创建和操作数组的基本方法：

np.array 创建数组。

.shape 查看数组维度。

数组元素可通过索引访问和修改。

一维数组的形状如 (3,)，二维数组如 (2, 3)。

In [1]:
import numpy as np

a = np.array([1, 2, 3])   # Create a rank 1 array
print(type(a))            # Prints "<class 'numpy.ndarray'>"
print(a.shape)            # Prints "(3,)"
print(a[0], a[1], a[2])   # Prints "1 2 3"
a[0] = 5                  # Change an element of the array
print(a)                  # Prints "[5, 2, 3]"

b = np.array([[1,2,3],[4,5,6]])    # Create a rank 2 array
print(b.shape)                     # Prints "(2, 3)"
print(b[0, 0], b[0, 1], b[1, 0])   # Prints "1 2 4"

<class 'numpy.ndarray'>
(3,)
1 2 3
[5 2 3]
(2, 3)
1 2 4


使用 NumPy 创建各种类型数组的方法：

这些函数用于快速生成不同初始化方式的数组，在数据处理和科学计算中非常实用：

np.zeros / np.ones：初始化为0或1；

np.full：初始化为指定常数；

np.eye：生成单位矩阵（适合线性代数）；

np.random.random：生成指定形状的随机数数组。

In [2]:
import numpy as np

a = np.zeros((2,2))   # Create an array of all zeros
print(a)              # Prints "[[ 0.  0.]
                      #          [ 0.  0.]]"

b = np.ones((1,2))    # Create an array of all ones
print(b)              # Prints "[[ 1.  1.]]"

c = np.full((2,2), 7)  # Create a constant array
print(c)               # Prints "[[ 7.  7.]
                       #          [ 7.  7.]]"

d = np.eye(2)         # Create a 2x2 identity matrix
print(d)              # Prints "[[ 1.  0.]
                      #          [ 0.  1.]]"

e = np.random.random((2,2))  # Create an array filled with random values
print(e)                     # Might print "[[ 0.91940167  0.08143941]
                             #               [ 0.68744134  0.87236687]]"


[[0. 0.]
 [0. 0.]]
[[1. 1.]]
[[7 7]
 [7 7]]
[[1. 0.]
 [0. 1.]]
[[0.22370715 0.65774805]
 [0.07732815 0.63378913]]


NumPy 中数组切片（slicing）的用法，切片是原数组的视图（view）

使用 : 进行切片可以获取子数组；

切片返回的是原数组的视图（view），不是拷贝；

修改视图会影响原数组。

In [2]:
import numpy as np

# Create the following rank 2 array with shape (3, 4)
# [[ 1  2  3  4]
#  [ 5  6  7  8]
#  [ 9 10 11 12]]
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])

# Use slicing to pull out the subarray consisting of the first 2 rows
# and columns 1 and 2; b is the following array of shape (2, 2):
# [[2 3]
#  [6 7]]
b = a[:2, 1:3]

# A slice of an array is a view into the same data, so modifying it
# will modify the original array.
print(a[0, 1])   # Prints "2"
b[0, 0] = 77     # b[0, 0] is the same piece of data as a[0, 1]
print(a[0, 1])   # Prints "77"


2
77


In [None]:
NumPy 中切片与整数索引的差别：

整数索引 + 切片	降维

纯切片	保持维度

In [3]:
import numpy as np

# Create the following rank 2 array with shape (3, 4)
# [[ 1  2  3  4]
#  [ 5  6  7  8]
#  [ 9 10 11 12]]
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])

# Two ways of accessing the data in the middle row of the array.
# Mixing integer indexing with slices yields an array of lower rank,
# while using only slices yields an array of the same rank as the
# original array:
row_r1 = a[1, :]    # Rank 1 view of the second row of a
row_r2 = a[1:2, :]  # Rank 2 view of the second row of a
print(row_r1, row_r1.shape)  # Prints "[5 6 7 8] (4,)"
print(row_r2, row_r2.shape)  # Prints "[[5 6 7 8]] (1, 4)"

# We can make the same distinction when accessing columns of an array:
col_r1 = a[:, 1]
col_r2 = a[:, 1:2]
print(col_r1, col_r1.shape)  # Prints "[ 2  6 10] (3,)"
print(col_r2, col_r2.shape)  # Prints "[[ 2]
                             #          [ 6]
                             #          [10]] (3, 1)"

[5 6 7 8] (4,)
[[5 6 7 8]] (1, 4)
[ 2  6 10] (3,)
[[ 2]
 [ 6]
 [10]] (3, 1)


In [None]:
NumPy 中的整数数组索引：

使用两个整数数组 [i1, i2, ...] 和 [j1, j2, ...]，可一次性提取多个元素：a[[i1, i2], [j1, j2]]

返回的一维数组长度 = 索引数组长度

支持重复、跨行列取值，非常适合灵活抽取特定位置的元素集合。

In [4]:
import numpy as np

a = np.array([[1,2], [3, 4], [5, 6]])

# An example of integer array indexing.
# The returned array will have shape (3,) and
print(a[[0, 1, 2], [0, 1, 0]])  # Prints "[1 4 5]"

# The above example of integer array indexing is equivalent to this:
print(np.array([a[0, 0], a[1, 1], a[2, 0]]))  # Prints "[1 4 5]"

# When using integer array indexing, you can reuse the same
# element from the source array:
print(a[[0, 0], [1, 1]])  # Prints "[2 2]"

# Equivalent to the previous integer array indexing example
print(np.array([a[0, 1], a[0, 1]]))  # Prints "[2 2]"


[1 4 5]
[1 4 5]
[2 2]
[2 2]


In [None]:
NumPy 中的布尔索引：

布尔索引利用一个布尔数组筛选对应位置为 True 的元素；

筛选结果为一维数组，元素顺序按原数组顺序排列；

这种方式简洁高效，常用于根据条件选取数组中的部分数据。

In [5]:
import numpy as np

a = np.array([[1,2], [3, 4], [5, 6]])

bool_idx = (a > 2)   # Find the elements of a that are bigger than 2;
                     # this returns a numpy array of Booleans of the same
                     # shape as a, where each slot of bool_idx tells
                     # whether that element of a is > 2.

print(bool_idx)      # Prints "[[False False]
                     #          [ True  True]
                     #          [ True  True]]"

# We use boolean array indexing to construct a rank 1 array
# consisting of the elements of a corresponding to the True values
# of bool_idx
print(a[bool_idx])  # Prints "[3 4 5 6]"

# We can do all of the above in a single concise statement:
print(a[a > 2])     # Prints "[3 4 5 6]"

[[False False]
 [ True  True]
 [ True  True]]
[3 4 5 6]
[3 4 5 6]


In [None]:
NumPy 数组的数据类型（dtype）自动推断与手动指定的用法：

NumPy 默认根据数据内容自动选择合适的类型；

可以用 dtype 参数强制指定数据类型；

数据类型影响数组内存占用和计算性能。

In [6]:
import numpy as np

x = np.array([1, 2])   # Let numpy choose the datatype
print(x.dtype)         # Prints "int64"

x = np.array([1.0, 2.0])   # Let numpy choose the datatype
print(x.dtype)             # Prints "float64"

x = np.array([1, 2], dtype=np.int64)   # Force a particular datatype
print(x.dtype)                         # Prints "int64"


int32
float64
int64


In [None]:
NumPy 中常见的 逐元素（elementwise）算术运算，包括加法、减法、乘法、除法和平方根，操作对象是两个形状相同的二维浮点数组：

使用运算符（+, -, *, /）或对应的 NumPy 函数（np.add，np.subtract，np.multiply，np.divide）进行元素间的逐元素运算，结果相同。

逐元素运算对同形状数组按位置一一操作。

还可使用 np.sqrt 等函数对数组元素逐个进行数学计算。

这些操作简洁、高效，是 NumPy 数组计算的基础。

In [8]:
import numpy as np 
x = np.array([[1,2],[3,4]], dtype=np.float64) 
y = np.array([[5,6],[7,8]], dtype=np.float64) 
# Elementwise sum; both produce the array 
# [[ 6.0 8.0] 
# [10.0 12.0]] 
print(x + y) 
print(np.add(x, y)) 
# Elementwise difference; both produce the array 
# [[-4.0 -4.0] 
# [-4.0 -4.0]] 
print(x - y) 
print(np.subtract(x, y)) 
# Elementwise product; both produce the array 
# [[ 5.0 12.0] 
# [21.0 32.0]] 
print(x * y) 
print(np.multiply(x, y)) 
# Elementwise division; both produce the array 
# [[ 0.2 0.33333333] 
# [ 0.42857143 0.5 ]] 
print(x / y) 
print(np.divide(x, y)) 
# Elementwise square root; produces the array 
# [[ 1. 1.41421356] 
# [ 1.73205081 2. ]] 
print(np.sqrt(x))


[[ 6.  8.]
 [10. 12.]]
[[ 6.  8.]
 [10. 12.]]
[[-4. -4.]
 [-4. -4.]]
[[-4. -4.]
 [-4. -4.]]
[[ 5. 12.]
 [21. 32.]]
[[ 5. 12.]
 [21. 32.]]
[[0.2        0.33333333]
 [0.42857143 0.5       ]]
[[0.2        0.33333333]
 [0.42857143 0.5       ]]
[[1.         1.41421356]
 [1.73205081 2.        ]]
