## Chapter 4: Arrays and Vectorized Computation

Here is an example of the efficiency gains from using numpy arrays

In [None]:
import numpy as np

my_arr = np.arange(1_000_000)

my_list = list(range(1_000_000))

%timeit my_arr2 = my_arr * 2

In [None]:
%timeit my_list2 = [x * 2 for x in my_list]

### 4.1 The NumPy ndarray: A Multidimensional Array Object

In [None]:
import numpy as np

data = np.array([[1.5, -0.1, 3.0], [0, -3, 6.5]])

print(data)
data

In [None]:
print(data)
print(data * 10)
print(data + data)

In [None]:
print(data.shape)
print(data.dtype)
print(data.ndim)

#### Creating ndarrays

In [None]:
data1 = [6, 7.5, 8, 0, 1]
arr1 = np.array(data1)
arr1

In [None]:
data2 = [[1, 2, 3, 4], [5, 6, 7, 8]]
arr2 = np.array(data2)
arr2

In [None]:
print(arr2.ndim)
print(arr2.shape)

In [None]:
print(arr1.dtype)
print(arr2.dtype)

In [None]:
print(np.zeros(10))
print(np.zeros((3, 6)))
print(np.empty((2, 3, 2)))

The numpy.arange function is a vectorized version of the built-in range function. It creates an array of evenly spaced values within a given interval.

In [None]:
np.arange(15)

#### Data Types for ndarrays

In [None]:
arr1 = np.array([1, 2, 3], dtype=np.float64)
print(arr1.dtype)
arr2 = np.array([1, 2, 3], dtype=np.int32)
print(arr2.dtype)

We can cast numpy arrays to different data types using the astype method. This is useful when we want to ensure that our data is in a specific format for calculations or storage.

In [None]:
arr = np.array(np.arange(6))
print(arr)
print(arr.dtype)

float_arr = arr.astype(np.float64)
print(float_arr)
print(float_arr.dtype)

If we cast floating to integer, we lose the decimal part. If we cast integer to floating, we get a floating number.

In [None]:
arr = np.array([3.7, -1.2, -2.6, 0.5, 12.9, 10.1])
print(arr)

arr.astype(np.int32)

We can changes strings to float using astype(float). This is useful when we have a string representation of numbers and we want to perform calculations on them.

In [None]:
numeric_strings = np.array(['1.25', '-9.6', '42'], dtype = np.str_)
print(numeric_strings)

numeric_strings.astype(float)

In [None]:
int_array = np.arange(10)
print(int_array)

calibers = np.array([.22, .270, .357, .380, .44, .50], dtype=np.float64)
print(calibers)

int_array.astype(calibers.dtype)

In [None]:
zeros_unit32 = np.zeros(10)
print(zeros_unit32)
zeros_unit32 = np.zeros(10, dtype="u4")
zeros_unit32

#### Arithmetic with NumPy Arrays

You can performa arithmetic operations on numpy arrays. The operations are applied element-wise, meaning that each element in the array is operated on independently.
This is different from Python lists, where the operations are not vectorized and require explicit loops to apply the operation to each element.

In [None]:
arr = np.array([[1., 2., 3.], [4., 5., 6.]])
print(arr)
print(arr * arr)
print(arr - arr)
print(arr + arr)

Arithmetic operations with scalars propagate the scalar to each element in the array. This is useful for performing operations on all elements of an array without needing to write explicit loops.
The operations are applied element-wise, meaning that each element in the array is operated on independently. This is different from Python lists, where the operations are not vectorized and require explicit loops to apply the operation to each element.

In [None]:
print(1 / arr)
print(arr ** 2)

Comparison between arrays of the same size yields a boolean array of the same size. This is useful for filtering or masking data based on certain conditions.
The comparison operations are also vectorized, meaning that they are applied element-wise to the arrays. This allows for efficient filtering and masking of data based on conditions.

In [None]:
print(arr)

arr2 = np.array([[0., 4., 1.], [7., 2., 12.]])
print(arr2)

arr2 > arr

#### Basic Indexing and Slicing

In [None]:
arr = np.arange(10)
print(arr)

print(arr[0])
print(arr[1])
print(arr[5])
print(arr[-1])
print(arr[-2])

print(arr[5:8])

arr[5:8] = 12
print(arr)

An important first distinction from Python's built-in lists is that array slices are views into the same data, so modifying one array will also modify the other. This is different from Python lists, where slices create a new list that is a copy of the original list.
This means that if you modify the slice, the original array will also be modified. This is important to keep in mind when working with numpy arrays, as it can lead to unintended consequences if you're not careful.

In [None]:
arr_slice = arr[5:8]
print(arr_slice)

arr_slice[1] = 12345
print(arr)

In [None]:
arr_slice[:] = 64
print(arr)

In [None]:
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(arr2d)
print(arr2d[2])
print(arr2d[0][2])

In [None]:
arr3d = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
print(arr3d)
print(arr3d[0])

In [None]:
print(arr3d)

old_values = arr3d[0].copy()
print(old_values)

arr3d[0] = 42
print(arr3d)

arr3d[0] = old_values
print(arr3d)

In [None]:
print(arr3d[1, 0])
print(arr3d[1, 0, 2])

In [None]:
print(arr3d)
x = arr3d[1]
print(x)
print(x[0])

#### Indexing with Slices

Indexing with slices is similar to Python lists, but it can be more powerful. You can use slices to extract subarrays or modify parts of an array without creating a copy of the data.
This is useful for working with large datasets, as it allows you to manipulate data without creating unnecessary copies.

In [None]:
print(arr)
print(arr[1:6])
print(arr[:6])
print(arr[6:])

In [None]:
print(arr2d)
print(arr2d[:2])

In [None]:
print(arr2d)
arr2d[:2, 1:]

In [None]:
lower_dim_slice = arr2d[1, :2]
print(lower_dim_slice)
print(lower_dim_slice.shape)
print(lower_dim_slice.dtype)

In [None]:
print(arr2d)
arr2d[:2, 2]

In [None]:
print(arr2d)
print(arr2d[:, :1])
print(arr2d[:, 0])

In [None]:
print(arr2d)
arr2d[:2, 1:] = 0
print(arr2d)

### Boolean Indexing

Boolean indexing is a powerful feature of numpy arrays that allows you to select elements based on conditions. You can create a boolean array by applying a condition to an array, and then use that boolean array to index into the original array.
This is useful for filtering data based on certain criteria, such as selecting elements that are greater than a certain value or elements that meet a specific condition.

In [None]:
names = np.array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'])
data = np.array([[4, 7], [0, 2], [-5, 6], [0, 0], [1,2], [-12, -4], [3, 4]])
print(names)
print(data)

In [None]:
print(names == "Bob")
data[names == "Bob"]

In [None]:
print(data)
print(data[names == "Bob", 1:])
print(data[names == "Bob", 1])

In [None]:
print(names != "Bob")
print(~(names == "Bob"))
print(data[~(names == "Bob")])

In [None]:
cond = names == "Bob"
print(data[~cond])

In [None]:
print(data)
print(names)
mask = (names == "Bob") | (names == "Will")
print(mask)
print(data[mask])

In [None]:
print(data)
data[data < 0] = 0
print(data)

In [None]:
print(names)
data[names != "Joe"] = 7
print(data)

#### Fancy Indexing

In [None]:
arr = np.zeros((8, 4))

for i in range(8):
    arr[i] = i
print(arr)

In [None]:
print(arr)
print(arr[[4, 3, 0, 6]])
print(arr[[-3, -5, -7]])

In [None]:
arr = np.arange(32).reshape((8, 4))
print(arr)
arr[[1, 5, 7, 2], [0, 3, 1, 2]]

In [None]:
print(arr)
print(arr[[1, 5, 7, 2]])
print(arr[[1, 5, 7, 2]][:, [0, 3, 1, 2]])

In [None]:
print(arr)
print(arr[[1, 5, 7, 2], [0, 3, 1, 2]])
arr[[1, 5, 7, 2], [0, 3, 1, 2]] = 0
print(arr)

#### Transposing Arrays and Swapping Axes

Transpose is a method that allows you to swap the rows and columns of an array. This is useful for reshaping data or changing the orientation of an array.
The transpose method is a powerful tool for reshaping data and changing the orientation of an array. It allows you to swap the rows and columns of an array, which can be useful for various applications, such as data analysis or machine learning.  