# NumPy: the absolute basics for beginners

## This tutorial in particular repeats [NumPy: the absolute basics for beginners](https://numpy.org/doc/stable/user/absolute_beginners.html) a lot from the official site of NumPy.

***

We shorten the imported name to `np` for better readability of code using NumPy. This is a widely adopted convention that you should follow so that anyone working with your code can easily understand it.

In [115]:
# import numpy
import numpy as np
rng = np.random.default_rng(0)

### How to create a basic array

> `np.array()`

> `np.zeros()`

> `np.ones()`

> `np.empty()`

> `np.arange()`

> `np.linspace()`

In [3]:
np.array([[1, 2, 3, 4], [1, 2, 3, 4]])

array([[1, 2, 3, 4],
       [1, 2, 3, 4]])

In [4]:
np.zeros(5)

array([0., 0., 0., 0., 0.])

In [7]:
np.ones([5, 5])

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

Or even an empty array! The function `empty` creates an array whose initial content is random and depends on the state of the memory. **The reason to use empty over zeros (or something similar) is speed** - just make sure to fill every element afterwards!#### 

In [8]:
np.empty([3, 2])

array([[4.67121671e-310, 0.00000000e+000],
       [0.00000000e+000, 0.00000000e+000],
       [0.00000000e+000, 0.00000000e+000]])

In [9]:
# create range of elements
np.arange(10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [11]:
np.arange(2, 9, 3)

array([2, 5, 8])

In [14]:
np.linspace(1, 2, 5, endpoint=True)

array([1.  , 1.25, 1.5 , 1.75, 2.  ])

#### Specifying your data type

> dtype

In [16]:
np.ones(2, dtype=np.float32)

array([1., 1.], dtype=float32)

### Adding, removing, and sorting elements 

> `numpy.sort(a, axis=-1, kind=None, order=None)`

Return a sorted copy of an array.

**Parameters:**
1. **a : array_like**

Array to be sorted.

2. **axis : int or None, optional**

Axis along which to sort. If None, the array is flattened before sorting. The default is -1, which sorts along the last axis.

3. **kind : {‘quicksort’, ‘mergesort’, ‘heapsort’, ‘stable’}, optional**

Sorting algorithm. The default is ‘quicksort’. Note that both ‘stable’ and ‘mergesort’ use timsort or radix sort under the covers and, in general, the actual implementation will vary with data type. The ‘mergesort’ option is retained for backwards compatibility.

Changed in version 1.15.0.: The ‘stable’ option was added.

4. **order : str or list of str, optional**

When a is an array with fields defined, this argument specifies which fields to compare first, second, etc. A single field can be specified as a string, and not all fields need be specified, but unspecified fields will still be used, in the order in which they come up in the dtype, to break ties.

> `numpy.concatenate((a1, a2, ...), axis=0, out=None, dtype=None, casting="same_kind")`

Join a sequence of arrays along an existing axis.



In [17]:
arr = np.array([2, 1, 5, 3, 7, 4, 6, 8])

In [18]:
np.sort(arr)

array([1, 2, 3, 4, 5, 6, 7, 8])

In [19]:
a = np.array([1, 2, 3, 4])
b = np.array([5, 6, 7, 8])

In [21]:
np.concatenate((a, b))

array([1, 2, 3, 4, 5, 6, 7, 8])

In [37]:
x = np.array([[1, 2], [3, 4]])
y = np.array([[5, 6]])
np.concatenate((x, y))

array([[1, 2],
       [3, 4],
       [5, 6]])

In [38]:
x = np.array([[1, 2], [3, 4]])
y = np.array([[5, 6], [7, 8]])
np.concatenate((x, y), axis=1)

array([[1, 2, 5, 6],
       [3, 4, 7, 8]])

### How do you know the shape and size of an array?

> `ndarray.ndim`

will tell you the number of axes, or dimensions, of the array.

> `ndarray.size`

will tell you the total number of elements of the array. This is the product of the elements of the array’s shape.

> `ndarray.shape`

will display a tuple of integers that indicate the number of elements stored along each dimension of the array. If, for example, you have a 2-D array with 2 rows and 3 columns, the shape of your array is (2, 3).

In [47]:
array_example = np.arange(24).reshape(2, 3, -1)
array_example

array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]],

       [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])

In [48]:
array_example.ndim

3

In [49]:
array_example.size

24

In [50]:
array_example.shape

(2, 3, 4)

### Can you reshape an array?


> `numpy.reshape(a, newshape, order='C')`

will give a new shape to an array without changing the data. Just remember that when you use the reshape method, the array you want to produce needs to have the same number of elements as the original array.

**Parameters :**

1. **a : array_like**

Array to be reshaped.

2. **newshape : int or tuple of ints**

The new shape should be compatible with the original shape. If an integer, then the result will be a 1-D array of that length. One shape dimension can be -1. In this case, the value is inferred from the length of the array and remaining dimensions.

3. **order : {‘C’, ‘F’, ‘A’}, optional** 

Read the elements of a using this index order, and place the elements into the reshaped array using this index order. ‘C’ means to read / write the elements using C-like index order, with the last axis index changing fastest, back to the first axis index changing slowest. ‘F’ means to read / write the elements using Fortran-like index order, with the first index changing fastest, and the last index changing slowest. Note that the ‘C’ and ‘F’ options take no account of the memory layout of the underlying array, and only refer to the order of indexing. ‘A’ means to read / write the elements in Fortran-like index order if a is Fortran contiguous in memory, C-like order otherwise.

In [53]:
a = np.arange(6)
a

array([0, 1, 2, 3, 4, 5])

In [54]:
b = a.reshape(3, 2)
b

array([[0, 1],
       [2, 3],
       [4, 5]])

In [56]:
np.reshape(a, newshape=(1, 6), order='C')

array([[0, 1, 2, 3, 4, 5]])

### How to convert a 1D array into a 2D array (how to add a new axis to an array)

> `np.newaxis` 

> `np.expand_dims`

Using `np.newaxis` will increase the dimensions of your array by one dimension when used once. This means that a 1D array will become a 2D array, a 2D array will become a 3D array, and so on.

In [57]:
a = np.array([1, 2, 3, 4, 5, 6])
a.shape

(6,)

In [59]:
a2 = a[np.newaxis, :]
a2.shape

(1, 6)

You can explicitly convert a 1D array with either a row vector or a column vector using np.newaxis. For example, you can convert a 1D array to a row vector by inserting an axis along the first dimension:

In [63]:
row_vector = a[np.newaxis, :]
row_vector.shape

(1, 6)

In [64]:
col_vector = a[:, np.newaxis]
col_vector.shape

(6, 1)

You can also expand an array by inserting a new axis at a specified position with `np.expand_dims`.

In [65]:
a = np.array([1, 2, 3, 4, 5, 6])
a.shape

(6,)

In [66]:
b = np.expand_dims(a, axis=1)
b.shape

(6, 1)

In [67]:
c = np.expand_dims(a, axis=0)
c.shape

(1, 6)

In [97]:
d = np.expand_dims(a, axis=(0, 2, 3))
d.shape

(1, 6, 1, 1)

In [99]:
print(d)

[[[[1]]

  [[2]]

  [[3]]

  [[4]]

  [[5]]

  [[6]]]]


In [100]:
d[0, 4, 0, 0]

5

### Indexing and slicing

In [103]:
data = np.array([1, 2, 3])

In [104]:
data[1]

2

In [105]:
data[0:2]

array([1, 2])

In [106]:
data[1:]

array([2, 3])

In [107]:
data[-2:]

array([2, 3])

![](./images/np_indexing.png)

You may want to take a section of your array or specific array elements to use in further analysis or additional operations. To do that, you’ll need to subset, slice, and/or index your arrays.

If you want to select values from your array that fulfill certain conditions, it’s straightforward with NumPy.

For example, if you start with this array:

In [108]:
a = np.array([[1 , 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])

You can easily print all of the values in the array that are less than 5.


In [109]:
print(a[a < 5])

[1 2 3 4]


You can also select, for example, numbers that are equal to or greater than 5, and use that condition to index an array.



In [112]:
five_up = (a >= 5)
five_up

array([[False, False, False, False],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True]])

In [111]:
print(a[five_up])


[ 5  6  7  8  9 10 11 12]


In [113]:
divisible_by_2 = a[a%2==0]
divisible_by_2

array([ 2,  4,  6,  8, 10, 12])

In [114]:
c = a[(a > 2) & (a < 11)]
c

array([ 3,  4,  5,  6,  7,  8,  9, 10])

In [115]:
five_up = (a > 5) | (a == 5)
print(five_up)

[[False False False False]
 [ True  True  True  True]
 [ True  True  True  True]]


You can also use `np.nonzero()` to select elements or indices from an array.

Starting with this array:

In [123]:
a = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
b = np.nonzero(a > 5)
a

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [124]:
b

(array([1, 1, 1, 2, 2, 2, 2]), array([1, 2, 3, 0, 1, 2, 3]))

In [125]:
a[b]

array([ 6,  7,  8,  9, 10, 11, 12])

In this example, a tuple of arrays was returned: one for each dimension. The first array represents the row indices where these values are found, and the second array represents the column indices where the values are found.

If you want to generate a list of coordinates where the elements exist, you can zip the arrays, iterate over the list of coordinates, and print them. For example:

In [126]:
list_of_coordinates= list(zip(b[0], b[1]))

In [127]:
for coord in list_of_coordinates:
    print(coord)

(1, 1)
(1, 2)
(1, 3)
(2, 0)
(2, 1)
(2, 2)
(2, 3)


### How to create an array from existing data

This section covers `slicing` and `indexing`, `np.vstack()`, `np.vsplit()`, `np.hstack()`, `np.hsplit()`, `.view()`, `copy()`

In [18]:
a = np.array([1,  2,  3,  4,  5,  6,  7,  8,  9, 10])
a

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [21]:
b = a[3:8]
b

array([4, 5, 6, 7, 8])

In [22]:
a1 = np.array([[1, 1],
               [2, 2]])

a2 = np.array([[3, 3],
               [4, 4]])

In [24]:
arr1 = np.vstack((a1, a2))
arr1

array([[1, 1],
       [2, 2],
       [3, 3],
       [4, 4]])

In [28]:
arr2 = np.vsplit(arr1, 2)
arr2

[array([[1, 1],
        [2, 2]]),
 array([[3, 3],
        [4, 4]])]

In [30]:
arr3 = np.hstack((a1, a2))
arr3

array([[1, 1, 3, 3],
       [2, 2, 4, 4]])

You can split an array into several smaller arrays using `hsplit`. You can specify either the **number of equally shaped arrays to return or the columns after which the division should occur**.

In [31]:
x = np.arange(1, 25).reshape(2, 12)
x

array([[ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12],
       [13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24]])

In [34]:
np.hsplit(x, 3)

[array([[ 1,  2,  3,  4],
        [13, 14, 15, 16]]),
 array([[ 5,  6,  7,  8],
        [17, 18, 19, 20]]),
 array([[ 9, 10, 11, 12],
        [21, 22, 23, 24]])]

If you wanted to **split** your array **after the third and fourth column**, you’d run:

In [33]:
np.hsplit(x, (3, 4))

[array([[ 1,  2,  3],
        [13, 14, 15]]),
 array([[ 4],
        [16]]),
 array([[ 5,  6,  7,  8,  9, 10, 11, 12],
        [17, 18, 19, 20, 21, 22, 23, 24]])]

You can use the `view` method to create a new array object that looks at the same data as the original array (a shallow copy).

Views are an important NumPy concept! NumPy functions, as well as operations like indexing and slicing, will return views whenever possible. This saves memory and is faster (no copy of the data has to be made). However it’s important to be aware of this - modifying data in a view also modifies the original array!

In [36]:
a = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
a

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [41]:
b = a[0, :] 
b[0] = 99
b

array([99,  2,  3,  4])

In [42]:
a

array([[99,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

Using the copy method will make a complete `copy` of the array and its data (**a deep copy**). To use this on your array, you could run:

In [65]:
b2 = a[0, :].copy()
b2

array([10101,     2,     3,     4])

In [67]:
b2[0] = 90
b2

array([90,  2,  3,  4])

In [68]:
a

array([[10101,     2,     3,     4],
       [    5,     6,     7,     8],
       [    9,    10,    11,    12]])

### Basic array operations

In [69]:
data = np.array([1, 2])
ones = np.ones(2, dtype=int)
data + ones

array([2, 3])

![](./images/np_data_plus_ones.png)

In [70]:
data - ones

array([0, 1])

In [71]:
data * data

array([1, 4])

In [72]:
data / data

array([1., 1.])

![](./images/np_sub_mult_divide.png)

In [73]:
a = np.array([1, 2, 3, 4])
a.sum()

10

In [74]:
b = np.array([[1, 1], [2, 2]])
b.sum(axis=0)

array([3, 3])

In [75]:
b.sum(axis=1)

array([2, 4])

### Broadcasting

There are times when you might want to carry out an operation between an array and a single number (also called *an operation between a vector and a scalar*) or between arrays of two different sizes. For example, your array (we’ll call it “data”) might contain information about distance in miles but you want to convert the information to kilometers. You can perform this operation with:

In [76]:
data = np.array([1.0, 2.0])

In [77]:
data * 1.6

array([1.6, 3.2])

![](./images/np_multiply_broadcasting.png)

NumPy understands that the multiplication should happen with each cell. That concept is called broadcasting. Broadcasting is a mechanism that allows NumPy to perform operations on arrays of different shapes. The dimensions of your array must be compatible, for example, when the dimensions of both arrays are equal or when one of them is 1. If the dimensions are not compatible, you will get a `ValueError`.

### More useful array operations

NumPy also performs aggregation functions. In addition to `min`, `max`, and `sum`, you can easily run `mean` to get the average, `prod` to get the result of multiplying the elements together, `std` to get the standard deviation, and more.



In [83]:
data = np.array([1, 2, 3])

In [84]:
data.max()

3

In [85]:
data.min()

1

In [86]:
data.prod()

6

In [87]:
data.std()

0.816496580927726

In [88]:
data.sum()

6

![](./images/np_aggregation.png)

In [89]:
 a = np.array([[0.45053314, 0.17296777, 0.34376245, 0.5510652],
               [0.54627315, 0.05093587, 0.40067661, 0.55645993],
               [0.12697628, 0.82485143, 0.26590556, 0.56917101]])

In [90]:
a.sum()

4.8595784

In [91]:
a.min()

0.05093587

You can specify on which axis you want the aggregation function to be computed. For example, you can find the minimum value within each column by specifying `axis=0`.

In [94]:
a.min(axis=0)

array([0.12697628, 0.05093587, 0.26590556, 0.5510652 ])

The four values listed above correspond to the number of columns in your array. With a four-column array, you will get four values as your result.

### Creating matrices

In [95]:
data = np.array([[1, 2], [3, 4], [5, 6]])
data

array([[1, 2],
       [3, 4],
       [5, 6]])

![](./images/np_create_matrix.png)

In [96]:
data[0, 1]

2

In [99]:
data[1:3]

array([[3, 4],
       [5, 6]])

In [103]:
data[0:2, 0]

array([1, 3])

![](./images/np_matrix_indexing.png)

In [106]:
print(
    data.max(),
    data.min(),
    data.sum(),
    sep='\n'
)

6
1
21


![](./images/np_matrix_aggregation.png)

In [107]:
data.max(axis=0)

array([5, 6])

In [108]:
data.max(axis=1)


array([2, 4, 6])

![](./images/np_matrix_aggregation_row.png)

In [109]:
data = np.array([[1, 2], [3, 4], [5, 6]])
ones_row = np.array([[1, 1]])
ones_row

array([[1, 1]])

In [110]:
data + ones_row

array([[2, 3],
       [4, 5],
       [6, 7]])

![](./images/np_matrix_broadcasting.png)

In [111]:
np.ones((3, 2))

array([[1., 1.],
       [1., 1.],
       [1., 1.]])

In [112]:
np.zeros((3, 2))

array([[0., 0.],
       [0., 0.],
       [0., 0.]])

In [116]:
rng.random((3, 2))

array([[0.63696169, 0.26978671],
       [0.04097352, 0.01652764],
       [0.81327024, 0.91275558]])

![](./images/np_ones_zeros_matrix.png)

### Generating random numbers


The use of random number generation is an important part of the configuration and evaluation of many numerical and machine learning algorithms. Whether you need to randomly initialize weights in an artificial neural network, split data into random sets, or randomly shuffle your dataset, being able to generate random numbers (actually, repeatable pseudo-random numbers) is essential.

With `Generator.integers`, you can generate random integers from low (remember that this is inclusive with NumPy) to high (exclusive). You can set `endpoint=True` to make the high number inclusive.

You can generate a 2 x 4 array of random integers between 0 and 4 with:

In [120]:
rng.integers(5, size=(2, 4), endpoint=True)

array([[4, 4, 5, 1],
       [0, 5, 0, 3]])

### How to get unique items and counts

You can find the unique elements in an array easily with `np.unique`.

For example, if you start with this array:

In [123]:
a = np.array([11, 11, 12, 13, 14, 15, 16, 17, 12, 13, 11, 14, 18, 19, 20, 20, 20, 20])

In [124]:
unique_val = np.unique(a)
unique_val

array([11, 12, 13, 14, 15, 16, 17, 18, 19, 20])

To get the indices of unique values in a NumPy array (an array of first index positions of unique values in the array), just pass the `return_index` argument in `np.unique()` as well as your array.

In [125]:
np.unique(a, return_index=True)

(array([11, 12, 13, 14, 15, 16, 17, 18, 19, 20]),
 array([ 0,  2,  3,  4,  5,  6,  7, 12, 13, 14]))

You can pass the `return_counts` argument in `np.unique()` along with your array to get the frequency count of unique values in a NumPy array.

In [127]:
np.unique(a, return_index=True, return_counts=True)

(array([11, 12, 13, 14, 15, 16, 17, 18, 19, 20]),
 array([ 0,  2,  3,  4,  5,  6,  7, 12, 13, 14]),
 array([3, 2, 2, 2, 1, 1, 1, 1, 1, 4]))

This also works with 2D arrays! If you start with this array:

In [128]:
a_2d = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [1, 2, 3, 4]])

In [129]:
np.unique(a_2d, return_index=True, return_counts=True)

(array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12]),
 array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11]),
 array([2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1]))

If you want to get the unique rows or columns, make sure to pass the `axis` argument. To find the unique rows, specify `axis=0` and for columns, specify `axis=1`.

In [130]:
np.unique(a_2d, axis=0)

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [131]:
np.unique(a_2d, axis=1)

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12],
       [ 1,  2,  3,  4]])

### Transposing and reshaping a matrix

It’s common to need to transpose your matrices. NumPy arrays have the property T that allows you to transpose a matrix.

![](./images/np_transposing_reshaping.png)

You may also need to switch the dimensions of a matrix. This can happen when, for example, you have a model that expects a certain input shape that is different from your dataset. This is where the reshape method can be useful. You simply need to pass in the new dimensions that you want for the matrix.

In [133]:
data.reshape(2, 3)


array([[1, 2, 3],
       [4, 5, 6]])

In [134]:
data.reshape(3, 2)

array([[1, 2],
       [3, 4],
       [5, 6]])

![](./images/np_reshape.png)

You can also use `.transpose()` or `.T` (its same) to reverse or change the axes of an array according to the values you specify.



In [136]:
arr = np.arange(6).reshape((2, 3))
arr

array([[0, 1, 2],
       [3, 4, 5]])

In [137]:
arr.transpose()

array([[0, 3],
       [1, 4],
       [2, 5]])

In [138]:
arr.T

array([[0, 3],
       [1, 4],
       [2, 5]])

### How to reverse an array

NumPy’s `np.flip()` function allows you to flip, or reverse, the contents of an array along an axis. When using `np.flip()`, specify the array you would like to reverse and the axis. If you don’t specify the axis, NumPy will reverse the contents along all of the axes of your input array.

#### Reversing a 1D array 

In [141]:
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])

In [142]:
reversed_arr = np.flip(arr)
reversed_arr

array([8, 7, 6, 5, 4, 3, 2, 1])

#### Reversing a 2D array

In [143]:
arr_2d = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])

In [144]:
reversed_arr = np.flip(arr_2d)
reversed_arr

array([[12, 11, 10,  9],
       [ 8,  7,  6,  5],
       [ 4,  3,  2,  1]])

In [145]:
np.flip(arr_2d, axis=0)

array([[ 9, 10, 11, 12],
       [ 5,  6,  7,  8],
       [ 1,  2,  3,  4]])

In [146]:
np.flip(arr_2d, axis=1)

array([[ 4,  3,  2,  1],
       [ 8,  7,  6,  5],
       [12, 11, 10,  9]])

In [149]:
arr_2d[1] = np.flip(arr_2d[1])
arr_2d

array([[ 1,  2,  3,  4],
       [ 8,  7,  6,  5],
       [ 9, 10, 11, 12]])

In [150]:
arr_2d[:,1] = np.flip(arr_2d[:,1])
arr_2d

array([[ 1, 10,  3,  4],
       [ 8,  7,  6,  5],
       [ 9,  2, 11, 12]])

### Reshaping and flattening multidimensional arrays

There are two popular ways to flatten an array: `.flatten()` and `.ravel()`. The primary difference between the two is that the new array created using `ravel()` **is actually a reference to the parent array (i.e., a “`view`”).** This means that any changes to the new array will affect the parent array as well. Since ravel does not create a copy, it’s memory efficient.

In [151]:
x = np.array([[1 , 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])

In [152]:
x.flatten()

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12])

In [153]:
a1 = x.flatten()

In [159]:
a1[0] = 99
x # Original array

array([[98,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [158]:
a1 # New array

array([99,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12])

In [160]:
a2 = x.ravel()
a2[0] = 98
x # Original array

array([[98,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [161]:
a2 # New array

array([98,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12])

### How to access the docstring for more information

When it comes to the data science ecosystem, Python and NumPy are built with the user in mind. One of the best examples of this is the built-in access to documentation. Every object contains the reference to a string, which is known as the docstring. In most cases, this docstring contains a quick and concise summary of the object and how to use it. Python has a built-in help() function that can help you access this information. This means that nearly any time you need more information, you can use help() to quickly find the information that you need.

In [162]:
help(max)

Help on built-in function max in module builtins:

max(...)
    max(iterable, *[, default=obj, key=func]) -> value
    max(arg1, arg2, *args, *[, key=func]) -> value
    
    With a single iterable argument, return its biggest item. The
    default keyword-only argument specifies an object to return if
    the provided iterable is empty.
    With two or more arguments, return the largest argument.



Because access to additional information is so useful, IPython uses the ? character as a shorthand for accessing this documentation along with other relevant information. IPython is a command shell for interactive computing in multiple languages. You can find more information about IPython here.



In [163]:
max?

[0;31mDocstring:[0m
max(iterable, *[, default=obj, key=func]) -> value
max(arg1, arg2, *args, *[, key=func]) -> value

With a single iterable argument, return its biggest item. The
default keyword-only argument specifies an object to return if
the provided iterable is empty.
With two or more arguments, return the largest argument.
[0;31mType:[0m      builtin_function_or_method


You can even use this notation for object methods and objects themselves.

Let’s say you create this array:


In [164]:
a = np.array([1, 2, 3, 4, 5, 6])

In [165]:
a?

[0;31mType:[0m            ndarray
[0;31mString form:[0m     [1 2 3 4 5 6]
[0;31mLength:[0m          6
[0;31mFile:[0m            ~/anaconda3/envs/ds-env/lib/python3.9/site-packages/numpy/__init__.py
[0;31mDocstring:[0m       <no docstring>
[0;31mClass docstring:[0m
ndarray(shape, dtype=float, buffer=None, offset=0,
        strides=None, order=None)

An array object represents a multidimensional, homogeneous array
of fixed-size items.  An associated data-type object describes the
format of each element in the array (its byte-order, how many bytes it
occupies in memory, whether it is an integer, a floating point number,
or something else, etc.)

Arrays should be constructed using `array`, `zeros` or `empty` (refer
to the See Also section below).  The parameters given here refer to
a low-level method (`ndarray(...)`) for instantiating an array.

For more information, refer to the `numpy` module and examine the
methods and attributes of an array.

Parameters
----------
(for the 

You can obtain information about the function:



In [166]:
def double(a):
    '''Return a * 2'''
    return a * 2

In [167]:
double?

[0;31mSignature:[0m [0mdouble[0m[0;34m([0m[0ma[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m Return a * 2
[0;31mFile:[0m      /tmp/ipykernel_5996/2713554790.py
[0;31mType:[0m      function


In [168]:
double??

[0;31mSignature:[0m [0mdouble[0m[0;34m([0m[0ma[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mSource:[0m   
[0;32mdef[0m [0mdouble[0m[0;34m([0m[0ma[0m[0;34m)[0m[0;34m:[0m[0;34m[0m
[0;34m[0m    [0;34m'''Return a * 2'''[0m[0;34m[0m
[0;34m[0m    [0;32mreturn[0m [0ma[0m [0;34m*[0m [0;36m2[0m[0;34m[0m[0;34m[0m[0m
[0;31mFile:[0m      /tmp/ipykernel_5996/2713554790.py
[0;31mType:[0m      function


If the object in question is compiled in a language other than Python, using ?? will return the same information as ?. You’ll find this with a lot of built-in objects and types, for example:

In [170]:
len?

[0;31mSignature:[0m [0mlen[0m[0;34m([0m[0mobj[0m[0;34m,[0m [0;34m/[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m Return the number of items in a container.
[0;31mType:[0m      builtin_function_or_method


In [171]:
len??

[0;31mSignature:[0m [0mlen[0m[0;34m([0m[0mobj[0m[0;34m,[0m [0;34m/[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m Return the number of items in a container.
[0;31mType:[0m      builtin_function_or_method


### Working with mathematical formulas

The ease of implementing mathematical formulas that work on arrays is one of the things that make NumPy so widely used in the scientific Python community.

For example, this is the mean square error formula (a central formula used in supervised machine learning models that deal with regression):

![](./images/np_MSE_formula.png)

Implementing this formula is simple and straightforward in NumPy:

![](./images/np_MSE_implementation.png)

What makes this work so well is that `predictions` and `labels` can contain one or a thousand values. They only need to be the same size.

You can visualize it this way:

![](./images/np_mse_viz1.png)

In this example, both the predictions and labels vectors contain three values, meaning n has a value of three. After we carry out subtractions the values in the vector are squared. Then NumPy sums the values, and your result is the error value for that prediction and a score for the quality of the model.

![](./images/np_mse_viz2.png)


### How to save and load NumPy objects

Fortunately, there are several ways to save and load objects with NumPy. The `ndarray` objects can be saved to and loaded from the disk files with `loadtxt` and `savetxt` functions that handle normal text files, `load` and `save` functions that handle NumPy binary files with a `.npy` file extension, and a `savez` function that handles NumPy files with a `.npz` file extension.

The `.npy` and `.npz` files store `data, shape, dtype, and other information required to reconstruct the ndarray` in a way that allows the array to be correctly retrieved, even when the file is on another machine with different architecture.

If you want to store a single `ndarray object`, store it as a `.npy` file using `np.save`. If you want to store more than one ndarray object in a single file, save it as a `.npz` file using `np.savez`. You can also save several arrays into a single file in `compressed npz` format with `savez_compressed`.

It’s easy to save and load and array with `np.save()`. Just make sure to specify the array you want to save and a file name. For example, if you create this array:


In [172]:
a = np.array([1, 2, 3, 4, 5, 6])

In [173]:
np.save('./saved_files/filename', a)

In [174]:
b = np.load('./saved_files/filename.npy')
b

array([1, 2, 3, 4, 5, 6])

You can save a NumPy array as a plain text file like a .csv or .txt file with np.savetxt.

In [178]:
np.savetxt('./saved_files/txt_filename.csv', b)

In [180]:
c = np.loadtxt('./saved_files/txt_filename.csv', dtype=np.int32)
c

array([1, 2, 3, 4, 5, 6], dtype=int32)

The `savetxt()` and `loadtxt()` functions accept additional optional parameters such as header, footer, and delimiter. While text files can be easier for sharing, `.npy` and `.npz` files are smaller and faster to read. If you need more sophisticated handling of your text file (for example, if you need to work with lines that contain missing values), you will want to use the `genfromtxt function`.

With `savetxt`, you can specify headers, footers, comments, and more.