# From Python to Numpy

[reference book](https://www.labri.fr/perso/nrougier/from-python-to-numpy/)

https://numpy.org/doc/stable/user/absolute_beginners.html#broadcasting


https://github.com/rougier/numpy-100/blob/master/100_Numpy_exercises_with_solutions.md

https://github.com/rougier/numpy-100/blob/master/100_Numpy_exercises.ipynb


NumPy (Numerical Python) is an open source Python library that’s used in almost every field of science and engineering. It’s the universal standard for working with numerical data in Python, and it’s at the core of the scientific Python and PyData ecosystems. NumPy users include everyone from beginning coders to experienced researchers doing state-of-the-art scientific and industrial research and development. The NumPy API is used extensively in Pandas, SciPy, Matplotlib, scikit-learn, scikit-image and most other data science and scientific Python packages.

In [2]:
import numpy as np
import scipy as sp
import matplotlib.pyplot as plt

In [3]:
def p(v):
    return print(v)

In [3]:
np?

### What’s the difference between a Python list and a NumPy array?
NumPy gives you an enormous range of fast and efficient ways of creating arrays and manipulating numerical data inside them. While a Python list can contain different data types within a single list, all of the elements in a NumPy array should be homogeneous. 

In [4]:
a= np.arange(6)
p(a)
p(a.shape)
a2= a[np.newaxis, :]
p(a2)
p(a2.shape)

[0 1 2 3 4 5]
(6,)
[[0 1 2 3 4 5]]
(1, 6)


### What is an array?
An array is a central data structure of the NumPy library. An array is a grid of values and it contains information about the raw data, how to locate an element, and how to interpret an element. It has a grid of elements that can be indexed in various ways. The elements are all of the same type, referred to as the array dtype.

An array can be indexed by a tuple of nonnegative integers, by booleans, by another array, or by integers. The rank of the array is the number of dimensions. The shape of the array is a tuple of integers giving the size of the array along each dimension.

One way we can initialize NumPy arrays is from Python lists, using nested lists for two- or higher-dimensional data.

In [13]:
a = np.array([1, 2, 3, 4, 5, 6])
p(a)

[1 2 3 4 5 6]


In [15]:
a = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
p(a)

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]


In [16]:
p(a[0])

[1 2 3 4]


### More information about arrays
*This section covers 1D array, 2D array, ndarray, vector, matrix

You might occasionally hear an array referred to as a **“ndarray,”** which is shorthand for “N-dimensional array.” An N-dimensional array is simply an array with any number of dimensions. You might also hear 1-D, or one-dimensional array, 2-D, or two-dimensional array, and so on. 

The NumPy ndarray class is used to represent both **matrices and vectors**. `A vector is an array with a single dimension` (there’s no difference between row and column vectors), while `a matrix refers to an array with two dimensions`. For 3-D or higher dimensional arrays, the term tensor is also commonly used.

####  How to create a basic array?
This section covers np.array(), np.zeros(), np.ones(), np.empty(), np.arange(), np.linspace(), dtype

To create a NumPy array, you can use the function np.array().

In [6]:
a = np.array([1, 2, 3])
p(a)
p(a[:, np.newaxis])
p(a.shape)
p(a[:, np.newaxis].shape)

[1 2 3]
[[1]
 [2]
 [3]]
(3,)
(3, 1)


In [22]:
#You can easily create an array filled with 0’s:
p(np.zeros(2))
# You can easily create an array filled with 1’s:
p(np.ones(2))

[0. 0.]
[1. 1.]


Or even an empty array! The function empty creates an array whose initial content is random and depends on the state of the memory. The reason to use empty over zeros (or something similar) is speed - just make sure to fill every element afterwards!

In [26]:
p(np.empty(5))

[0.   0.25 0.5  0.75 1.  ]


In [27]:
#You can create an array with a range of elements:
np.arange(4)

array([0, 1, 2, 3])

And even an array that contains a range of evenly spaced intervals. To do this, you will specify the `first number, last number, and the step size`.



In [28]:
np.arange(2, 9, 2)

array([2, 4, 6, 8])

You can also use `np.linspace()` to create an array with values that are spaced linearly in a specified interval:

In [29]:
np.linspace(0, 10, num=5)

array([ 0. ,  2.5,  5. ,  7.5, 10. ])

#### Specifying your data type

While the default data type is floating point (np.float64), you can explicitly specify which data type you want using the dtype keyword.

In [8]:
x = np.ones(2, dtype=np.int64)
p(x)
x.dtype

[1 1]


dtype('int64')

### Adding, removing, and sorting elements
This section covers np.sort(), np.concatenate()

Sorting an element is simple with np.sort(). You can specify the axis, kind, and order when you call the function.

In [31]:
arr = np.array([2, 1, 5, 3, 7, 4, 6, 8])
np.sort(arr)

array([1, 2, 3, 4, 5, 6, 7, 8])

In addition to sort, which returns a sorted copy of an array, you can use:

- **argsort**, which is an indirect sort along a specified axis,

- **lexsort**, which is an indirect stable sort on multiple keys,

- **searchsorted**, which will find elements in a sorted array, and

- **partition**, which is a partial sort.

In [33]:
a = np.array([1, 2, 3, 4])
b = np.array([5, 6, 7, 8])
np.concatenate((a, b))

array([1, 2, 3, 4, 5, 6, 7, 8])

In [36]:
# Or, if you start with these arrays:
x = np.array([[1, 2], [3, 4]])
y = np.array([[5, 6]])

#You can concatenate them with:
np.concatenate((x, y), axis=0)

array([[1, 2],
       [3, 4],
       [5, 6]])

In order to remove elements from an array, it’s simple to use indexing to select the elements that you want to keep.

### How do you know the shape and size of an array?
This section covers ndarray.ndim, ndarray.size, ndarray.shape

- ndarray.ndim will tell you the number of axes, or dimensions, of the array.

- ndarray.size will tell you the total number of elements of the array. This is the product of the elements of the array’s shape.

- ndarray.shape will display a tuple of integers that indicate the number of elements stored along each dimension of the array. If, for example, you have a 2-D array with 2 rows and 3 columns, the shape of your array is (2, 3).




In [47]:
# To find the number of dimensions of the array, run:
p(a.ndim)
p(a[np.newaxis, :].ndim)
#==> ndim ==> number of axis

1
2


In [49]:
#To find the total number of elements in the array, run:
a.size
b.size

4

In [52]:
array_example = np.array([[[0, 1, 2, 3],
                           [4, 5, 6, 7]],

                          [[0, 1, 2, 3],
                           [4, 5, 6, 7]],

                          [[0 ,1 ,2, 3],
                           [4, 5, 6, 7]]])
array_example

array([[[0, 1, 2, 3],
        [4, 5, 6, 7]],

       [[0, 1, 2, 3],
        [4, 5, 6, 7]],

       [[0, 1, 2, 3],
        [4, 5, 6, 7]]])

In [51]:
array_example.size

24

In [53]:
# And to find the shape of your array, run:
array_example.shape

(3, 2, 4)

### Can you reshape an array?
`arr.reshape()`

Using arr.reshape() will give a new shape to an array without changing the data. Just remember that when you use the reshape method, the array you want to produce needs to have the same number of elements as the original array. If you start with an array with 12 elements, you’ll need to make sure that your new array also has a total of 12 elements.



In [54]:
a = np.arange(6)
b = a.reshape(3, 2)
b

array([[0, 1],
       [2, 3],
       [4, 5]])

### How to convert a 1D array into a 2D array (how to add a new axis to an array)
This section covers `np.newaxis, np.expand_dims`

You can use **np.newaxis** and **np.expand_dims** to increase the dimensions of your existing array.

Using np.newaxis will increase the dimensions of your array by one dimension when used once. This means that a 1D array will become a 2D array, a 2D array will become a 3D array, and so on.

In [58]:
#You can use np.newaxis to add a new axis:
p(a)
a2 = a[:, np.newaxis]
p(a2)
p(a2.shape)


[0 1 2 3 4 5]
[[0]
 [1]
 [2]
 [3]
 [4]
 [5]]
(6, 1)


In [13]:
# You can use np.expand_dims to add an axis at index position 1 with:
b = np.expand_dims(a, axis=0)
p(a)
p(a.shape)
p(b)
p(b.shape)

[1 2 3]
(3,)
[[1 2 3]]
(1, 3)


### Indexing and slicing
You can index and slice NumPy arrays in the same ways you can slice Python lists.

In [64]:
p(a)
p(a[:2])
p(a[:-2])
p(a[-2:])

[0 1 2 3 4 5]
[0 1]
[0 1 2 3]
[4 5]


In [67]:
p(a[a<4])
p(a[a>3])
p(a[a<=4])

[0 1 2 3]
[4 5]
[0 1 2 3 4]


In [69]:
three_up = (a >= 3)
p(a[three_up])

[3 4 5]


In [70]:
divisible_by_2 = a[a%2==0]
p(a[divisible_by_2])

[0 2 4]


In [72]:
a = np.array([[1 , 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
c = a[(a > 2) & (a < 11)]
p(c)

[ 3  4  5  6  7  8  9 10]


In [73]:
five_up = (a > 5) | (a == 5)
print(five_up)

[[False False False False]
 [ True  True  True  True]
 [ True  True  True  True]]


In [74]:
p(a[five_up])

[ 5  6  7  8  9 10 11 12]


In [75]:
# You can also use np.nonzero() to select elements or indices from an array.
b = np.nonzero(a < 5)
b
#==> return the coordonate of the location of the filterd values row, col

(array([0, 0, 0, 0]), array([0, 1, 2, 3]))

In this example, a tuple of arrays was returned: one for each dimension. The first array represents the row indices where these values are found, and the second array represents the column indices where the values are found.

If you want to generate a list of coordinates where the elements exist, you can zip the arrays, iterate over the list of coordinates, and print them. For example:

In [76]:
list_of_coordinates= list(zip(b[0], b[1]))

for coord in list_of_coordinates:
    print(coord)

(0, 0)
(0, 1)
(0, 2)
(0, 3)


In [77]:
# You can also use np.nonzero() to print the elements in an array that are less than 5 with:
print(a[b])

[1 2 3 4]


### How to create an array from existing data
This section covers slicing and indexing, `np.vstack(), np.hstack(), np.hsplit(), .view(), copy()`



In [81]:
# You can create a new array from a section of your array any time by specifying where 
## you want to slice your array.
a = np.array([1,  2,  3,  4,  5,  6,  7,  8,  9, 10])
arr1 = a[3:8]
arr1

array([4, 5, 6, 7, 8])

You can also stack two existing arrays, both vertically and horizontally. Let’s say you have two arrays, a1 and a2:

In [82]:
a1 = np.array([[1, 1],
               [2, 2]])

a2 = np.array([[3, 3],
               [4, 4]])

# You can stack them vertically with vstack:
np.vstack((a1, a2))

array([[1, 1],
       [2, 2],
       [3, 3],
       [4, 4]])

In [83]:
# Or stack them horizontally with hstack:
np.hstack((a1, a2))

array([[1, 1, 3, 3],
       [2, 2, 4, 4]])

You can split an array into several smaller arrays using `hsplit`. You can specify either the number of equally shaped arrays to return or the columns after which the division should occur.

Let’s say you have this array:

In [84]:
x = np.arange(1, 25).reshape(2, 12)
x

array([[ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12],
       [13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24]])

In [85]:
#If you wanted to split this array into three equally shaped arrays, you would run:
np.hsplit(x, 3)


[array([[ 1,  2,  3,  4],
        [13, 14, 15, 16]]),
 array([[ 5,  6,  7,  8],
        [17, 18, 19, 20]]),
 array([[ 9, 10, 11, 12],
        [21, 22, 23, 24]])]

In [86]:
#If you wanted to split your array after the third and fourth column, you’d run:
np.hsplit(x, (3, 4))

[array([[ 1,  2,  3],
        [13, 14, 15]]),
 array([[ 4],
        [16]]),
 array([[ 5,  6,  7,  8,  9, 10, 11, 12],
        [17, 18, 19, 20, 21, 22, 23, 24]])]

You can use the `view` method to create a new array object that looks at the same data as the original array (a shallow copy).

Views are an important NumPy concept! NumPy functions, as well as operations like indexing and slicing, will return views whenever possible. This saves memory and is faster (no copy of the data has to be made). However it’s important to be aware of this - modifying data in a view also modifies the original array!

Let’s say you create this array:


In [87]:
a = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
p(a)

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]


Now we create an array b1 by slicing a and modify the first element of b1. This will modify the corresponding element in a as well!

In [91]:
b1 = a[0, :]
p(b1)
b1[0]=99
#p(b1)
p(a)

[99  2  3  4]
[[99  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]


Using the `copy` method will make a complete copy of the array and its data (a deep copy). To use this on your array, you could run:

In [92]:
b2 = a.copy()
b2

array([[99,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

### Basic array operations

In [94]:
# You can add the arrays together with the plus sign.
data = np.array([1, 2])
ones = np.ones(2, dtype=int)
aa= data + ones
p(aa)
data - ones

[2 3]


array([0, 1])

In [95]:
p(a.sum())

176


In [99]:
p(a.shape)
p(a.sum(axis=0)) # sum over rows

(3, 4)
[113  18  21  24]


In [100]:
p(a.sum(axis=1)) # sum over columns

[108  26  42]


### Broadcasting
There are times when you might want to carry out an operation between an array and a single number (also called an operation between a vector and a scalar) or between arrays of two different sizes. For example, your array (we’ll call it “data”) might contain information about distance in miles but you want to convert the information to kilometers. You can perform this operation with:

In [101]:
data = np.array([1.0, 2.0])
data * 1.6

array([1.6, 3.2])

NumPy understands that the multiplication should happen with each cell. That concept is called broadcasting. Broadcasting is a mechanism that allows NumPy to perform operations on arrays of different shapes. The dimensions of your array must be compatible, for example, when the dimensions of both arrays are equal or when one of them is 1. If the dimensions are not compatible, you will get a ValueError.

### How to reverse an array
This section covers `np.flip()`

NumPy’s np.flip() function allows you to flip, or reverse, the contents of an array along an axis. When using np.flip(), specify the array you would like to reverse and the axis. If you don’t specify the axis, NumPy will reverse the contents along all of the axes of your input array.

In [102]:
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])
reversed_arr = np.flip(arr)
p(reversed_arr)

[8 7 6 5 4 3 2 1]


In [105]:
# If you start with this array:

arr_2d = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
p(arr_2d)
reversed_arr = np.flip(arr_2d)
p(reversed_arr)

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]
[[12 11 10  9]
 [ 8  7  6  5]
 [ 4  3  2  1]]


In [106]:
#You can easily reverse only the rows with:

reversed_arr_rows = np.flip(arr_2d, axis=0)
print(reversed_arr_rows)

[[ 9 10 11 12]
 [ 5  6  7  8]
 [ 1  2  3  4]]


In [107]:
#Or reverse only the columns with:

reversed_arr_columns = np.flip(arr_2d, axis=1)
print(reversed_arr_columns)

[[ 4  3  2  1]
 [ 8  7  6  5]
 [12 11 10  9]]


In [108]:
# You can also reverse the contents of only one column or row. For example, 
# you can reverse the contents of the row at index position 1 (the second row):

arr_2d[1] = np.flip(arr_2d[1])
print(arr_2d)

[[ 1  2  3  4]
 [ 8  7  6  5]
 [ 9 10 11 12]]


In [109]:
# You can also reverse the column at index position 1 (the second column):
arr_2d[:,1] = np.flip(arr_2d[:,1])
print(arr_2d)

[[ 1 10  3  4]
 [ 8  7  6  5]
 [ 9  2 11 12]]


### Reshaping and flattening multidimensional arrays
This section covers `.flatten(), ravel()`

There are two popular ways to flatten an array: .flatten() and .ravel(). The primary difference between the two is that the new array created using ravel() is actually a reference to the parent array (i.e., a “view”). This means that any changes to the new array will affect the parent array as well. Since ravel does not create a copy, it’s memory efficient.


In [14]:
x = np.array([[1 , 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
p(x)
#You can use flatten to flatten your array into a 1D array.

x.flatten()

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]


array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12])

When you use `flatten`, changes to your new array won’t change the parent array.

But when you use `ravel`, the changes you make to the new array will affect the parent array.

In [15]:
a2 = x.ravel()
a2[0] = 98
p(x)

[[98  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]


In [114]:
p(a2)

[98  2  3  4  5  6  7  8  9 10 11 12]


### How to save and load NumPy objects
This section covers `np.save, np.savez, np.savetxt, np.load, np.loadtxt`

You will, at some point, want to save your arrays to disk and load them back without having to re-run the code. Fortunately, there are several ways to save and load objects with NumPy. The ndarray objects can be saved to and loaded from the disk files with `loadtxt` and `savetxt` functions that handle normal text files, `load` and `save` functions that handle NumPy binary files with a `.npy` file extension, and a `savez` function that handles NumPy files with a `.npz` file extension.

The .`npy and .npz` files store data, shape, dtype, and other information required to reconstruct the ndarray in a way that allows the array to be correctly retrieved, even when the file is on another machine with different architecture.

If you want to store a single ndarray object, store it as a `.npy` file using `np.save`. If you want to store more than one ndarray object in a single file, save it as a `.npz` file using `np.savez`. You can also save several arrays into a single file in compressed `npz` format with savez_compressed.

It’s easy to save and load and array with np.save(). Just make sure to specify the array you want to save and a file name. For example, if you create this array:

In [115]:
a = np.array([1, 2, 3, 4, 5, 6])
#You can save it as “filename.npy” with:
np.save('filename', a)

In [116]:
!ls filename.npy

filename.npy


In [117]:
#You can use np.load() to reconstruct your array.
b = np.load('filename.npy')
p(b)

[1 2 3 4 5 6]


In [118]:
# You can save a NumPy array as a plain text file like a .csv or .txt file with np.savetxt.
csv_arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])

#You can easily save it as a .csv file with the name “new_file.csv” like this:
np.savetxt('new_file.csv', csv_arr)

In [119]:
!ls new_file.csv

new_file.csv


In [120]:
# You can quickly and easily load your saved text file using loadtxt():
b= np.loadtxt('new_file.csv')
p(b)


[1. 2. 3. 4. 5. 6. 7. 8.]


The `savetxt() and loadtxt()` functions accept additional optional parameters such as header, footer, and delimiter. While text files can be easier for sharing, .npy and .npz files are smaller and faster to read. If you need more sophisticated handling of your text file (for example, if you need to work with lines that contain missing values), you will want to use the genfromtxt function.

With savetxt, you can specify headers, footers, comments, and more.