# Numpy beginner tutorial
## This tutorial in particular repeats [numpy quickstart](https://numpy.org/devdocs/user/quickstart.html) a lot from the official site of NumPy.

In [1]:
# imports
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

### Basics

NumPy’s main object is the homogeneous multidimensional array. It is a table of elements (usually numbers), all of the same type, indexed by a tuple of non-negative integers. In NumPy dimensions are called `axes`.

In [2]:
# for example: the array has 2 axes. The first axis has a length of 2, the second axis has a length of 3.
[[1., 0., 0.],
 [0., 1., 0.]]

[[1.0, 0.0, 0.0], [0.0, 1.0, 0.0]]

NumPy’s array class is called `ndarray`. It is also known by the alias array. Note that `numpy.array` **is not** the same as the Standard Python Library class `array.array`

The more important attributes of an ndarray object are:

***

> `ndarray.ndim`

the number of axes (dimensions) of the array.

> `ndarray.shape`

the dimensions of the array. This is a tuple of integers indicating the size of the array in each dimension. For a matrix with n rows and m columns, `shape` will be `(n,m)`. The length of the `shape` tuple is therefore the number of axes, `ndim`.

> `ndarray.size`

the total number of elements of the array. This is equal to the product of the elements of `shape`.

> `ndarray.dtype`

an object describing the type of the elements in the array. One can create or specify dtype’s using standard Python types. **Additionally** NumPy provides types of its own. **numpy.int32, numpy.int16, and numpy.float64 are some examples**.

> `ndarray.itemsize`

the size in bytes of each element of the array. For example, an array of elements of type `float64` has `itemsize` 8 (=64/8), while one of type `complex32` has `itemsize 4` (=32/8). It is equivalent to `ndarray.dtype.itemsize`.

> `ndarray.data`

the buffer containing the actual elements of the array. Normally, we won’t need to use this attribute because we will access the elements in an array using indexing facilities.


In [3]:
lst = np.arange(15).reshape(3, 5)
lst

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

In [4]:
lst.shape

(3, 5)

In [5]:
lst.size

15

In [6]:
lst.dtype

dtype('int64')

In [7]:
lst.itemsize

8

In [8]:
lst.data

<memory at 0x7fb932c29790>

In [9]:
type(lst)

numpy.ndarray

In [10]:
# create numpy array
basic_lst = [1, 2, 3, 4]
own_lst = np.array(basic_lst)

own_lst

array([1, 2, 3, 4])

In [11]:
type(own_lst)

numpy.ndarray

### Array Creation

In [12]:
# create array using np.array function
int_lst = np.array([1, 2, 3])
int_lst

array([1, 2, 3])

In [13]:
int_lst.dtype

dtype('int64')

In [14]:
float_lst = np.array([1., 2., 3.])
float_lst

array([1., 2., 3.])

In [15]:
float_lst.dtype

dtype('float64')

A frequent error consists in calling `array` with multiple arguments, rather than providing a single sequence as an argument.

In [16]:
# WRONG
np.array(1, 2, 3, 4)

TypeError: array() takes from 1 to 2 positional arguments but 4 were given

In [17]:
# RIGHT
np.array([1, 2, 3, 4, 5])

array([1, 2, 3, 4, 5])

`array` transforms sequences of sequences into two-dimensional arrays, sequences of sequences of sequences into three-dimensional arrays, and so on.

In [18]:
# like tuples
lst_2dim = np.array([(1, 2, 3), (4, 5, 6)])
lst_2dim

array([[1, 2, 3],
       [4, 5, 6]])

In [19]:
# like list
lst_2dim = np.array([[1, 2, 3], [4, 5, 6]])
lst_2dim

array([[1, 2, 3],
       [4, 5, 6]])

In [20]:
lst_2dim.ndim

2

In [21]:
#The type of the array can also be explicitly specified at creation time:
complex_lst = np.array([1, 2, 3, 4], dtype=complex)
complex_lst

array([1.+0.j, 2.+0.j, 3.+0.j, 4.+0.j])

Often, the elements of an array are originally unknown, but its size is known. Hence, NumPy offers several functions to create arrays with initial placeholder content. These minimize the necessity of growing arrays, an expensive operation.

> `numpy.zeros(shape, dtype=float, order='C', *, like=None)`

Return a new array of given shape and type, filled with zeros.

> `numpy.ones(shape, dtype=None, order='C', *, like=None)`

Return a new array of given shape and type, filled with ones.

> `numpy.empty(shape, dtype=float, order='C', *, like=None)`

Return a new array of given shape and type, without initializing entries **(content is random and depends on the state of the memory)**.


_By default, the `dtype` of the created array is `float64`, but it can be specified via the key word argument dtype._

In [27]:
np.zeros([2, 3])
# or
# np.zeros((2, 3))

array([[0., 0., 0.],
       [0., 0., 0.]])

In [28]:
np.ones([3, 4])
# or
# np.ones((3, 4))

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

In [33]:
np.empty([2, 4])

array([[4.65328105e-310, 0.00000000e+000, 6.93833211e-310,
        6.93838078e-310],
       [6.93838077e-310, 6.93838078e-310, 6.93835596e-310,
        6.93838036e-310]])

In [41]:
# with numpy type
np.empty((2, 4), dtype=np.int8)

array([[-1, -1, -1, -1],
       [-1, -1, -1, -1]], dtype=int8)

In [53]:
# with python type
np.empty([2, 5], dtype=int)

array([[     94183456056560,                   0,                   0,
                          0,                   0],
       [7076336329807914035, 3617566314198085933, 3257002151774073654,
        3618752481072527665,        521388909921]])

To create sequences of numbers, NumPy provides the `arange` function which is analogous to the Python built-in `range`, but returns an array.

> `numpy.arange([start, ]stop, [step, ]dtype=None, *, like=None)`

Return evenly spaced values within a given interval.

Values are generated within the half-open interval **[start, stop)** (in other words, the interval including start but excluding stop). For integer arguments the function is equivalent to the Python built-in range function, but returns an ndarray rather than a list.

**When using a non-integer step, such as 0.1, the results will often not be consistent. It is better to use** `numpy.linspace` **for these cases.**

> `numpy.linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None, axis=0)`

Return evenly spaced numbers over a specified interval.

Returns num evenly spaced samples, calculated over the interval **[start, stop] when endpoint=True, or [start, stop) when endpoint=False**.

In [56]:
np.arange(1, 16)

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15])

In [57]:
np.arange(1, 15, 3)

array([ 1,  4,  7, 10, 13])

In [84]:
# VERY BAD

# step must be int, else - undefined behavior
np.arange(1, 5, 0.1)

array([1. , 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2. , 2.1, 2.2,
       2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9, 3. , 3.1, 3.2, 3.3, 3.4, 3.5,
       3.6, 3.7, 3.8, 3.9, 4. , 4.1, 4.2, 4.3, 4.4, 4.5, 4.6, 4.7, 4.8,
       4.9])

In [85]:
# GOOD
# with endpoint=False
# [start, stop) 

np.linspace(1, 5, 40, endpoint=False)

array([1. , 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2. , 2.1, 2.2,
       2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9, 3. , 3.1, 3.2, 3.3, 3.4, 3.5,
       3.6, 3.7, 3.8, 3.9, 4. , 4.1, 4.2, 4.3, 4.4, 4.5, 4.6, 4.7, 4.8,
       4.9])

In [86]:
# with endpoint=True
# [start, stop] 

np.linspace(1, 5, 41)

array([1. , 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2. , 2.1, 2.2,
       2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9, 3. , 3.1, 3.2, 3.3, 3.4, 3.5,
       3.6, 3.7, 3.8, 3.9, 4. , 4.1, 4.2, 4.3, 4.4, 4.5, 4.6, 4.7, 4.8,
       4.9, 5. ])

### Printing Arrays

When you print an array, NumPy displays it in a similar way to nested lists, but with the following layout:

* the last axis is printed from left to right,

* the second-to-last is printed from top to bottom,

* the rest are also printed from top to bottom, with each slice separated from the next by an empty line.

One-dimensional arrays are then printed as rows, bidimensionals as matrices and tridimensionals as lists of matrices.

In [88]:
# 1d array
lst_1d = np.arange(1, 6)
print(lst_1d)

[1 2 3 4 5]


In [92]:
# 2d array
lst_2d = np.arange(12).reshape(4, 3)
print(lst_2d)

[[ 0  1  2]
 [ 3  4  5]
 [ 6  7  8]
 [ 9 10 11]]


In [94]:
# 3d array
lst_3d = np.arange(24).reshape(2, 3, 4)
print(lst_3d)

[[[ 0  1  2  3]
  [ 4  5  6  7]
  [ 8  9 10 11]]

 [[12 13 14 15]
  [16 17 18 19]
  [20 21 22 23]]]


If an array is too large to be printed, NumPy automatically skips the central part of the array and only prints the corners:

In [97]:
big_lst = np.arange(10000).reshape(100,100)
big_lst

array([[   0,    1,    2, ...,   97,   98,   99],
       [ 100,  101,  102, ...,  197,  198,  199],
       [ 200,  201,  202, ...,  297,  298,  299],
       ...,
       [9700, 9701, 9702, ..., 9797, 9798, 9799],
       [9800, 9801, 9802, ..., 9897, 9898, 9899],
       [9900, 9901, 9902, ..., 9997, 9998, 9999]])

To disable this behaviour and force NumPy to print the entire array, you can change the printing options using `set_printoptions`.

In [98]:
# np.set_printoptions(threshold=sys.maxsize)  # sys module should be imported

### Basic Operations

Arithmetic operators on arrays apply *elementwise*. A new array is created and filled with the result.



In [101]:
a = np.array([10, 20, 30, 40])
b = np.arange(4)

print(a - b)
print(b ** 2)
print(10 * np.sin(a))
print(a < 21)

[10 19 28 37]
[0 1 4 9]
[-5.44021111  9.12945251 -9.88031624  7.4511316 ]
[ True  True False False]


In [108]:
# important part with product
A = np.array([[1, 1],
              [0, 1]])

B = np.array([[2, 0],
             [3, 4]])
print('elementwise')
print(A * B)    # elementwise product
print('\nmatrix')
print(A @ B)    # matrix product
print('\nmatrix')
print(A.dot(B)) # another matrix product



elementwise
[[2 0]
 [0 4]]

matrix
[[5 4]
 [3 4]]

matrix
[[5 4]
 [3 4]]


Some operations, such as `+=` and `*=`, act in place to **modify an existing array** rather than create a new one.

In [125]:
a = np.ones([3, 4], dtype=int)
b = np.empty([3, 4], dtype=np.float16)

In [126]:
print('before:\n', a)
a *= 3
print('\nafter:\n', a)

before:
 [[1 1 1 1]
 [1 1 1 1]
 [1 1 1 1]]

after:
 [[3 3 3 3]
 [3 3 3 3]
 [3 3 3 3]]


In [127]:
print('before:\n', b)
b += a
print('\nafter:\n', b)

before:
 [[ 9.     9.     9.    10.984]
 [ 9.     9.     9.    10.99 ]
 [ 9.     9.     9.    11.   ]]

after:
 [[12.    12.    12.    13.984]
 [12.    12.    12.    13.99 ]
 [12.    12.    12.    14.   ]]


In [128]:
a += b  # b is not automatically converted to integer type

UFuncTypeError: Cannot cast ufunc 'add' output from dtype('float64') to dtype('int64') with casting rule 'same_kind'

In [138]:
a = np.ones(3, dtype=np.int32)
b = np.linspace(0, np.pi, 3)
b.dtype.name

'float64'

In [140]:
c = a + b
c

array([1.        , 2.57079633, 4.14159265])

In [141]:
c.dtype.name

'float64'

In [143]:
d = np.exp(c * 1j)
d

array([ 0.54030231+0.84147098j, -0.84147098+0.54030231j,
       -0.54030231-0.84147098j])

In [144]:
d.dtype.name

'complex128'