Now let's import `numpy`. Most people import it as `np`:

In [1]:
!pip install numpy



In [2]:
import numpy as np

Quick Exercise:
Try importing pandas as pd and math







## `np.zeros`

The `zeros` function creates an array containing any number of zeros:

In [3]:
np.zeros(5)

array([0., 0., 0., 0., 0.])

It's just as easy to create a 2D array (ie. a matrix) by providing a tuple with the desired number of rows and columns. For example, here's a 3x4 matrix:

In [4]:
np.zeros((3,4))

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

## Some vocabulary

* In NumPy, each dimension is called an **axis**.
* The number of axes is called the **rank**.
    * For example, the above 3x4 matrix is an array of rank 2 (it is 2-dimensional).
    * The first axis has length 3, the second has length 4.
* An array's list of axis lengths is called the **shape** of the array.
    * For example, the above matrix's shape is `(3, 4)`.
    * The rank is equal to the shape's length.
* The **size** of an array is the total number of elements, which is the product of all axis lengths (eg. 3*4=12)

In [5]:
a = np.zeros((3,4))
a

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [6]:
a.shape

(3, 4)

In [7]:
a.ndim  # equal to len(a.shape)

2

In [8]:
a.size

12

[link text](https://)## N-dimensional arrays
You can also create an N-dimensional array of arbitrary rank. For example, here's a 3D array (rank=3), with shape `(2,3,4)`:

In [10]:
np.zeros((2,3,4))

array([[[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]],

       [[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]]])

## Array type
NumPy arrays have the type `ndarray`s:

In [11]:
type(np.zeros((3,4)))

numpy.ndarray

## `np.ones`
Many other NumPy functions create `ndarrays`.

Here's a 3x4 matrix full of ones:

In [12]:
np.ones((3,4))

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

## `np.full`
Creates an array of the given shape initialized with the given value. Here's a 3x4 matrix full of `π`.

In [15]:
np.full((3,4), 5)

array([[5, 5, 5, 5],
       [5, 5, 5, 5],
       [5, 5, 5, 5]])

## `np.empty`
An uninitialized 2x3 array (its content is not predictable, as it is whatever is in memory at that point):

In [16]:
np.empty((2,3))

array([[0., 0., 0.],
       [0., 0., 0.]])

## `np.array`
Of course you can initialize an `ndarray` using a regular python array. Just call the `array` function:

In [17]:
np.array([[1,2,3,4], [10, 20, 30, 40]])

array([[ 1,  2,  3,  4],
       [10, 20, 30, 40]])

## `np.arange`
You can create an `ndarray` using NumPy's `range` function, which is similar to python's built-in `range` function:

In [18]:
np.arange(1, 5)

array([1, 2, 3, 4])

It also works with floats:

In [20]:
np.arange(1.0, 5.0)

array([1., 2., 3., 4.])

Of course you can provide a step parameter:

In [22]:
np.arange(1, 5, 0.5)

array([1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5])

However, when dealing with floats, the exact number of elements in the array is not always predictible. For example, consider this:

In [23]:
print(np.arange(0, 5/3, 1/3)) # depending on floating point errors, the max value is 4/3 or 5/3.
print(np.arange(0, 5/3, 0.333333333))
print(np.arange(0, 5/3, 0.333333334))


[0.         0.33333333 0.66666667 1.         1.33333333 1.66666667]
[0.         0.33333333 0.66666667 1.         1.33333333 1.66666667]
[0.         0.33333333 0.66666667 1.         1.33333334]


## `np.random.rand` and `np.random.randn` and `np.random.randint`
A number of functions are available in NumPy's `random` module to create `ndarray`s initialized with random values.

In [25]:
np.random.rand(3,4) #between 0 and 1

array([[0.03387914, 0.34610383, 0.7021621 , 0.30737065],
       [0.74307006, 0.97152655, 0.94129165, 0.35018269],
       [0.91622479, 0.39979979, 0.20906984, 0.82966277]])

Here's a 3x4 matrix containing random floats sampled from a univariate [normal distribution](https://en.wikipedia.org/wiki/Normal_distribution) (Gaussian distribution) of mean 0 and variance 1:

In [26]:
np.random.randn(3,4)

array([[ 0.01333425, -0.43165955, -0.11339523, -0.76988188],
       [-1.62194688, -1.30028136, -1.82880316,  0.03755902],
       [-1.35411059, -1.58069615, -0.46310323,  0.81808936]])

In [28]:
np.random.randint(0,10, size=5)

array([9, 5, 6, 3, 5])

Quick Exercise:


1. make random array of rank 2

2. array with 10 ones and having dimensions 4

3. make a random integer array of size (5,10)

In [29]:
np.random.rand(3,5)

array([[0.49786015, 0.19260779, 0.83646785, 0.25208959, 0.58823444],
       [0.20141598, 0.95581556, 0.31643211, 0.34734558, 0.73222253],
       [0.66520761, 0.4024246 , 0.2332443 , 0.82242101, 0.87630087]])

In [32]:
np.ones((2,5,1,1))

array([[[[1.]],

        [[1.]],

        [[1.]],

        [[1.]],

        [[1.]]],


       [[[1.]],

        [[1.]],

        [[1.]],

        [[1.]],

        [[1.]]]])

In [33]:
np.random.randint(0,20, size=(5,10))

array([[ 6, 10,  8,  0, 10, 14,  8,  8, 13,  7],
       [11, 18,  8,  6, 16,  0, 11,  9,  8, 14],
       [ 7,  3,  5, 15,  7,  4, 16,  9, 12, 14],
       [ 0,  6, 12, 12,  7,  1, 17,  2, 14, 10],
       [10,  7,  4, 15,  7, 13,  6, 11, 13, 16]])

# Array data
## `dtype`
NumPy's `ndarray`s are also efficient in part because all their elements must have the same type (usually numbers).
You can check what the data type is by looking at the `dtype` attribute:

In [35]:
c = np.arange(1, 5)
print(c.dtype, c)

int32 [1 2 3 4]


In [39]:
c = np.arange(1.0, 5)
print(c.dtype, c)

float64 [1. 2. 3. 4.]


Instead of letting NumPy guess what data type to use, you can set it explicitly when creating an array by setting the `dtype` parameter:

In [38]:
d = np.arange(1, 5, dtype=np.complex64)
print(d.dtype, d)

complex64 [1.+0.j 2.+0.j 3.+0.j 4.+0.j]


# Arithmetic operations
All the usual arithmetic operators (`+`, `-`, `*`, `/`, `//`, `**`, etc.) can be used with `ndarray`s. They apply *elementwise*:

In [41]:
a = np.array([14, 23, 32, 41])
b = np.array([5,  4,  3,  2])
print("a + b  =", a + b)
print("a - b  =", a - b)
print("a * b  =", a * b)
print("a / b  =", a / b)
print("a // b  =", a // b)
print("a % b  =", a % b)
print("a ** b =", a ** b)

a + b  = [19 27 35 43]
a - b  = [ 9 19 29 39]
a * b  = [70 92 96 82]
a / b  = [ 2.8         5.75       10.66666667 20.5       ]
a // b  = [ 2  5 10 20]
a % b  = [4 3 2 1]
a ** b = [537824 279841  32768   1681]


Note that the multiplication is *not* a matrix multiplication. We will discuss matrix operations below.

# Conditional operators

The conditional operators also apply elementwise:

In [42]:
m = np.array([20, -5, 30, 40])
m < [15, 16, 35, 36]

array([False,  True,  True, False])

And using broadcasting:

In [43]:
m < 25  # equivalent to m < [25, 25, 25, 25]

array([ True,  True, False, False])

This is most useful in conjunction with boolean indexing (discussed below).

In [44]:
m[m < 25]

array([20, -5])

# Mathematical and statistical functions

Many mathematical and statistical functions are available for `ndarray`s.

## `ndarray` methods
Some functions are simply `ndarray` methods, for example:

In [3]:
a = np.array([[-2.5, 3.1, 7], [10, 11, 12]])
print(a)
print("mean =", a.mean())

[[-2.5  3.1  7. ]
 [10.  11.  12. ]]
mean = 6.766666666666667


Note that this computes the mean of all elements in the `ndarray`, regardless of its shape.

Here are a few more useful `ndarray` methods:

In [4]:
tup = (a.min, a.max, a.sum, a.prod, a.std, a.var)
for func in tup:
    print(func.__name__, "=", func())

min = -2.5
max = 12.0
sum = 40.6
prod = -71610.0
std = 5.084835843520964
var = 25.855555555555554


These functions accept an optional argument `axis` which lets you ask for the operation to be performed on elements along the given axis. For example:

In [5]:
c=np.arange(24).reshape(2,3,4)
c

array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]],

       [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])

In [6]:
c.sum(axis=0)  # sum across matrices

array([[12, 14, 16, 18],
       [20, 22, 24, 26],
       [28, 30, 32, 34]])

In [7]:
c.sum(axis=1)  # sum across rows

array([[12, 15, 18, 21],
       [48, 51, 54, 57]])

You can also sum over multiple axes:

In [None]:
c.sum(axis=(0,2))  # sum across matrices and columns

In [None]:
0+1+2+3 + 12+13+14+15, 4+5+6+7 + 16+17+18+19, 8+9+10+11 + 20+21+22+23

## Universal functions
NumPy also provides fast elementwise functions called *universal functions*, or **ufunc**. They are vectorized wrappers of simple functions. For example `square` returns a new `ndarray` which is a copy of the original `ndarray` except that each element is squared:

In [None]:
a = np.array([[-2.5, 3.1, 7], [10, 11, 12]])
np.square(a)

Here are a few more useful unary ufuncs:

In [8]:
print("Original ndarray")
print(a)
tup = (np.abs, np.sqrt, np.exp, np.log, np.sign, np.ceil, np.modf, np.isnan, np.cos)
for func in tup:
    print("\n", func.__name__)
    print(func(a))

Original ndarray
[[-2.5  3.1  7. ]
 [10.  11.  12. ]]

 absolute
[[ 2.5  3.1  7. ]
 [10.  11.  12. ]]

 sqrt
[[       nan 1.76068169 2.64575131]
 [3.16227766 3.31662479 3.46410162]]

 exp
[[8.20849986e-02 2.21979513e+01 1.09663316e+03]
 [2.20264658e+04 5.98741417e+04 1.62754791e+05]]

 log
[[       nan 1.13140211 1.94591015]
 [2.30258509 2.39789527 2.48490665]]

 sign
[[-1.  1.  1.]
 [ 1.  1.  1.]]

 ceil
[[-2.  4.  7.]
 [10. 11. 12.]]

 modf
(array([[-0.5,  0.1,  0. ],
       [ 0. ,  0. ,  0. ]]), array([[-2.,  3.,  7.],
       [10., 11., 12.]]))

 isnan
[[False False False]
 [False False False]]

 cos
[[-0.80114362 -0.99913515  0.75390225]
 [-0.83907153  0.0044257   0.84385396]]


  print(func(a))
  print(func(a))


## Binary ufuncs
There are also many binary ufuncs, that apply elementwise on two `ndarray`s.

In [1]:
import numpy as np
a = np.array([1, -2, 3, 4])
b = np.array([2, 8, -1, 7])
np.add(a, b)  # equivalent to a + b

array([ 3,  6,  2, 11])

In [2]:
np.greater(a, b)  # equivalent to a > b

array([False, False,  True, False])

In [3]:
np.maximum(a, b)

array([2, 8, 3, 7])

In [4]:
np.copysign(a, b)

array([ 1.,  2., -3.,  4.])

# Array indexing
## One-dimensional arrays
One-dimensional NumPy arrays can be accessed more or less like regular python arrays:

In [None]:
a = np.array([1, 5, 3, 19, 13, 7, 3])
a[3]

In [None]:
a[2:5]

In [None]:
a[2:-1]

In [None]:
a[:2]

In [None]:
a[2::2]

In [None]:
a[::-1]

Of course, you can modify elements:

In [None]:
a[3]=999
a

You can also modify an `ndarray` slice:

In [None]:
a[2:5] = [997, 998, 999]
a

## Differences with regular python lists
Contrary to regular python lists, if you assign a single value to an `ndarray` slice, it is copied across the whole slice.

In [None]:
a[2:5] = -1
a

Also, you cannot grow or shrink `ndarray`s this way:

In [None]:
try:
    a[2:5] = [1,2,3,4,5,6]  # too long
except ValueError as e:
    print(e)

You cannot delete elements either:

In [None]:
try:
    del a[2:5]
except ValueError as e:
    print(e)

Last but not least, `ndarray` **slices are actually *views*** on the same data buffer. This means that if you create a slice and modify it, you are actually going to modify the original `ndarray` as well!

In [None]:
a_slice = a[2:6]
a_slice[1] = 1000
a  # the original array was modified!

In [None]:
a[3] = 2000
a_slice  # similarly, modifying the original array modifies the slice!

If you want a copy of the data, you need to use the `copy` method:

In [None]:
another_slice = a[2:6].copy()
another_slice[1] = 3000
a  # the original array is untouched

In [None]:
a[3] = 4000
another_slice  # similary, modifying the original array does not affect the slice copy