# NumPy

NumPy is the fundamental package for scientific computing in Python.

In [3]:
import numpy as np
np.__version__

'1.18.1'

## The Basic

NumPy’s main object is the homogeneous multidimensional array. It is a table of elements (usually numbers), all of the same type, indexed by a tuple of non-negative integers. In NumPy dimensions are called axes.

For example, the coordinates of a point in 3D space [1, 2, 1] has one axis. That axis has 3 elements in it, so we say it has a length of 3. 

### Array Creation

There are several ways to create arrays.

1. For example, you can create an array from a regular Python list or tuple using the array function. The type of the resulting array is deduced from the type of the elements in the sequences.

In [4]:
a=np.array([2,3,4])
a

array([2, 3, 4])

NOTICE:

A frequent error consists in calling array with multiple numeric arguments, rather than providing a single list of numbers as an argument.

In [5]:
#a = np.array(1,2,3,4)    # WRONG
a = np.array([1,2,3,4])  # RIGHT

2D array:

In [6]:
b=np.array([[1,2,3],
            [4,5,6]])
b

array([[1, 2, 3],
       [4, 5, 6]])

3D array:

In [7]:
c=np.array([[1,2,3],
            [4,5,6],
            [7,8,9]])
c

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

The type of the array can also be explicitly specified at creation time:

In [8]:
d = np.array( [ [1,2], [3,4] ], dtype=complex )
d

array([[1.+0.j, 2.+0.j],
       [3.+0.j, 4.+0.j]])

In [9]:
e = np.array( [ [1,2], [3,4] ], dtype='float32' )
e

array([[1., 2.],
       [3., 4.]], dtype=float32)

2. To create sequences of numbers, NumPy provides a function analogous to range that returns arrays instead of lists.

In [10]:
np.arange( 10, 30, 5 )

array([10, 15, 20, 25])

Also it accepts float arguments

In [11]:
np.arange( 0, 2, 0.3 )

array([0. , 0.3, 0.6, 0.9, 1.2, 1.5, 1.8])

3. When arange is used with floating point arguments, it is generally not possible to predict the number of elements obtained, due to the finite floating point precision. For this reason, it is usually better to use the function linspace that receives as an argument the number of elements that we want, instead of the step:

In [12]:
np.linspace( 0, 2, 9 )                 # 9 numbers from 0 to 2

array([0.  , 0.25, 0.5 , 0.75, 1.  , 1.25, 1.5 , 1.75, 2.  ])

In [13]:
from numpy import pi
x = np.linspace( 0, 2*pi, 100 )        # useful to evaluate function at lots of points
f = np.sin(x)

For set precision of an array:

In [14]:
a=np.random.random((2,3))
a

array([[0.81621903, 0.66408771, 0.50022361],
       [0.65540712, 0.00284629, 0.58907271]])

In [15]:
np.set_printoptions(precision=1,suppress=True)
a

array([[0.8, 0.7, 0.5],
       [0.7, 0. , 0.6]])

### Array Attributes

NumPy’s array class is called ndarray. It is also known by the alias array. Note that numpy.array is not the same as the Standard Python Library class array.array, which only handles one-dimensional arrays and offers less functionality. The more important attributes of an ndarray object are:

In [16]:
a=np.arange(6)
a

array([0, 1, 2, 3, 4, 5])

In [17]:
b=np.arange(10,30,5).reshape(2,2)
b

array([[10, 15],
       [20, 25]])

In [18]:
c = np.arange(15).reshape(3, 5)
c

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

#### ndarray.ndim

the number of axes (dimensions) of the array.

In [20]:
b.ndim

2

#### ndarray.shape

the dimensions of the array. This is a tuple of integers indicating the size of the array in each dimension. For a matrix with n rows and m columns, shape will be (n,m). The length of the shape tuple is therefore the number of axes, ndim.

In [21]:
c.shape

(3, 5)

#### ndarray.size

the total number of elements of the array. This is equal to the product of the elements of shape.

In [60]:
c.size

15

#### ndarray.dtype

an object describing the type of the elements in the array. One can create or specify dtype’s using standard Python types. Additionally NumPy provides types of its own. numpy.int32, numpy.int16, and numpy.float64 are some examples.

In [61]:
c.dtype

dtype('int64')

#### ndarray.itemsize

the size in bytes of each element of the array. For example, an array of elements of type float64 has itemsize 8 (=64/8), while one of type complex32 has itemsize 4 (=32/8). It is equivalent to ndarray.dtype.itemsize.

In [62]:
c.itemsize

8

#### ndarray.data

the buffer containing the actual elements of the array. Normally, we won’t need to use this attribute because we will access the elements in an array using indexing facilities.

In [63]:
a.data

<memory at 0x7f00643c7a08>

More example:

#### 0D tensor: Scalar

In [64]:
X0 = np.array(12)
print("X0 dimentions: ", X0.ndim)
print("X0 shape: ", X0.shape)
print("X0 type: ", X0.dtype)

X0 dimentions:  0
X0 shape:  ()
X0 type:  int64


#### 1D tensor: Vector

In [65]:
X1 = np.array([12.5,3,6.4,4])
print("X1 dimentions: ", X1.ndim)
print("X1 shape: ", X1.shape)
print("X1 type: ", X1.dtype)

X1 dimentions:  1
X1 shape:  (4,)
X1 type:  float64


#### 2D tensor: Matrix

In [66]:
X2 = np.array([[1,3,6,4],
              [3,43,1,2],
              [14,5,7,4]])
print("X2 dimentions: ", X2.ndim)
print("X2 shape: ", X2.shape)
print("X2 type: ", X2.dtype)

X2 dimentions:  2
X2 shape:  (3, 4)
X2 type:  int64


### Random Function

In [9]:
a=np.random.random((2,3))
a

array([[0.28, 0.25, 0.8 ],
       [0.16, 0.73, 0.43]])

In [10]:
a=np.random.randint(1,10,5)
a

array([6, 6, 9, 8, 2])

### Array creation routines

Often, the elements of an array are originally unknown, but its size is known. Hence, NumPy offers several functions to create arrays with initial placeholder content. These minimize the necessity of growing arrays, an expensive operation.

The function zeros creates an array full of zeros, the function ones creates an array full of ones, and the function empty creates an array whose initial content is random and depends on the state of the memory. By default, the dtype of the created array is float64.

In [67]:
np.zeros( (3,4) )

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [68]:
np.ones( (2,3,4), dtype=np.int16 )                # dtype can also be specified

array([[[1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1]],

       [[1, 1, 1, 1],
        [1, 1, 1, 1],
        [1, 1, 1, 1]]], dtype=int16)

In [69]:
np.empty( (2,3) )                                 # uninitialized, output may vary

array([[4.67718176e-310, 0.00000000e+000, 0.00000000e+000],
       [0.00000000e+000, 0.00000000e+000, 0.00000000e+000]])

### Printing Arrays

When you print an array, NumPy displays it in a similar way to nested lists, but with the following layout:

- the last axis is printed from left to right,
- the second-to-last is printed from top to bottom,
- the rest are also printed from top to bottom, with each slice separated from the next by an empty line.

One-dimensional arrays are then printed as rows, bidimensionals as matrices and tridimensionals as lists of matrices.

In [70]:
a = np.arange(6)                         # 1d array
print(a)

[0 1 2 3 4 5]


In [71]:
b = np.arange(12).reshape(4,3)           # 2d array
print(b)

[[ 0  1  2]
 [ 3  4  5]
 [ 6  7  8]
 [ 9 10 11]]


In [72]:
c = np.arange(24).reshape(2,3,4)         # 3d array
print(c)

[[[ 0  1  2  3]
  [ 4  5  6  7]
  [ 8  9 10 11]]

 [[12 13 14 15]
  [16 17 18 19]
  [20 21 22 23]]]


If an array is too large to be printed, NumPy automatically skips the central part of the array and only prints the corners:

In [73]:
print(np.arange(10000))

[   0    1    2 ... 9997 9998 9999]


In [74]:
print(np.arange(10000).reshape(100,100))

[[   0    1    2 ...   97   98   99]
 [ 100  101  102 ...  197  198  199]
 [ 200  201  202 ...  297  298  299]
 ...
 [9700 9701 9702 ... 9797 9798 9799]
 [9800 9801 9802 ... 9897 9898 9899]
 [9900 9901 9902 ... 9997 9998 9999]]


To disable this behaviour and force NumPy to print the entire array, you can change the printing options using <span style="color:red">set_printoptions</span>.

In [75]:
#import sys
#np.set_printoptions(threshold=sys.maxsize)       # sys module should be imported
#np.set_printoptions(threshold=np.inf)

### Universal Functions

NumPy provides familiar mathematical functions such as sin, cos, and exp. In NumPy, these are called “<span style="color:red">universal functions</span>” (ufunc). Within NumPy, these functions operate elementwise on an array, producing an array as output.

In [76]:
B = np.arange(3)
B

array([0, 1, 2])

In [77]:
np.exp(B)

array([1.        , 2.71828183, 7.3890561 ])

In [78]:
np.sqrt(B)

array([0.        , 1.        , 1.41421356])

In [79]:
np.sin(B)

array([0.        , 0.84147098, 0.90929743])

In [80]:
C = np.array([2., -1., 4.])
C

array([ 2., -1.,  4.])

In [81]:
np.add(B, C)

array([2., 0., 6.])

### Basic Operations

Arithmetic operators on arrays apply elementwise. A new array is created and filled with the result.

In [82]:
a = np.array( [20,30,40,50] )
a

array([20, 30, 40, 50])

In [83]:
b = np.arange( 4 )
b

array([0, 1, 2, 3])

In [84]:
c=a-b
c

array([20, 29, 38, 47])

In [85]:
b**2

array([0, 1, 4, 9])

In [86]:
10*np.sin(a)

array([ 9.12945251, -9.88031624,  7.4511316 , -2.62374854])

In [87]:
a<35

array([ True,  True, False, False])

Unlike in many matrix languages, the product operator * operates elementwise in NumPy arrays. The matrix product can be performed using the @ operator (in python >=3.5) or the dot function or method:

In [88]:
A = np.array( [[1,1],
               [0,1]] )
B = np.array( [[2,0],
               [3,4]] )

Elementwise product:

In [89]:
A * B

array([[2, 0],
       [0, 4]])

Matrix product:

In [90]:
A@B

array([[5, 4],
       [3, 4]])

Another matrix product:

In [91]:
np.dot(A,B)     #or A.dot(B)

array([[5, 4],
       [3, 4]])

Some operations, such as += and *=, act in place to modify an existing array rather than create a new one.

In [92]:
a = np.ones((2,3), dtype=int)
b = np.random.random((2,3))

In [93]:
a *= 3
a

array([[3, 3, 3],
       [3, 3, 3]])

In [94]:
b += a
b

array([[3.95049525, 3.88305307, 3.27878775],
       [3.19956348, 3.10677591, 3.25301166]])

In [95]:
a += b                  # b is not automatically converted to integer type

UFuncTypeError: Cannot cast ufunc 'add' output from dtype('float64') to dtype('int64') with casting rule 'same_kind'

When operating with arrays of different types, the type of the resulting array corresponds to the more general or precise one (a behavior known as upcasting).

In [96]:
a = np.ones(3, dtype=np.int32)
b = np.linspace(0,pi,3)

In [98]:
b.dtype.name

'float64'

In [99]:
c = a+b
c

array([1.        , 2.57079633, 4.14159265])

In [104]:
c.dtype.name

'float64'

In [105]:
d = np.exp(c*1j)
d

array([ 0.54030231+0.84147098j, -0.84147098+0.54030231j,
       -0.54030231-0.84147098j])

In [106]:
d.dtype.name

'complex128'

Many unary operations, such as computing the sum of all the elements in the array, are implemented as methods of the ndarray class.

In [107]:
a = np.random.random((2,3))
a

array([[0.29311509, 0.93323196, 0.07294179],
       [0.78993439, 0.11464553, 0.15210529]])

In [108]:
a.sum()

2.355974058263352

In [109]:
a.min()

0.07294178953971897

In [110]:
a.max()

0.9332319630220062

In [112]:
a.mean()

0.392662343043892

In [113]:
a.var()

0.11622208769246933

In [121]:
a.std()               #standard deviation

0.3409136073735827

By default, these operations apply to the array as though it were a list of numbers, regardless of its shape. However, by specifying the axis parameter you can apply an operation along the specified axis of an array:

In [122]:
b = np.arange(12).reshape(3,4)
b

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [123]:
b.sum(axis=0)                            # sum of each column

array([12, 15, 18, 21])

In [124]:
b.min(axis=1)                            # min of each row

array([0, 4, 8])

In [125]:
b.cumsum(axis=1)                         # cumulative sum along each row

array([[ 0,  1,  3,  6],
       [ 4,  9, 15, 22],
       [ 8, 17, 27, 38]])

## Shape Manipulation

### Changing the shape of an array

An array has a shape given by the number of elements along each axis:

In [2]:
a = np.floor(10*np.random.random((3,4)))
a

array([[2., 3., 5., 7.],
       [5., 3., 2., 7.],
       [7., 6., 3., 0.]])

In [3]:
a.shape

(3, 4)

The shape of an array can be changed with various commands. Note that the following three commands all return a modified array, but do not change the original array:

In [4]:
a.ravel()  # returns the array, flattened

array([2., 3., 5., 7., 5., 3., 2., 7., 7., 6., 3., 0.])

In [5]:
a.reshape(6,2)  # returns the array with a modified shape

array([[2., 3.],
       [5., 7.],
       [5., 3.],
       [2., 7.],
       [7., 6.],
       [3., 0.]])

In [6]:
a.T  # returns the array, transposed

array([[2., 5., 7.],
       [3., 3., 6.],
       [5., 2., 3.],
       [7., 7., 0.]])

In [7]:
a.shape

(3, 4)

In [8]:
a.T.shape

(4, 3)

The <span style="color:red">reshape</span> function returns its argument with a modified shape, whereas the <span style="color:red">ndarray.resize</span> method modifies the array itself:

In [9]:
a

array([[2., 3., 5., 7.],
       [5., 3., 2., 7.],
       [7., 6., 3., 0.]])

In [11]:
a.resize((2,6))
a

array([[2., 3., 5., 7., 5., 3.],
       [2., 7., 7., 6., 3., 0.]])

In [12]:
a.reshape(3,-1)

array([[2., 3., 5., 7.],
       [5., 3., 2., 7.],
       [7., 6., 3., 0.]])

Shuffle th array:

In [14]:
a=np.arange(10)
a

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [15]:
np.random.shuffle(a)
a

array([1, 2, 4, 5, 9, 8, 0, 3, 7, 6])

In [16]:
np.random.choice(a)

3

### Stacking together different arrays

Several arrays can be stacked together along different axes:

In [14]:
a = np.floor(10*np.random.random((2,2)))
a

array([[8., 2.],
       [2., 6.]])

In [15]:
b = np.floor(10*np.random.random((2,2)))
b

array([[5., 9.],
       [1., 1.]])

In [16]:
np.vstack((a,b))

array([[8., 2.],
       [2., 6.],
       [5., 9.],
       [1., 1.]])

In [17]:
np.hstack((a,b))

array([[8., 2., 5., 9.],
       [2., 6., 1., 1.]])

The function <span style="color:blue">column_stack</span> stacks 1D arrays as columns into a 2D array. It is equivalent to <span style="color:blue">hstack</span> only for 2D arrays:

In [19]:
from numpy import newaxis
np.column_stack((a,b))     # with 2D arrays

array([[8., 2., 5., 9.],
       [2., 6., 1., 1.]])

In [20]:
a = np.array([4.,2.])
b = np.array([3.,8.])
np.column_stack((a,b))     # returns a 2D array

array([[4., 3.],
       [2., 8.]])

In [21]:
np.hstack((a,b))           # the result is different

array([4., 2., 3., 8.])

In [22]:
a[:,newaxis]               # this allows to have a 2D columns vector

array([[4.],
       [2.]])

In [23]:
np.column_stack((a[:,newaxis],b[:,newaxis]))

array([[4., 3.],
       [2., 8.]])

In [24]:
np.hstack((a[:,newaxis],b[:,newaxis]))   # the result is the same

array([[4., 3.],
       [2., 8.]])

On the other hand, the function <span style="color:blue">ma.row_stack</span> is equivalent to <span style="color:blue">vstack</span> for any input arrays. In general, for arrays with more than two dimensions, <span style="color:blue">hstack</span> stacks along their second axes, <span style="color:blue">vstack</span> stacks along their first axes, and <span style="color:blue">concatenate</span> allows for an optional arguments giving the number of the axis along which the concatenation should happen.

Note

In complex cases, <span style="color:blue">r_</span> and <span style="color:blue">c_</span> are useful for creating arrays by stacking numbers along one axis. They allow the use of range literals (“:”)

In [25]:
np.r_[1:4,0,4]

array([1, 2, 3, 0, 4])

When used with arrays as arguments, <span style="color:blue">r_</span> and <span style="color:blue">c_</span> are similar to <span style="color:blue">vstack</span> and <span style="color:blue">hstack</span> in their default behavior, but allow for an optional argument giving the number of the axis along which to concatenate.

### Splitting one array into several smaller one

Using <span style="color:blue">hsplit</span>, you can split an array along its horizontal axis, either by specifying the number of equally shaped arrays to return, or by specifying the columns after which the division should occur:

In [3]:
a = np.floor(10*np.random.random((2,12)))
a

array([[3., 5., 6., 7., 1., 3., 7., 6., 7., 5., 0., 7.],
       [3., 7., 3., 1., 0., 3., 6., 6., 7., 5., 9., 6.]])

In [4]:
np.hsplit(a,3)   # Split a into 3

[array([[3., 5., 6., 7.],
        [3., 7., 3., 1.]]), array([[1., 3., 7., 6.],
        [0., 3., 6., 6.]]), array([[7., 5., 0., 7.],
        [7., 5., 9., 6.]])]

In [5]:
np.hsplit(a,(3,4))   # Split a after the third and the fourth column

[array([[3., 5., 6.],
        [3., 7., 3.]]), array([[7.],
        [1.]]), array([[1., 3., 7., 6., 7., 5., 0., 7.],
        [0., 3., 6., 6., 7., 5., 9., 6.]])]

<span style="color:blue">vsplit</span> splits along the vertical axis, and <span style="color:blue">array_split</span> allows one to specify along which axis to split.

## Copies and Views

When operating and manipulating arrays, their data is sometimes copied into a new array and sometimes not. This is often a source of confusion for beginners. There are three cases:

## Import text file

In [35]:
data=np.loadtxt("data.txt",dtype=np.uint8, delimiter=",",skiprows=1)
data

array([[1, 2, 3],
       [3, 4, 5],
       [6, 7, 8]], dtype=uint8)

## Reference:

    1- https://numpy.org/devdocs/user/quickstart.html#shape-manipulation