# Numpy
During this lesson, we will learn the basic python libraries for machine learning. I show you which are the commands that are somehow important for machine learning about `numpy`, `pandas`, `matplotlib`, `seaborn` and `plotly`. For more information, as always you have: the docs, the sources, the help with `?` symbol at the begining, the forums, stackoverflow, and a large etc.

* NumPy 
* pandas 
* matplotlib
* seaborn
* plotly

In [1]:
import numpy as np
from answ2 import *

## What is NumPy

NumPy’s main object is the homogeneous multidimensional array. It is a table of elements (usually numbers), all of the same type, indexed by a tuple of positive integers. In NumPy dimensions are called axes.
http://www.numpy.org/

### The Basics 

NumPy’s array class is called `ndarray`

- **ndarray.ndim**
    the number of axes (dimensions) of the array.
- **ndarray.shape**
    the dimensions of the array. This is a tuple of integers indicating the size of the array in each dimension. For a matrix with n rows and m columns, shape will be (n,m). The length of the shape tuple is therefore the number of axes, ndim.
- **ndarray.size**
    the total number of elements of the array. This is equal to the product of the elements of shape.
- **ndarray.dtype**
    an object describing the type of the elements in the array. One can create or specify dtype’s using standard Python types. Additionally NumPy provides types of its own. numpy.int32, numpy.int16, and numpy.float64 are some examples.

In [2]:
a = np.arange(15).reshape(3, 5); a

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

To create sequences of numbers, NumPy provides a function analogous to `range` that returns arrays instead of lists.` aragne`, It accepts float arguments

In [3]:
a.shape

(3, 5)

In [4]:
a.ndim

2

In [5]:
a.dtype

dtype('int64')

In [6]:
a.size

15

In [7]:
type(a)

numpy.ndarray

### Array Creation
it can be created from a list or tuples

In [8]:
np.array([2,3,4])

array([2, 3, 4])

In [9]:
np.array((2,3,4))

array([2, 3, 4])

we can nest list if we need more dimensions

In [10]:
b = np.array([[1,2,3],[4,5,6]]); b

array([[1, 2, 3],
       [4, 5, 6]])

In [11]:
b.shape

(2, 3)

we can specify the data type in the array creation

In [12]:
np.array([[1,2],[3,4]]).dtype

dtype('int64')

In [13]:
np.array([[1,2],[3,4]],dtype=float).dtype

dtype('float64')

In [14]:
np.array([[1,2],[3,4]],dtype=complex).dtype

dtype('complex128')

Often, the elements of an array are originally unknown, but its size is known. Hence, NumPy offers several functions to create arrays with initial placeholder content. These minimize the necessity of growing arrays, an expensive operation.

In [15]:
np.zeros( (3,4) )

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [16]:
np.ones( (3,4) )

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

In [17]:
np.empty( (3,3) )

array([[2.31619951e-316, 2.33245546e-316, 7.76598372e-299],
       [9.11576605e-304, 1.53719845e-202, 6.92768354e-246],
       [5.01743526e-111, 7.45317905e-304, 3.95252517e-322]])

we have already seen the arange method, the are another similar method that could be usefull when we want to determine the number of steps isntead of the step.

In [18]:
np.linspace(0,2*np.pi,10)

array([0.        , 0.6981317 , 1.3962634 , 2.0943951 , 2.7925268 ,
       3.4906585 , 4.1887902 , 4.88692191, 5.58505361, 6.28318531])

### Basic Operations
Now algebra is working!

In [19]:
a = np.array( [20,30,40,50] )
b = np.arange(4)
a,b

(array([20, 30, 40, 50]), array([0, 1, 2, 3]))

In [20]:
a-b

array([20, 29, 38, 47])

In [21]:
b**2

array([0, 1, 4, 9])

In [22]:
10*np.sin(a)

array([ 9.12945251, -9.88031624,  7.4511316 , -2.62374854])

In [23]:
a<35

array([ True,  True, False, False])

Unlike in many matrix languages, the product operator * operates elementwise in NumPy arrays. The matrix product can be performed using the @ operator or the dot function or method:

In [24]:
A = np.array( [[1,1],[0,1]] ); A

array([[1, 1],
       [0, 1]])

In [25]:
B = np.array( [[2,0], [3,4]] ); B

array([[2, 0],
       [3, 4]])

In [26]:
A*B

array([[2, 0],
       [0, 4]])

In [27]:
A@B

array([[5, 4],
       [3, 4]])

In [28]:
A.dot(B)

array([[5, 4],
       [3, 4]])

Many operations, such as computing the sum of all the elements in the array, are implemented as methods of the ndarray class.

In [29]:
np.random.seed(0) # important to replicate results
a = np.random.random((2,3)); a

array([[0.5488135 , 0.71518937, 0.60276338],
       [0.54488318, 0.4236548 , 0.64589411]])

In [30]:
a.sum()

3.481198341773846

In [31]:
a.sum(axis=0)

array([1.09369669, 1.13884417, 1.24865749])

In [32]:
a.min(),a.max()

(0.4236547993389047, 0.7151893663724195)

In [33]:
a.min(axis=0)

array([0.54488318, 0.4236548 , 0.60276338])

NumPy provides familiar mathematical functions such as sin, cos, and exp. In NumPy, these are called “universal functions”(ufunc). Within NumPy, these functions operate elementwise on an array, producing an array as output.

In [34]:
B = np.arange(3); B

array([0, 1, 2])

In [35]:
np.exp(B)

array([1.        , 2.71828183, 7.3890561 ])

In [36]:
np.sqrt(B)

array([0.        , 1.        , 1.41421356])

In [37]:
C = np.array([2., -1., 4.])

In [38]:
np.add(B, C)

array([2., 0., 6.])

### Indexing, Slicing and Iterating
#### One-dimensional

In [39]:
a = np.arange(10)**3

In [40]:
a[2]

8

In [41]:
a[2:5]

array([ 8, 27, 64])

In [42]:
a[:6:2] = -1000 

In [43]:
a

array([-1000,     1, -1000,    27, -1000,   125,   216,   343,   512,
         729])

reversed

In [44]:
a[::-1]     

array([  729,   512,   343,   216,   125, -1000,    27, -1000,     1,
       -1000])

#### Multidimensional

In [45]:
def f(x,y):
    return 10*x+y

In [46]:
b = np.fromfunction(f,(5,4),dtype=int)

In [47]:
b

array([[ 0,  1,  2,  3],
       [10, 11, 12, 13],
       [20, 21, 22, 23],
       [30, 31, 32, 33],
       [40, 41, 42, 43]])

In [48]:
b[2,3]

23

In [49]:
b[0:5,1]

array([ 1, 11, 21, 31, 41])

In [50]:
b[:,1]

array([ 1, 11, 21, 31, 41])

In [51]:
b[1:3,:]

array([[10, 11, 12, 13],
       [20, 21, 22, 23]])

The dots (...) represent as many colons as needed to produce a complete indexing tuple. For example, if x is an array with 5 axes, then

$x[1,2,...]$ is equivalent to $x[1,2,:,:,:]$

### Shape Manipulation 

In [52]:
a = np.floor(10*np.random.random((3,4)))

In [53]:
a

array([[4., 8., 9., 3.],
       [7., 5., 5., 9.],
       [0., 0., 0., 8.]])

In [54]:
a.shape

(3, 4)

In [55]:
a.ravel()  # returns the array, flattened

array([4., 8., 9., 3., 7., 5., 5., 9., 0., 0., 0., 8.])

In [56]:
a.reshape(6,2)  # returns the array with a modified shape

array([[4., 8.],
       [9., 3.],
       [7., 5.],
       [5., 9.],
       [0., 0.],
       [0., 8.]])

In [57]:
a.T  # returns the array, transposed

array([[4., 7., 0.],
       [8., 5., 0.],
       [9., 5., 0.],
       [3., 9., 8.]])

In [58]:
a.shape, a.T.shape

((3, 4), (4, 3))

### Less Basic

#### Broadcasting rules

Broadcasting allows universal functions to deal in a meaningful way with inputs that do not have exactly the same shape.

The first rule of broadcasting is that if all input arrays do not have the same number of dimensions, a “1” will be repeatedly prepended to the shapes of the smaller arrays until all the arrays have the same number of dimensions.

The second rule of broadcasting ensures that arrays with a size of 1 along a particular dimension act as if they had the size of the array with the largest shape along that dimension. The value of the array element is assumed to be the same along that dimension for the “broadcast” array.

After application of the broadcasting rules, the sizes of all arrays must match. More details can be found in Broadcasting.

### Fancy indexing and index tricks
#### Indexing with Arrays of Indices

In [59]:
a = np.arange(12)**2  

In [60]:
i = np.array( [ 1,1,3,8,5 ] )# an array of indices

In [61]:
a[i] # the elements of a at the positions i

array([ 1,  1,  9, 64, 25])

In [62]:
j = np.array( [ [ 3, 4], [ 9, 7 ] ] ) 

In [63]:
a[j]

array([[ 9, 16],
       [81, 49]])

#### Indexing with Boolean Arrays

In [64]:
a = np.arange(12).reshape(3,4)

In [65]:
b = a > 4

In [66]:
b  

array([[False, False, False, False],
       [False,  True,  True,  True],
       [ True,  True,  True,  True]])

In [67]:
a[b]

array([ 5,  6,  7,  8,  9, 10, 11])

### Linear Algebra 
#### Simple Array Operations

In [68]:
a =  np.array([[1.0, 2.0], [3.0, 4.0]])

In [69]:
a

array([[1., 2.],
       [3., 4.]])

In [70]:
a.transpose()

array([[1., 3.],
       [2., 4.]])

In [71]:
np.linalg.inv(a)

array([[-2. ,  1. ],
       [ 1.5, -0.5]])

In [72]:
np.trace(a)

5.0

In [73]:
y = np.array([[5.], [7.]])

In [74]:
np.linalg.solve(a, y)

array([[-3.],
       [ 4.]])

In [75]:
np.linalg.eig(j)

(array([-1.32455532, 11.32455532]), array([[-0.67902243, -0.43310182],
        [ 0.73411752, -0.901345  ]]))

### Tricks and Tips
#### “Automatic” Reshaping

In [76]:
a = np.arange(30)

In [77]:
a.shape = 2,-1,3  # -1 means "whatever is needed"

In [78]:
a.shape

(2, 5, 3)

#### Vector Stacking

In [79]:
x = np.arange(0,10,2)

In [80]:
y = np.arange(5) 

In [81]:
m = np.vstack([x,y]) 

In [82]:
xy = np.hstack([x,y]) 

### Exercicies

1. Import numpy as np and see the version

In [83]:
#numpy_ex1()

2. Test element-wise for NaN of a given array.

In [84]:
a = np.array([1, 0, np.nan, np.inf])

In [85]:
#numpy_ex2()

3. Create an array of the integers from 30 to70

In [86]:
#numpy_ex3()

4.  Create a vector of length 10 with values evenly distributed between 5 and 50

In [87]:
#numpy_ex4()

5. Create a 10x10 matrix, in which the elements on the borders will be equal to 1, and inside 0.

In [88]:
#numpy_ex5()

6. Save a given array to a binary file.

In [89]:
a = np.arange(20)

In [90]:
#numpy_ex6()

7. Convert a given array into a list and then convert it into a list again.

In [91]:
a = [[1, 2], [3, 4]]

In [92]:
#numpy_ex7()

8. Create a numpy array of ints and convert it to floats.

In [93]:
a = np.array([1, 2, 3, 4]); a.dtype

dtype('int64')

In [94]:
#numpy_ex8()

9.  Convert the values of Centigrade degrees into Fahrenheit degrees:

formula: (32°F − 32) × 5/9 = 0°C

In [95]:
fvalues = [0, 12, 45.21, 34, 99.91]

In [96]:
#numpy_ex9()

10. Find the set exclusive-or of two arrays

In [97]:
array1 = np.array([0, 10, 20, 40, 60, 80])
array2 = [10, 30, 40, 50, 70]

In [98]:
#numpy_ex10()

11. Compare two arrays using numpy

In [99]:
a = np.array([1, 2])
b = np.array([4, 5])

In [100]:
#numpy_ex11()

12. Change the dimension of an array, to 3,3

In [101]:
x = np.array([1,2,3,4,5,6,7,8,9])

In [102]:
#numpy_ex12()

13. Insert a new axis within a 2-D array, the final shape has to be (3,1,4)

In [103]:
x = np.zeros((3, 4))

In [104]:
#numpy_ex13()

14. Compute the determinant of a given square array

In [105]:
a = np.array([[1, 0], [1, 2]])

In [106]:
#numpy_ex14()

15. Compute the eigenvalues and right eigenvectors of a given square array

In [107]:
m = np.mat("3 -2;1 0");m

matrix([[ 3, -2],
        [ 1,  0]])

In [108]:
#numpy_ex15()

16. Generate five random numbers from the normal distribution

In [109]:
#numpy_ex16()

17. Generate the same five random numbers 2 times from the normal distribution

In [110]:
#numpy_ex17()

18. Generate six random integers between 10 and 30

In [111]:
#numpy_ex18()

19.  create a 3x3x3 array with random values.

In [112]:
#numpy_ex19()

We will continue to explore numpy when we see the visualization tools