# NumPy 

NumPy is a Linear Algebra Library for Python. The reason it is so important for Data Science with Python is that almost all of the libraries in the Python-Data Ecosystem rely on NumPy as one of their main building blocks.

Numpy is extremely fast and useful for all matrix manipulation problems. Image processing and Computer vision are areas of advanced application.

## Using NumPy

Once you've installed NumPy you can import it as a library.
Numpy comes with Anaconda distribution of python by default, so you are good to go if you have Anaconda installed

In [1]:
import numpy as np

Numpy has many built-in functions and capabilities. We cant't cover them all but instead we will focus on some of the most important aspects of Numpy: vectors,arrays,matrices, and number generation. 

# Numpy Arrays

NumPy arrays are the main way we use Numpy in data science. Numpy arrays come in two ways: vectors and matrices. Vectors are strictly 1-dimensional arrays and matrices can be 2-dimensional (but you should note a matrix can still have only one row or one column).

Let's begin our introduction by exploring how to create NumPy arrays.

## Creating NumPy Arrays

### From a Python List

We can create an array by directly converting a list or list of lists:

In [2]:
sample_list = [1,2,3,4,5,6]
sample_list

[1, 2, 3, 4, 5, 6]

In [4]:
np.array(sample_list)

array([1, 2, 3, 4, 5, 6])

In [5]:
sample_matrix = [[10,20,30],[40,50,60],[70,80,90]]
sample_matrix

[[10, 20, 30], [40, 50, 60], [70, 80, 90]]

In [6]:
np.array(sample_matrix)

array([[10, 20, 30],
       [40, 50, 60],
       [70, 80, 90]])

In [7]:
dim3matrix = [[[10, 20],[30, 40]], [[50, 60],[70,80]], [[90,100],[110, 120]], [[90,100],[110, 120]]]
np.array(dim3matrix)

array([[[ 10,  20],
        [ 30,  40]],

       [[ 50,  60],
        [ 70,  80]],

       [[ 90, 100],
        [110, 120]],

       [[ 90, 100],
        [110, 120]]])

In [8]:
sample_tuple = (3,6,9)
sample_tuple

(3, 6, 9)

In [9]:
np.array(sample_tuple)

array([3, 6, 9])

In [10]:
sample_dict ={0:10,1:20,2:30,3:40,4:50}

In [12]:
type(sample_dict.keys())

dict_keys

In [14]:
tuple(sample_dict.keys())

(0, 1, 2, 3, 4)

In [16]:
np.array(tuple(sample_dict.keys()))

array([0, 1, 2, 3, 4])

In [17]:
np.array(tuple(sample_dict.values()))

array([10, 20, 30, 40, 50])

What if we want to return a collection of key-value pairs in our dictionary?

In [19]:
list(sample_dict.items())

[(0, 10), (1, 20), (2, 30), (3, 40), (4, 50)]

In [20]:
np.array(list(sample_dict.items()))

array([[ 0, 10],
       [ 1, 20],
       [ 2, 30],
       [ 3, 40],
       [ 4, 50]])

## Built-in Methods

There are lots of built-in methods in Numpy, we will consider the most used ones

### arange

Returns evenly spaced values within a given interval. It has a default interval of 1

In [21]:
np.arange(0,10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [22]:
np.arange(0,11,2)

array([ 0,  2,  4,  6,  8, 10])

### linspace
Returns evenly spaced numbers over a specified interval.It takes three arguments. The low and high as well as the number of items to return. The low and high are inclusive.

In [27]:
np.linspace(0,20,6)

array([ 0.,  4.,  8., 12., 16., 20.])

In [28]:
np.linspace(0,20,6, endpoint = False)

array([ 0.        ,  3.33333333,  6.66666667, 10.        , 13.33333333,
       16.66666667])

## Random 

Numpy also has lots of ways to create random number arrays:

### rand
Create an array of the given shape and populate it with
random samples from a uniform distribution
over ``[0, 1)``

In [30]:
np.random.rand(10)

array([0.10402218, 0.2358936 , 0.84065475, 0.8492015 , 0.01949417,
       0.96412721, 0.74617418, 0.28109075, 0.33516056, 0.48634035])

In [31]:
np.random.rand(3,2)

array([[0.85138031, 0.82310643],
       [0.45838854, 0.18348454],
       [0.35377905, 0.03281761]])

In [32]:
np.random.rand(4,3,2)

array([[[0.24297053, 0.62054653],
        [0.73431944, 0.71570273],
        [0.39979985, 0.8886768 ]],

       [[0.90468449, 0.06709277],
        [0.0996484 , 0.04065687],
        [0.51811424, 0.24932416]],

       [[0.22196698, 0.79881862],
        [0.34940514, 0.45534528],
        [0.12365614, 0.97056969]],

       [[0.06757215, 0.03196877],
        [0.80477655, 0.10310146],
        [0.15725005, 0.25715685]]])

### randint

returns random integer from interval [low, high) Returns one integer by default unless size is specified

In [46]:
#np.random.seed(4)
np.random.randint(15)

10

In [50]:
np.random.randint(15,50)

In [49]:
# size argument specifies the number of items to return
np.random.randint(1,7, size=5)

array([1, 1, 3, 2, 3])

In [51]:
np.random.randint(1,7, size=(3,5))

array([[5, 6, 2, 1, 5],
       [3, 5, 3, 5, 4],
       [1, 6, 6, 2, 6]])

### choice

returns a random sample from a given array. By default it returns just one sample

In [53]:
np.random.choice([0,1, 6, 3, 9])

6

In [55]:
np.random.choice?

In [54]:
np.random.choice([21,12,35,14,5,6], size=5)

array([ 6, 21, 12, 35, 35])

In [56]:
np.random.choice([21,12,35,14,5,6], size=5, replace = False)

array([ 5, 12, 35, 14,  6])

### nan

This is a numpy constant. In computing, not a number is a numeric data type that can be interpreted as a value that is undefined. We can use " not a number " to represent missing or null values in a dataset. Unfortunately, dirty data sets contain null values with other denominations (e.g. Unknown, — , and n/a), making it difficult to detect and drop them.

In [57]:
np.nan

nan

In [58]:
np.repeat?

In [60]:
nanarray= np.repeat(6,5)
nanarray

array([6, 6, 6, 6, 6])

### zeros and ones

Generate arrays of zeros or ones. It can only take one argument which can be an integer or a tuple

In [61]:
np.zeros(3)

array([0., 0., 0.])

In [62]:
np.zeros((3,3))

array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

In [63]:
np.ones(13)

array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])

In [None]:
np.eye(3)

In [None]:
np.diag([[4,6],[2,3]])

In [None]:
np.diag([[4,6],[2,3]], k = 1)

## Array Attributes and Methods

Let's discuss some useful attributes and methods or an array:

In [68]:
array = np.arange(25)
array

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24])

In [65]:
# the code below will generate 10 random integers between 0 and 50
ranarray = np.random.randint(0,50,10)
ranarray

array([33, 32, 45, 28,  3, 15, 34, 31, 17, 48])

In [66]:
ranarray.shape

(10,)

## Reshape

Returns an array containing the same data with a new shape.
Consider the total number of values!

In [70]:
array = np.arange(24)
array.shape

(24,)

In [74]:
array.reshape(3,2,4)

array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7]],

       [[ 8,  9, 10, 11],
        [12, 13, 14, 15]],

       [[16, 17, 18, 19],
        [20, 21, 22, 23]]])

In [77]:
array.reshape( -1,4, 2)

array([[[ 0,  1],
        [ 2,  3],
        [ 4,  5],
        [ 6,  7]],

       [[ 8,  9],
        [10, 11],
        [12, 13],
        [14, 15]],

       [[16, 17],
        [18, 19],
        [20, 21],
        [22, 23]]])

In [None]:
array.reshape( 2, -1)

### max,min,argmax,argmin

These are useful methods for finding max or min values. Or to find their index locations using argmin or argmax

In [78]:
ranarray

array([33, 32, 45, 28,  3, 15, 34, 31, 17, 48])

In [79]:
ranarray.max()

48

In [80]:
ranarray.argmax()

9

In [81]:
ranarray.min()

3

In [82]:
ranarray.argmin()

4

In [83]:
ranarray2d = ranarray.reshape(5,2)
ranarray2d

array([[33, 32],
       [45, 28],
       [ 3, 15],
       [34, 31],
       [17, 48]])

In [84]:
ranarray2d.max()

48

In [85]:
ranarray2d.max(axis = 0)

array([45, 48])

In [86]:
ranarray2d.max(axis = 1)

array([33, 45, 15, 34, 48])

## Shape

This is an attribute of an array and not a method. It returns tuple of the size of each array dimension.

In [87]:
array1 = np.array([4,5,0,9, 1,2])
array1.shape

(6,)

In [88]:
array2 = np.array([[4,5,0,9, 1,2]])
array2.shape

(1, 6)

In [89]:
array3 = np.array([[4],[5],[0],[9], [1],[2]], dtype = 'float32')
array3.shape

(6, 1)

In [None]:
array3

In [90]:
array4 = np.array([[4,5], [0,9], [1,2]])
array4.shape

(3, 2)

The size attribute of an array, returns the number of elements in the array

In [91]:
array4.size

6

The dtype attribute returns the datatype of the array values

In [92]:
#attribute to specify the data type in an array
array1.dtype

dtype('int32')

The ndim attribute returns the number of dimensions of the array

In [None]:
array4.ndim

In [None]:
array1.ndim