# For Checking Numpy version

In [1]:
!conda list numpy

# packages in environment at C:\Users\muhammad_ali\Anaconda3:
#
# Name                    Version                   Build  Channel
numpy                     1.14.3           py36h9fa60d3_1  
numpy-base                1.14.3           py36h555522e_1  
numpydoc                  0.8.0                    py36_0  


# Numpy Documentation

- [NumPy Manual](https://docs.scipy.org/doc/numpy-1.13.0/contents.html)
- [NumPy User Guide](https://docs.scipy.org/doc/numpy-1.13.0/user/index.html)
- [NumPy Reference](https://docs.scipy.org/doc/numpy-1.13.0/reference/index.html#reference)
- [Scipy Lectures](http://www.scipy-lectures.org/intro/numpy/index.html)

# Why NumPy?

The NumPy **speed** comes from the nature of NumPy arrays being memory-efficient and from optimized algorithms used by NumPy for doing arithmetic, statistical, and linear algebra operations.Furthermore, it is built-over the language C which allows it lower-level advantage

It has **multidimensional array data structures** that can represent vectors and matrices. You will learn all about vectors and matrices in the Linear Algebra section of this course later on, and as you will soon see, a lot of machine learning algorithms rely on matrix operations. For example, when training a Neural Network, you often have to carry out many matrix multiplications. NumPy is optimized for matrix operations and it allows us to do Linear Algebra operations effectively and efficiently, making it very suitable for solving machine learning problems.

Another great advantage of NumPy over Python lists is that NumPy has a large number of optimized **built-in mathematical functions.**

# Importing NumPy

In [2]:
import numpy as np # as == alias

import time # for checking time of an 
                # operation

In [3]:
x = np.random.random(100000000)
# 100 million random floats between 0 and 1 

In [4]:
# Simple Python
start = time.time()
sum(x) / len(x) # Mean of x
print(time.time() - start)

26.463828563690186


In [5]:
# Numpy mean function 
start = time.time()
np.mean(x) # Mean of x using NumPy
print(time.time() - start)
# present - past

0.24953508377075195


# NumPy Arrays

Arrays are Grid like objects that can take many shapes and enforces every element to have a same type. It is used for optimizing big data computational operations. Ndarrays are multi-dimensional arrays that hold a group of elements that all have the same data type.

Generally, there are two ways to build an array. First, using numpy's array function to create arrays from other array like python objects like lists. Second, using variety of numpy's built-in function that quickly generates specific types of arrays.

In [8]:
x = np.array([1,2,3,4,5]) # 1-D Array 
print(x)

[1 2 3 4 5]


In [9]:
print(type(x)) # Data Structure/Container type
print()

# dtype function
print(x.dtype) # data type of each element

<class 'numpy.ndarray'>

int32


In [10]:
# shape function
x.shape # (N,) --> Vector == 1D == 1 axis

(5,)

- returns a tuple of N positive integers 
- that specifies the sizes of each dimension,
- while N being the number of dimensions

In [11]:
# 2-D array == 2 axis
D_2 = np.array([[1,2,3], [4,5,6], [7,8,9], [10,11,12]]) # 2d array == list of lists,,
print(D_2) # rows, columns OR x, y

[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]


In [13]:
D_2.shape # No.of rows 4, No. of columns 3

(4, 3)

In [14]:
# size == shape_elements_multiple
D_2.size # total number of elements in an array
            # size = row * columns 

12

The array with N dimension has a rank N. Therefore, 1-D array has a rank 1 and 2-D array has a rank 2

In [15]:
# lets create a rank1 array
x = np.array(['Hello', 'World!'])
print(x)

['Hello' 'World!']


In [16]:
print('shape:', x.shape)
print('type_of_structure:', type(x))
print('dtype:', x.dtype) # string of Unicode characters with 6 elements

shape: (2,)
type_of_structure: <class 'numpy.ndarray'>
dtype: <U6


if we try to give hetregenous data 

In [17]:
x = np.array([1, 2.5, 'World'])
print(x) # upcasted all the elements,
            # to make the array
                # homogenous

['1' '2.5' 'World']


In [18]:
print('Shape:', x.shape)
print('type:', type(x))
print('dtype:', x.dtype)

Shape: (3,)
type: <class 'numpy.ndarray'>
dtype: <U32


In [20]:
for i in x: print(type(i))

<class 'numpy.str_'>
<class 'numpy.str_'>
<class 'numpy.str_'>


In [21]:
# specifying particular dtype (typecasting)
x = np.array([1.5, 2.2, 3.7], dtype=np.int64)
print(x)
print('dtype:', x.dtype)

[1 2 3]
dtype: int64


# Saving

In [23]:
x = np.array([1,2,3,4,5])
np.save('my_array.npy',x) # save in the current dir
                            # save as my_array.npy

# Loading

In [25]:
y = np.load('my_array.npy')
print(y)

[1 2 3 4 5]


# Built-in Functions to create ndarrays

In [2]:
import numpy as np

In [3]:
# creating numpy array of zeros
# with a specified shape
X = np.zeros((3,4)) # 3 rows, 4 cols
print(X)

[[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]


In [4]:
print("dtype:", X.dtype)

dtype: float64


In [6]:
X = np.zeros((3,4), dtype=int)
print(X)

[[0 0 0 0]
 [0 0 0 0]
 [0 0 0 0]]


In [7]:
X.dtype

dtype('int32')

In [8]:
# ones
X = np.ones((4,5))
print(X)

[[1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]]


In [10]:
# creating  array with any value
X = np.full((4,3), 5.)
print(X)


[[5. 5. 5.]
 [5. 5. 5.]
 [5. 5. 5.]
 [5. 5. 5.]]


In [11]:
X = np.full((4,3), 5 , dtype=float)
print(X)

[[5. 5. 5.]
 [5. 5. 5.]
 [5. 5. 5.]
 [5. 5. 5.]]


arange- to generate arrays with specific numerical ranges

arange(start, stop, step) # takes three arguments

1D array with equally spaced intervals

In [12]:
# with only one argument
np.arange(10) # stop is exclusive -> N-1

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [13]:
# with two arguments
np.arange(1,11) # start, stop-1

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [14]:
# with all three arguments
# step defaulted to one
np.arange(1,10,2) # jump on to the second element

array([1, 3, 5, 7, 9])

A fundamental array in Linear Algebra is the identity matrix.
Matrix  == 2D Array with rows and columns

**Identity Matrix** is 

- just a squared shape matrix
- that has ones along its main diagonal
- and zeros everywhere else

In [33]:
# identity matrix
np.eye(5) # using 5 will give us 5x5 identity matrix
            # from top-left to bottom-right == diagonal
            # diagonal will be filled with ones

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])

In [34]:
# to set specific elements on diagonal
np.diag([10,20])

array([[10,  0],
       [ 0, 20]])

For non-integer steps

start, stop and N

N evenly spaced numbers from start to stop with both being inclusive.

Unlike arange, linspace requires two arguments start and stop

if n is not specified then the size of an array will default to 50
 

In [18]:
# rank 1 array with 10 numbers evenly spaced from zero to 25
np.linspace(0, 25, 10)

array([ 0.        ,  2.77777778,  5.55555556,  8.33333333, 11.11111111,
       13.88888889, 16.66666667, 19.44444444, 22.22222222, 25.        ])

In [19]:
# making stop excluded
np.linspace(0, 25, 10, endpoint=False)

array([ 0. ,  2.5,  5. ,  7.5, 10. , 12.5, 15. , 17.5, 20. , 22.5])

In [30]:
X = np.linspace(2,100) # will return an array of len = 50.
print(X)

[  2.   4.   6.   8.  10.  12.  14.  16.  18.  20.  22.  24.  26.  28.
  30.  32.  34.  36.  38.  40.  42.  44.  46.  48.  50.  52.  54.  56.
  58.  60.  62.  64.  66.  68.  70.  72.  74.  76.  78.  80.  82.  84.
  86.  88.  90.  92.  94.  96.  98. 100.]


In [32]:
len(X)

50

What if we want rank 2 arrays for arange and linspace?

We can use reshape function along with arange or linspace

# Reshape

**Reshape** - Converts any array into a specified shape

Note:

The new shape should be compatible with the number of elements in the new array.
Compatibility refers to len(array) = row x column

In [35]:
X = np.arange(20)
X

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19])

In [42]:
x = np.reshape(X, (4, 5)) # 4 x 5 = 20 = Len(X)
                            
x

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

Some Functions can also be applied as methods, this allows us to use different functions as sequence in just one line of code. Numpy methods are similar to its attributes as they can be used using . notation

In [43]:
x = np.arange(20).reshape((4, 5))
x

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

In [44]:
x = np.linspace(0,25,10 ,endpoint=False).reshape((5,2))
x

array([[ 0. ,  2.5],
       [ 5. ,  7.5],
       [10. , 12.5],
       [15. , 17.5],
       [20. , 22.5]])

# Random Numbers

let's create numpy arrays with random numbers. Often in Machine Learning, we require random matrices for example when initializing the weights of neural network.

**random function** - Create random array of a given shape with random floats btw zero and one, where zero is inclusive and one not.


In [1]:
import numpy as np

In [2]:
# NumPy library --> Random Module ==> functions
X = np.random.random((3,3))
print(X)

[[0.87479122 0.78680294 0.54901421]
 [0.70370197 0.04263657 0.89169666]
 [0.55885828 0.82215084 0.77093119]]


# Random Integers

- random integers within a particular interval
- Takes three arguments i.e. 
    - lower boundary (inclusive)
    - upper boundary (exclusive)
    - shape

In [3]:
X = np.random.randint(4, 15, (4,5))
print(X)

[[14  9 14  7 14]
 [10  6 13 11 10]
 [ 8 11  6 10  8]
 [12 14  7  7 12]]


Random Arrays that satisfies certain statistical properties. 

In [None]:
# Let's consider random array with an avg(Mean) of zero
# mean = Sum of elements / len(array)

# usage of statistics
# usage of mean 
# usage of standard deviation

# 1000 x 1000 array contains random floats drawn from
# a normal distribution
# with a given mean = 0 and standard deviation = 0.1

In [6]:
X = np.random.normal(0, 0.1, (1000,1000)) # M, S, Sh
X[:5,:5]

array([[-0.02536077,  0.08397534, -0.00807158, -0.13762636, -0.02852658],
       [ 0.05908222,  0.12873567, -0.08593089, -0.02532583, -0.20754659],
       [-0.0059777 , -0.00911044, -0.03823916, -0.05316137, -0.10648712],
       [-0.1100033 , -0.04201538,  0.11409059, -0.0670748 , -0.00805775],
       [ 0.01491143, -0.07625395,  0.23245333,  0.1327165 , -0.0076708 ]])

In [7]:
print('mean:', X.mean()) # close to zero
print('std:', X.std()) # very close to 0.1
print('max:', X.max()) # both max and min are 
                        # symmetric about zero (the avg)
print('min:', X.min())
print('postive', (X>0).sum()) # len of +ve no.s
print('negative', (X<0).sum())
# about same no. of +ve and -ve ints

mean: -5.699276389837806e-05
std: 0.09995673915064661
max: 0.47289055362544286
min: -0.49012810416081676
postive 500131
negative 499869


Because the float, or approximation, for 0.1 is actually slightly more than 0.1, when we add several of them together we can see the difference between the mathematically correct answer and the one that Python creates.

For further information click [here](https://docs.python.org/3/tutorial/floatingpoint.html)

In [8]:
0.1+0.1+0.1+0.1+0.1+0.1+0.1+0.1+0.1+0.1

0.9999999999999999

In [9]:
# Short-project
import numpy as np

# Using the built-in-
# 4 x 4 ndarray that ONLY
# contains consective odd numbers from 1 to 33 (inclusive)

X = 0 

### Accessing, Deleting, and Inserting Elements Into ndarrays

- Numpy Arrays are mutable i.e. they can be changed
- They follow indexing 
- They can be sliced

This can be used in Machine Learning, for seperating the data. For example dividing a dataset into trianing, cross-validation and testing sets.

##### Modifying Arrays

#### Modifying 2D array elements

### Deleting elements

# Adding Values to Numpy Arrays

Note: the added rows and columns must have a correct shape to match the shape of the array

# Inserting Values in the array

# Stacking

Numpy also allows us to stack numpy arrays on top of each other or side by side.

There are two options for stacking:

- Horizontal Stacking
- Vertical Stacking

The shape of the arrays must match for stacking

# Slicing

Types of Slicing:

1. ndarray[start:end] 
2. ndarray[start:]
3. ndarray[:end]

Note:
Start is included and end is excluded

# Copy vs View

# Using arrays as indices

# Using Diagonal to slice out elements

`np.diag(ndarray, k=N)`

Extracts the elements along the diagonal defined by N. As default is k=0, which refers to the main diagonal. Values of k > 0 are used to select elements in diagonals above the main diagonal, and values of k < 0 are used to select elements in diagonals below the main diagonal.

# Unique Elements Extraction