## Hello guys! Welcome to Day-3 of the sessions. Hope you were able to complete the Day-2 task and didn't face any issues. In case you do, please make sure to put them up on the issues tab.

### Lets begin with this session right away!

So today we will seeing what numpy and matplotlib is. Numpy is a python library used to handle matrices.

# Part-1 NumPy Basics

In [1]:
import numpy as np

## Create NumPy array

In [2]:
## Creating array using a list
ls1 = [1, 3, 4, 5] ## 1 dimensional list
arr1 = np.array(ls1)
print(arr1)

ls2 = [[1, 2, 3, 4],[6, 7, 8, 9]] ## Multi-dimensional list
data = np.array(ls2) ## creating a NumPy array
print(data)

print(data.ndim) 
print(data.shape)
print(data.dtype) 

[1 3 4 5]
[[1 2 3 4]
 [6 7 8 9]]
2
(2, 4)
int64


In [3]:
## Array of zeros
arr2 = np.zeros(5)
arr3 = np.zeros((1,5))
## Note the difference in shape of arr2 and arr3
arr4 = np.zeros((4,5)) ## higher dimensional array with 0s
print(arr2.shape)
print(arr3.shape)
print(arr4.shape)

(5,)
(1, 5)
(4, 5)


In [4]:
#empty creates an array without initializing its values to any particular value.
np.empty((4, 2, 2))

array([[[ 7.90505033e-323,  0.00000000e+000],
        [-1.28822975e-231, -1.28822975e-231]],

       [[ 6.91691904e-323,  0.00000000e+000],
        [ 0.00000000e+000,  0.00000000e+000]],

       [[ 0.00000000e+000,  0.00000000e+000],
        [-1.28822975e-231, -1.28822975e-231]],

       [[ 4.94065646e-323,  0.00000000e+000],
        [-1.28822975e-231, -1.28822975e-231]]])

Let's figure out some more stuff from table below
![](data/arr2.png)
![](data/arr1.png)

### Data Types for ndarrays
The data type or dtype is a special object containing the information the ndarray needs to interpret a chunk of memory as a particular type of data

In [5]:
ls = [[1, 2, 3, 4],[6, 7, 8, 9]]
arr1 = np.array(ls, dtype=np.float64)
print(arr1.dtype)
print(arr1)

float64
[[1. 2. 3. 4.]
 [6. 7. 8. 9.]]


The numerical dtypes are named the same way: a type name, like float or int, followed by a number indicating the number of bits per element. `Checkout table below for more information`
![](data/arr4.png)
![](data/arr3.png)

In [6]:
## You can explicitly convert or cast an array from one dtype to another using ndarray’s astype method:
## Int ------> float
arr = np.array([1, 2, 3, 4, 5])
print(arr.dtype)
float_arr = arr.astype(np.float64)
print(float_arr.dtype)

## Float ------> Int
## If I cast some floating point numbers to be of integer dtype, the decimal part will be truncated:
arr = np.array([3.7, -1.2, -2.6, 0.5, 12.9, 10.1])
print(arr)
print(arr.astype(np.int32))

## You can also use another array’s dtype attribute:
int_array = np.arange(10)
calibers = np.array([.22, .270, .357, .380, .44, .50], dtype=np.float64)
print(int_array.astype(calibers.dtype))

int64
float64
[ 3.7 -1.2 -2.6  0.5 12.9 10.1]
[ 3 -1 -2  0 12 10]
[0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]


In [7]:
## If you have an array of strings representing numbers, you can use astype to convert them to numeric form:
numeric_strings = np.array(['1.25', '-9.6', '42'], dtype=np.string_)
print(numeric_strings.astype(float))
## If casting were to fail for some reason (like a string that cannot be converted to float64), a TypeError will be raised

[ 1.25 -9.6  42.  ]


## Operations between Arrays and Scalars
Arrays are important because they enable you to express batch operations on data `without writing any for loops.` This is usually called `vectorization`. Any arithmetic op- erations between equal-size arrays applies the operation elementwise:

In [8]:
ls = [[1, 2, 3, 4],[6, 7, 8, 9]]
arr = np.array(ls)
print(arr*arr)
print(arr-arr)
print(arr/arr)
print(1/arr)
print(arr**2)

[[ 1  4  9 16]
 [36 49 64 81]]
[[0 0 0 0]
 [0 0 0 0]]
[[1. 1. 1. 1.]
 [1. 1. 1. 1.]]
[[1.         0.5        0.33333333 0.25      ]
 [0.16666667 0.14285714 0.125      0.11111111]]
[[ 1  4  9 16]
 [36 49 64 81]]


Operations between differently sized arrays is called `broadcasting` and will be discussed in more detail in later in this notebook

## Basic Indexing and Slicing
![](data/arr5.png)

In [9]:
ls = [[1, 2, 3, 4],[6, 7, 8, 9]]
arr = np.array(ls)
print(arr)
print(arr[1][3])
## or use 
print(arr[1, 3])

[[1 2 3 4]
 [6 7 8 9]]
9
9


**`Array slices are views on the original array.` This means that the data is not copied, and any modifications to the view will be reflected in the source array**

In [10]:
arr = np.arange(7)
print(arr[3: 6])

arr[5:7] = 12 ##the value is propagated (or broadcasted henceforth) to the entire selection
print(arr)

[3 4 5]
[ 0  1  2  3  4 12 12]


**If you want a copy of a slice of an ndarray instead of a view, you will need to explicitly copy the array; for example arr[5:7].copy().**

In [11]:
ls = [[1, 2, 3, 4],[6, 7, 8, 9]]
arr = np.array(ls)
print(arr)
print(arr[:2, 1:])

[[1 2 3 4]
 [6 7 8 9]]
[[2 3 4]
 [7 8 9]]


![](data/arr6.png)

In [12]:
## Lets involve bool values as well
names = np.array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'])
data = np.random.randn(7, 4)
print(data)
bool1 = names == 'Will'
data[bool1, 2:4]

[[-1.26341676  0.19807986 -1.65446289  1.64156278]
 [ 0.50165996 -0.911429   -0.88195149 -0.15204294]
 [ 0.1529012  -0.70054799  1.8318881  -0.78679042]
 [ 0.20472269  1.3426561   1.17777795  0.53897844]
 [-1.00923506  0.13651547 -0.49546943  2.20156266]
 [-0.48036817 -0.6699796   0.08173798  0.07929976]
 [-0.41385267  0.43974374  0.84247203  0.22686388]]


array([[ 1.8318881 , -0.78679042],
       [-0.49546943,  2.20156266]])

In [13]:
## Some Fancy indexing
print(data[[6, -2, -1], [0, 3, 2]])
data[[6, -2, -1]][:, [0, 3, 2]]

[-0.41385267  0.07929976  0.84247203]


array([[-0.41385267,  0.22686388,  0.84247203],
       [-0.48036817,  0.07929976,  0.08173798],
       [-0.41385267,  0.22686388,  0.84247203]])

**`Note:`** ***Keep in mind that fancy indexing, unlike slicing, always copies the data into a new array.***

## Transposing Arrays and Swapping Axes
Transposing is a special form of reshaping which similarly returns a view on the underlying data `without copying anything.`

In [14]:
arr = np.arange(15).reshape((3, 5))
print(arr, '\n')
print(arr.T, '\n')


arr = np.arange(16).reshape((2, 2, 4))
print(arr, '\n')
print(arr.T, '\n')
print(arr.swapaxes(0,1), '\n') ## simply swap 2 axis 
print(arr.transpose((1,0,2)), '\n') ## can swap multiple axis at a time

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]] 

[[ 0  5 10]
 [ 1  6 11]
 [ 2  7 12]
 [ 3  8 13]
 [ 4  9 14]] 

[[[ 0  1  2  3]
  [ 4  5  6  7]]

 [[ 8  9 10 11]
  [12 13 14 15]]] 

[[[ 0  8]
  [ 4 12]]

 [[ 1  9]
  [ 5 13]]

 [[ 2 10]
  [ 6 14]]

 [[ 3 11]
  [ 7 15]]] 

[[[ 0  1  2  3]
  [ 8  9 10 11]]

 [[ 4  5  6  7]
  [12 13 14 15]]] 

[[[ 0  1  2  3]
  [ 8  9 10 11]]

 [[ 4  5  6  7]
  [12 13 14 15]]] 



## Universal Functions: Element-wise Array Functions

In [15]:
arr = np.array([[1, 2, 3, 4],[5, 6, 7, 8]])
print(np.sqrt(arr))
print(np.exp(arr))

[[1.         1.41421356 1.73205081 2.        ]
 [2.23606798 2.44948974 2.64575131 2.82842712]]
[[2.71828183e+00 7.38905610e+00 2.00855369e+01 5.45981500e+01]
 [1.48413159e+02 4.03428793e+02 1.09663316e+03 2.98095799e+03]]


![](data/arr7.png)
![](data/arr8.png)

In [16]:
arr1 = np.arange(0, 7, 0.01)
arr2 = np.arange(0, 2, 0.1)
np.meshgrid(arr1, arr2) ## produces all possible combination between arr1 and arr2

[array([[0.  , 0.01, 0.02, ..., 6.97, 6.98, 6.99],
        [0.  , 0.01, 0.02, ..., 6.97, 6.98, 6.99],
        [0.  , 0.01, 0.02, ..., 6.97, 6.98, 6.99],
        ...,
        [0.  , 0.01, 0.02, ..., 6.97, 6.98, 6.99],
        [0.  , 0.01, 0.02, ..., 6.97, 6.98, 6.99],
        [0.  , 0.01, 0.02, ..., 6.97, 6.98, 6.99]]),
 array([[0. , 0. , 0. , ..., 0. , 0. , 0. ],
        [0.1, 0.1, 0.1, ..., 0.1, 0.1, 0.1],
        [0.2, 0.2, 0.2, ..., 0.2, 0.2, 0.2],
        ...,
        [1.7, 1.7, 1.7, ..., 1.7, 1.7, 1.7],
        [1.8, 1.8, 1.8, ..., 1.8, 1.8, 1.8],
        [1.9, 1.9, 1.9, ..., 1.9, 1.9, 1.9]])]

## Sorting

NumPy arrays can be sorted `in-place using the sort method`,meaning that the array contents are rearranged without producing a new array.
The top level method np.sort returns a `sorted copy of an array` instead of modifying the array in place.

In [17]:
arr = np.array([[2, 5, 100, 4],[153, 62, 37, 82]])
np.sort(arr)

array([[  2,   4,   5, 100],
       [ 37,  62,  82, 153]])

In [18]:
## using sort method
arr.sort()

### Indirect Sorts: argsort and lexsort

In [19]:
values = np.array([5, 0, 1, 3, 2]) 
indexer = values.argsort()
print(indexer)
values[indexer] 

[1 2 4 3 0]


array([0, 1, 2, 3, 5])

In [20]:
## lexsort is similar to argsort, but it performs an indirect lexicographical sort on multiple key arrays.
first_name = np.array([22, 11, 45, 2]) 
last_name = np.array([34, 43, 2, 32]) 
sorter = np.lexsort((first_name, last_name))
print(first_name[sorter])
(last_name[sorter])

[45  2 22 11]


array([ 2, 32, 34, 43])

## Unique and Set logic

Try to implement stuff below by creating new cell
![](data/arr9.png)

## Random Number Generation
The numpy.random module supplements the built-in Python random with functions for efficiently generating whole arrays of sample values from many kinds of probability
distributions

See Table 4-8 for a partial list of functions available in numpy.random.
![](data/arr10.png)

## Vectorization
Vectorization is used to speed up the Python code without using loop. Using such a function can help in minimizing the running time of code efficiently.\n",
To do so, Python has some standard mathematical functions for fast operations on entire arrays of data without having to write loops. One of such library which contains such function is `numpy.`
Refer some functions for implementing vectorization in numpy
![](data/arr15.png)

## Matrix Multiplication

In [21]:
## Matrix multiplication using dot is same as matrix multiplication in maths
x = np.array([[1., 2., 3.], [4., 5., 6.]])
y = np.array([[6., 23.], [-1, 7], [8, 9]])

print(x.dot(y))
## or use this
np.dot(x, y)

[[ 28.  64.]
 [ 67. 181.]]


array([[ 28.,  64.],
       [ 67., 181.]])

In [22]:
## For elementwise multiplication
x = np.array([[1., 2., 3.], [4., 5., 6.]])
y = np.array([[6., 23., 9], [-1, 7, 8]])
print(np.multiply(x, y))

## Equivalent to x1 * x2 in terms of array broadcasting.
x*y

[[ 6. 46. 27.]
 [-4. 35. 48.]]


array([[ 6., 46., 27.],
       [-4., 35., 48.]])

# Part-2 Advanced NumPy

## Reshaping Arrays
Given what we know about NumPy arrays, it should come as little surprise that you can convert an array from one shape to another `without copying any data.`

In [23]:
arr = np.arange(20)
arr = arr.reshape((5,4))
print(arr)
## Reshape from hifher dimension to 1 dimension is called revelling or flattening
print(arr.flatten())
print(arr.ravel())
# fatten method behaves similar to ravel except it always return a copy of data

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]
 [12 13 14 15]
 [16 17 18 19]]
[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19]
[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19]


## C versus Fortran Order
NumPy arrays are created in row major order. Spatially this means that if you have a two-dimensional array of data, the items in each row of the array are stored in adjacent memory locations. The alternative to row major ordering is column major order, which means that (you guessed it) values within each column of data are stored in adjacent memory locations.
![](data/arr11.png)

In [24]:
print(arr.ravel('F')) ## F argument for column major
print(arr.flatten('C')) ## C argument for row major

[ 0  4  8 12 16  1  5  9 13 17  2  6 10 14 18  3  7 11 15 19]
[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19]


## Concating and Splitting Arrays

In [25]:
arr1 = np.array([[1, 2, 3], [4, 5, 6]]) 
arr2 = np.array([[7, 8, 9], [10, 11, 12]]) 
print(np.concatenate([arr1, arr2], axis=0)) ## along row axis
print(np.concatenate([arr1, arr2], axis=1)) ## along column axis

arr3 = np.arange(20).reshape((5,4))
a1, a2, a3 = np.split(arr3, [1, 3]) ## putting cuts at 1st and 3rd posion along axis=0
print(a1)
print(a2)
print(a3)

[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]
[[ 1  2  3  7  8  9]
 [ 4  5  6 10 11 12]]
[[0 1 2 3]]
[[ 4  5  6  7]
 [ 8  9 10 11]]
[[12 13 14 15]
 [16 17 18 19]]


See Table 12-1 for a list of all relevant concatenation and splitting functions, some of which are provided only as a convenience of the very general purpose concatenate.
![](Images/arr12.png)

## Fancy Indexing Equivalents: Take and Put

In [26]:
arr = np.arange(10) * 100
inds = inds = [7, 1, 2, 6]
print(arr.take(inds))

arr.put(inds, 42)
print(arr)
## As of this writing, the take and put functions in general have better performance than their fancy indexing equivalents by a significant margin

[700 100 200 600]
[  0  42  42 300 400 500  42  42 800 900]


## Broadcasting

See Figure 12-4, 12-5 for an illustration of this operation
![](data/arr13.png)
![](data/arr14.png)

In [27]:
arr = np.random.rand(4, 3)
print(arr.mean(0))
demeaned = arr - arr.mean(0)
demeaned

[0.74122388 0.75013318 0.33044501]


array([[ 0.2332351 ,  0.18333927,  0.05186026],
       [ 0.05869641,  0.07708805, -0.13090015],
       [-0.29802306, -0.03521836,  0.3347184 ],
       [ 0.00609155, -0.22520896, -0.2556785 ]])

In [28]:
## Setting arr value using broadcastimg
arr = np.empty((4, 3))
arr[:] = 5
arr

array([[5., 5., 5.],
       [5., 5., 5.],
       [5., 5., 5.],
       [5., 5., 5.]])

## Structured Arrays
A structured array is an ndarray in which each element can be thought of as representing a struct in C (hence the “structured” name) or a row in a SQL table with multiple named fields

Since each element in the array is represented in memory as a fixed number of bytes, structured arrays provide a very fast and efficient way of writing data to and from disk, transporting it over the network, and other such use.

In [29]:
dtype = [('x', np.float64), ('y', np.int32)]
sarr = np.array([(1.5, 6), (np.pi, -2)], dtype=dtype)
print(sarr)
print(sarr[0])
print(sarr[0]['y'])
print(sarr['x'])

[(1.5       ,  6) (3.14159265, -2)]
(1.5, 6)
6
[1.5        3.14159265]


## Performance Tips
Getting good performance out of code utilizing NumPy is often straightforward, as array operations typically replace otherwise comparatively extremely slow pure Python loops. Here is a brief list of some of the things to keep in mind:
<ol>
    <li>Convert Python loops and conditional logic to array operations and boolean array operations</li>
    <li>Use broadcasting whenever possible</li>
    <li>Avoid copying data using array views (slicing)</li>
</ol>

### The Importance of Contiguous memory
Operations accessing contiguous blocks of memory (for example, summing the rows of a C order array) will generally be the fastest because 
<ol>
<li>The memory subsystem will buffer the appropriate blocks of memory into the ultrafast L1 or L2 CPU cache.</li>
    <li>Also, certain code paths inside NumPy’s C codebase have been optimized for the contiguous case in which generic strided memory access can be avoided.</li>
</ol>

To say that an array’s memory layout is contiguous means that the elements are stored in memory in the order that they appear in the array with respect to Fortran (column major) or C (row major) ordering. These properties can be ex- plicitly checked via the flags attribute on the ndarray:

In [30]:
arr_c = np.ones((1000, 1000), order='C') 
arr_f = np.ones((1000, 1000), order='F')
print(arr_c.flags) 
arr_f.flags

  C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False
  UPDATEIFCOPY : False



  C_CONTIGUOUS : False
  F_CONTIGUOUS : True
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False
  UPDATEIFCOPY : False

In [31]:
%timeit arr_c.sum(1)

399 µs ± 17.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


In [32]:
%timeit arr_f.sum(1)

645 µs ± 11.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


## Next lets move on to matplotlib [link](matplotlib.ipynb)