# Numpy

It was released as an open source project in 2005 with the goal of bringing scientific computing to python. It was based on two earlier packages - Numeric and Numarray. Its strength lies in its ability to work with multi-dimensional array objects much faster. Numpy leverages BLAS - Basic Linear Algebra Subprogram and LAPACK - Linear Algebra PACKage. It uses these to supercharge its linear algebra capabilities. This library is focused solely on numbers and excels in numerical analysis, linear algebra and simulation. However when it comes to data analysis and manipulation working with a wide range of data sources, that's where pandas steps in.

### Characteristics of Numpy
- Numpy arrays have a fixed size at creation, unlike python lists that can grow dynamically. Changing the size of an ndarray will create a new array and delete the original.


- The elements in a numpy array are required to be of a homogenous datatype, hence they will be the same size in memory. Albeit, one can have arrays of objects thereby allowing for arrays of different sized elements.


- It supports an object-oriented approach


- Numpy is fast due to vectorization and broadcasting capabilities.

**Vectorization** describes the absence of explicit looping and indexing. It is more concise and easier to read because they are fewer lines of codes and generally fewer bugs. The code more closely resemebles standard mathematical notation 

**Broadcasting** describes the implicit element-by-element behaviour of operations. All numpy operations broadcast.

In [1]:
#installing numpy

!pip install numpy



In [4]:
lst = [1,2,3,4,5,6,7,8,9]
print(type(lst))
print('List: ', lst)

<class 'list'>
List:  [1, 2, 3, 4, 5, 6, 7, 8, 9]


## Creating a numpy array

In [5]:
import numpy as np

my_array = np.array(lst)
print(type(my_array))
print('Numpy array: ', my_array)

<class 'numpy.ndarray'>
Numpy array:  [1 2 3 4 5 6 7 8 9]


## Creating a simple numpy using np.arange

It creates arrays of evenly spaced values within a specific range.

In [6]:
arr = np.arange(30)

print(type(arr))
print(arr)

<class 'numpy.ndarray'>
[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
 24 25 26 27 28 29]


## Comparing lists and numpy

In [7]:
%%time

#always put the "%%time" at the very beginning of the code, before any comments

#what we are trying to do is do an element-wise multiplication between two lists

#first list that we defined
lst = list(range(1000000))

#second list defintion and multiplication

for i in range(1000000):
    lst[i] * lst[i]
    
#Note that "i" serves as index here to get the current 
#index value and then multiply it by itself

Wall time: 213 ms


In [8]:
%%time

arr = np.arange(1000000)

arr = arr*arr

Wall time: 6 ms


You can note the difference in the computation time. Wall time means the elapsed real time or running time. It is the actual time taken from the start of a process to its completion measured by a real-world clock.

## Numpy array and its attributes/properties - reminds me of OOP

In [17]:
import numpy as np

arr = np.array([[1,2,3,4], [3,4,5,6]])

print(f'Array: {arr}')

#print shape of the array
print('Shape: ', arr.shape)

#print datatype
print('Dataype: ', arr.dtype)

#print item size in byte of each element
print('Item size: ', arr.itemsize)

#print the dimensionality of the numpy array
print('Dimensionality: ', arr.ndim)

Array: [[1 2 3 4]
 [3 4 5 6]]
Shape:  (2, 4)
Dataype:  int32
Item size:  4
Dimensionality:  2


In [18]:
arr

array([[1, 2, 3, 4],
       [3, 4, 5, 6]])

## Functions for creating numpy array

### 1. np.arange()

Takes 1 to 3 arguments.

We've seen the one argument case above.

Notice also that for np.arange(10) it returns 10 integers, starting from 0 to 9.

In [19]:
#two argument case

import numpy as np
arr = np.arange(1, 10) #second parameter defines the stop point. It is exclusive as well

print('Array: ', arr)
print('Shape:',arr.shape)


Array:  [1 2 3 4 5 6 7 8 9]
Shape: (9,)


In [22]:
#three argument case

import numpy as np
arr = np.arange(1,20,2)

print('Array:', arr)
print('Shape: ', arr.shape)

Array: [ 1  3  5  7  9 11 13 15 17 19]
Shape:  (10,)


In [23]:
help(np.arange)

Help on built-in function arange in module numpy:

arange(...)
    arange([start,] stop[, step,], dtype=None, *, like=None)
    
    Return evenly spaced values within a given interval.
    
    Values are generated within the half-open interval ``[start, stop)``
    (in other words, the interval including `start` but excluding `stop`).
    For integer arguments the function is equivalent to the Python built-in
    `range` function, but returns an ndarray rather than a list.
    
    When using a non-integer step, such as 0.1, the results will often not
    be consistent.  It is better to use `numpy.linspace` for these cases.
    
    Parameters
    ----------
    start : integer or real, optional
        Start of interval.  The interval includes this value.  The default
        start value is 0.
    stop : integer or real
        End of interval.  The interval does not include this value, except
        in some cases where `step` is not an integer and floating point
        round-off 

### 2. np.ones()

In [27]:
arr = np.ones((3,3))

print('Array: ', arr)
print('Shape: ', arr.shape)
print('Data type: ', arr.dtype)
print('Item size: ', arr.itemsize)

Array:  [[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]
Shape:  (3, 3)
Data type:  float64
Item size:  8


### 3. np.zeros()

In [28]:
arr = np.zeros((3,3))

print('Array: ', arr)
print('Shape: ', arr.shape)
print('Data type: ', arr.dtype)
print('Item size: ', arr.itemsize)

Array:  [[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]
Shape:  (3, 3)
Data type:  float64
Item size:  8


### 4. np.eye()

In [30]:
arr = np.eye(3,3)

print('Array: ', arr)
print('Shape: ', arr.shape)
print('Data type: ', arr.dtype)
print('Item size: ', arr.itemsize)

Array:  [[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]
Shape:  (3, 3)
Data type:  float64
Item size:  8


In [32]:
arr =np.eye(3,4)

print('Array: ', arr)
print('Shape: ', arr.shape)
print('Data type: ', arr.dtype)
print('Item size: ', arr.itemsize)

Array:  [[1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]]
Shape:  (3, 4)
Data type:  float64
Item size:  8


## Data types in Numpy

- i = integer
- b = boolean
- str = string
- f = float
- m = timedelta
- M = datetime
- O = object
- u = unsigned integer
- c = complex float
- U = unicode string
- V = fixed chunk of memory of other tyoe (void)

In [35]:
import numpy as np

arr = np.array([1,3,5,6,7], dtype = 'b')

print('Array: ', arr)
print('Data type: ', arr.dtype)

Array:  [1 3 5 6 7]
Data type:  int8


In [36]:
import numpy as np

arr = np.array([1,3,5,6,7], dtype = 'i')

print('Array: ', arr)
print('Data type: ', arr.dtype)

Array:  [1 3 5 6 7]
Data type:  int32


In [37]:
import numpy as np

arr = np.array([1,3,5,6,7], dtype = 'O')

print('Array: ', arr)
print('Data type: ', arr.dtype)

Array:  [1 3 5 6 7]
Data type:  object


In [38]:
import numpy as np

arr = np.array([1,3,5,6,7], dtype = 'm')

print('Array: ', arr)
print('Data type: ', arr.dtype)

Array:  [1 3 5 6 7]
Data type:  timedelta64


## Numpy Random numbers

1. np.random.rand = generates an array with random numbers that are **uniformly distributed** between 0 and 1


2. np.random.randn = generates an array with random numbers that are **normally distributed** with mean  = 0 and stdev = 1


3. np.random.randint = generates an array wih random numbers (integers) that are **uniformly distributed** between 0 and a given number


4. np.random.uniform = generates an array with random(float) numbers between given numbers.

In [42]:
# help(np.random)

#you can obtain more information on the use of np.random by seeking help as seen above

In [44]:
#random generation of an array from uniform distribution

import numpy as np

#prints 5 random numbers generated from a uniform distrbution (between 0 and 1)
arr = np.random.rand(5)

print('Array: ', arr)

Array:  [0.2261839  0.17714786 0.05305274 0.45187612 0.12897024]


In [45]:
#randonly generate an array with 10 rows and 2 columns from uniform distribution

arr = np.random.rand(10,2)
print('Array: ', arr)

Array:  [[0.65530671 0.17934809]
 [0.69203059 0.1417652 ]
 [0.99304708 0.45380322]
 [0.6825353  0.16783538]
 [0.03194995 0.16719442]
 [0.77207637 0.54592942]
 [0.69072642 0.38213148]
 [0.95283933 0.81162294]
 [0.79492393 0.79651552]
 [0.6347463  0.34470269]]


Can the precision of these numbers generated be altered? I wonder.

In [47]:
#randomly generate an array from normal distribution

import numpy as np

arr = np.random.randn(5)
print('Array: ', arr)

Array:  [-0.18218322  0.04749454 -0.98962698  2.01068176 -1.10098055]


In [48]:
#randomly generate an array with 5 rows and 4 columns from a normal distribution

import numpy as np

arr = np.random.randn(5,4)
print('Array: ', arr)

Array:  [[-6.29923415e-01 -8.79633452e-01 -4.22773993e-01 -1.09965252e-01]
 [ 1.12146865e-01 -2.41262694e-01 -2.14369689e+00  3.64377990e-01]
 [-6.55830159e-01 -3.76506189e-01 -4.48950853e-01  1.56743313e+00]
 [-7.23262750e-02 -6.40407606e-01 -1.13361504e+00  6.76568877e-01]
 [-8.65238223e-04 -7.22651899e-01  1.41491735e+00 -6.54596590e-01]]


In [49]:
#generate one random number between 0 to 9

import numpy as np

value = np.random.randint(10)
print(value)

4


In [51]:
#randomly generate a 5 by 4 array containing values in the range of 0 to 9

arr = np.random.randint(10, size = (5,4))

print('Array: \n', arr)

Array: 
 [[2 6 7 1]
 [0 6 6 8]
 [3 6 0 9]
 [7 3 1 3]
 [5 9 0 4]]


In [53]:
#randomly generating a 5 by 10 array containing values in the range of 10 to 50

arr = np.random.randint(10, 51, size = (5, 10))

print('Array: \n', arr)

Array: 
 [[17 24 43 23 39 32 32 38 38 24]
 [16 19 39 21 12 27 20 42 27 37]
 [10 46 28 43 19 28 46 40 44 32]
 [17 30 42 18 28 30 16 19 10 48]
 [50 26 15 22 39 19 27 14 46 15]]


In [54]:
#randomly generate one decimal number between 0 to 10

value = np.random.uniform(10)

print(value)

3.7037892780723665


In [56]:
arr = np.random.uniform(10, 50, size = (2,3))
arr

array([[17.02372786, 48.47400111, 16.80233751],
       [49.84379724, 24.40858166, 45.41640197]])

## Numpy array - indexing, slicing and updating

Data in numpy arrays is stored sequentially, therefore it is possible to access the data with the help of indexing and slicing operations. It offers more indexing facilities than regular python sequences. Also, it is possible to not only index using integers and slices,arrays can be indexed by arrays of integers and arrays of booleans.

In addition, numpy arrays are mutable, that is, data stored in numpy arrays can be changed or updated.

### Data accessing using Indexing

In [57]:
#randomly generating 1 dimensional array

arr = np.random.randint(100, size = (5,))
print('Array: ', arr)

Array:  [50 95 44 85 74]


In [58]:
#accessing values at index 2 and 4

print(arr[2])
print(arr[4])

44
74


In [69]:
#randomly generating 2 dimensional array

arr = np.random.randint(100, size = (4,6))
print('Array: \n', arr)

Array: 
 [[39 85 73 21 49 69]
 [17 41 49 79 11  0]
 [84 92 70 73 99 94]
 [86 40  8 24 15 95]]


In [70]:
## accessing the second index
#remember that the result above is a list of list

print(arr[2])

[84 92 70 73 99 94]


In [71]:
#to retrieve the second value of the second list (index 2)

print(arr[2,1])

#OR 

print(arr[2][1])
#this method is useful to access multiple values

92
92


In [72]:
#to retrieves the 4th and 2nd values from the first row

#the first list specifies the rows to select from
#the second list specifies the columns to select from

print(arr[[0,0], [3,1]])


[21 85]


In [73]:
#to retrieves the third value from the first row
#and the 4th value from the 4th row

print(arr[[0,3], [2,3]])

[73 24]


### Data accessing using slicing

In [74]:
#randomly generating 1D array

import numpy as np

arr = np.random.randint(100, size = (10, ))
print('Array: ', arr)

Array:  [19 51 42 32 62 91 89 59 76 35]


In [77]:
#accessing using slicing

print(arr[1:4])
print(arr[0:-4])

[51 42 32]
[19 51 42 32 62 91]


In [78]:
print(arr[::2])

[19 42 62 89 76]


In [81]:
arr = np.array([[1,2,3,4], [5,6,7,8], [9,0,1,2], [3,4,5,6]])
print('Array: \n', arr)
print('Shape: ', arr.shape)

Array: 
 [[1 2 3 4]
 [5 6 7 8]
 [9 0 1 2]
 [3 4 5 6]]
Shape:  (4, 4)


In [82]:
#using two way accessing

print(arr[0:2, 2:4])

[[3 4]
 [7 8]]


### Indexing with boolean arrays

In [83]:
#randomly generating one dimensional array
import numpy as np

arr = np.random.randint(100, size = (10,))

print('Array:', arr)

Array: [32 53 79 36 87 80 51 75 82 65]


In [84]:
index = [True, False, False, True, False, True, True, True, False, True]

print(arr[index])

[32 36 80 51 75 65]


### Updating data in numpy arrays

In [88]:
#rabndonly generate a 2 dimensional array

arr = np.random.randint(100, size = (5, 2))

print('Original array: \n',arr)

Original array: 
 [[ 6 12]
 [43 84]
 [30 60]
 [92 11]
 [91 98]]


In [90]:
arr[1,1] = 99
print('Updated array:\n', arr)

Updated array:
 [[ 6 12]
 [43 99]
 [30 60]
 [92 11]
 [91 98]]


## Numpy Flatten and Ravel

In [92]:
#randomly generating a 5 by 10 array

arr = np.random.randint(10, 40, size = (5,10))

print('Array: \n', arr)
print('Shape: ', arr.shape)

Array: 
 [[11 31 26 37 24 26 26 35 34 21]
 [15 27 10 13 36 24 10 14 32 28]
 [37 18 11 23 31 18 36 15 30 19]
 [36 36 39 21 37 37 37 39 11 24]
 [12 15 36 21 11 37 36 11 39 14]]
Shape:  (5, 10)


In [94]:
#using flatten()

flatten_arr = arr.flatten()

print('Array: \n', flatten_arr)
print('Shape: ', flatten_arr.shape)

Array: 
 [11 31 26 37 24 26 26 35 34 21 15 27 10 13 36 24 10 14 32 28 37 18 11 23
 31 18 36 15 30 19 36 36 39 21 37 37 37 39 11 24 12 15 36 21 11 37 36 11
 39 14]
Shape:  (50,)


In [95]:
ravel_arr = arr.ravel()

print('Array: \n', ravel_arr)
print('Shape: ', ravel_arr.shape)

Array: 
 [11 31 26 37 24 26 26 35 34 21 15 27 10 13 36 24 10 14 32 28 37 18 11 23
 31 18 36 15 30 19 36 36 39 21 37 37 37 39 11 24 12 15 36 21 11 37 36 11
 39 14]
Shape:  (50,)


Both functions perform the same action. But what's the difference.

In [97]:
#updating value in the array
print('Original array: \n', arr)

arr[1,1] = 78

print('Updated array: \n', arr)


Original array: 
 [[11 31 26 37 24 26 26 35 34 21]
 [15 27 10 13 36 24 10 14 32 28]
 [37 18 11 23 31 18 36 15 30 19]
 [36 36 39 21 37 37 37 39 11 24]
 [12 15 36 21 11 37 36 11 39 14]]
Updated array: 
 [[11 31 26 37 24 26 26 35 34 21]
 [15 78 10 13 36 24 10 14 32 28]
 [37 18 11 23 31 18 36 15 30 19]
 [36 36 39 21 37 37 37 39 11 24]
 [12 15 36 21 11 37 36 11 39 14]]


The difference between the two are as follows:

- Flatten creates a copy of the original array with a flatteneed layout. Any mofdifications made to the flattened array won't affect the original. Also, because it creates a new copy, it consumes more memory.

- Ravel returns a view of the original array. It is more memory efficient since it leverages a view when possible. Any changes made to the flattened array will also modify the original array since they point to the same data.

## Numpy reshape

In [98]:
#randomly generating a 5 by 10 array

arr = np.random.randint(50, 90, size = (5,10))

print('Array: \n', arr)
print('Shape: ', arr.shape)

Array: 
 [[67 84 84 56 83 72 60 68 71 76]
 [56 67 67 60 85 72 63 60 87 60]
 [69 80 70 61 70 79 51 81 75 84]
 [67 60 72 75 81 67 55 79 70 82]
 [57 64 54 88 51 52 50 64 76 81]]
Shape:  (5, 10)


In [99]:
arr_reshaped = arr.reshape(10,5)
print(arr_reshaped)

[[67 84 84 56 83]
 [72 60 68 71 76]
 [56 67 67 60 85]
 [72 63 60 87 60]
 [69 80 70 61 70]
 [79 51 81 75 84]
 [67 60 72 75 81]
 [67 55 79 70 82]
 [57 64 54 88 51]
 [52 50 64 76 81]]


In [100]:
arr_reshaped2 = arr.reshape(25,2)
print(arr_reshaped2)

[[67 84]
 [84 56]
 [83 72]
 [60 68]
 [71 76]
 [56 67]
 [67 60]
 [85 72]
 [63 60]
 [87 60]
 [69 80]
 [70 61]
 [70 79]
 [51 81]
 [75 84]
 [67 60]
 [72 75]
 [81 67]
 [55 79]
 [70 82]
 [57 64]
 [54 88]
 [51 52]
 [50 64]
 [76 81]]


In [101]:
arr_reshaped3 = arr.reshape(3,3)
print(arr_reshaped3)

ValueError: cannot reshape array of size 50 into shape (3,3)

## Iterating over Numpy arrays

### Generating a one dimensional array

In [114]:
arr1 = np.random.randint(10,49, size = (10,))
print('Array: \n', arr)
print('Shape: ', arr.shape)

Array: 
 [[80 68 79 83 66 50 64 65 54 66]
 [56 87 81 62 63 79 77 86 56 67]
 [52 78 63 50 50 60 52 86 88 84]
 [89 59 50 67 67 54 86 75 55 60]
 [50 84 53 59 84 89 62 76 87 77]]
Shape:  (5, 10)


In [115]:
#looping over items in the array

for i in arr1:
    print(i, sep = ',')

43
21
13
20
26
34
35
27
19
38


In [116]:
#redoing the above

for i in arr1:
    print(i, end = ' ')

43 21 13 20 26 34 35 27 19 38 

### Iterating over a 2 dimensional array

In [117]:
#randomly generating a 5 by 10 array

arr2 = np.random.randint(50, 90, size = (5,10))

print('Array: \n', arr)
print('Shape: ', arr.shape)

Array: 
 [[80 68 79 83 66 50 64 65 54 66]
 [56 87 81 62 63 79 77 86 56 67]
 [52 78 63 50 50 60 52 86 88 84]
 [89 59 50 67 67 54 86 75 55 60]
 [50 84 53 59 84 89 62 76 87 77]]
Shape:  (5, 10)


In [118]:
#looping over rows in the 2D array

for row in arr2:
    print(row)

[84 78 50 86 59 79 62 77 63 71]
[76 82 64 69 51 52 53 82 50 88]
[73 57 72 78 59 68 73 64 67 73]
[78 84 76 66 59 57 74 67 53 74]
[73 79 66 59 63 87 61 62 77 68]


In [119]:
#looping over each item in the arr array

for item in arr2.ravel():
    print(item, end = ' ')

84 78 50 86 59 79 62 77 63 71 76 82 64 69 51 52 53 82 50 88 73 57 72 78 59 68 73 64 67 73 78 84 76 66 59 57 74 67 53 74 73 79 66 59 63 87 61 62 77 68 

### Iterating using np.nditer()

In [120]:
#using nditer for the one dimensional array

for item in np.nditer(arr1):
    print(item, end = ' ')

43 21 13 20 26 34 35 27 19 38 

In [121]:
#using nditer for the 2D array

for item in np.nditer(arr2):
    print(item, end = ' ')

84 78 50 86 59 79 62 77 63 71 76 82 64 69 51 52 53 82 50 88 73 57 72 78 59 68 73 64 67 73 78 84 76 66 59 57 74 67 53 74 73 79 66 59 63 87 61 62 77 68 

In [122]:
#performing some calculations with nditer()

print('Original array: ', arr1)

for item in np.nditer(arr1):
    if item > 20:
        item[...] = (item*0)
print('Updated array: ', arr1)

Original array:  [43 21 13 20 26 34 35 27 19 38]


ValueError: assignment destination is read-only

We get this error because we use nditer to iterate over the elements of an array. However, it creates a copy of the underlying data for iteration. So when we try to modify, we are attempting to change the copy and not the actual element in the array. The copy is read-only hence the error.

In [123]:
#trying again

print('Original array: ', arr1)

for item in np.nditer(arr1, op_flags = ['readwrite']):
    if item>20:
        item[...] = (item * 0)
        
print('Updated array: ', arr1)

Original array:  [43 21 13 20 26 34 35 27 19 38]
Updated array:  [ 0  0 13 20  0  0  0  0 19  0]


### **Exercise 1**

Write a program to generate an array with shape 5 by 4 at random containing positive integers. Perform an update by replacing all odd numbers using -1 (Using a loop)

In [137]:
my_array = np.random.randint(100, size = (5, 4))

print('Original array: \n', my_array)

for item in np.nditer(my_array, op_flags = ['readwrite']):
    if item%2 != 0:
        item[...] = -1
        
print('Updated array: \n', my_array)

Original array: 
 [[20 99 20 84]
 [ 9 84 52 75]
 [89 20 14 20]
 [29 15  8 54]
 [45 68 67 78]]
Updated array: 
 [[20 -1 20 84]
 [-1 84 52 -1]
 [-1 20 14 20]
 [-1 -1  8 54]
 [-1 68 -1 78]]


In [138]:
#help(np.nditer)

### **Exercise 2**

Given an array [1, -10, 2,3,0,6], print the array in this order [0,6,-10,2,1,3]

In [166]:
array1 = np.array([1,-10, 2,3,0,6])

array2 = np.concatenate((array1[4:6], array1[1:3], np.array([array1[0]]), np.array([array1[3]])))

array2

array([  0,   6, -10,   2,   1,   3])

## Python Operators on Numpy Array

In [10]:
x = np.array([[1,2,5,7,4], [4,7,8,9,0]])

print('Array: \n', x)

Array: 
 [[1 2 5 7 4]
 [4 7 8 9 0]]


In [11]:
print(x+5)

[[ 6  7 10 12  9]
 [ 9 12 13 14  5]]


In [12]:
print(x%2)

[[1 0 1 1 0]
 [0 1 0 1 0]]


In [13]:
print(x>=3)

[[False False  True  True  True]
 [ True  True  True  True False]]


In [14]:
print(x//2)

[[0 1 2 3 2]
 [2 3 4 4 0]]


### Exercise

Write a program to generate an array with shape 5 by 4 at random containing positive integer. Perform an update by replacing all odd numbers with -1. (Without using a Loop)

In [48]:
my_arr = np.random.randint(10,60, size = (5, 4))
my_arr

array([[15, 38, 58, 33],
       [18, 18, 12, 39],
       [51, 18, 48, 28],
       [49, 24, 42, 11],
       [23, 29, 17, 53]])

In [49]:
index = (my_arr%2!=0)
index

array([[ True, False, False,  True],
       [False, False, False,  True],
       [ True, False, False, False],
       [ True, False, False,  True],
       [ True,  True,  True,  True]])

In [50]:
#making a copy of the array

arr_copy = my_arr.copy()

arr_copy[index] = -1
arr_copy

array([[-1, 38, 58, -1],
       [18, 18, 12, -1],
       [-1, 18, 48, 28],
       [-1, 24, 42, -1],
       [-1, -1, -1, -1]])

The solution is such that it replaces the odd numbers with -1 for areas where the result or index is True.

### Exercise

Write a program to filter the values from the array based on below mentioned conditions:

- Either value should be divisible by 5.
- (or) value should be an odd number and factor of 7.

In [51]:
arr_copy2 = my_arr.copy()
arr_copy2

array([[15, 38, 58, 33],
       [18, 18, 12, 39],
       [51, 18, 48, 28],
       [49, 24, 42, 11],
       [23, 29, 17, 53]])

In [54]:
for item in np.nditer(arr_copy2, op_flags = ['readwrite']):
    if item%5==0 or (item%2 != 0 and item%7 == 0):
        item[...] = item
    else:
        item[...] = 0
arr_copy2

array([[15,  0,  0,  0],
       [ 0,  0,  0,  0],
       [ 0,  0,  0,  0],
       [49,  0,  0,  0],
       [ 0,  0,  0,  0]])

## Numpy Maths

- `np.sqrt()`
- `np.exp()`
- `np.sin()`
- `np.cos()`
- etcetera

### Element wise operations

- `np.add()`
- `np.subtract()`
- `np.multiply()`
- `np.divide()`

### Matrix multiplication
Either of the functions below produce the same result

- `np.matmul()`
- `np.dot()`
- `@`

### Others
- `np.diag()`
- `T` - for transpose

In [58]:
#help(np)

#you would find more in the documentation for numpy

In [62]:
arr = np.array([[2,3,4], [4,8,9]])

print('Root: \n', np.sqrt(arr))
print('Sine values: \n', np.sin(arr))
print('Expnonent: \n', np.exp(arr))
print('Cosine values: \n', np.cos(arr))

Root: 
 [[1.41421356 1.73205081 2.        ]
 [2.         2.82842712 3.        ]]
Sine values: 
 [[ 0.90929743  0.14112001 -0.7568025 ]
 [-0.7568025   0.98935825  0.41211849]]
Expnonent: 
 [[7.38905610e+00 2.00855369e+01 5.45981500e+01]
 [5.45981500e+01 2.98095799e+03 8.10308393e+03]]
Cosine values: 
 [[-0.41614684 -0.9899925  -0.65364362]
 [-0.65364362 -0.14550003 -0.91113026]]


In [65]:
## Element wise operations
x = np.random.randint(10,21, size = (2,2))

y = np.random.randint(10,21, size = (2,2))

print('First array: \n', x)
print('Second array: \n', y)

First array: 
 [[18 10]
 [15 14]]
Second array: 
 [[15 16]
 [20 12]]


In [71]:
#Let E represent "Elementwise"

print('EAddition: \n', np.add(x,y),'\n')
print('Esubtraction: \n', np.subtract(x,y), '\n')
print('EMultiplication: \n', np.multiply(x,y), '\n')
print('EDivision: \n', np.divide(x,y), '\n')

EAddition: 
 [[33 26]
 [35 26]] 

Esubtraction: 
 [[ 3 -6]
 [-5  2]] 

EMultiplication: 
 [[270 160]
 [300 168]] 

EDivision: 
 [[1.2        0.625     ]
 [0.75       1.16666667]] 



In [72]:
## Matrix multiplication --> Let MM represent Matrix Multiplication

print('MM (way-1): \n', np.matmul(x,y), '\n')
print('MM (way-2): \n', np.dot(x,y), '\n')
print('MM (way-3): \n', x @ y, '\n')

MM (way-1): 
 [[470 408]
 [505 408]] 

MM (way-2): 
 [[470 408]
 [505 408]] 

MM (way-3): 
 [[470 408]
 [505 408]] 



In [73]:
#to retrieve the diagonal elements

x = np.random.randint(10,25, size = (2,3))

print('Original array: \n', x, '\n')

print('Diagonal values: \n', np.diag(x))


Original array: 
 [[14 21 18]
 [24 13 23]] 

Diagonal values: 
 [14 13]


In [74]:
#to transpose an array

print('Transposed matrix: \n', x.T)

Transposed matrix: 
 [[14 24]
 [21 13]
 [18 23]]


## Numpy statistics
You will often see the need to specific the axis for operation.

`axis = 0` ==> Column wise operation

`axis = 1` ==> Row wise operation

- `np.sum()`
- `np.min() and np.max()`
- `np.median(), np.mean()`
- `np.var()`
- `np.std()`
- `np.corrcoef()` - calculates the Pearson product-moment correlation coefficient between two sets of data. It indicates the strength and direction of the linear relationship between two variables.

In [75]:
arr = np.random.randint(1, 10, size = (3,3))

arr

array([[9, 8, 7],
       [7, 7, 1],
       [3, 1, 9]])

In [77]:
print('Min: ', np.min(arr))
print('Sum: ', np.sum(arr))

print('Row-sum: ', np.sum(arr, axis = 1))

print('Column-sum: ', np.sum(arr, axis = 0))

print('Row-median: ', np.median(arr, axis = 1))

Min:  1
Sum:  52
Row-sum:  [24 15 13]
Column-sum:  [19 16 17]
Row-median:  [8. 7. 3.]


In [80]:
heights = np.random.randint(150, 200, size = (6,))

weights = np.random.randint(40, 61, size = (6,))

np.corrcoef(heights, weights)

array([[ 1.       , -0.0398085],
       [-0.0398085,  1.       ]])

## More functions from Numpy

### Linspace
Generates an array of evenly spaced numbers over a specified interval

`Syntax => np.linspace(begin, end, #number of elements)`


In [83]:
print(np.linspace(5, 20, 9))

[ 5.     6.875  8.75  10.625 12.5   14.375 16.25  18.125 20.   ]


### Sorting

Column and row wise sorting works here as well.

`Syntax => np.sort(array, axis = )`

In [85]:
arr = np.random.randint(50,100, size = (5,10))
arr

array([[69, 61, 63, 94, 53, 89, 52, 66, 78, 58],
       [98, 82, 83, 94, 84, 56, 82, 81, 81, 56],
       [80, 81, 76, 69, 62, 77, 86, 72, 50, 84],
       [95, 95, 77, 87, 86, 74, 75, 72, 72, 89],
       [62, 92, 69, 76, 51, 52, 74, 67, 62, 91]])

In [86]:
# Column-wise sorting
#This goes to each column and arranges the values from the smaLLest to the largest

np.sort(arr, axis = 0)

array([[62, 61, 63, 69, 51, 52, 52, 66, 50, 56],
       [69, 81, 69, 76, 53, 56, 74, 67, 62, 58],
       [80, 82, 76, 87, 62, 74, 75, 72, 72, 84],
       [95, 92, 77, 94, 84, 77, 82, 72, 78, 89],
       [98, 95, 83, 94, 86, 89, 86, 81, 81, 91]])

In [88]:
#Row-wise sorting

np.sort(arr, axis = 1)

array([[52, 53, 58, 61, 63, 66, 69, 78, 89, 94],
       [56, 56, 81, 81, 82, 82, 83, 84, 94, 98],
       [50, 62, 69, 72, 76, 77, 80, 81, 84, 86],
       [72, 72, 74, 75, 77, 86, 87, 89, 95, 95],
       [51, 52, 62, 62, 67, 69, 74, 76, 91, 92]])

### Stacking

We have the horizonatal stacking = adds arrays side by side. Here, the number of rows in both arrays has to be the same.

`Syntax ==> np.hstack([array1, array2])`


Vertical stacking = adds arrays on top each other, that is, vertically. Here, the number of rows has to be the same.

`Syntax ==> np.vstack([array1, array2])`

In [95]:
arr1 = np.arange(5,15).reshape(2,5)
arr1

array([[ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

In [92]:
arr2 = np.arange(25, 35).reshape(2,5)
arr2

array([[25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34]])

In [96]:
#vertical stacking

np.vstack([arr1, arr2])

array([[ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34]])

In [97]:
#horizontal stacking

np.hstack([arr1, arr2])

array([[ 5,  6,  7,  8,  9, 25, 26, 27, 28, 29],
       [10, 11, 12, 13, 14, 30, 31, 32, 33, 34]])

### Concatenate

- Horizontal concatenation which is also = horizontal (row) stacking
`Syntax ==> np.concatenate([array1, array2], axis = 1)`


- Vertical concatenation = vertical (column) stacking
`Syntax ==> np.concatenate([array1, array2], axis = 0)`

In [98]:
#horizontal concatenate

np.concatenate([arr1, arr2], axis = 1)

array([[ 5,  6,  7,  8,  9, 25, 26, 27, 28, 29],
       [10, 11, 12, 13, 14, 30, 31, 32, 33, 34]])

In [99]:
#vertical concatenate

np.concatenate([arr1, arr2], axis = 0)

array([[ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34]])

### Append

- Horizontal append = horizontal (row) stacking
`Syntax ==> np.append(array1, array2, axis = 1)`


- Vertical append = vertical (column) stacking
`Syntax ==> np.append(array1, array2, axis = 0)`

In [100]:
#horizontal append

np.append(arr1, arr2, axis = 1)

array([[ 5,  6,  7,  8,  9, 25, 26, 27, 28, 29],
       [10, 11, 12, 13, 14, 30, 31, 32, 33, 34]])

In [101]:
#vertical append

np.append(arr1, arr2, axis = 0)

array([[ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34]])

## Where - np.where()

Processes array elements conditionally.

`np.where(condition, x,y)

This reads: Where True, yield x, otherwise y.

In [102]:
arr = np.arange(50,100).reshape(5,10)
arr

array([[50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
       [60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
       [70, 71, 72, 73, 74, 75, 76, 77, 78, 79],
       [80, 81, 82, 83, 84, 85, 86, 87, 88, 89],
       [90, 91, 92, 93, 94, 95, 96, 97, 98, 99]])

In [103]:
#using where

np.where(arr < 60, 0, 1)

array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])

In [105]:
#another example
#where arr is > 64 return arr/10, else return the arr value

np.where(arr>70, arr/10, arr)

array([[50. , 51. , 52. , 53. , 54. , 55. , 56. , 57. , 58. , 59. ],
       [60. , 61. , 62. , 63. , 64. , 65. , 66. , 67. , 68. , 69. ],
       [70. ,  7.1,  7.2,  7.3,  7.4,  7.5,  7.6,  7.7,  7.8,  7.9],
       [ 8. ,  8.1,  8.2,  8.3,  8.4,  8.5,  8.6,  8.7,  8.8,  8.9],
       [ 9. ,  9.1,  9.2,  9.3,  9.4,  9.5,  9.6,  9.7,  9.8,  9.9]])

## argsort

Used to sort an array indirectly. 

- Instead of modifying the original array, it retuns an array of indices sorting the original array in a specific order (usually ascending by default)

- This output contains the positions/indices of the elements in the original array that would result in the sorted order

- This indices is then used with other functions to get the actual sorted array

In [106]:
arr = np.array([10, -5, 7, -3])

print('Indices: ', arr.argsort())

print('Sorted array: ', arr[arr.argsort()])

Indices:  [1 3 2 0]
Sorted array:  [-5 -3  7 10]


A bit confusing right?

Here's how it works:

We know that indexes always start at 0, therefore the array below would have the following indices

`[10, -5, 7, -3] = [0, 1, 2, 3]`

This is the natural index for the numbers in the array. Now when we use argsort, it uses this natural index to place the numbers where they need to be in an ascending order.

`` `My array`       `Natrural index`   `argsort()`

`[10, -5, 7, -3]= [0, 1, 2, 3] = [1, 3, 2, 0]`



## Numpy broadcasting

This is a mechanism that allows us perform arithmetic operations on arrays with different shaps under certain conditions. It automates the process of making arrays compatible for element-wise operations. How does it work?

- Because numpy needs arrays to be compatible shape wise to allow for element wise operations, broadcasting comes into play where these arrays are not exactly the same.

- It allows a smaller array to be stretched to match the shape of the bigger one

In [107]:
arr1 = np.array([[1,2,3], [4,5,6]])

arr2 = np.array([1,2,3])

print(arr1+arr2)

[[2 4 6]
 [5 7 9]]


In [108]:
arr1 = np.array([[1,2,3], [4,5,6]])

arr2 = np.array([1])

print(arr1+arr2)

[[2 3 4]
 [5 6 7]]


In [112]:
arr1 = np.array([[1,2,3], [4,5,6]])

arr2 = np.array([[1],[2]])

print(arr1+arr2)

[[2 3 4]
 [6 7 8]]


In [115]:
arr1 = np.array([[1],[2],[3], [4],[5],[6]])

arr2 = np.array([1,2,3])

print('Array 1: \n', arr1, '\n')

print('Array 2: \n', arr2, '\n')

print('Result: \n', arr1+arr2)

Array 1: 
 [[1]
 [2]
 [3]
 [4]
 [5]
 [6]] 

Array 2: 
 [1 2 3] 

Result: 
 [[2 3 4]
 [3 4 5]
 [4 5 6]
 [5 6 7]
 [6 7 8]
 [7 8 9]]


## Numpy masking

This is the process of creating a boolean array (also called a mask) based on certain conditions applied to another array. This boolean acts as a filter, indicating which elements of the original array satisfy the specified condition. You can use np.where() to create a mask as well.

In [21]:
import numpy as np

arr = np.random.randint(10, 31, size = (10,))

#creating a mask for elements that are even
mask = arr%2 == 0

#using the mask to filter elements from the original array

filtered_arr = arr[mask]

print('Original array: ', arr)
print('Mask:', mask)

print('Filtered array:', filtered_arr)

Original array:  [12 13 17 14 23 19 27 18 21 29]
Mask: [ True False False  True False False False  True False False]
Filtered array: [12 14 18]


## Reading CSV into a Numpy Array

In [19]:
#help(np.loadtxt)

In [1]:
import numpy as np

csv_array = np.loadtxt('Data/nyc_weather.csv',dtype = 'str', delimiter = ',')

print(csv_array)

[['EST' 'Temperature' 'DewPoint' 'Humidity' 'Sea Level PressureIn'
  'VisibilityMiles' 'WindSpeedMPH' 'PrecipitationIn' 'CloudCover'
  'Events' 'WindDirDegrees']
 ['1/1/2016' '38' '23' '52' '30.03' '10' '8' '0' '5' '' '281']
 ['1/2/2016' '36' '18' '46' '30.02' '10' '7' '0' '3' '' '275']
 ['1/3/2016' '40' '21' '47' '29.86' '10' '8' '0' '1' '' '277']
 ['1/4/2016' '25' '9' '44' '30.05' '10' '9' '0' '3' '' '345']
 ['1/5/2016' '20' '-3' '41' '30.57' '10' '5' '0' '0' '' '333']
 ['1/6/2016' '33' '4' '35' '30.5' '10' '4' '0' '0' '' '259']
 ['1/7/2016' '39' '11' '33' '30.28' '10' '2' '0' '3' '' '293']
 ['1/8/2016' '39' '29' '64' '30.2' '10' '4' '0' '8' '' '79']
 ['1/9/2016' '44' '38' '77' '30.16' '9' '8' 'T' '8' 'Rain' '76']
 ['1/10/2016' '50' '46' '71' '29.59' '4' '' '1.8' '7' 'Rain' '109']
 ['1/11/2016' '33' '8' '37' '29.92' '10' '' '0' '1' '' '289']
 ['1/12/2016' '35' '15' '53' '29.85' '10' '6' 'T' '4' '' '235']
 ['1/13/2016' '26' '4' '42' '29.94' '10' '10' '0' '0' '' '284']
 ['1/14/2016' '3

# Task -  Dice Rolling Simulation

Create a simulation of a dice rolling game using NumPy. The game will involve rolling two dice and calculating the sum of their values. Players will guess whether the next roll will result in a higher, lower, or equal sum compared to the previous roll. The simulation will track the player's score based on their guesses.

In [23]:
#number of rounds in the game
num_of_rounds = 4

#range of possible values for the die roll
min_value = 1
max_value = 6

#intializing player's score
score = 0

#a list to record the dice sums
sums_list = []

while num_of_rounds >0 :
    rolling = str(input('\nType "roll": '))


    if rolling == 'roll':
        rolls_array = np.random.randint(min_value, max_value+1, size = (2,))
        dice_sums = np.sum(rolls_array)
        print ("Dice sum:", dice_sums)
        sums_list.append(dice_sums)   

        if len(sums_list) > 1:
            if (guess == 'higher' and (sums_list[-1] > sums_list[-2])) or \
               (guess == 'lower' and (sums_list[-1] < sums_list[-2])) or \
               (guess == 'equal' and (sums_list[-1] == sums_list[-2])):
                print('Correct guess!\n')
                score += 1
            else:
                print('Incorrect guess!\n')

    else:
        print('Please type "roll".')
    
    
    
    print('\nWill the next sum be higher, lower, or equal?')
      
    guess = str(input()).strip().lower() 

    if guess not in ["higher", "lower", "equal"]:
        print("Invalid input! Please enter 'higher', 'lower', or 'equal'.")

    num_of_rounds -= 1
        
else:
    print('\nRounds exhausted! Come back later\n')
    print('View of dice rolls: ', sums_list)
    print('Your total score:', score)


Type "roll": roll
Dice sum: 8

Will the next sum be higher, lower, or equal?
higher

Type "roll": roll
Dice sum: 5
Incorrect guess!


Will the next sum be higher, lower, or equal?
lower

Type "roll": roll
Dice sum: 5
Incorrect guess!


Will the next sum be higher, lower, or equal?
equal

Type "roll": roll
Dice sum: 4
Incorrect guess!


Will the next sum be higher, lower, or equal?
lower

Rounds exhausted! Come back later

View of dice rolls:  [8, 5, 5, 4]
Your total score: 0


**Note:**

Whenever you enter "roll", the simulation kicks in by generating two random numbers between 1 and 6, and then sums it up before returning the results. It also keeps track of your score which is equal to the number of correct guesses you make.