
# NUMPY

Numerical Python, or "Numpy" for short. Numpy provides us with high performance multi-dimensional arrays which we can use as vectors or matrices.and tools for working with these arrays. 

The key features of numpy are:

- ndarrays: n-dimensional arrays of the same data type which are fast and space-efficient.  
- There are a number of built-in methods for ndarrays which allow for rapid processing of data without using loops (e.g., compute the mean).
- Broadcasting: a useful tool which defines implicit behavior between multi-dimensional arrays of different sizes.
- Vectorization: enables numeric operations on ndarrays.
- Input/Output: simplifies reading and writing of data from/to file.

## ndarrays
They are time and space-efficient multidimensional arrays at the core of numpy.  Like the data structures in Week 2, let's get started by creating ndarrays using the numpy package.

In [1]:
# importing numpy
import numpy as np

In [2]:
# Create a rank 1 array
a_np_array = np.array([1, 2, 4])
print(type(a_np_array)) 

<class 'numpy.ndarray'>


In [3]:
# shape of the created array, (rank 1)
a_np_array.shape

(3,)

In [6]:
# accessing the elements of this array.
# need only one index as its a one diamensional array.
a_np_array[0], a_np_array[1], a_np_array[2]

(1, 2, 4)

In [8]:
# since arrays are mutable, lets change the first element.
a_np_array[0] = 0
print(a_np_array)

[0 2 4]


In [11]:
# Create rank 2 array
rank2_array = np.array([[1,2,3,4],[4,5,6,7],[7,8,9,10]])
print(rank2_array)

[[ 1  2  3  4]
 [ 4  5  6  7]
 [ 7  8  9 10]]


In [16]:
# shape of the created array, (rank 3)
print("Shape of rank2_array is 3 rows, 4 columns: ", rank2_array.shape)
# accessing 1st element in second row, 3 element of 3rd row.
print("-Element at (1,2) is: ", rank2_array[0, 1], "  -Element at (3,3) is: ", rank2_array[2, 2])

Shape of rank2_array is 3 rows, 4 columns:  (3, 4)
-Element at (1,2) is:  2   -Element at (3,3) is:  9


In [28]:
# making a np array, defaults.
print("Zero matrix of shape (3,4) \n",np.zeros((3, 4)))
print("One's matrix of shape (2,2) \n", np.ones((2,2)))
print("Matrix of shape (2,3) with all values as '5' \n", np.full((2,3), 5))
print("Identity matrix of shape (2,2) \n", np.eye(2,2))
print("Matrix of shape (3,3) filled with some random values \n", np.random.randn(3,3))

Zero matrix of shape (3,4) 
 [[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]
One's matrix of shape (2,2) 
 [[1. 1.]
 [1. 1.]]
Matrix of shape (2,3) with all values as '5' 
 [[5 5 5]
 [5 5 5]]
Identity matrix of shape (2,2) 
 [[1. 0.]
 [0. 1.]]
Matrix of shape (3,3) filled with some random values 
 [[ 0.53369317 -1.63595566 -0.51224718]
 [ 0.96108709  0.67195819 -0.04493374]
 [-1.04836031 -0.26269445  0.42037945]]


In [30]:
# ARRAY INDEXING
# Create the following rank 2 array with shape (3, 4)
# [[ 1  2  3  4]
#  [ 5  6  7  8]
#  [ 9 10 11 12]]
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])

# Use slicing to pull out the subarray consisting of the first 2 rows
# and columns 1 and 2; b is the following array of shape (2, 2):
# [[2 3]
#  [6 7]]
a_slice = a[:2, 1:3]
print(a_slice)

[[2 3]
 [6 7]]


In [34]:
#When you modify a slice, you actually modify the underlying array.
print("Before:", a[0, 1])   #inspect the element at 0, 1  
a_slice[0, 0] = 100    # a_slice[0, 0] is the same piece of data as an_array[0, 1]
print("After:", a[0, 1])   

Before: 1
After: 100


In [35]:
#We can use combinations of integer indexing and slice indexing to create different shaped matrices.
# Create a Rank 2 array of shape (3, 4)
an_array = np.array([[11,12,13,14], [21,22,23,24], [31,32,33,34]])
print(an_array)
# Using both integer indexing & slicing generates an array of lower rank
row_rank1 = an_array[1, :]    # Rank 1 view 

print(row_rank1, row_rank1.shape)  # notice only a single []
# Slicing alone: generates an array of the same rank as the an_array
row_rank2 = an_array[1:2, :]  # Rank 2 view 

print(row_rank2, row_rank2.shape)   # Notice the [[ ]]

[[11 12 13 14]
 [21 22 23 24]
 [31 32 33 34]]
[21 22 23 24] (4,)
[[21 22 23 24]] (1, 4)


In [41]:
#Sometimes it's useful to use an array of indexes to access or change elements.

# Create a new array
an_array = np.array([[10,20,30], [30,40,50], [50,60,70], [70,80,90]])

print('Original Array:')
print(an_array)
# Create an array of indices
col_indices = np.array([0, 1, 2, 0])
print('\nCol indices picked : ', col_indices)

row_indices = np.arange(4)
print('\nRows indices picked : ', row_indices)
# Examine the pairings of row_indices and col_indices.  These are the elements we'll change next.
for row,col in zip(row_indices,col_indices):
    print(row, ", ",col)
# Select one element from each row
print('\nValues in the array at those indices: ',an_array[row_indices, col_indices])
# Change one element from each row using the indices selected
an_array[row_indices, col_indices] += 100

print('\nChanged Array:')
print(an_array)

Original Array:
[[10 20 30]
 [30 40 50]
 [50 60 70]
 [70 80 90]]

Col indices picked :  [0 1 2 0]

Rows indices picked :  [0 1 2 3]
0 ,  0
1 ,  1
2 ,  2
3 ,  0

Values in the array at those indices:  [10 40 70 70]

Changed Array:
[[110  20  30]
 [ 30 140  50]
 [ 50  60 170]
 [170  80  90]]


In [45]:
# Can do boolean indexing

# Find the elements of a that are bigger than 50;
# this returns a numpy array of Booleans of the same
# shape as a, where each slot of bool_idx tells
# whether that element of a is > 50.
bool_idx = (an_array > 50)  

print(bool_idx)

print(an_array[bool_idx])

# We can do all of the above in a single concise statement:
print(an_array[an_array > 50])

[[ True False False]
 [False  True False]
 [False  True  True]
 [ True  True  True]]
[110 140  60 170 170  80  90]
[110 140  60 170 170  80  90]


In [46]:
#We can actually change elements in the array applying a similar logical filter. Let's add 100 to all the even values.

an_array[an_array % 20 == 0] +=100
print(an_array)

[[110 120  30]
 [ 30 240  50]
 [ 50 160 170]
 [170 180  90]]


In [49]:
# Specifiying a datatype
floatedarray = np.array([1, 2, 3, 4], dtype=np.float64)
print(floatedarray)
floatedarray.dtype

[1. 2. 3. 4.]


dtype('float64')

## Arithmetic Operations

In [50]:
x = np.array([[111,102],[101,122]], dtype=np.int)
y = np.array([[211.1,212.1],[201.1,222.1]], dtype=np.float64)

print(x)
print()
print(y)

[[111 102]
 [101 122]]

[[211.1 212.1]
 [201.1 222.1]]


In [51]:
# add
print(x + y)
print()
print(np.add(x, y))

[[322.1 314.1]
 [302.1 344.1]]

[[322.1 314.1]
 [302.1 344.1]]


In [52]:
# subtract
print(x - y)
print()
print(np.subtract(x, y))

[[-100.1 -110.1]
 [-100.1 -100.1]]

[[-100.1 -110.1]
 [-100.1 -100.1]]


In [53]:
# multiply
print(x * y)
print()
print(np.multiply(x, y))

[[23432.1 21634.2]
 [20311.1 27096.2]]

[[23432.1 21634.2]
 [20311.1 27096.2]]


In [54]:
# divide
print(x / y)
print()
print(np.divide(x, y))

[[0.52581715 0.48090523]
 [0.50223769 0.54930212]]

[[0.52581715 0.48090523]
 [0.50223769 0.54930212]]


In [55]:
# square root
print(np.sqrt(x))

[[10.53565375 10.09950494]
 [10.04987562 11.04536102]]


In [65]:
# exponent (e ** x)
print(np.exp(y))

[[4.78151068e+91 1.29974936e+92]
 [2.17080249e+87 2.86288848e+96]]


### Basic Statistical Operations:

In [59]:
# setup a random 2 x 4 matrix
arr = 15 * np.random.randn(2,5)
print(arr)
print("Mean: ")
# compute the mean for all elements
print(arr.mean())
print("Mean by row: ")
# compute the means by row
print(arr.mean(axis = 1))
print("Mean by column: ")
# compute the means by column
print(arr.mean(axis = 0))
print("Sum of all the elements:")
# sum all the elements
print(arr.sum())
print("Medians:")
# compute the medians
print(np.median(arr, axis = 1))

[[ -5.50674253 -15.99972284 -11.54997606   4.20223224 -21.70002818]
 [ 10.58460679   0.15928781   6.39411631  12.68264437   5.84187096]]
Mean: 
-1.4891711117799673
Mean by row: 
[-10.11084747   7.13250525]
Mean by column: 
[ 2.53893213 -7.92021751 -2.57792988  8.44243831 -7.92907861]
Sum of all the elements:
-14.891711117799673
Medians:
[-11.54997606   6.39411631]


### Sorting

In [62]:
# create a 10 element array of randoms
unsorted = np.random.randn(10)

print(unsorted)

# create copy and sort
sorted = np.array(unsorted)
sorted.sort()

print(sorted)
print()
print("UnSorted output: \n",unsorted)

# inplace sorting
unsorted.sort() 
print("Sorted output: \n", unsorted)

[-0.26419144 -0.8977148   1.29521363  0.52932755 -0.97411037 -0.07303587
  0.03733673 -0.40920998 -1.55720265  0.00595524]
[-1.55720265 -0.97411037 -0.8977148  -0.40920998 -0.26419144 -0.07303587
  0.00595524  0.03733673  0.52932755  1.29521363]

UnSorted output: 
 [-0.26419144 -0.8977148   1.29521363  0.52932755 -0.97411037 -0.07303587
  0.03733673 -0.40920998 -1.55720265  0.00595524]
Sorted output: 
 [-1.55720265 -0.97411037 -0.8977148  -0.40920998 -0.26419144 -0.07303587
  0.00595524  0.03733673  0.52932755  1.29521363]


In [70]:
#Finding Unique elements

array = np.array([4,1,2,4,2,3,2,4,1])

print("Unique elements of array [4,1,2,4,2,3,2,4,1] are: ",np.unique(array))

set1 = np.array(['desk','chair','bulb'])
set2 = np.array(['lamp','bulb','chair'])
print("\nThe sets are: \n", set1, set2)
print("\nTheir intersection: \n",np.intersect1d(set1, set2) ) 
print("\nTheir union: \n", np.union1d(set1, set2) )
print("\nTheir set diff: \n", np.setdiff1d(set1, set2) )# elements in s1 that are not in s2
print("\nTheir set intersection: \n", np.in1d(set1, set2) )#which element of s1 is also in s2

Unique elements of array [4,1,2,4,2,3,2,4,1] are:  [1 2 3 4]

The sets are: 
 ['desk' 'chair' 'bulb'] ['lamp' 'bulb' 'chair']

Their intersection: 
 ['bulb' 'chair']

Their union: 
 ['bulb' 'chair' 'desk' 'lamp']

Their set diff: 
 ['desk']

Their set intersection: 
 [False  True  True]
