<p style="font-family: Arial; font-size:3.75em;color:purple; font-style:bold"><br>
Introduction to numpy:
</p><br>

<p style="font-family: Arial; font-size:1.25em;color:#2462C0; font-style:bold"><br>
Package for scientific computing with Python
</p><br>

Numerical Python, or "Numpy" for short, is a foundational package on which many of the most common data science packages are built.  Numpy provides us with high performance multi-dimensional arrays which we can use as vectors or matrices.  

The key features of numpy are:

- ndarrays: n-dimensional arrays of the same data type which are fast and space-efficient.  There are a number of built-in methods for ndarrays which allow for rapid processing of data without using loops (e.g., compute the mean).
- Broadcasting: a useful tool which defines implicit behavior between multi-dimensional arrays of different sizes.
- Vectorization: enables numeric operations on ndarrays.
- Input/Output: simplifies reading and writing of data from/to file.

<b>Additional Recommended Resources:</b><br>
<a href="https://docs.scipy.org/doc/numpy/reference/">Numpy Documentation</a><br>
<i>Python for Data Analysis</i> by Wes McKinney<br>
<i>Python Data science Handbook</i> by Jake VanderPlas



<p style="font-family: Arial; font-size:2.75em;color:purple; font-style:bold"><br>

Getting started with ndarray<br><br></p>

**ndarrays** are time and space-efficient multidimensional arrays at the core of numpy.  Like the data structures in Week 2, let's get started by creating ndarrays using the numpy package.

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

How to create Rank 1 numpy arrays:
</p>

In [1]:
import numpy as np

# Create a rank 1 array (single dimension array = a vector)
an_array = np.array([3, 33, 333])  

# use type() to see class of the ndarray object
print(type(an_array))

<class 'numpy.ndarray'>


In [2]:
# test the shape of the array via array.shape()
# it should have just one dimension (Rank 1) w/ 3 elements
print(an_array.shape)

(3,)


In [3]:
# because this is a 1-rank array, we need only one index to accesss each element via brackets
print(an_array[0], an_array[1], an_array[2]) 

3 33 333


In [4]:
# ndarrays are mutable (can cange an element of the array via assignment)
an_array[0] = 888

print(an_array)

[888  33 333]


<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

How to create a Rank 2 numpy array:</p>

A rank 2 **ndarray** is one with two dimensions.  Notice the format below of [ [row] , [row] ].  2 dimensional arrays are great for representing matrices which are often useful in data science.

In [5]:
# create a 2 ndarray w/ 2 dimensions (Rank 2 array) w/ 2 rows, 3 cols
another = np.array([[11,12,13],[21,22,23]])

print(another)

[[11 12 13]
 [21 22 23]]


In [7]:
# get dimensions of the array via array.shape()
print("The shape is 2 rows, 3 columns: ", another.shape)  # rows x columns                   

# get 1st 2 elements of the 1st row of the array and the 1st element of the 2nd row
print("Accessing elements [0,0], [0,1], and [1,0] of the ndarray: ", another[0, 0], ", ",another[0, 1],", ", another[1, 0])

The shape is 2 rows, 3 columns:  (2, 3)
Accessing elements [0,0], [0,1], and [1,0] of the ndarray:  11 ,  12 ,  21


<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

There are many way to create numpy arrays:
</p>

Here we create a number of different size arrays with different shapes and different pre-filled values.  numpy has a number of built in methods which help us quickly and easily create multidimensional arrays.

In [8]:
import numpy as np

# create a 2x2 array of zeros via .zeros()
ex1 = np.zeros((2,2))      
print(ex1)                              

[[ 0.  0.]
 [ 0.  0.]]


In [9]:
# create a 2x2 array filled with 9.0's
ex2 = np.full((2,2), 9.0)  
print(ex2)   

[[ 9.  9.]
 [ 9.  9.]]


In [10]:
# create a 2x2 matrix with the diagonal 1s and the others 0
# .eye() = creates IDENTITY MATRIX
ex3 = np.eye(2,2)
print(ex3)  

[[ 1.  0.]
 [ 0.  1.]]


In [11]:
# create an array of ones
ex4 = np.ones((1,2))
print(ex4)    

[[ 1.  1.]]


In [12]:
# notice that the above ndarray (ex4) is actually rank 2, it is a 2x1 array [1 row 2 cols]
print(ex4.shape)

(1, 2)


In [14]:
# this means we need to use 2 indexes to access an element --> get 1st row, 2nd element
print(ex4[0,1])

1.0


In [15]:
# create an 2x2 array of random floats between 0 and 1
ex5 = np.random.random((2,2))
print(ex5)    

[[ 0.39654922  0.61591795]
 [ 0.6500051   0.03578252]]


<p style="font-family: Arial; font-size:2.75em;color:purple; font-style:bold"><br>

Array Indexing
<br><br></p>

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>
Slice indexing:
</p>

Similar to the use of slice indexing with lists and strings, we can use slice indexing to pull out sub-regions of ndarrays.

In [16]:
import numpy as np

# Rank 2 array (2 dimensional) of shape (3, 4)

an_array = np.array([[11,12,13,14], [21,22,23,24], [31,32,33,34]])
print(an_array)

[[11 12 13 14]
 [21 22 23 24]
 [31 32 33 34]]


Use array slicing to get a subarray consisting of the first 2 rows x 2 columns.

In [15]:
an_array[:2,:2]

array([[  11, 1000],
       [  21,   22]])

In [9]:
# get 1st 2 rows and 2nd + 3rd cols and put into its own ndarray 
# no.array() - creates copy that does NOT point to same data as an_array = has its own data
 
a_slice = np.array(an_array[:2, 1:3]) # has different indices than an_array
print(a_slice)

[[12 13]
 [22 23]]


When you modify a slice, you actually modify the underlying array.

In [17]:
a_slice2 = an_array[:2, 1:3] # points to same data as an_array

print("Before:", an_array[0, 1])   #inspect the element at row 0, col 1  --> 1st row, 2nd col

a_slice2[0, 0] = 1000    # a_slice[0, 0] = same piece of data as an_array[0, 1]

print("After:", an_array[0, 1])    

Before: 12
After: 1000


<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Use both integer indexing & slice indexing
</p>

We can use combinations of integer indexing and slice indexing to create different shaped matrices.

In [23]:
# Create a Rank 2 array of shape (3, 4)

an_array = np.array([[11,12,13,14], [21,22,23,24], [31,32,33,34]])
print(an_array)

[[11 12 13 14]
 [21 22 23 24]
 [31 32 33 34]]


In [24]:
# Using both integer indexing & slicing generates an array of lower rank

row_rank1 = an_array[1, :]    # 2nd row, all cols = Rank 1 view 

print(row_rank1, row_rank1.shape)  # notice only a single []

[21 22 23 24] (4,)


In [25]:
# Slicing alone: generates an array of the SAME rank as the an_array

row_rank2 = an_array[1:2, :]  # 2nd row up to + not including 3rd row + all cols = Rank 2 view 

print(row_rank2, row_rank2.shape)   # Notice the [[ ]]

[[21 22 23 24]] (1, 4)


In [28]:
#We can do the same thing for columns of an array:

col_rank1 = an_array[:, 1] # all rows, col 2 --> Rank 1 View
col_rank2 = an_array[:, 1:2] # all rows, col 2 up to + not including col3 --> Rank 2 View

print(col_rank1, col_rank1.shape)  # Rank 1 --> vector
print()
print(col_rank2, col_rank2.shape)  # Rank 2 --> vertical vector (1d matrix)

[12 22 32] (3,)

[[12]
 [22]
 [32]] (3, 1)


<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Array Indexing for changing elements:
</p>

Sometimes it's useful to use an array of indexes to access or change elements.

In [18]:
# Create new array
an_array = np.array([[11,12,13], [21,22,23], [31,32,33], [41,42,43]])

print('Original Array:')
print(an_array)

Original Array:
[[11 12 13]
 [21 22 23]
 [31 32 33]
 [41 42 43]]


In [19]:
# Create an array of indices

col_indices = np.array([0, 1, 2, 0])
print('\nCol indices picked : ', col_indices)

row_indices = np.arange(4)
print('\nRows indices picked : ', row_indices)


Col indices picked :  [0 1 2 0]

Rows indices picked :  [0 1 2 3]


In [20]:
# Examine the pairings of row_indices and col_indices.  These are the elements we'll change next.
# zip() creates a pairing of 2 lists 

for row,col in zip(row_indices,col_indices):
    print(row, ", ",col)

0 ,  0
1 ,  1
2 ,  2
3 ,  0


In [21]:
# Select one element from each row

print('Values in the array at those indices: ',an_array[row_indices, col_indices])

# should return value at (1,1), (2,2), (3,3) (4,1)

Values in the array at those indices:  [11 22 33 41]


In [22]:
# Change 1 element from each row using the indices selected
# add 10k to each element

an_array[row_indices, col_indices] += 100000

print('\nChanged Array:')
print(an_array)


Changed Array:
[[100011     12     13]
 [    21 100022     23]
 [    31     32 100033]
 [100041     42     43]]


<p style="font-family: Arial; font-size:2.75em;color:purple; font-style:bold"><br>
Boolean Indexing

<br><br></p>
<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Array Indexing for changing elements:
</p>

In [48]:
# create a Rank 2 3x2 array

an_array = np.array([[11,12], [21, 22], [31, 32]])
print(an_array)

[[11 12]
 [21 22]
 [31 32]]


In [32]:
# create a boolean filter for whether each element value meets this condition of being > 15

filter = (an_array > 15)
filter

array([[False, False],
       [ True,  True],
       [ True,  True]], dtype=bool)

Notice that the filter is a *same size* ndarray as an_array which is filled with **True** for each element whose corresponding element in an_array which is greater than 15 and **False** for those elements whose value is less than 15.

In [40]:
# select just those elements which meet that criteria

print(an_array[filter],'\n') # rank 1 view = vector
print(an_array[filter].shape)

[21 22 31 32] 

(4,)


In [43]:
# For short, we could have just used the approach below w/out the need for the separate filter array.

an_array[(an_array > 15)]

array([21, 22, 31, 32])

In [49]:
# get all values between 20 and 30

an_array[(an_array >= 20) & (an_array <= 30)]

array([21, 22])

In [44]:
# get all even values

an_array[(an_array % 2 == 0)]

array([12, 22, 32])

What is particularly useful is that we can actually change elements in the array applying a similar logical filter.  Let's add 100 to all the even values.

In [50]:
# add 100 to even values

an_array[an_array % 2 == 0] +=100
print(an_array)

[[ 11 112]
 [ 21 122]
 [ 31 132]]


<p style="font-family: Arial; font-size:2.75em;color:purple; font-style:bold"><br>

Datatypes and Array Operations
<br><br></p>

Each ndarray *has its OWN data type*

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Datatypes:
</p>

In [51]:
ex1 = np.array([11, 12]) # Python assigns the int data type to a rank 1 array
print(ex1.dtype)

int32


In [52]:
ex2 = np.array([11.0, 12.0]) # Python assigns the float data type to a rank 1 arary
print(ex2.dtype)

float64


In [53]:
ex3 = np.array([11, 21], dtype = np.int64) # specify the data type
print(ex3.dtype)

int64


In [58]:
# force floats into integers (effectively using floor())
ex4 = np.array([11.1,12.7], dtype = np.int64)

print(ex4.dtype,'\n')
print(ex4)

int64 

[11 12]


In [60]:
# you can use this to force integers into floats if you anticipate the values may change to floats later
# good because we are not actually losing any data here

ex5 = np.array([11, 21], dtype=np.float64)
print(ex5.dtype,'\n')
print(ex5)

float64 

[ 11.  21.]


<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Arithmetic Array Operations:

</p>

In [62]:
x = np.array([[111,112],[121,122]], dtype = np.int)
y = np.array([[211.1,212.1],[221.1,222.1]], dtype = np.float64)

# view the array

print(x,'\n')
print(y)

[[111 112]
 [121 122]] 

[[ 211.1  212.1]
 [ 221.1  222.1]]


For most arithmetic operations, the result will be **upcast** to a floating point to avoid losing precision

In [63]:
# add
print(x + y,'\n')

# same operation via numpy function "add"
print(np.add(x, y))  

[[ 322.1  324.1]
 [ 342.1  344.1]] 

[[ 322.1  324.1]
 [ 342.1  344.1]]


In [64]:
# subtract
print(x - y,'\n')

# same operation via numpy function "subtract"
print(np.subtract(x, y))

[[-100.1 -100.1]
 [-100.1 -100.1]] 

[[-100.1 -100.1]
 [-100.1 -100.1]]


In [65]:
# multiply
print(x * y,'\n')

# same operation via numpy function "multiply"
print(np.multiply(x, y))

[[ 23432.1  23755.2]
 [ 26753.1  27096.2]] 

[[ 23432.1  23755.2]
 [ 26753.1  27096.2]]


In [66]:
# divide
print(x / y,'\n')

# same operation via numpy function "divide"
print(np.divide(x, y))

[[ 0.52581715  0.52805281]
 [ 0.54726368  0.54930212]] 

[[ 0.52581715  0.52805281]
 [ 0.54726368  0.54930212]]


In [70]:
# square root
print(np.sqrt(x),'\n')
print(np.sqrt(y))

[[ 10.53565375  10.58300524]
 [ 11.          11.04536102]] 

[[ 14.52928078  14.56365339]
 [ 14.86943173  14.90301983]]


In [69]:
# exponent (e ** x)
print(np.exp(x),'\n')
print(np.exp(y))

[[  1.60948707e+48   4.37503945e+48]
 [  3.54513118e+52   9.63666567e+52]] 

[[  4.78151068e+91   1.29974936e+92]
 [  1.05319781e+96   2.86288848e+96]]


<p style="font-family: Arial; font-size:2.75em;color:purple; font-style:bold"><br>

Statistical Methods, Sorting, and <br> <br> Set Operations:
<br><br>
</p>

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Basic Statistical Operations:
</p>

In [2]:
# setup a Rank 2 array --> 2 x 4 matrix w/ random #'s from a normal distribution multiplied by 10

arr = 10 * np.random.randn(2,5)
print(arr)

[[  2.04634129  -2.58827571  -9.49416007  -5.05161017   6.97683136]
 [  0.10173662   6.62993441 -11.46428431   9.17327269   0.28204893]]


In [3]:
# compute the mean from all elements in the array/matrix

print(arr.mean())

-0.338816496895


In [4]:
# compute row means

print(arr.mean(axis = 1))

[-1.62217466  0.94454167]


In [5]:
# compute column means

print(arr.mean(axis = 0))

[  1.07403896   2.02082935 -10.47922219   2.06083126   3.62944014]


In [6]:
# sum all the elements

print(arr.sum())

-3.38816496895


In [7]:
# compute row medians

print(np.median(arr, axis = 1))

[-2.58827571  0.28204893]


<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Sorting:
</p>


In [8]:
# create a 10-element array of random #'s from a normal distribution

unsorted = np.random.randn(10)

print(unsorted)

[ 0.33976748  0.89658729 -1.13403508  0.00509504  0.02651432 -0.93240787
  1.6446378  -0.43795395  0.14064008  1.07002401]


In [10]:
# create copy and sort

sorted = np.array(unsorted)
sorted.sort()

print(sorted,'\n')
print(unsorted)

[-1.13403508 -0.93240787 -0.43795395  0.00509504  0.02651432  0.14064008
  0.33976748  0.89658729  1.07002401  1.6446378 ] 

[ 0.33976748  0.89658729 -1.13403508  0.00509504  0.02651432 -0.93240787
  1.6446378  -0.43795395  0.14064008  1.07002401]


In [11]:
# inplace sorting to sort the original array

unsorted.sort() 

print(unsorted)

[-1.13403508 -0.93240787 -0.43795395  0.00509504  0.02651432  0.14064008
  0.33976748  0.89658729  1.07002401  1.6446378 ]


<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Finding Unique elements:
</p>

In [12]:
array = np.array([1,2,1,4,2,1,4,2])

print(np.unique(array))

[1 2 4]


<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Set Operations with np.array data type:
</p>

In [13]:
# create 2 string ndarrays

set1 = np.array(['desk','chair','bulb'])
set2 = np.array(['lamp','bulb','chair'])

print(set1,set2)

['desk' 'chair' 'bulb'] ['lamp' 'bulb' 'chair']


In [14]:
# get elements common to both arrays via intersect1d which expects 1D arrays

print(np.intersect1d(set1,set2))

['bulb' 'chair']


In [15]:
# get all unique elements across both arrays

print(np.union1d(set1,set2))

['bulb' 'chair' 'desk' 'lamp']


In [16]:
# get elements that are in set1 and are not in set2

print(np.setdiff1d(set1,set2))

# get elements that are in set2 and are not in set1

print(np.setdiff1d(set2,set1))

['desk']
['lamp']


In [17]:
# get booleans to see which elements of s1 are in s2

print(np.in1d(set1,set2))

[False  True  True]


<p style="font-family: Arial; font-size:2.75em;color:purple; font-style:bold"><br>

Broadcasting:
<br><br>
</p>

Introduction to broadcasting. <br>

**Broadcasting** is one of the more advanced features of Numpy and can help make array operations more convienent.

When operating on 2 arrays, Numpy compares their shapes element-wise. It starts w/ the trailing dimensions + works its way forward.

2 dimensions are compatible if:
* they are equal in size
* one of them is 1/scalar

For more details, please see: <br>
https://docs.scipy.org/doc/numpy-1.10.1/user/basics.broadcasting.html

In [18]:
import numpy as np

#create 4x3 matrix of zeroes

start = np.zeros((4,3))
print(start)

[[ 0.  0.  0.]
 [ 0.  0.  0.]
 [ 0.  0.  0.]
 [ 0.  0.  0.]]


In [19]:
# create a rank 1 ndarray with 3 values]

add_rows = np.array([1,0,2])
print(add_rows)

[1 0 2]


In [20]:
# add add_rows array to EACH row of 'start' using broadcasting
# broadcasting will figure out which dimensions we want to add 
#   - it will see we have the same # of cols, so it performs the operation of adding add_rows to each row in start
#   - where both cols will match up in size

y = start + add_rows  
print(y)

[[ 1.  0.  2.]
 [ 1.  0.  2.]
 [ 1.  0.  2.]
 [ 1.  0.  2.]]


In [25]:
# create a 4x1 ndarray o broadcast across columns

add_cols = np.array([[0,1,2,3]]) # double []'s = cols
print(add_cols)

[[0 1 2 3]]


In [26]:
# transpose this add_cols array to make vertical

add_cols = add_cols.T
print(add_cols)

[[0]
 [1]
 [2]
 [3]]


In [27]:
# add add_cols values to each column of 'start' using broadcasting

y = start + add_cols 
print(y)

[[ 0.  0.  0.]
 [ 1.  1.  1.]
 [ 2.  2.  2.]
 [ 3.  3.  3.]]


In [28]:
# using a scalar (single) value will just broadcast in both dimensions

add_scalar = np.array([1])  
print(add_scalar)

[1]


In [29]:
y = start + add_scalar 
print(y)

[[ 1.  1.  1.]
 [ 1.  1.  1.]
 [ 1.  1.  1.]
 [ 1.  1.  1.]]


In [32]:
a = np.array([[0,0],[0,0]])
b1 = np.array([1,1]) # double broadcast
b2 = 1 #scalar broadcasts to each value

print(a+b1,'\n','\n','\n',a+b2)

[[1 1]
 [1 1]] 
 
 
 [[1 1]
 [1 1]]


Example from the slides:

In [33]:
# create our 3x4 matrix

arrA = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])
print(arrA)

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]


In [34]:
# create our 4x1 array

arrB = [0,1,0,2]
print(arrB)

[0, 1, 0, 2]


In [35]:
# add the two together using broadcasting --> adds to each row

print(arrA + arrB)

[[ 1  3  3  6]
 [ 5  7  7 10]
 [ 9 11 11 14]]


<p style="font-family: Arial; font-size:2.75em;color:purple; font-style:bold"><br>

Speedtest: ndarrays vs lists
<br><br>
</p>

First setup paramaters for the speed test. We'll be testing time to sum elements in an ndarray versus a list.

In [36]:
from numpy import arange
from timeit import Timer

size    = 1000000
timeits = 1000

In [38]:
# create ndarray with values from 0 to size-1

nd_array = arange(size)
print( type(nd_array) )

<class 'numpy.ndarray'>


In [39]:
# timer expects the operation as a parameter --> pass nd_array.sum()

# this runs a sum operation on 1 million elements

timer_numpy = Timer("nd_array.sum()", "from __main__ import nd_array")

print("Time taken by numpy ndarray: %f seconds" % 
      (timer_numpy.timeit(timeits)/timeits))

Time taken by numpy ndarray: 0.000633 seconds


In [40]:
# create the list with values from 0 to size-1

a_list = list(range(size))
print (type(a_list) )

<class 'list'>


In [41]:
# timer expects the operation as a parameter --> pass sum(a_list)

# this runs a sum operation on 1 million elements

timer_list = Timer("sum(a_list)", "from __main__ import a_list")

print("Time taken by list:  %f seconds" % 
      (timer_list.timeit(timeits)/timeits))

Time taken by list:  0.051841 seconds


<p style="font-family: Arial; font-size:2.75em;color:purple; font-style:bold"><br>

Read or Write to Disk:
<br><br>
</p>

<p style="font-family: Arial; font-size:1.3em;color:#2462C0; font-style:bold"><br>

Binary Format:</p>

In [43]:
binary_array = np.array([ 23.23, 24.24] )

# store this array in a numpy file (extension = .npy)

np.save('binary_array', binary_array)

# now load it back in

np.load('binary_array.npy')

array([ 23.23,  24.24])

<p style="font-family: Arial; font-size:1.3em;color:#2462C0; font-style:bold"><br>

Text Format:</p>

In [44]:
# save the same binary array into a text file, seperated by a commma
np.savetxt('array.txt', X = binary_array, delimiter = ',')

In [4]:
#read that file back in

# !cat array.txt
!type array.txt

2.323000000000000043e+01
2.423999999999999844e+01


In [5]:
# load that file back in

np.loadtxt('array.txt', delimiter=',')

array([ 23.23,  24.24])

<p style="font-family: Arial; font-size:2.75em;color:purple; font-style:bold"><br>

Additional Common ndarray Operations
<br><br></p>

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Dot Product on Matrices and Inner Product on Vectors:

</p>

In [7]:
# determine the dot product of two matrices

x2d = np.array([[1,1],[1,1]])
y2d = np.array([[2,2],[2,2]])

# dot/inner product = (1*2 + 1*2), (1*2 + 1*2)
#                     (1*2 + 1*2), (1*2 + 1*2)

print(x2d.dot(y2d),'\n',)
print(np.dot(x2d, y2d))

[[4 4]
 [4 4]] 

[[4 4]
 [4 4]]


In [11]:
# determine the dot product of two matrices

x2d = np.array([[3,5],[7,1]])
y2d = np.array([[4,1],[2,4]])

print(x2d,'\n')
print(y2d)

[[3 5]
 [7 1]] 

[[4 1]
 [2 4]]


In [12]:
# dot/inner product = (3*4 + 5*2), (3*1 + 5*4)
#                     (7*4 + 1*2), (7*1 + 1*4)

print(x2d.dot(y2d),'\n',)
print(np.dot(x2d, y2d))

[[22 23]
 [30 11]] 

[[22 23]
 [30 11]]


In [13]:
# determine the inner (dot) product of two vectors

a1d = np.array([9 , 9 ])
b1d = np.array([10, 10])

# dot/inner product = 9*10 + 9*10

print(a1d.dot(b1d),'\n',)
print(np.dot(a1d, b1d))

180 

180


In [14]:
# dot produce on an array and vector

print(x2d.dot(a1d))
print()
print(np.dot(x2d, a1d))

[72 72]

[72 72]


<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Sum:
</p>

In [15]:
# sum elements in the array

np.sum(np.array([[11,12],[21,22]]))

66

In [17]:
# column-wise sum

print(np.sum(np.array([[11,12],[21,22]]), axis=0))  

[32 34]


In [18]:
# row-wise sum

print(np.sum(np.array([[11,12],[21,22]]), axis=1))  

[23 43]


<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Element-wise Functions: </p>

For example, let's compare two arrays values to get the maximum of each.

In [22]:
# 8 random #'s from normal distribution
x = np.random.randn(8)
y = np.random.randn(8)

print(x,'\n')
print(y)

[-1.37378548  0.57871972 -0.32993242 -0.5345772   0.73003726 -0.06312801
 -1.42582655  0.83213456] 

[ 0.45813042 -1.99832727 -2.01170411 -1.16895361 -0.00239344  0.52069672
  1.4524876   0.87435966]


In [23]:
# returns element wise maximum (max at certain index) between 2 arrays

np.maximum(x, y)

array([ 0.45813042,  0.57871972, -0.32993242, -0.5345772 ,  0.73003726,
        0.52069672,  1.4524876 ,  0.87435966])

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Reshaping array:
</p>

In [24]:
# grab values from 0 through 19 in an array

arr = np.arange(20)
print(arr)

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19]


In [25]:
# reshape to be a 4 x 5 matrix

arr.reshape(4,5)

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Transpose:

</p>

In [28]:
# transpose

ex1 = np.array([[11,12],[21,22]])
print(ex1,'\n')
print(ex1.T)

[[11 12]
 [21 22]] 

[[11 21]
 [12 22]]


<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Indexing using where():</p>

In [30]:
# create 2 arrays
x_1 = np.array([1,2,3,4,5])
y_1 = np.array([11,22,33,44,55])

# create a filter
filter = np.array([True, False, True, False, True])

In [32]:
# print out element from 2nd arg (1st array) if True
# Else print out the element from the 3rd arg (2nd array)

out = np.where(filter, x_1, y_1)
print(out)

[ 1 22  3 44  5]


In [33]:
mat = np.random.rand(5,5)
mat

array([[ 0.9252558 ,  0.66538338,  0.39752032,  0.71909384,  0.47376204],
       [ 0.45009932,  0.76245426,  0.89662439,  0.33237054,  0.99147958],
       [ 0.95951848,  0.84846857,  0.76153086,  0.49279184,  0.68645096],
       [ 0.34810691,  0.15888041,  0.45240514,  0.71747593,  0.64381488],
       [ 0.4538235 ,  0.17882713,  0.8608744 ,  0.50154922,  0.4476627 ]])

In [34]:
# wherever mat is greater than 1/2, print 1000, and if not print out -1
np.where(mat > 0.5, 1000, -1)

array([[1000, 1000,   -1, 1000,   -1],
       [  -1, 1000, 1000,   -1, 1000],
       [1000, 1000, 1000,   -1, 1000],
       [  -1,   -1,   -1, 1000, 1000],
       [  -1,   -1, 1000, 1000,   -1]])

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

"any" or "all" conditionals:</p>

In [35]:
arr_bools = np.array([ True, False, True, True, False ])

In [36]:
# are any booleans in the array true?

arr_bools.any()

True

In [37]:
# are all booleans in the array true?

arr_bools.all()

False

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Random Number Generation:
</p>

In [43]:
# create 1x5 matrix + get the 1st row (only row) of the matrix (array)

Y = np.random.normal(size = (1,5))[0]
print(Y)

[-1.75406548 -0.08639028  0.40645585  0.85696603  0.48069403]


In [40]:
# get an array of 4 random ints between 2 and 4
Z = np.random.randint(low=2,high=50,size=4)
print(Z)

[17 48 39 41]


In [47]:
# rearrange/return a new ordering of elements in Z
np.random.permutation(Z)

array([39, 48, 17, 41])

In [48]:
# get 4 #'s from a uniform distribution
np.random.uniform(size=4)

array([ 0.34931839,  0.40121988,  0.96335452,  0.8353164 ])

In [49]:
# get 4 #'s from a uniform distribution
np.random.normal(size=4)

array([ 1.01830886,  0.8881716 ,  0.2300275 , -0.40472579])

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Merging data sets:
</p>

In [50]:
# get two 2x2 matrices of random ints between 2 and 50
K = np.random.randint(low=2,high=50,size=(2,2))
print(K)

print()
M = np.random.randint(low=2,high=50,size=(2,2))
print(M)

[[16 47]
 [29 42]]

[[36 13]
 [39 17]]


In [51]:
# stack these matrices vertically w/ K on top
np.vstack((K,M))

array([[16, 47],
       [29, 42],
       [36, 13],
       [39, 17]])

In [52]:
# stack these matrices horizontally w/ K on the left
np.hstack((K,M))

array([[16, 47, 36, 13],
       [29, 42, 39, 17]])

In [53]:
# concatenate them column-wise
np.concatenate([K, M], axis = 0)

array([[16, 47],
       [29, 42],
       [36, 13],
       [39, 17]])

In [54]:
# concatenate them row-wise
np.concatenate([K, M.T], axis = 1)

array([[16, 47, 36, 39],
       [29, 42, 13, 17]])