## Introduction to Libraries

Import a `string` library and then use it

In [1]:
import string

print('The lower ascii letters are', string.ascii_lowercase)
print(string.capwords('capitalise this sentence please.'))

The lower ascii letters are abcdefghijklmnopqrstuvwxyz
Capitalise This Sentence Please.


Use help to learn about the contents of a library

In [2]:
help(string)

Help on module string:

NAME
    string - A collection of string constants.

MODULE REFERENCE
    https://docs.python.org/3.6/library/string
    
    The following documentation is automatically generated from the Python
    source files.  It may be incomplete, incorrect or include features that
    are considered implementation detail and may vary between Python
    implementations.  When in doubt, consult the module reference at the
    location listed above.

DESCRIPTION
    Public module variables:
    
    whitespace -- a string containing all ASCII whitespace
    ascii_lowercase -- a string containing all ASCII lowercase letters
    ascii_uppercase -- a string containing all ASCII uppercase letters
    ascii_letters -- a string containing all ASCII letters
    digits -- a string containing all ASCII decimal digits
    hexdigits -- a string containing all ASCII hexadecimal digits
    octdigits -- a string containing all ASCII octal digits
    punctuation -- a string containing all

Import specific items from a library to shorten code and save memory!

In [3]:
from string import ascii_letters

print('The ASCII letters are', ascii_letters)

The ASCII letters are abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ


Create an alias for a library module when importing it to shorten programs.

In [4]:
import string as s

print(s.capwords('capitalise this sentence again please.'))

Capitalise This Sentence Again Please.


## numpy 

### Create a numpy array

Numpy is a powerful library for scientific computing with data.

Create a 1d numpy array

In [5]:
import numpy as np

In [7]:
# Create an 1d array from a list
import numpy as np
list1 = [0,1,2,3,4]
arr1 = np.array(list1)

In [9]:
# Print the array and its type
print(type(arr1d))
arr1d

<class 'numpy.ndarray'>


array([0, 1, 2, 3, 4])

So how are array better than lists?

In [10]:
list1 + 2  # error

TypeError: can only concatenate list (not "int") to list

In [11]:
# Add 2 to each element of arr1d
arr1d + 2

array([2, 3, 4, 5, 6])

So, that’s about 1d array. You can also pass a list of lists to create a matrix like a 2d array.

In [13]:
# Create a 2d array from a list of lists
list2 = [[0,1,2], [3,4,5], [6,7,8]]
arr2d = np.array(list2)
arr2d

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

You may also specify the datatype by setting the dtype argument. Some of the most commonly used numpy dtypes are: 'float', 'int', 'bool', 'str' and 'object'.

In [14]:
# Create a float 2d array
arr2d_f = np.array(list2, dtype='float')
arr2d_f

array([[0., 1., 2.],
       [3., 4., 5.],
       [6., 7., 8.]])

In [15]:
# Convert to 'int' datatype
arr2d_f.astype('int')

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

In [16]:
# Convert to int then to str datatype
arr2d_f.astype('int').astype('str')

array([['0', '1', '2'],
       ['3', '4', '5'],
       ['6', '7', '8']], dtype='<U21')

Note: A numpy array must have all items to be of the same data type, unlike lists. 

In [31]:
# Create an array of random numbers
arr_rand = np.random.randint(0, 10, size=[2,2])

In [35]:
# Create an array of floats at equal intervals
arr_inter = np.linspace(0, 10, 100)

Finally, you can always convert an array back to a python list using tolist().

In [32]:
# Convert an array back to a list
arr_rand.tolist()

[[4, 1], [5, 3]]

### Basic array operations

How to inspect the size of a numpy array?

In [20]:
# Create a 2d array with 3 rows and 4 columns
list2 = [[1, 2, 3, 4],[3, 4, 5, 6], [5, 6, 7, 8]]
arr2 = np.array(list2, dtype='float')
arr2

array([[1., 2., 3., 4.],
       [3., 4., 5., 6.],
       [5., 6., 7., 8.]])

In [21]:
# shape
print('Shape: ', arr2.shape)

# dtype
print('Datatype: ', arr2.dtype)

# size
print('Size: ', arr2.size)

# ndim
print('Num Dimensions: ', arr2.ndim)

Shape:  (3, 4)
Datatype:  float64
Size:  12
Num Dimensions:  2


How to extract specific items from an array?

In [22]:
arr2

array([[1., 2., 3., 4.],
       [3., 4., 5., 6.],
       [5., 6., 7., 8.]])

In [23]:
# Extract the first 2 rows and columns
arr2[:2, :2]
list2[:2, :2]  # error

TypeError: list indices must be integers or slices, not tuple

In [24]:
# Get the boolean output by applying the condition to each element.
b = arr2 > 4
b

array([[False, False, False, False],
       [False, False,  True,  True],
       [ True,  True,  True,  True]])

In [25]:
arr2[b]

array([5., 6., 5., 6., 7., 8.])

How to represent missing values and infinite?

In [26]:
# Insert a nan and an inf
arr2[1,1] = np.nan  # not a number
arr2[1,2] = np.inf  # infinite
arr2

array([[ 1.,  2.,  3.,  4.],
       [ 3., nan, inf,  6.],
       [ 5.,  6.,  7.,  8.]])

In [27]:
# Replace nan and inf with -1. Don't use arr2 == np.nan
missing_bool = np.isnan(arr2) | np.isinf(arr2)
arr2[missing_bool] = -1  
arr2

array([[ 1.,  2.,  3.,  4.],
       [ 3., -1., -1.,  6.],
       [ 5.,  6.,  7.,  8.]])

How to compute mean, min, max on the ndarray?

In [28]:
# mean, max and min
print("Mean value is: ", arr2.mean())
print("Max value is: ", arr2.max())
print("Min value is: ", arr2.min())

Mean value is:  3.5833333333333335
Max value is:  8.0
Min value is:  -1.0


Reshaping and Flattening Multidimensional arrays

In [29]:
# Reshape a 3x4 array to 4x3 array
arr2.reshape(4, 3)

array([[ 1.,  2.,  3.],
       [ 4.,  3., -1.],
       [-1.,  6.,  5.],
       [ 6.,  7.,  8.]])

In [30]:
# Flatten it to a 1d array
arr2.flatten()

array([ 1.,  2.,  3.,  4.,  3., -1., -1.,  6.,  5.,  6.,  7.,  8.])

How to get the unique items and their counts?

In [33]:
# Create random integers of size 10 between [0,10)
np.random.seed(100)
arr_rand = np.random.randint(0, 10, size=10)
print(arr_rand)


[8 8 3 7 7 0 4 2 5 2]


In [34]:
# Get the unique items and their counts
uniqs, counts = np.unique(arr_rand, return_counts=True)
print("Unique items : ", uniqs)
print("Counts       : ", counts)

Unique items :  [0 2 3 4 5 7 8]
Counts       :  [1 2 1 1 1 2 2]
