# MSc Artificial Intelligence Python Primer
# Unit 5 Jupyter Notebook 
# Built-in Functions and Exceptions 


## Goals
This notebook has been created to familiarise you with other aspects of Python that we have not covered elsewhere in the module. Most of the code needed to progress through this Notebook has been provided for you. However, there are several coding tasks that you will need to complete yourself by entering code yourself.

The topics in this notebook include:
* Built-in Functions (such as `eval()`, `filter()`, `min()`, `max()`, `sorted()` and `zip()`)
* String Methods
* File Handling using Python
* Exception Handling

# Built-in Functions
We have already encountered some of the built-in functions on the module, namely, `format()`, `help()`, `input()`, `print()` and `range()`. The following section provides some information about the other functions that are available to you. However, it does not list all of the built-in functions. We have tried to capture the most popular and useful ones. You can find a detailed description of all the built-in functions at https://docs.python.org/3/library/functions.html.

## `abs(x)`
This function returns the absolute value of a number. An absolute value is defined in mathematics as the non-negative value of `x` without regard to its sign. The parameter `x` may be an integer or a floating point number.

In [None]:
print(abs(-100)) # negative integer

print(abs(3.1415)) # positive float

## `chr(x)` and `ord(x)`
This `chr()` function returns the Unicode character given by the parameter `x`. There are 144697 possible characters that can be represented as a Unicode value covering 159 modern and historic scripts as well as symbols and emojis. See https://en.wikipedia.org/wiki/List_of_Unicode_characters for a list of the most commonly used characters. The `ord()` function performs the opposite conversion, that is, from a given character to its Unicode value.

In [None]:
print(chr(97), ord('a')) # Unicode value for the lower case a

print(chr(49), ord('1')) # Unicode value for the digit 1

print(chr(920), ord('Θ')) # Unicode value for upper case theta

## `complex(r, i)`
This function returns a complex number with a real and imaginary component. The `r` parameter represents the real part and `i` represents the imaginary part. In addition, this function can also accept a string representation of a complex number. Care must be taken in formatting the string. It must be in the form *"real+imagj"* or *"real-imagj"* with no spaces. Also the `i` parameter should be omitted.

In [None]:
print(complex(2, -3)) # real and imaginary parts are given

print(complex(1)) # only real part is given, imag part defaults to 0

print(complex()) # no parameters are given, 
                 # both real and imag default to 0

print(complex('5-9j')) # a string is given

print(complex('2 + 6j')) # a string is given with incorrect format

## `divmod(x, y)`
When performing division using integer values it is common to use both the / and % operators do give the quotient and remainder. This function takes two parameters, `x` and `y`, and returns a pair of numbers consisting of their quotient and remainder. `x` is the numerator and `y` is the denominator. Care must be taken when using this function with float parameter values. If you are interested in how the function works for float parameters, take a look at the following link: https://docs.python.org/3/library/functions.html#divmod.

In [None]:
q, r = divmod(10, 3) # divide 10 by 3 -> quot3n10 = 3, remainder = 1
print("q =",q,"r =",r)

q, r = divmod(3, 10) # divide 3 by 10 -> quotient = 0, remainder = 3
print("q =",q,"r =",r)

q, r = divmod(8.0, 3.6) # divide 8.0 by 3.6 -> quotient = 2.0, remainder = 0.8
print("q =",q,"r =",r)


## `eval(expr)`
This function can be used to evaluate a given string parameter, `expr`, as a Python expression. The function returns the result of evaluating the given expression. This function can be configured using additional optional parameters. Again, if you are interested in all the options available with this function, take a look at the following link: https://docs.python.org/3/library/functions.html#eval.

In [None]:
x = 1 
print(eval('x * 2')) # evaluate the expression x * 2

a = 43
b = 17 
print(eval("divmod(a, b)")) # evalute the expression divmod(a, b) 

## `filter(function, iterable)`
This function can be used to filter out items in an `iterable` for which the given `function` returns `True`. An iterable is either a sequence (such as a List, Tuple and strings), a container (Set and Dictionary) or an iterator (an object that is used to traverse through all values in a sequence). The `function` parameter should be a Python function that check each iterable item for acceptance or rejection. Accepted items are indicated by returning `True`. The `filter()` function returns an iterator. This can be converted back into a sequence type using the `list()`, `set()` or `tuple()` function (if required).

In [None]:
# define a function which checks if a given letter is a vowel.
# Returns: True if letter is a vowel, False if letter is not a vowel
def filter_vowels(letter):
    vowels = ['a', 'e', 'i', 'o', 'u']
    return True if letter in vowels else False

# filter out none vowels from a given string and iterate over the result
for c in filter(filter_vowels,"hello world"):
    print(c)

# filter out all none-vowels from a given list of letters and convert the result to a tuple
letters = ['a', 'b', 'd', 'e', 'i', 'j', 'o']
print(tuple(filter(filter_vowels,letters)))

# define a funcion which checks a given key-value pair to see if the key is even
# Returns: True if key is even, False if key is odd
def f(item):
    if item[0]%2 == 0: # access the first value in the pair, i.e. the key
        return True
    else:
        return False

# filter out all none-even keys from a dictionary and store the result in a new dictionary
aDict = {1:'one',2:'two',3:'three',4:'four'}
newDict = dict(filter(f, aDict.items())) 
print(newDict)

## `float(x)`
This function converts the given parameter, `x`, into a float value. The parameter can be a number or a string. The string representation of the floating point number must valid otherwise Python will raise a `ValueError`. If no parameters are given, `float()` returns `0.0`.

In [None]:
# an integer example
print(float(10))

# a simple string example
print(float("-13.33"))

# a string using scientific notation
print(float("-1.333E1"))

# an invalid float string - which raises a ValueError
print(float("abc"))

## `int(x, [base])`
This function convert the given parameter, `x`, into an int value. The parameter can be a number or a string. There is an optional parameter, `base`, which allows you to specify which number base to use for the conversion. The default base is 10. If no parameters are given, `int()` returns `0`.

In [None]:
# a float example
print(int(123.23))

# a binary example using base=2
print(int('1010', base=2))

# a string example
print(int('123'))

## `len(x)`
This function returns the number of items in a sequence (such as List, Set, Tuple, string) or collection (such as Dictionary or Set). This function takes a single parameter, `x`, and returns the length of the sequence or collection

In [None]:
# length of a string
str="hello"
print(len(str))

# length of a List
languages = ['Python', 'Java', 'JavaScript']
print(len(languages))

# size of a dictionary
aDict = {1:'one',2:'two',3:'three',4:'four'}
print(len(aDict))

## `map(function, iterable)`
This function can be used to apply a given function, `function` to each item in a given `iterable`. An iterable is either a sequence (such as a List, Tuple and strings), a container (Set and Dictionary) or an iterator (an object that is used to traverse through all values in a sequence). The function parameter should be a Python function that returns the result of processing the given item. The `map()` function returns an iterator. This can be converted back into a sequence type using the `list()`, `set()` or `tuple()` function (if required).

In [None]:
# returns square of a number
def square(number):
    return number * number

# square every number in the given List and convert the result to a List
numbers = [2, 4, 6, 8, 10]
print(list(map(square, numbers)))

# find the length of each key in a dictionary
languages = {"Python":0, "CSharp":0, "Java":0}
print(list(map(len, languages))) # here the built-in len() function has been used


## `min(iter)` & `max(iter)`
These functions return the largest (`max`) and smallest (`min`) items in a given iterable object, `iter`. As you have already seen, an iterable can be List, Tuple, Set, Dictionary or string. For number values, the largest (or smallest) value is obvious. For string values, the values are sorted alphabetically to find the largest (or smallest) value. For dictionaries, the key is used used. Both functions can also accept two parameters instead of an iterable.

In [None]:
# integers
numbers = [9, 34, 11, -4, 27]
print(min(numbers), max(numbers))

# strings
languages = ["Python", "C Programming", "Java", "JavaScript"]
print(min(languages)) # sorting A-Z
print(max(languages)) # sorting Z-A

# dictionaries
staff = {"Dave": 1, "Rhys": 2, "Jan": 3, "Khoa": 4}
print(min(staff)) # key sorted A-Z
print(max(staff)) # key sorted Z-A

# two parameter usage
print(min(34,23))
print(max(34,23))

## `pow(x, y,[mod])`
This function returns x<sup>y</sup> where the parameters `x` represents the base and `y` represents the exponent and must be numeric. This is equivalent to using the power operator `x**y`. There is a third optional parameter, `mod`, which represents the modolus value to be applied after the power. The modolus is equavalent to the `%` operator that we covered in Unit 1, i.e. the remainder of a division. 

In [None]:
# find 5 squared
print(pow(5, 2))

# show pow() and ** are equivalent
print(5**2)

# find square root of 81
print(pow(81, 0.5))

# find inverse of 10
print(pow(10,-1))

# use the modulus parameter to show the result 
# of 7 squared module 5
print(pow(7,2,5))

## `reversed()`
This function returns a reverse iterator to provide a way to access a sequence of values in reverse order. The sequence can be a List, Tuple, string, `range()` or Dictionary. The iterator can be used to create a new sequence object using the list() or tuple() function (if required).

In [None]:
# for string
seq_string = 'Python'
print(list(reversed(seq_string)))

# for list
seq_list = [1, 2, 4, 3, 5]
print(list(reversed(seq_list)))

# for range
seq_range = range(5, 9)
print(list(reversed(seq_range)))

# for dictionary
staff = {"Dave": 1, "Rhys": 2, "Jan": 3, "Khoa": 4}
print(list(reversed(staff))) # uses the key only

## `round(n, [p])`
This function returns a floating point number, `n`, rounded to the specified number of decimal places, `p`. If the second parameter, `p`, is omitted then the `round()` function will round to the nearest whole number.

In [None]:
# round up
print(round(1.6))

# round down
print(round(1.4))

# rounds up when in middle of two options
print(round(1.5))

# round to nearest 4 d.p.
print(round(3.1415926535, 4))

## `sorted(iterable,[reverse],[key])`
This function sorts the items in a given `iterable` in ascending or descending order and returns the result as a List.  An iterable is either a sequence (such as a List, Tuple and strings), a container (Set and Dictionary) or an iterator (an object that is used to traverse through all values in a sequence). The optional parameter `reverse` is used to switch between ascending and descending order. The default is ascending order. Additionally, the optional parameter `key` is used to specify a function that can be used to generate the key to be used during the sort operation. The function can be user-defined or a built-in function (such as `len()`). 

In [None]:
# a list
py_list = ['e', 'a', 'u', 'o', 'i']
print(sorted(py_list))

# a tuple
py_tuple = ('e', 'a', 'u', 'o', 'i')
print(sorted(py_tuple))

# a set in reverse order
py_set = {"Dave", "Rhys", "Khoa", "Jan"}
print(sorted(py_set, reverse=True))

# a list sorted by string length using built-in len() function
languages = ["Python", "C Programming", "Java", "JavaScript"]
print(sorted(languages,key=len))

# a dictionary sorted by value (rather than key)
def use_value(elem):
    return elem[1]

random = [(2, 2), (3, 4), (4, 1), (1, 3)]
print(sorted(random, key=use_value))

## `sum(iterable,[start])`
This function adds all the items in the given `iterable` together and returns the total sum. The items in the iterable should be numbers. As previously described, an iterable can be List, Tuple, Set or Dictionary. There is an optional parameter, `start`, which allows you to specify the initial value which is added to. The default value of start is `0`. If you wish to add at set of strings together, you should use the `join()` method (as described in the next section).

In [None]:
# a list
marks = [65, 71, 68, 74, 61]
print(sum(marks))

# a set with start value 10
ages = {65, 71, 68, 74, 61}
print(sum(ages, 10))

## `zip(iterables)` and unzipping
This function takes zero or more `iterables`, joins them together one item at a time in a tuple and returns the result. In other words, the i<sup>th</sup> tuple contains the i<sup>th</sup> element from each `iterable` given as a parameter. This can become more complicated if the any of the `iterable` parameters has a different length from the others. By default, `zip()` will stop once the shortest `iterable` has been exhausted. The `zip()` function returns an iterator. This can be converted back into a sequence type using the list(), set() or tuple() function (if required). The `zip()` function can also be used unzip an iterable that has previously been zipped (or that is in the correct format). Simple add a `*` infront of the zipped iterable that is being unzipped. The function will return each iterable separately. These can be assigned to variables using the `,` operator, e.g. `x,y = zip(*zipped)`

In [None]:
# example with two iterables of equal length
languages = ['Java', 'Python', 'JavaScript']
versions = [14, 3, 6]

print(list(zip(languages, versions))) # convert result to a list

print("")

# example with three iterables of equal length
names = {'Muhammad', 'Deborah', 'Oladeji'} # a set of names
degrees = ['Data Science', 'Fintech', 'Data Science'] # a list of degrees
ages = (25, 28, 43) # a tuple of ages

print(set(zip(names, degrees, ages))) # convert result to a set

print("")

# example with two iterables of unequal length
numbersList = [1, 2, 3]
numbersListText = ['ONE', 'TWO', 'THREE', 'FOUR']
print(list(zip(numbersList, numbersListText))) # convert result to a list

print("")

# unzipping an existing tuple of aggregated values
zippedList = [(1, "one"), (2, "two"), (3, "three")]
number, number_str = zip(*zippedList) # the * operator indicates unzipping
print("Iterable 1: ",number)
print("Iterable 2: ", number_str )


### <font color='red'><u>Worksheet Exercises</u></font>
1. Find the minimum, maximum and mean average of the following list of numbers:  `37.14, 32.53, 62.21, 96.55, 44.68, 31.18, 34.29, 27.22, 63.13, 54.09, 59.66, 23.3, 88.93, 71.63, 99.1, 93.86, 65.13, 96.81, 10.23, 64.06`
2. Using the `filter()` function, show all the prime numbers between 2 and 100.
3. Using the `map()` and `len()` functions, create a `Tuple` containing the length of each string in the following `List`: `['Dave', 'Jan', 'Rhys', 'Khoa']`
4. Using the `sorted()` function and an appropriate function, sort the following Dictionary by value string length: `{ 'Colorado':'Rockies', 'Boston':'Red Sox', 'Minnesota':'Twins', 'Milwaukee':'Brewers', 'Seattle':'Mariners'}`
5. Given the following two lists, use the `zip()` and `dict()` functions to create a new Dictionary: `keys = ["First Name", "Last Name", "Workshop Room", "Time"]` and `values = ["Dave", "Wyatt", "3Q44", "4-6pm"]`

In [None]:
# 1.
numbers = [37.14, 32.53, 62.21, 96.55, 44.68, 31.18, 34.29, 
           27.22, 63.13, 54.09, 59.66, 23.3, 88.93, 71.63, 
           99.1, 93.86, 65.13, 96.81, 10.23, 64.06]

print("Min: ", min(numbers), "Max: ", max(numbers), "Mean:", sum(numbers)/len(numbers))

# 2.
# this function only works for values greater than or equal to 2
def filter_prime(num):
    flag = True # retain by default
    
    # check if the current number is divisible by any number 
    # up to but not including the current number
    for i in range(2, num):
        if (num % i) == 0:
            flag = False # reject if divisible 
            break
    
    # returns True if value should be retained, False if it should not
    return flag

# use the filter function to print all prime numbers between 2 and 100
for n in filter(filter_prime,range(2,100)):
    print(n)
    
# 3.
# define a function that returns the length of the supplied iterable 
def find_len(str):
    return len(str)

# provide a List of strings and use the map() function to create a Tuple of lengths
names = ['Dave', 'Jan', 'Rhys', 'Khoa']
print(tuple(map(find_len,names)))

# 4.
# define a function that returns the length of the value part of the key:value pair
def find_value_len(elem):
    return len(elem[1])

baseball= {'Colorado':'Rockies', 'Boston':'Red Sox', 
           'Minnesota':'Twins', 'Milwaukee':'Brewers', 
           'Seattle':'Mariners'}

# use the key parameter to specify the find_value_len() function should be used when sorting
print(sorted(baseball.items(), key=find_value_len))

# 5.
keys = ["First Name", "Last Name", "Workshop Room", "Time"]
values = ["Dave", "Wyatt", "3Q44", "4-6pm"]

tutorDict = dict(zip(keys,values))
print(tutorDict)

# String Methods
A string is a sequence of characters enclosed in quaotation marks, `""` or `''`. Strings are represented in Python as a string object. String objects have their own set of built-in methods that can be used to manipulate the string. To use a string method, you simply add a `.` after a string literal or a string variable followed by the required function. We have already seen an example of this in Unit 1 using the `format()` method, e.g. `"My name is {0}".format("Dave")`.

## `capitalize()`
This method returns a string with the first letter capitalised and all other characters converted to lower case. It does not modify the original string. Non-alphabetic charcaters are ignored.

In [None]:
# a string with all lower-case characters
example = "this is an example sentence"
print(example.capitalize())

# a string with all a mixture of upper and lower-case characters
example2 = "tHiS Is An ExAmPlE sEnTeNcE"
print(example2.capitalize())

# a string with all non-alphabetic characters
example3 = "!this text is important!"
print(example3.capitalize())


## `count(substr, [start], [end])`
This method returns the number of occurrences of a substring, `substr`, in the string on which it acts. There are two optional parameters. `start` refers to the starting index to be used when searching while `end` refers to the ending index.

In [None]:
# example for single character substrings
message = 'python is popular programming language'
print(message.count('p')) # there are 4 'p' characters in the message

# example for multiple character substrings
message = "Python is awesome, isn't it?"
print(message.count('is')) # there are 2 'is' strings in the message

# example using the start and end index parameters
message = "Python is awesome, isn't it?"
print(message.count('is',10, 25)) # there is 1 'is' string between index 10 and 20

## `find(substr)` or `index(substr)`
These methods return the index of the first occurrence of the substring, `substr`, if found. The difference between them is that `find()` returns `-1` if the substring is not found, while `index()` causes a run-time error. As we saw in the `count()` method above, these methods also have optional parameters `start` and `end` to limit the search space for the substring.

In [None]:
# find the first occurrance of the given substring
quote = 'Let it be, let it be, let it be'
print(quote.find('let it')) # using find()
print(quote.index('let it')) # using index()

# limit the search space using the start and end parameters
quote = 'Let it be, let it be, let it be'
print(quote.find('let it',20,30)) # using find()
print(quote.index('let it',20,30)) # using index()

# look for a substring that cannot be found
print(quote.find('long')) # using find()
print(quote.index('long')) # using index()


## `isalnum()`
This method returns `True` if all of the characters in the given string are alphanumeric (i.e. do not contain symbols) otherwise it returns `False`.

In [None]:
# example string with letter and numbers only
example="UFCFVQ302"
print(example.isalnum())

# example string with letters, numbers and spaces
example2="I teach in 3Q44"
print(example2.isalnum()) # False because string contains spaces

## `isalpha()`
This method returns `True` if all of the characters in the given string are either upper or lower case letters otherwise it returns `False`.

In [None]:
# example string with letters only
example="Dave"
print(example.isalpha())

# example string with letters, numbers and symbols
example2="david2.wyatt@uwe.ac.uk"
print(example2.isalpha())

## `isdigit()`
This method returns `True` if all of the characters in the given string are digits otherwise it returns `False`.

In [None]:
# example string with digits only
example="1234567890"
print(example.isdigit())

# example string with digits and a decimal point
example2="3.1415"
print(example2.isdigit())

## `islower()` or `isupper()`
These methods return `True` if all of the characters in the given string are lowercase letters or uppercase letters respectively otherwise they return `False`. These functions ignore symbols when checking for truth.

In [None]:
# example string with lowercase letters and other symbols
example="this is a good string."
print(example.islower())
print(example.isupper())

# example string with uppercase letters and other symbols and digits
example2="THIS IS 2ND GOOD STRING."
print(example2.isupper())
print(example2.islower())

## `join(iterable)`
This method returns a string which is constructed from all elements in the given `iterable`. As you have already seen, an iterable can be List, Tuple, Set, Dictionary or string. Each element can be separated by a provided string. To do this, the `join()` method should be called on a string containing the required seperator string.

In [None]:
# join the elements of a List together using a space character as a separator
text = ['Python', 'is', 'a', 'fun', 'programming', 'language']
print(' '.join(text))

# use join to add spaces between each character in a string
print(' '.join('Data Science'))

# join the keys of a dictionary using a string as a separator
test = {'Dave': 1, 'Jan': 2, 'Khoa': 3, 'Rhys': 4 }
s = '->'
print(s.join(test))

## `lower(str)` or `upper(str)`
These methods convert all characters in a given string, `str`, to lower or upper case characters, respectively. The methods return the resulting string.

In [None]:
str = "ThIs StRiNg CoNtAiNs UpPeR aNd LoWeR cAsE lEtTeRs"
print(str.lower()) # convert all letters to lower-case
print(str.upper()) # convert all letters to upper-case


## `strip([chars])`
This method removes all leading and trailing characters in a given string. By default, whitespaces are removed (i.e. spaces, tabs and newlines). However, the optional parameter, `chars`, can be used to customise the method. Any leading or trailing characters that match the characters given in the string parameter are removed. The method returns the resulting string.

In [None]:
# print string without stripping charcaters
print("\n\n\t     Hello     World      \t\n\n")

# strip leading and trailing whitespaces
print("\n\n\t     Hello     World      \t\n\n".strip())

# custom strip 
print("x0x0 Hello 0x0x".strip(" 0x"))

# custom strip with missing character in parameter
print("x0x0 Hello 0x0x".strip(" 0"))


## `replace(old, new, [count])`
This method replaces all occurrences of the string parameter `old` with the string parameter `new`. The method returns this resulting string. If the substring `old` cannot be found then the original string is returned.The original string remains unchanged. The optional parameter, `count`, allows you to limit the number of times that the replacement takes place for a given string. The default is to replace all occurrences. 

In [None]:
# replace a character with a character
example1 = "apologize organize recognize"
print(example1.replace('z', 's'))

# replacing a string with a string with a string
example2 = 'cold, cold heart'
print(example2.replace('old', 'alm'))

# replacing only two occurences of 'let'
example3 = 'Let it be, let it be, let it be, let it be'
print(example3.replace('let', "don't let", 2))

## `split([separator], [maxsplit])`
This method breaks the given string down into a List of substrings using a specified separator, `separator`, to identify each substring. If no `separator` parameter is provided, the method uses whitespaces (i.e. spaces, tabs and newlines). The second optional parameter, `maxplit`, limits the number of splits that are made.

In [None]:
# split a string using the default separators
example1 = 'Torture the data and it will confess to anything'
print(example1.split())

# split a string using a specified separator string
example2 = 'Milk, Chicken, Bread'
print(example2.split(', ')) # note: the separator used is a comma AND a space

# limit the number of splits made to a given string
example3 = "Without data you're just another person with an opinion"
print(example3.split(' ',3)) # note: 4 substrings are created

### <font color='red'><u>Worksheet Exercises</u></font>
1. Write a Python program that accepts a comma separated sequence of words as input and prints the unique words in sorted form (alphanumerically).
2. Write a Python program that finds the first occurrence of the substrings `'not'` and `'poor'` in a given string. If the `not` occurs before `poor`, then replace the whole substring between `'not...poor'` with the string`'good'`, e.g. the `not that poor` substring in `The weather is not that poor` would replaced by the substring `good` resulting in `The weather is good`.
3. Write a Python program to compute sum of all digits in a given string, e.g. `'123abcd45'` would result in the sum `15`
4. Write a Python program to remove unwanted characters from a given string assuming the unwanted characters are given in a List. For this exercise use the following List: `['#', '*', '!', '^', '%']`
5. Given a incorrectly formatted IP address (such as `164. 11.  4. 40`) where each part of the address may include leading whitespace, write a Python program to remove the leading whitespace and add an appropriate web protocol, e.g. `http://164.11.4.40`.

In [None]:
# 1.
inputStr = 'red,white,black,red,green,black' # black and red are repeated
words = inputStr.split(',')
print(','.join(sorted(list(set(words))))) # Set ensures uniqueness

# 2. 
exampleStr = 'Bristol music culture is not at all poor!'

notIndex = exampleStr.find('not') # returns -1 if 'not' not found in exampleStr
poorIndex = exampleStr.find('poor') # returns -1 if 'poor' not found in exampleStr

# ensure there is a 'poor' and 'not' substring before we try to replace
if ((notIndex != -1) and (poorIndex != -1)):
    if poorIndex > notIndex:
        # when using the list slicing syntax we need to add 4 onto the poor index
        # to account for the word `poor` which is 4 characters long
        poorEnd = poorIndex + len('poor')
        exampleStr = exampleStr.replace(exampleStr[notIndex:poorEnd], 'good')
    
print(exampleStr)

# 3.
exampleStr = '123abc45'
digitSum = 0
for c in exampleStr:
    if c.isdigit() == True:  # check if the character is a digit
        digitSum += int(c)
        
print(digitSum)

# 4.
exampleStr = "Pyth*^on Exercis^es"
unwantedChars = ['#', '*', '!', '^', '%']
for c in unwantedChars:
    # replacing the given character with nothing effectively deletes it
    exampleStr = exampleStr.replace(c, '') 
    
print(exampleStr)

# 5.
badIP = '164. 11.  4. 40'
# use strip function to remove leading spaces from each substring split on '.'
goodIP = 'http://' + '.'.join([i.strip() for i in badIP.split('.')])

print(goodIP)

# File Handling using Python
The only built-in function that provides functionality for handling files is `open()`. This function returns a `File` object (or a file handle) which provides a set of methods for accessing the file's contents such as `read()`, `write()`, `tell()` and `seek()`. In this section, we will look at how to open a file and how read from and write data to a file.

## `open(file,[mode])`
The `open()` function accepts several different parameters, but for this tutorial we will focus on just two of them: `file` and `mode`. You can find a detailed description of `open()` at https://docs.python.org/3/library/functions.html#open.

### The `file` parameter
This parameter that provides the name of the file to be opened. You can use either an absolute or relative pathname to identify the file. An absolute pathname includes the full path from the root directory of the file system, e.g., `'C:\Users\drdav\Downloads\data\test.txt'` assuming the `test.txt` file was located in the given folder. You should avoid (where possible) using absolute pathnames when accessing files because other users of your Python code may not have the same file structure on their system. A relative pathname specifies a location that is relative to the current working directory in which the Python script is being executed, e.g. `'test.txt'` or `'data/test.txt'` assuming the data file is located in a sub-folder called `data`. When you are using Jupyter Notebooks this means the file should be located in the same folder (or a subfolder) as the Jupyter Notebook file.

### The `mode` parameter
Python provides several different options for accessing files depending on whether you wish to read file contents or write new contents to a file. In addition, the structure of a file can also be taken into account when accessing the file. Some files are stored in text format (such as files with the extension `.csv` or `.txt`) while other files are stored in raw binary format (such as files with the extension `.exe`, `.dll` and commonly `.dat`). The `mode` parameter provides a way to specify which kind of access is needed for a given Python solution. The following table provides a list of all options:

| <div align="center">Mode</div> | <div align="center">Description</div> |
|----|---|
| <div align="center">r</div>  | <div align="center"><strong>Opens a file for reading. (default)</strong></div> |
| <div align="center">w</div>  | <div align="center">Opens a file for writing. <br/> Creates a new file if it does not exist or <br/> truncates (or removes all contents from) the file if it exists. </div> |
| <div align="center">x</div>  | <div align="center">Opens a file for exclusive creation.<br/> If the file already exists, the operation fails. </div> |
| <div align="center">a</div>  | <div align="center">Opens a file for appending at the end of the file without truncating it. <br/> Creates a new file if it does not exist. </div> |
| <div align="center">t</div>  | <div align="center"><strong>Opens in text mode. (default)</strong></div> |
| <div align="center">b</div>  | <div align="center">Opens in binary mode. </div> |
| <div align="center">+</div>  | <div align="center">Opens a file for updating (reading and writing) </div> |


In [None]:
# open a file using the default read in text mode
file = open("data/test.txt")
file.close() # every open file must be closed when access is no longer required

# open a file for writing in text mode. This will truncate the file.
file = open("data/test.txt", 'w')
file.close()

# open a file for reading and writing in binary mode.
file = open("data/test.txt", 'r+b')
file.close()

# open a file for appending in binary mode.
file = open("data/test.txt", 'ab')
file.close()

# creates a file if it does not exist otherwise throws an error
file = open("data/test.txt", 'x')
file.close()

## The `close()` method
Every file that has been opened **MUST** be closed, once the current Python program has finished operations on the file. This is particuarly important if the file being accessed is used by other programs or users. The file will be locked preventing others from using the file until the `close()` method has been called. You have already seen examples of the `close()` method being used in the previous code example.

In fact there is another way to access files that automatically closes the file following access by using the `with` keyword. It is good practise to use this approach where applicable. The following example shows how this can be used to access a file and read its contents.

In [None]:
# open a file using the 'with' keyword - the system will automatically
# close the file once the read method has finished
with open("data/test.txt",'rb') as file:
    print(file.read()) 

### The `read([size])` method
This method reads the quantity of data specified by the optional `size` parameter into memory and returns the result as either a string (in text mode) or a `bytes` object (in binary mode). If `size` is ommitted then the entire file contents are read into memory. Once the end of the file is reached, the method returns an empty string `''` or `bytes` literal `b''` when called.

In [None]:
# read the whole content
with open("data/president_heights.csv") as file:
    print(file.read())

# read the first 21 characters of the file
with open("data/president_heights.csv") as file:
    print(file.read(21))

### The `readline()` and `readlines()` methods
These methods read lines of text from the open file. A line is defined as a line of text which ends with a newline character (`\n`). The `readline()` method reads a single line while the `readlines()` method reads all lines in the open file and places them into a `List`. NOTE: these methods retain the newline character in their return values.

In [None]:
# read the first line in the given file
with open("data/president_heights.csv", 'r') as file:
    print(file.readline())

# read the first line in the given file in binary mode
with open("data/president_heights.csv", 'rb') as file:
    print(file.readline())

# read all the lines in the given file
with open("data/president_heights.csv", 'r') as file:
    print(file.readlines())

It is common to iterate through a file one line at a time. There is another more efficient and simpler way to access each line in a file using the a `for` loop.

In [None]:
# print the contents of a given file one line at a time using for
with open("data/president_heights.csv", 'r') as file:
    for line in file:
        print(line)

### The `write(data)` method
This method writes the given `data` to the open file. The form of data written depends on whether the file was opened in text mode or binary mode. In text mode, the `data` parameter should be a string. In binary mode, the `data` parameter should be a `bytes` object. Any other data you wish to write to a file needs to be converted into one of these before it can be written. The method returns the number of character/bytes written to the file.

In [None]:
# write a string to a file in text mode
with open("data/test.txt",'w') as file:
    file.write('Here is some text!')

# write a Bytes object to a file in binary mode
#data = bytes(10) # create a bytes object with 10 bytes initialised to zero
#with open("data/test.txt",'wb') as file:
#    val = 1
#    file.write(data)
    
# write a tuple to a file in text mode
#with open("data/test.txt",'w') as file:
#    value = ('the answer', 42)
#    file.write(str(value)) # convert the tuple to a string
    
# review the results of the write operation (uncomment each example above to see result)
with open("data/test.txt",'r') as file:
    print(file.read()) 

### The `writelines(data)` method
This method write lines of text to an open file. The form of data written depends on whether the file was opened in text mode or binary mode. In text mode, the `data` parameter should be an iterable of strings (typically a `List`). In binary mode, the `data` parameter should be an iterable of `bytes` object.

In [None]:
# write a list of strings to a file in text mode
list_of_strings = ["\nA dozen, a gross, and a score", "\nPlus three times the square root of four",
                  "\nDivided by seven", "\nPlus five times eleven",
                   "\nIs nine squared and not a bit more"]
with open("data/test.txt",'w') as file:
    file.writelines(list_of_strings)

# write a list of bytes objects to a file in binary mode
#bytes1 = bytes(10)
#bytes2 = bytes(10)
#list_of_bytes = [bytes1, bytes2]
#with open("data/test.txt",'wb') as file:
#    file.writelines(list_of_bytes)

# review the results of the write operation (uncomment each example above to see result)
with open("data/test.txt",'r') as file:
    print(file.read()) 

### <font color='red'><u>Worksheet Exercises</u></font>
1. Write a Python function, `printLines()` which takes two parameters, `filename` and `n` where `filename` is the file to use and `n` is the number of lines to print. Postive `n` values indicate printing should begin at the start of the file while negative `n` values indicate printing should begin at the end of the file. A sample file `data/sample1.txt` has been provided for this exercise. Use your function to print the first 3 and last 3 lines.
2. The file `data/sample2.csv` contains 100 integer values. Write a Python program to find the minimum, maximimum and average value. Print these values to the screen.
3. One of the earliest forms of cipher used by Julius Caesar in Roman times involves shifting letters a given number of positions in the alphabet (see https://en.wikipedia.org/wiki/Caesar_cipher). For example, assuming the shift value is 3, then all occurrences of the letter `A` in a text are substituted with the letter `D`, `B` with `E` and so on. Write a Python program that uses this technique to encrypt a given file `data/sample3.txt` using the shift value 12 and write the result into a new file `data/sample3_shifted.txt`. Assume that only capital letters are used and other character should be passed through unchanged (such as a space or a full stop).
4. Create a new solution (based on your solution to 3. above) to decrypt the file `data/sample4_shifted.txt` and write the result into a new file `data/sample4.txt` assuming the shift value used was 4. NOTE: you will need to shift the letters in the other direction.

In [None]:
# 1.
def printLines(filename, n):
    with open(filename) as handle:
        allLines = handle.readlines()
    
    if n > 0:
        for i in range(n):
            print(allLines[i])
    else:
        for i in range(len(allLines)-abs(n), len(allLines), 1):
            print(allLines[i])
        
printLines('data/sample1.txt', 3)
printLines('data/sample1.txt', -3)

# 1. alternative solution
#def printLines(filename, n):
#    with open(filename) as handle:
#        allLines = handle.readlines()
#    
#    if n > 0:
#        for line in allLines[:n]:
#            print(line)
#    elif n < 0:
#        for line in allLines[n:]:
#            print(line)
#        
#printLines('data/sample1.txt', 3)
#printLines('data/sample1.txt', -3)

# 2.
with open('data/sample2.csv') as handle:
    data = handle.read()

aList = data.split(',') # assuming each substring is separated by a comma

# use the map() function to convert each substring number into an actual number
aList = list(map(int, aList)) 

print('Min: ', min(aList), 'Max: ', max(aList), 'Mean: ', sum(aList)/len(aList))

# 3.
shift = 12
with open('data/sample3.txt') as handle:
    allLines = handle.readlines()

resultFile = open('data/sample3_shifted.txt', 'w') # open a file to record result

for line in allLines:
    shiftedLine = "" # create an empty string to store the shifted letters into
    for c in line:
        # only CAPITAL letters will be shifted
        if c.isalpha() == True and c.isupper() == True:
            unicodeVal = ord(c) - ord('A') # find value for letter
            shifted = unicodeVal + shift # shift letter the given number of places

            # if a shifted letter goes past Z then reset it to start as A
            fixed_shifted = shifted % 26
            
            # add shifted letter to the string holding the encryped message
            shiftedLine += chr(fixed_shifted + ord('A'))
        else:
            shiftedLine += c # letter is added unchanged
    resultFile.write(shiftedLine) # write the shifted letters to the file
resultFile.close() # must close the open file

# 4.
shift = 4
with open('data/sample4_shifted.txt') as handle:
    allLines = handle.readlines()

resultFile = open('data/sample4.txt', 'w') # open a file to record result

for line in allLines:
    shiftedLine = "" # create an empty string to store the shifted letters into
    for c in line:
        # only CAPITAL letters will be shifted
        if c.isalpha() == True and c.isupper() == True:
            unicodeVal = ord(c) - ord('A') # find value for letter
            shifted = unicodeVal - shift # shift letter the given number of places

            # if a shifted letter goes past Z then reset it to start as A
            fixed_shifted = shifted % 26
            
            # add shifted letter to the string holding the encryped message
            shiftedLine += chr(fixed_shifted + ord('A'))
        else:
            shiftedLine += c # letter is added unchanged
    resultFile.write(shiftedLine) # write the shifted letters to the file
resultFile.close() # must close the open file

# Exception Handling

No matter your skill as a programmer, you will eventually make a coding mistake. Such mistakes come in three basic flavors:

* *Syntax errors*: Errors where the code is not valid Python (generally easy to fix)
* *Runtime errors*: Errors where syntactically valid code fails to execute, perhaps due to invalid user input (sometimes easy to fix)
* *Semantic/Logic errors*: Errors in logic: code executes without a problem, but the result is not what you expect (often very difficult to track-down and fix)

Here we're going to focus on how to deal cleanly with Python *runtime errors* via its exception handling framework. Exception handling makes your code more robust and helps prevent potential failures that would cause your program to stop in an uncontrolled manner. Imagine if you have written a code which is deployed in production and still, it terminates due to an exception, your client would not appreciate that, so it's better to handle the particular exception beforehand and avoid the chaos.

Below are two example runtime errors which cause execution of the script to fail:

In [None]:
a = 2
b = 'three'
a + b

In [None]:
100 / 0

There are dozens of built-in Python exceptions. The type is printed as part of the error message when an exception occurs. The types in the above two examples are ZeroDivisionError and TypeError. The remaining part of the error line provides the details of what caused the error based on the type of exception. The following link provides a complete list of errors: https://docs.python.org/3/library/exceptions.html.

## Catching Exceptions: `try` and `except`
The main tool Python gives you for handling runtime exceptions is the `try...except` clause. The `try` clause lets you test a set of statements for runtime errors. The except clause handles the runtime error. The following example illustrates the basic exception handling formulation:

In [None]:
try:
    x = 1 / "0" # ZeroDivisionError
except:
    print("something bad happened!")

However, it is possible to specify which exception to "catch"  and process by specifying the name of exception in the except statement. A full list of built-in exceptions is given at: https://docs.python.org/3/library/exceptions.html. The term "catch" is commonly used in exception handling to describe an error this is explicitly processed. In the following example, the error has changed to a `TypeError`. 

In [None]:
try:
    x = 1 / "0" # TypeError
except ZeroDivisionError:
    print("something bad happened!")

The exception clause does not explicitly catch the `TypeError` exception and so simply ignores it. In fact, we can list types of exception that can be caught. The following example adds the `TypeError` exception to the list of exceptions it can deal with:

In [None]:
try:
    x = 1 / "0" # TypeError
except (ZeroDivisionError, TypeError):
    print("something bad happened!")

Finally, we may need to provide different functionality for each type of exception. In such cases, we can add multiple exception clauses. It is common when doing this to add a final `except` clause to catch all other types of runtime error. In this example, a `NameError` exception is generated and caught by the final exception clause.

In [None]:
try:
    x = y / 0 # NameError
except ZeroDivisionError:
    print("You tried to divide by zero")
except TypeError:
    print("You have used an incorrect type")
except:
    print("something bad happened!")


## The `else` keyword
When using exception handling we may wish to execute block of code if no runtime errors were raised by the `try` clause. To do this we use the `else` keyword. The ordering of the clauses is important. The `else` clause should come after all `except` clauses. The following example illustrates this:

In [None]:
try:
    print("Hello World!")
except:
    print("Something went wrong")
else:
    print("Nothing went wrong")

## The `finally` keyword
This keyword is used to specify code that should be executed after the exception code has been processed. While on the surface this looks identical to the `else` keyword above, there is a key difference between them. The `else` code is only executed of there is no runtime error while the `finally` code is executed regardless of whether there is an error or not. Again, the ordering of the clauses is important: `try`..`except`..`else`..`finally`.

In [None]:
try:
    print(x) # NameError
except:
    print("Something went wrong")
finally:
    print("Executed after the try...except has finished")

## Raising an exception using `raise`
As we have already seen Python will generate exceptions automatically when a given runtime error has occurred. Sometimes we may wish to make use of exceptions programmatically. By this we mean recognising an issue using program logic (such as an `if` statement) and forcing an exception to occur. This can be useful when developing your own custom exceptions (see later section for more information on custom exceptions) or during the the development process to help identify bugs. To do this we use the `raise` keyword. This `raise` keyword is followed by the exception we wish to raise. The example below illustrates this:

In [None]:
N = 100
if N > 50:
    raise RuntimeError("Input Value too large!")

It is possible to use the `raise` keyword without any exception included. In this situation, we are re-raising a previous exception. It is common to do this in an `except` clause. This causes the exception to be passed back to the calling function. If that function has its own try..except code blocks the exception can be dealt with again. This can be useful to help track down issues within your Python code where information can be added to the console output at each step of the algorithm. The following example illustrates this:

In [None]:
def doSomething(N):
    try:
        if N > 50:
            raise RuntimeError
    except:
        print("doSomething(): Input Value too large!")
        raise # re-raise the RuntimeError exception

N = 100
try:
    doSomething(N)
except:
    print("Error occurred in doSomething()")

## Defining Custom Exceptions
Programs may name their own exceptions by creating a new exception class. Exceptions should typically be derived from the base class `Exception`, either directly or indirectly. Exception classes can be defined which do anything any other class can do, but are usually kept simple. This definition of a custom exception uses class inheritance. In other words, your new custom exception is defined a special kind of existing exception. The following example illustartes how to create and use a custom exception called `MySpecialError` which is derived from the Base class `Exception`:

In [None]:
# simple custom exception class which inherits from Exception
class MySpecialError(Exception): 
    pass # no operation command required for this empty class

raise MySpecialError("An error message!") # raise the custom exception to test

### <font color='red'><u>Worksheet Exercises</u></font>
1. Add appropriate exception handling to the following code snippet:
```
    f = open('insurance.csv')
    s = f.readline()
```
2. Add an appropriate clause to your solution to 1. above to output a message when the code executes without exception. Adjust your code to open the file `data\insurance.csv`.
3. Add an appropriate clause to close the file once the code has finished.
4. Create a custom exception called `ParameterError`.
5. Create a new `addNumbers()` function to add two numbers together and return the result. Use your new `ParameterError` exception to cause an error if either parameter is not a number (i.e. int or float).
6. Using appropriate exception handling, use your new `addNumbers()` function to add the following two values: `2` and `three`.

In [None]:
# 1.

try:
    f = open('insurance.csv')
    s = f.readline()
except OSError as err: # captures errors related to OS functionality
    print(err) # print the exception details
    
# 2.

try:
    f = open('data\insurance.csv')
    s = f.readline()
except OSError as err: # captures errors related to OS functionality
    print(err) # print the exception details
else:
    print("File opened!")
    
# 3.

try:
    f = open('data\insurance.csv')
    s = f.readline()
except OSError as err: # captures errors related to OS functionality
    print(err) # print the exception details
else:
    print("File opened!")
finally:
    f.close()
    
# 4.

class ParameterError(Exception): 
    pass # no operation command required for this empty class

# 5.
import numbers # provides functionality for numbers in Python

def addNumbers(x, y):
    # check if x is a number
    if isinstance(x, numbers.Number) == False: 
        raise ParameterError("Parameter 1 must be a number")
    # check if y is a number
    elif isinstance(y, numbers.Number) == False:
        raise ParameterError("Parameter 2 must be a number") 
    else:
        return (x + y) # since both are numbers we can do the addition

# 6.    
print(addNumbers(2,"three"))
