# Introduction to Python
## Lesson 17 - Decorators & Regular Expressions
**Ian Clark - 30.11.2020**

------

## Objectives
By the end of today's lesson, we'll have looked at:

* Decorators
* Regular expressions

----

## Decorators
* A decorator, is quite simply: a function which *alters* the behaviour of another function or class
* Lets return to lesson 6 (advanced functions)...

In [None]:
from time import time

def print_with_time(fn, *args, **kwargs):
    start = time()
    print('Answer: ', fn(*args, **kwargs))
    duration = round(time() - start, 2)
    print("Total time:", duration, "seconds")

* Recall the `print_with_time()` function
* We used it to "profile" our custom range functions - to work out how long they took to generate the result
* For simplicity, lets say we had a function which took 3 seconds to complete

In [None]:
from time import sleep

def delay(seconds):
    sleep(seconds)
    return "My result"

* We could profile it, by "wrapping" it in our `print_with_time()` function

In [None]:
print_with_time(delay, 0.2)

* Using decorators, Python allows us to perform behaviour like this a little more expressively
* Lets now rewrite this `print_with_time()` method as a decorator

In [None]:
from time import time

def profile(fn):
    def wrapper(*args, **kwargs):
        start = time()
        result = fn(*args, **kwargs)
        duration = round(time() - start, 2)
        print("(Total time:", duration, "seconds)")
        return result
    
    return wrapper

* So, how does this differ from our original `print_with_time()` method?
* The first thing to note is, our decorator is a _function which returns a function_
* The function which is returned can change the behaviour of the function it is given
* In this case, all we do is capture the time _before and after_, and then log the result
* We *could* use our decorator by just overwriting the function...

In [None]:
delay = profile(delay)

In [None]:
print(delay(0.3))

* But Python provides a cleaner interface for this
* Above the function we want to "decorate", we use the `@` symbol, followed by the name of the decorator

In [None]:
@profile
def delay(seconds):
    sleep(seconds)
    return "My result"

In [None]:
print(delay(0.15))

* And we can combine these decorators...

In [None]:
# Adapted from: https://realpython.com/primer-on-python-decorators/#simple-decorators

def debug(fn):
    """Print the function signature and return value"""
    def wrapper(*args, **kwargs):
        # Get the name of the function
        fn_name = fn.__name__

        # Convert all of the "args" to strings
        args_repr = [str(a) for a in args]

        # Convert all of the "kwargs" to strings                
        kwargs_repr = [f"{k}={v}" for k, v in kwargs.items()]

        # Combine all of our arg and kwarg strings
        signature = ", ".join(args_repr + kwargs_repr)

        # Print out the function signature, as if we did it ourselves
        print(f">>> {fn_name}({signature})")

        # Call and return the original function
        return fn(*args, **kwargs)
    return wrapper

In [None]:
@profile
@debug
def delay(seconds):
    sleep(seconds)
    return "My result"

In [None]:
delay(0.02)

* Decorators can also accept arguments themselves

In [None]:
def slow_down(seconds):
    """Sleep n second before calling the function"""
    def outer(fn):
        def inner(*args, **kwargs):
            sleep(seconds)
            return fn(*args, **kwargs)
        return inner
    return outer

In [None]:
@slow_down(0.4)
@profile
@debug
def greet(name):
    return f"Hello {name}!"

In [None]:
greet("Alice")

* But, there's a bug here! It's reporting the time taken as 0 seconds
* Remember that our decorators are really just shorthand for wrapping one function in another

In [None]:
"""
Task 1: Rewrite the greet() function by explicitly wrapping our decorators in turn

Example:
function = decorator(function)
"""

In [None]:
"""
Task 2: Fix the bug, rewrite the original greet, using the decorator syntax

Example:
@decorator
def function():
    pass

----

## Regular Expressions
* A regular expression is a "formal language", which uses sequences of characters to define "search patterns"
* Regular expressions are used everywhere, and available in some form in almost all programming languages
* They are an incredibly powerful and expressive way of searching for, and manipulating text
* Regular expressions are often shorted to "regexes"

### The Basics
* At its core, a regular expression is described using a string

In [None]:
pattern = 'def'

* This string forms a "pattern"
* By using the special "raw" string syntax, we can write regular expressions more cleanly

In [None]:
# We'll see the benefit of this approach later
# But notice that your code editor will highlight this a little differently
pattern = r'def'

* These matterns can then be "matched" against other strings
* Python provides the built-in `re` module for regular expression usage
* One way of matching is to use the `search()` function

In [None]:
import re

# Given a string that includes our pattern, we get back a match
print(re.search(pattern, 'abcdefghi'))

In [None]:
# And given one that doesn't, we get back none
print(re.search(pattern, 'does not contain the pattern'))

* Alternatively, we can "compile" a regular expression pattern, and use that directly 

In [None]:
# We use the re.compile() method to compile a pattern
pattern = re.compile(r'def')

# And then we use the search() method on the *compiled* pattern
# rather than directly using the re.search() method
print(pattern.search('abcdefghi'))

* OK, so we've seen that we can use regular expressions to find out whether `"def"` was within some given text
* But, we could just do this... 

In [None]:
print('def' in 'abcdefghi')

* So, what do we need regular expressions for?

### The language
* The regular expression language is extremely powerful, but it is also extremely complicated
* Do not be put off by this fact - many senior programmers still struggle with them
* By the end of this lesson we want to understand *the basics*
  * We should be able to spot regular expressions
  * We should know _how_ to work out what they do
* Common to all languages, regular expressions have a "grammar"
* This is made up of certain characters - metacharacters - which have a special meaning
  * Some of the most common examples are below  

* `.` - matches *any* character
* `[a-z]` - a character "class" - matches any characters within the block
* `(abc)` - describes a *group* - we can split our regex into multiple groups
* `*` - match the following character / group *0 to infinite* times 
* `+` - match the following character / group *1 to infinite* times
* `?` - "optionally" match the following character / group (0 or 1 times)
* `\w` - any alphanumeric character (think: word) 
* `\s` - any whitespace character (think: space)
* `^` - the end of a string
* `$` - the end of a string
* `\` - to use a "literal" value (e.g. if we want to actually match a `.` we need to write `\.`)

### Example - email address validation
* We can combine everything what we have above to create a regex which can be used to validate email addresses
* Lets simplify the rules for an email address
  * It needs to start with an alphanumeric character
    * We need at least one of these, but we can have more
  * We then have an `@` sign
  * We then need at least one more alhanumeric character
  * We then need a `.`
  * We then need at least one more alphanumeric character
  * Then we need the string to end (i.e. nothing else)
* Lets work on this together
  * We'll use the really useful [regex101](https://regex101.com/) to help us here

In [None]:
import re

email_regex = re.compile(r'solution here')

# It should return a match for a valid email
assert email_regex.search('valid@email.com')

# It should not return a match for invalid emails
assert not email_regex.search('invalid')
assert not email_regex.search('no@topleveldomain')
assert not email_regex.search('no.at')
assert not email_regex.search('has spaces @ in it.com')

----

## Homework - Regular Expressions

### Number validator
* Create a regular expression to validate a *number*
* Rules
  * A number must be one or more digits
  * The first digit cannot be 0
  * It can then be followed by a dot, and then must have at least one more digit
* Hints
  * Use character classes to describe your possible numbers
  * Use the `+` character to match *at least once*
  * Use the `*` character to match *zero or many* times
  * Use a "group" to make the "floating" (non-integer) part optional `?`
  * Use regex101 to help you - and the internet is your friend!

### Name replacer
* Given the following text:

> My name is Bejmamin. Or Ben. My friends called me benji.

* Replace all of Benjamin's names with your own
* Hints
  * Python provides the `replace()` regular expression method. The first argument is a regular expression, and the second is the replacement, for which you'll use your name
  * You'll need to replace all verions of Benjamin, regardless of whether it's spelled with an upper or lower case. To do this, you'll need to use a "modifier" - specifically the "case insensitive" modifier.

----

## Homework - Recursion
* Recall that in lesson 6 "advanced functions", we learned about recursive functions
* Here's a very clear example of how we can use recursion as part of an algorithm to efficiently sort a list of numbers, strings e.g.
* The algorithm below is called "quick sort", and was taken from a good repository on learning Python
* Notice that the function is written with clear documentation and examples
* Your task is to learn how it works
  * Use the `debug()` and `profile()` decorators we created above to help you do this

In [None]:
# Taken from https://github.com/TheAlgorithms/Python
def quick_sort(collection: list) -> list:
    """A pure Python implementation of quick sort algorithm
    :param collection: a mutable collection of comparable items
    :return: the same collection ordered by ascending
    Examples:
    >>> quick_sort([0, 5, 3, 2, 2])
    [0, 2, 2, 3, 5]
    >>> quick_sort([])
    []
    >>> quick_sort([-2, 5, 0, -45])
    [-45, -2, 0, 5]
    """
    if len(collection) < 2:
        return collection
    pivot = collection.pop()  # Use the last element as the first pivot
    greater = []  # All elements greater than pivot
    lesser = []  # All elements less than or equal to pivot
    for element in collection:
        (greater if element > pivot else lesser).append(element)
    return quick_sort(lesser) + [pivot] + quick_sort(greater)

In [None]:
print(quick_sort([2, 3, 4, 5]))
print(quick_sort(['r', 'e', 'd', 'i']))