# Class 3 on Python
## Topics include:
* Custom functions, with optional named parameters, default values
* Docstrings in functions
* Annotations vs Forcing Type-checking 
* Installing packages & some calculations needed for hurricane tracking.
* Lambda functions intro (show using for custom sort)
* First-class functions & Closures

# Scoping 
Make sure you understand the basics of variable scope so you write proper code.

In [1]:
# This code is syntactically valid Python and works. So what's wrong with it?  
# Why does it matter?  How do we fix it properly?  

my_list = [1, 5, 8, 9, 12, 18, 14, 2]

def compute_mean():
    total = sum(my_list)
    num_items = len(my_list)
    global mean
    mean = total / num_items

compute_mean()

print('the result is', mean)

the result is 8.625


The problems are in the use of variables `my_list` and `mean`.  The function should not be accessing (and potentially modifying) the `my_list` variable because it isn't local.  Doing that makes compute_mean() a non-deterministic or unpredictable function when it should be deterministic.  Also, what would happen if code outside the function were trying to use the variable called `mean` for something else?  It would cause a bug when it changes too.

In [2]:
# This code is a much better version of the above, following best-practices for accessing variables 
# within their scope by using parameters and return values, and not defining a global within a function.

my_list = [1, 5, 8, 9, 12, 18, 14, 2]

def compute_mean(some_list):
    total = sum(some_list)
    num_items = len(some_list)
    return total / num_items

mean = compute_mean(my_list)

print('the result is', mean)


the result is 8.625


## Custom Functions
Defining new functions in Python is generally straightforward.  

Parameters must be defined, but because of dynamic typing, we aren't *required* to specify data types.  But as we'll see, there are some advantages and multiple ways to do so.

In [3]:
def print_multiple_times(string, number_of_reps):
    for i in range(number_of_reps):
        print(string)

In [4]:
print_multiple_times('abcdefg', 3)
print_multiple_times('Welcome!', 2)

abcdefg
abcdefg
abcdefg
Welcome!
Welcome!


In the previous example, which works reasonably, notice that we have 2 REQUIRED parameters, and there's no explicit *return* keyword, so the function always returns the pseudo-value *None*.

But there are some things we can easily do to improve this simple function's re-usability and clarity.  For example, suppose that after using it a while in real coding, we discover that we most-often use it to print something twice.  So we decide to make the *number_of_reps* optional, with a default value of 2, and *refactor* by creating a new function name as a convenient wrapper.  Why did I use return here?  And why did I call the other function instead of just copying its innards?

In [5]:
def print_again(string, number_of_reps=2):
    return print_multiple_times(string, number_of_reps)

In [6]:
print_again(number_of_reps=3, string='Hurray!')

Hurray!
Hurray!
Hurray!


# Docstrings
Next, our co-workers who have been trying to use functions we wrote in their own programs, complain that there's no documentation on the API like he finds with the standard library functions.  We can fix that easily by adding Docstrings to our functions, modules, and classes.  PyCharm will even automatically create the format for us.  You'll also want to work in PyCharm instead of Jupyter Notebooks to fully benefit from autocompletion and inline documentation.

In [7]:
def print_again(string, number_of_reps=2):
    """Prints a given string, the specified number of times.
    
    :param string: The string to print
    :param number_of_reps: How many times to print the string (default=2)
    :returns: None
    """
    return print_multiple_times(string, number_of_reps)

In [8]:
print_again?

In [9]:
print_again('Echo!', 3)
print_again(number_of_reps=2, string="Who's there?")

Echo!
Echo!
Echo!
Who's there?
Who's there?


With the above function definition, PyCharm will assume that number_of_reps is supposed to be an integer and that the function should always return None.  But it doesn't actually know what data type the variable called "string" is.  As written, this function works fine when "string" is something else entirely. For example, let's see if it will work with a tuple of floats:

In [10]:
print_again(3.14159, 3)

3.14159
3.14159
3.14159


It does work! Why?  Because we're passing the "string" variable through unmodified to the built-in print() function, and it can handle a wide variety of data types.

# Type Annotations
But suppose we want our function to be a lot more precise or cautious and only accept strings as the "string".  How can we do that in Python?  Let's try using an annotation to tell Python we expect a string there.

In [11]:
def print_string_again(string: str, number_of_reps=2):
    """Prints a given string, repeated a specified number of times, 
    and tries to make SURE it's a string.
    
    :param string: The string to print
    :param number_of_reps: How many times to print the string (default=2)
    :returns: None
    """
    return print_multiple_times(string, number_of_reps)
        
print_string_again(('Nevermore!', 'said the Raven.'))
print_string_again([34.5, 82.0], 4)

('Nevermore!', 'said the Raven.')
('Nevermore!', 'said the Raven.')
[34.5, 82.0]
[34.5, 82.0]
[34.5, 82.0]
[34.5, 82.0]


Well, clearly that type assertion didn't do what you might expect.  Unlike Java or C, it DOES NOT enforce the data type of the parameter!  So what good is annotation?  **Go back to PyCharm and try the popup documentation key and using autocomplete on that parameter to see.**

Or in Jupyter Notebooks or other IPython environments, invoke the documentation like this, which will show in a popup-window:

In [12]:
print_string_again?

# Type Checking
Most of the time in Python we want the flexibility and low overhead of dynamic typing. **Only if we really need to enforce it**, we can do so explictly in several ways, such as this for example:

In [13]:
def print_string_again(string='Hurrah!', number_of_reps=2):
    """Prints a given string, the specified number of times, and 
    tries to make SURE it's a string.
    
    :param string: The string to print
    :param number_of_reps: How many times to print the string (default=2)
    :returns: None
    """
    # notice that str following here is NOT in quotes. str is the symbol for 
    # the actual string class in Python.
    if not isinstance(string, str):  
        raise ValueError('parameter "string" should be a str type (or a subclass).')
    return print_multiple_times(string, number_of_reps)


In [14]:
print_string_again('Something else', 3)

Something else
Something else
Something else


In [15]:
print_string_again(["This doesn't work with a list."])

ValueError: parameter "string" should be a str type (or a subclass).

If we want to catch the exception our function can raise, you'd do it like this:

In [16]:
try:
    print_string_again(('Nevermore!', 'said the Raven.'))
except ValueError:
    print('That was NOT a string.')

That was NOT a string.


### Return values in functions
Here's a very simple example of a function with return values:

In [17]:
def add(a, b):
    """Add two values together.  Note that with dynamic data typing, it works properly
    for numbers, strings, and even lists, etc.  Any data types that support the '+' operator
    will work.
    
    :param a: first value
    :param b: second value
    :return: the result after 'adding' the values a & b."""
    return a + b

In [18]:
print('answer = {:.2f}'.format(add(5.2, 6.99)))

answer = 12.19


In [19]:
round(12.19, 1)


12.2

In [20]:
add(4, 'Apple')   #  Is this going to crash?

TypeError: unsupported operand type(s) for +: 'int' and 'str'

In [21]:
add(['a', 'b'], ['c'])

['a', 'b', 'c']

# Latitude/Longitude computations

First, install the PyGeodesy module, as follows:

## This package isn't available from Anaconda, so we get it from pip

Run this command in your Anaconda Prompt or in the "Terminal" prompt launched from PyCharm:

`pip install pygeodesy` 

In [22]:
from pygeodesy import ellipsoidalVincenty as ev
a = ev.LatLon('0.0N', '0.0W')
b = ev.LatLon('1.0N', '0.0W')
c = ev.LatLon('0.0N', '1.0W')
d = ev.LatLon('1.0N', '1.0W')

# Do some sanity checks to make sure we understand bearings and how this function works:
print(a.distanceTo3(b))   # This is moving straight NORTH, 0 degrees
print(a.distanceTo3(c))   # This is moving straight WEST, 270 degrees
print(a.distanceTo3(d))   # This is moving NORTHWEST, 314.8 degrees
print(c.distanceTo3(b))   # This is moving NORTHEAST, 45.1 degrees

(110574.38855804392, 0.0, 0.0)
(111319.49079331082, 270.0, 270.0)
(156899.5682914189, 314.8119597706482, 314.8032326783622)
(156899.5682914189, 45.18804022935184, 45.19676732163782)


In [23]:
a.distanceTo3?

In [24]:
# Or the first two lines in the example for Hurricane Irene:
#20110821, 0000,  , TS, 15.0N,  59.0W,  45, 1006,  105,    ... 
#20110821, 0600,  , TS, 16.0N,  60.6W,  45, 1006,  130,    ...
a = ev.LatLon('15.0N', '59.0W')
b = ev.LatLon('16.0N', '60.6W')

meters = a.distanceTo(b)   # Calculate 'great circle' distance
distance = meters / 1852.0  # Divide to convert meters into nautical miles
bearing = a.bearingTo(b)
print('IRENE (2011) first moved {:.2f} nm at initial heading of {:.2f} deg.'.format(distance, bearing))

IRENE (2011) first moved 110.28 nm at initial heading of 303.02 deg.


In [25]:
# Or for from later lines in the example for Hurricane Irene:
#20110825, 1200,  , HU, 25.4N,  76.6W,  90,  950,  250,    ... 
#20110825, 1800,  , HU, 26.5N,  77.2W,  90,  946,  250,    ...
a = ev.LatLon('25.4N', '76.6W')
b = ev.LatLon('26.5N', '77.2W')

meters = a.distanceTo(b)    # Calculate 'great circle' distance
distance = meters / 1852.0  # Divide to convert meters into nautical miles
bearing = a.bearingTo(b)
knots = distance / 6        # nautical miles / hours = knots
print('IRENE (2011) later moved {:.2f} nm at initial heading of {:.2f} deg. \
at a speed of {:.2f} kts'.format(distance, bearing, knots))

IRENE (2011) later moved 73.37 nm at initial heading of 333.88 deg. at a speed of 12.23 kts


In [26]:
# Another sample from Hurricane Irene, where it's heading EXACTLY North:
a = ev.LatLon('27.7N', '77.3W')
b = ev.LatLon('28.8N', '77.3W')

meters = a.distanceTo(b)    # Calculate 'great circle' distance
distance = meters / 1852.0  # Divide to convert meters into nautical miles
bearing = a.bearingTo(b)
knots = distance / 6        # nautical miles / hours = knots
print('IRENE (2011) later moved {:.2f} nm at initial heading of {:.2f} deg. \
at a speed of {:.2f} kts'.format(distance, bearing, knots))

IRENE (2011) later moved 65.82 nm at initial heading of 0.00 deg. at a speed of 10.97 kts


# Coordinate Out of Range problems
Note that with recent versions of the PyGeodesy library, the following points will cause an exception like this:

dms.RangeError: -359.1 beyond -180 degrees

Last semester we never encountered this problem because older versions, such as 17.9 would gladly accept those coordinates and work anyway. Now it's pickier. 

### Good news though...
It appears that in the latest revision of the HURDAT2 data files, NHC have corrected the out-of-range longitude value so this workaround probably won't be necessary this semester.

In [27]:
a = ev.LatLon('43.2N', '359.1W')
b = ev.LatLon('44.0N', '358.4W')

meters = a.distanceTo(b)    # Calculate 'great circle' distance
distance = meters / 1852.0  # Divide to convert meters into nautical miles
bearing = a.bearingTo(b)
knots = distance / 6        # nautical miles / hours = knots
print('AL051952 later moved {:.2f} nm at initial heading of {:.2f} deg. \
at a speed of {:.2f} kts'.format(distance, bearing, knots))

RangeError: -359.1 beyond -180 degrees

In [28]:
def flip_direction(direction: str) -> str:
    """Given a compass direction 'E', 'W', 'N', or 'S', return the opposite.
    Raises exception with none of those.

    :param direction: a string containing 'E', 'W', 'N', or 'S'
    :return: a string containing 'E', 'W', 'N', or 'S'
    """
    if direction == 'E':
        return 'W'
    elif direction == 'W':
        return 'E'
    elif direction == 'N':
        return 'S'
    elif direction == 'S':
        return 'N'
    else:
        raise ValueError('Invalid or unsupported direction {} given.'.format(direction))
        
        
def myLatLon(lat: str, lon: str):
    """Given a latitude and longitude, normalize them if necessary,
    to return a valid ellipsoidalVincenty.LatLon object.

    :param lat: the latitude as a string
    :param lon: the longitude as a string
    """

    # get number portion:
    if lon[-1] in ['E', 'W']:
        lon_num = float(lon[:-1])
        lon_dir = lon[-1]
    else:
        lon_num = float(lon)
    if lon_num > 180.0:  # Does longitude exceed range?
        lon_num = 360.0 - lon_num
        lon_dir = flip_direction(lon_dir)
        lon = str(lon_num) + lon_dir

    return ev.LatLon(lat, lon)

Notice below how my wrapper function `myLatLon()` correctly handles the same out-of-range values that crashed above.

In [29]:
a = myLatLon('43.2N', '359.1W')
b = myLatLon('44.0N', '358.4W')

meters = a.distanceTo(b)    # Calculate 'great circle' distance
distance = meters / 1852.0  # Divide to convert meters into nautical miles
bearing = a.bearingTo(b)
knots = distance / 6        # nautical miles / hours = knots
print('AL051952 later moved {:.2f} nm at initial heading of {:.2f} deg. \
at a speed of {:.2f} kts'.format(distance, bearing, knots))

AL051952 later moved 56.87 nm at initial heading of 32.21 deg. at a speed of 9.48 kts


# Lambda functions
Lambda functions can seem a bizarre or abstract concept. It is related to both "first-class functions" and "anonymous functions", which are capabilities of many languages including Python.

# Anonymous functions
Often, they are just a short function that is used only once, so that we don't even feel like giving them a name. Thus you'll also see the phrase "anonymous function" used in this situation.  It's helpful to understand that you can always write equivalent code where the function *does* have a name.  There's an example both ways below.

There are a few situations in Python where we occasionally want to use a temporary function, but the most likely place you'll see other coders do it is for customized sorting.

In [30]:
list_of_tuples = [(1, 'Joe'),
                  (2, 'James'),
                  (5, 'Smith'),
                  (10, 'Charles'),
                  (3, 'Alberta'),
                  (4, 'Francine'),
                  (2, 'Adam'),
                  (7, 'Charles')]

sorted(list_of_tuples)  # this produces the standard sort order

[(1, 'Joe'),
 (2, 'Adam'),
 (2, 'James'),
 (3, 'Alberta'),
 (4, 'Francine'),
 (5, 'Smith'),
 (7, 'Charles'),
 (10, 'Charles')]

Suppose we want to sort this structure by the *names* instead of the numbers. Notice the names are always in index position 1 instead of index zero.  The sorted() function lets us optionally provide a **function** as its *key=* parameter, and it will sort instead by whatever values the key function returns from each item.  So we could implement it like this:

In [31]:
def get_name_from_tuple(t: tuple) -> str:
    return t[1]

get_name_from_tuple((2, 'adam'))

'adam'

In [32]:
sorted?

In [33]:
sorted(list_of_tuples, key=get_name_from_tuple)

[(2, 'Adam'),
 (3, 'Alberta'),
 (10, 'Charles'),
 (7, 'Charles'),
 (4, 'Francine'),
 (2, 'James'),
 (1, 'Joe'),
 (5, 'Smith')]

In [34]:
sorted(list_of_tuples, key=lambda t: t[1])

[(2, 'Adam'),
 (3, 'Alberta'),
 (10, 'Charles'),
 (7, 'Charles'),
 (4, 'Francine'),
 (2, 'James'),
 (1, 'Joe'),
 (5, 'Smith')]

Once you untangle how that previous example is working, then you can see how the lambda expression is just a shortcut for the same idea.  This one is sorting FIRST by name, and then (if there are matching names) it will sub-sort by the number:

So how to we make it sort FIRST by name, and then (if there are matching names -- like 'Charles') it will sub-sort by the number?

In [35]:
sorted(list_of_tuples, key=lambda t: (t[1], t[0]) )

[(2, 'Adam'),
 (3, 'Alberta'),
 (7, 'Charles'),
 (10, 'Charles'),
 (4, 'Francine'),
 (2, 'James'),
 (1, 'Joe'),
 (5, 'Smith')]

In [36]:
# Numbers with their names in English, German, and Spanish:
number_names = [[1, 'one',   'eins',   'uno'],
                [2, 'two',   'zwei',   'dos'],
                [3, 'three', 'drei',   'tres'],
                [4, 'four',  'vier',   'quatro'],
                [5, 'five',  'fünf',   'cinco'],
                [6, 'six',   'sechs',  'seis'],
                [7, 'seven', 'sieben', 'siete'],
                [8, 'eight', 'acht',   'ocho'],
                [9, 'nine',  'neun',   'nueve'],
                [10, 'ten',  'zehn',   'diez']]

sorted(number_names)

[[1, 'one', 'eins', 'uno'],
 [2, 'two', 'zwei', 'dos'],
 [3, 'three', 'drei', 'tres'],
 [4, 'four', 'vier', 'quatro'],
 [5, 'five', 'fünf', 'cinco'],
 [6, 'six', 'sechs', 'seis'],
 [7, 'seven', 'sieben', 'siete'],
 [8, 'eight', 'acht', 'ocho'],
 [9, 'nine', 'neun', 'nueve'],
 [10, 'ten', 'zehn', 'diez']]

If we want to sort these alphabetically by their English names, then like the previous example, we still want to use the item at index 1 as the key. 

In [37]:
sorted(number_names, key=lambda x: x[-1])

[[5, 'five', 'fünf', 'cinco'],
 [10, 'ten', 'zehn', 'diez'],
 [2, 'two', 'zwei', 'dos'],
 [9, 'nine', 'neun', 'nueve'],
 [8, 'eight', 'acht', 'ocho'],
 [4, 'four', 'vier', 'quatro'],
 [6, 'six', 'sechs', 'seis'],
 [7, 'seven', 'sieben', 'siete'],
 [3, 'three', 'drei', 'tres'],
 [1, 'one', 'eins', 'uno']]

How do we sort this in German?  Or Spanish?

In [38]:
sorted(number_names, key=lambda x: x[2])

[[8, 'eight', 'acht', 'ocho'],
 [3, 'three', 'drei', 'tres'],
 [1, 'one', 'eins', 'uno'],
 [5, 'five', 'fünf', 'cinco'],
 [9, 'nine', 'neun', 'nueve'],
 [6, 'six', 'sechs', 'seis'],
 [7, 'seven', 'sieben', 'siete'],
 [4, 'four', 'vier', 'quatro'],
 [10, 'ten', 'zehn', 'diez'],
 [2, 'two', 'zwei', 'dos']]

# Another way to look at Lambda functions:

In [39]:
triple = lambda x: x * 3

print(triple(5))
print(triple(40))

15
120


In [40]:
# That did the same thing as defining a "triple" function normally like this:
def triple(x):
    return x * 3

print(triple(5))
print(triple(40))

15
120


# First Class Functions & Closures
The above examples work only because functions are themselves objects in Python, that we can use (like any other objects) as parameters to other functions and even return values, or store them in other objects. This is a feature of many advanced programming languages, and is formally called "first class functions".

A **closure** is a more subtle concept.  It's where a function gets defined dynamically in a way that it depends on  variables outside its own scope at creation time. In the examples that follow, the variable `n` is _outside_ the scope of the lambda function we're creating. This only works because Python supports closures:

In [41]:
def multiplier_creator(n: int):
    def new_function(x): 
        print(x, 'times', n, 'is', x * n)
        
    return new_function

doubler = multiplier_creator(2)
tripler = multiplier_creator(3)
quadrupler = multiplier_creator(4)

doubler(10)
tripler(9)
quadrupler(4)

10 times 2 is 20
9 times 3 is 27
4 times 4 is 16


In [42]:
quadrupler('Pizza')

Pizza times 4 is PizzaPizzaPizzaPizza


To prove that the value of n only mattered at the moment we defined the lambda functions, see this experiment:

In [43]:
n = 42
doubler(8)

8 times 2 is 16


In [44]:
def expander_creator(n: int, operation='exponent'):
    """A more general function creator than multipler_creator(). 
    This returns either multiplication function or an exponential function.

    :param n: the multiplier or exponent value to use
    :param operation: one of a list of possible operations (default 'exponent')
    :returns: a new function
    """
    if operation == 'exponent' or operation is None:
        return lambda x: x ** n
    elif operation in ['multiply', 'times']:
        return lambda x: x * n
    elif operation in ['addition', 'add', 'plus']:
        return lambda x: x + n
    else:
        raise ValueError('operation parameter was invalid: ', operation)
    
new_tripler = expander_creator(3, 'multiply')
print(new_tripler(5))

squarer = expander_creator(2, 'exponent')
print(squarer(8))

cuber = expander_creator(3)
print(cuber(3))

incrementer = expander_creator(1, 'add')
print(incrementer(5))

15
64
27
6
