
<div style="text-align: center; line-height: 0; padding-top: 9px;">
  <img src="https://databricks.com/wp-content/uploads/2018/03/db-academy-rgb-1200px.png" alt="Databricks Learning" style="width: 600px">
</div>

# Methods, Functions &amp; Packages

## In this lesson you:
* Define and use functions
  * with and without arguments
  * with and without type hints
* Use **`assert`** statements to "unit test" functions
* Employ the **`help()`** function to learn about modules, functions, classes, and keywords
* Identify differences between functions and methods
* Import libraries

##![Spark Logo Tiny](https://files.training.databricks.com/images/105/logo_spark_tiny.png) Functions

In this lesson, we're going to see how we can use functions to make code reusable.

Let's start with a simple example:

We know that we can use Python to do math.

The modulo operator returns the remainder of a division.

The code below returns 0 becuase 42 is even, so this division has no remainder.

In [0]:
42 % 2

The code below will return 1 because it is odd.

In [0]:
41 % 2

If we want to determine whether a whole bunch of numbers are even or odd, we can package this same code into a **function**. 

A <a href="https://docs.python.org/3/tutorial/controlflow.html#defining-functions" target="_blank">function</a> is created with the **`def`** keyword, followed by the name of the function, any parameters (variables) in parentheses, and a colon.

In [0]:
# General syntax
# def function_name(parameter_name):
#   """Optional doc string explaining the function"""
#   block of code that is run every time function is called

# defining the function
def printEvenOdd(num):
  """Prints the string "even", "odd", or "UNKNOWN"."""
  
  if num % 2 == 0:
     print("even")
  elif num % 2 == 1:
     print("odd")
  else:
     print("UNKNOWN")

# execute the function by passing it a number
printEvenOdd(42)

The one problem with printing the result is that if you assign the function's result back to a variable, the value is **`None`**.

As we see here:

In [0]:
result = printEvenOdd(42)
print(f"The result is: {result}")

To make the function more useful, we can instead introduce a **`return`** expression.

This directs the method to stop execution of the method and to "return" the specified value

In [0]:
def evenOdd(num):
  """Returns the string "even", "odd", or "UNKNOWN"."""
  
  if num % 2 == 0:
    return "even"
  elif num % 2 == 1:
    return "odd"
  else:
    return "UNKNOWN"

# execute the function by passing it a number
result = evenOdd(42)
print(f"The result is: {result}")

##![Spark Logo Tiny](https://files.training.databricks.com/images/105/logo_spark_tiny.png) Testing Functions

When developing functions like this, it's is best practice to test them with various inputs, expecting different outputs.

This is commonly refered to as **Unit Testing** - testing a "unit" of code.

To do that, we can employ the **`assert`** expression as seen in the previous lesson.

If a test fails, execution will stop and an alert (**AssertionError**) will be rended to the console.

To see this in action, alter the code below so that the various assertions fail.

In [0]:
result = evenOdd(101)
assert "odd" == result, f"Expected odd, found {result}"
print("Test #1 passed")

result = evenOdd(400)
assert "even" == result, f"Expected even, found {result}"
print("Test #2 passed")

result = evenOdd(5)
assert "odd" == result, f"Expected odd, found {result}"
print("Test #3 passed")

result = evenOdd(2)
assert "even" == result, f"Expected even, found {result}"
print("Test #4 passed")

result = evenOdd(3780) 
assert "even" == result, f"Expected even, found {result}"
print("Test #5 passed")

result = evenOdd(78963)
assert "odd" == result, f"Expected odd, found {result}"
print("Test #6 passed")

# A slighly more robust version:
value = 1/3
expected = "UNKNOWN"
result = evenOdd(value)
assert expected == result, f"Expected {expected}, found {result} for {value}"
print("Test #7 passed")

The secret to unit testing is two fold:
* Write your code so that it is testable - a whole other topic in and of itself.
* Provide as much information as reasonable in the assertion - in most cases you will be looking at a report of failures and not the code itself.

##![Spark Logo Tiny](https://files.training.databricks.com/images/105/logo_spark_tiny.png) The Help Functions

The <a href="https://docs.python.org/3/library/functions.html#help" target="_blank">help()</a> function can display additional information the functions you develop.

It relies in large part on the documentation string provided at the top of a function, for example:

In [0]:
help(evenOdd)

A properly documented function will include information on:
* The parameters
* The parameter's data types
* The return value
* The return value's data type
* A one-line description
* Possibly a more verbose description
* Example usage
* and more...

In [0]:
def documentedEvenOdd(num):
  """
  Returns the string "even", "odd", or "UNKNOWN".
  
  This would be a more verbose description of 
  what this function might do and could drone 
  one for quite a while
  
  Args:
    num (int): The number to be tested as even or odd
    
  Returns:
    str: "even" if the number is even, "odd" if the number is odd or "UNKNOWN"
    
  Examples:
    evenOdd(32)
    evenOdd(13)
  """
    
  if num % 2 == 0:
    return "even"
  elif num % 2 == 1:
    return "odd"
  else:
    return "UNKNOWN"

In [0]:
help(documentedEvenOdd)

##![Spark Logo Tiny](https://files.training.databricks.com/images/105/logo_spark_tiny.png) Functions Arguments

Obviously, functions can accept more than one argument.

Our function takes only one argument and works well for English.

Now let's modify our function to make it easy to localize for different languages by accepting two additional arguments for the local version of "even" and "odd".

In [0]:
def evenOddInt(num, evenLabel, oddLabel):
  if num % 2 == 0:
    return evenLabel
  elif num % 2 == 1:
    return oddLabel
  else:
    return "UNKNOWN"

# execute the function by passing it a number and two labels
print( evenOddInt(41, "even", "odd") )
print( evenOddInt(42, "even", "odd") )

A new set of unit tests can verify that our new function works as expected.

And because a unit tests is just more code, we can wrap that up in a test-function as well.

In [0]:
def testEvenOddInt(value, evenLabel, oddLabel, expected):
  result = evenOddInt(value, evenLabel, oddLabel)
  
  assert expected == result, f"Expected {expected}, found {result} for {value}"
  print(f"Test {value}/{evenLabel}/{oddLabel} passed")

With the test functiond defined, testing many permutations gets even easier.

In [0]:
# Test our "UNKNOWN" case
testEvenOddInt(1/3, "what", "ever", "UNKNOWN")

# Test around zero
testEvenOddInt(-1, "even", "odd", "odd")
testEvenOddInt(0, "even", "odd", "even")
testEvenOddInt(1, "even", "odd", "odd")
testEvenOddInt(2, "even", "odd", "even")
testEvenOddInt(3, "even", "odd", "odd")

# Additional tests focused on the even/odd labels
testEvenOddInt(400, "pair", "impair", "pair")
testEvenOddInt(5, "zυγός", "περιττός", "περιττός")
testEvenOddInt(2, "gerade", "ungerade", "gerade")
testEvenOddInt(3780, "genap", "ganjil", "genap")
testEvenOddInt(78963, "sudé", "liché", "liché")

##![Spark Logo Tiny](https://files.training.databricks.com/images/105/logo_spark_tiny.png) Type Hints

Type hinting allows us to indicate the argument and return type of a function.

This is done by appending a colon and data type to the end of each parameter.
* For example **`num`** would become **`num:int`**.

The return type is documented by adding an "arrow" and data type to the end of a function.
 * For example, **`def someFunc():`** would become **`def someFunc() -> str:`** to indicate that it returns a string.
 
We can see below how our **`evenOddInt()`** function would look with type parameters.

In [0]:
def evenOddInt(num:int, evenLabel:str, oddLabel:str) -> str:
  if num % 2 == 0:
    return evenLabel
  elif num % 2 == 1:
    return oddLabel
  else:
    return "UNKNOWN"

# execute the function by passing it a number
evenOddInt(42, "even", "odd")

But there is a catch!

Python is dynamically-typed language, so even if you decalre a parameter to be a string, there is little to nothing stoping you from passing other data types to that function.

For example....

In [0]:
resultA = evenOddInt(12, "EVEN", "ODD")
print(f"""The result is "{resultA}" and of type {type(resultA)}""")

resultB = evenOddInt(13, True, False)
print(f"""The result is "{resultB}" and of type {type(resultB)}""")

resultC = evenOddInt(17, 1, 0)
print(f"""The result is "{resultC}" and of type {type(resultC)}""")

**Question:** If type hints are not inforced in Python, then why provide them?

**Question:** If type hints won't stop callers of my function from passing the wrong value, what can I do?<br/>
Hint: Consider the following version of our even/odd method:

In [0]:
def evenOddInt(num:int, evenLabel:str, oddLabel:str) -> str:
  # What can one do to protect a function from bad input?
  
  if num % 2 == 0:
    return evenLabel
  elif num % 2 == 1:
    return oddLabel
  else:
    return "UNKNOWN"

# This should always work
print( evenOddInt(12, "EVEN", "ODD") )

# But can we stop this code from executing?
print( evenOddInt(13, True, False) )
print( evenOddInt(17, 1, 0) )

##![Spark Logo Tiny](https://files.training.databricks.com/images/105/logo_spark_tiny.png) Default Values

Assuming 90% of the callers of this function expect to use English, requiring them to specify the **`evenLabel`** and **`oddLabel`** everytime is cumbersome.

We can improve our function further by adding default values for even and odd so these two arguments are not needed for English.

In [0]:
def evenOddInt(num: int, evenLabel:str = "even", oddLabel:str = "odd") -> str:
  if num % 2 == 0:
    return evenLabel
  elif num % 2 == 1:
    return oddLabel
  else:
    return "UNKNOWN"

# execute the function by passing it a number
print(evenOddInt(42))
print(evenOddInt(32, "EVEN", "ODD"))
print(evenOddInt(65, "pair", "impair"))

Of course we now need to update our unit tests to account for the fact that we now have default values.

But we can save that for a later date.

##![Spark Logo Tiny](https://files.training.databricks.com/images/105/logo_spark_tiny.png) Named Arguments

By default, Python assignes a value to a parameter based on its ordinal value.

In the previous example, the first value is assigned to the **`num`** parameter, the second value to the **`evenLabel`** parameter, and so on.

Alternatively, you can name the arguments as the function is called - note that the order no longer matters.

In [0]:
def evenOddInt(num: int, evenLabel:str = "even", oddLabel:str = "odd") -> str:
  if num % 2 == 0:
    return evenLabel
  elif num % 2 == 1:
    return oddLabel
  else:
    return "UNKNOWN"

# execute the function by passing it a number
print(evenOddInt(42, oddLabel="impair", evenLabel="pair"))
print(evenOddInt(evenLabel="EVEN", oddLabel="ODD", num=32))
print(evenOddInt(oddLabel="ganjil", num=3780, evenLabel="genap"))

Hint: Calling a function that has 3+ aruments can make your code far more readable.

Compare the two examples:

**`db.record("Mike", "Smith", 32, 1695, "Plummer Dr", 75087)`**

**`db.record(first="Mike", last="Smith", age=32, house_num=1695, street="Plummer Dr", zip=75087)`**

##![Spark Logo Tiny](https://files.training.databricks.com/images/105/logo_spark_tiny.png) Arbitrary Arguments

You can define a function in Python that accepts an arbitrary number of arguments with syntax like this:

```
def my_func(*args):
  ...
```

The parameter name **`args`** is not required, but it is a common convention.

The **`args`** parameter is treated as a sequence containing all of the arguments passed to the function.

In [0]:
def sum(*args):
  total = 0
  for value in args:
    total += value
  return total

sum_a = sum(1, 2, 3, 4, 5)
sum_b = sum(32, 123, -100, 9)
sum_c = sum(13)
sum_d = sum()

print(f"Example A: {sum_a}")
print(f"Example B: {sum_b}")
print(f"Example C: {sum_c}")
print(f"Example D: {sum_d}")

##![Spark Logo Tiny](https://files.training.databricks.com/images/105/logo_spark_tiny.png) Arbitrary Keyword Arguments

A Python function can also accept arbitrary named arguments, which are referred to as _keyword arguments_, with syntax like this:

```
def my_func(**kwargs):
  ...
```

The parameter name `kwargs` is not required, but it is a common convention. 

The `kwargs` parameter is treated as a dictionary containing all of the argument names and values passed to the function.

We will talk more about dictionaries in the next lesson.

In [0]:
def my_func(**kwargs):
  print("Arguments received:")
  for key in kwargs:
    value = kwargs[key]
    print(f"  {key:15s} = {value}")
  print()

my_func(first_name="Jeff", last_name="Lebowski", drink="White Russian")

my_func(movie_title="The Big Lebowski", release_year=1998)

##![Spark Logo Tiny](https://files.training.databricks.com/images/105/logo_spark_tiny.png) Methods

In Python, a Method refers to a special kind of function that is applied to an object.

We will talk more about objects and classes in a later lesson, but for now, consdier that:
* A function only operates on the parameters passed to it (generally).
* A method has parameters, but it also has additional state as defined by the object it is attached to.

For example, a string (**`str`**) is a type of object:

In [0]:
name = 'Databricks'
print(name)

The string object has a method called **`upper()`** which we can invoke.

In [0]:
result = name.upper()
print(result)

Note that the original object was not modified, but that the **`upper()`** method used the string to produce a new string.

In [0]:
print(name)

As with functions, certain methods expect an argument.

The string class also has a **`count(str)`** method that returns the total number of times the specified substring is found.

In this case, let's count how many times we find the letter **`a`**.

In [0]:
total = name.count('a')
print(f"I found {total} instances of that substring")

##![Spark Logo Tiny](https://files.training.databricks.com/images/105/logo_spark_tiny.png) Libraries

Python includes a rich set of libraries developed by an every going community of software develoeprs.

Some libraries are included with Python by default.

Others are external libraries that have to be "installed".

One such library is <a href="https://numpy.org/doc/stable" target="_blank">NumPy</a> - while a 3rd-party library, NumPy is pre-installed on the Databricks runtime.
  
To use any library, you need to first import it into the current name space:

In [0]:
import numpy

numpy.sqrt(9)

As with other functions, we can use the **`help()`** function on **`numpy.sqrt`** to get more information.

In [0]:
help(numpy.sqrt)

You can also change the name of the library when you import it.

This can be handy when the code gets too verbose and abreviating or renaming the imported object makes the code easier to read.

In [0]:
import numpy as np

np.sqrt(12)

And instead of importing just one function at a time, you can import every function from the NumPy package using the wildcard **`*`**.

Caution: importing too many things into the same name space can create confusion.
* Are you using Python's **`count()`** function? Databricks' **`count()`**? NumPy's **`count()`** function?

In [0]:
from numpy import *

sqrt(12)
absolute(-123)

&copy; 2020 Databricks, Inc. All rights reserved.<br/>
Apache, Apache Spark, Spark and the Spark logo are trademarks of the <a href="http://www.apache.org/">Apache Software Foundation</a>.<br/>
<br/>
<a href="https://databricks.com/privacy-policy">Privacy Policy</a> | <a href="https://databricks.com/terms-of-use">Terms of Use</a> | <a href="http://help.databricks.com/">Support</a>