<div class="alert alert-block alert-info">
Author:<br>Felix Gonzalez, P.E. <br> Adjunct Instructor, <br> Division of Professional Studies <br> Computer Science and Electrical Engineering <br> University of Maryland Baltimore County <br> fgonzale@umbc.edu
</div>

This notebook provides an overview of creating custom functions in python and how to use them. The notebook also formally introduces the use of modules, libraries, and packages as well as how to explore and call functions within those modules. One of the advantages of Python is that further funtionality can be added by adding modules. As a program gets longer, Python allows to split scripts files and functions into a several files and modules. This permits easier maintenance by allowing to call functions and scripts from the separate modules without having to copy each script into each program. Further information can be found in the Python documentation at https://docs.python.org/3/tutorial/modules.html#standard-modules

The Anaconda Distribution is one of the distribution packages that includes data science centric modules. The list of the packages included in the Anaconda Distribution can be found at: https://docs.anaconda.com/anaconda/packages/pkg-docs/.

# Table of Contents
[Defining Custom Functions](#Defining-Custom-Functions)

[Python Modules](#Python-Modules)

[Importing Modules](#Importing-Modules)

[Installing Modules](#Installing-Modules)

[](#)

[](#)

[](#)

[](#)


# Defining Custom Functions
[Return to Table of Contents](#Table-of-Contents)

One of the features that Python has is the ability of defining functions. This allows to call and execute a function when the defined parameters are provided.

Documentation Reference:
- https://docs.python.org/3/tutorial/controlflow.html#defining-functions

In [1]:
# For example, 
# Say we want to develop an equation where I want to calculate the tip given the bill amount.
# Previously we would do something like the following:
bill_amount = 20 # in dollars
tip_percentage = 15 # in percent

tip_dollars = bill_amount*(tip_percentage/100)
total_amount = bill_amount + tip_dollars

print(f'The bill of ${bill_amount} includes a tip of ${tip_dollars} for a total of ${total_amount}.')

The bill of $20 includes a tip of $3.0 for a total of $23.0.


In [2]:
# Let's define a function with the parameters bill_amount and tip_percentage.
def bill_calculations(bill_amount, tip_percentage):

    tip_dollars = bill_amount*(tip_percentage/100)
    total_amount = bill_amount + tip_dollars

    return(print(f'The bill of ${bill_amount} includes a tip of ${tip_dollars} for a total of ${total_amount}.'))

In [3]:
# We call the function with the given parameters.
bill_calculations(bill_amount = 50, tip_percentage = 15)

The bill of $50 includes a tip of $7.5 for a total of $57.5.


In [4]:
# We don't need to specify the parameter name as long as it has the same number of parameters and in the right order.
bill_calculations(40, 20)

The bill of $40 includes a tip of $8.0 for a total of $48.0.


In [5]:
# If we forget one parameter we will get a syntax error.
bill_calculations(, 20)

SyntaxError: invalid syntax (2619297919.py, line 2)

In [6]:
# Note that if we call a variable within the function separately we get the result from the first example.
total_amount

23.0

In [7]:
# This is because the variables in the function are not stored in the memory unless the variable is defined as global.
def bill_calculations(bill_amount, tip_percentage):
    # Defining global variables allow to use the variables outside of the defined function.
    global tip_dollars, total_amount
    
    tip_dollars = bill_amount*(tip_percentage/100)
    total_amount = bill_amount + tip_dollars

    # When we define the global variables we don't need to provide a return output.
    return(print(f'The bill of ${bill_amount} includes a tip of ${tip_dollars} for a total of ${total_amount}.'))

In [8]:
bill_calculations(bill_amount = 50, tip_percentage = 25)

The bill of $50 includes a tip of $12.5 for a total of $62.5.


In [9]:
# I can now call the variables within the function
total_amount

62.5

In [10]:
# If I wanted to call the parameters there are two ways.
# First define it as a variable outside of the function.
bill_amount_input = 60
tip_percentage_input = 30

bill_calculations(bill_amount = bill_amount_input, tip_percentage = tip_percentage_input)

The bill of $60 includes a tip of $18.0 for a total of $78.0.


In [11]:
print(bill_amount_input)

60


In [12]:
# Second option is to redefine a variable with a new name inside the function.
def bill_calculations(bill_amount, tip_percentage):
    # Defining global variables allow to use the variables outside of the defined function.
    global tip_dollars, total_amount
    
    # Redefining the input parameters as a new variable to call them outside of th efunction.
    bill_amount_input = bill_amount
    tip_percentage_input = tip_percentage     
    
    tip_dollars = bill_amount*(tip_percentage/100)
    total_amount = bill_amount + tip_dollars

    return(print(f'The bill of ${bill_amount} includes a tip of ${tip_dollars} for a total of ${total_amount}.'))

In [13]:
bill_calculations(bill_amount = 70, tip_percentage = 35)
print(bill_amount_input) # I can now call the input variables.

The bill of $70 includes a tip of $24.5 for a total of $94.5.
60


In [14]:
# When designing a program it is important to start thinking on a modular fashion. 
# Modular coding allows much easier

# Python Modules
[Return to Table of Contents](#Table-of-Contents)

Python includes various modules that are used for various activities within data science. The following list provides a few of the data science most used Python modules:

- Numpy: NumPy offers comprehensive mathematical functions, random number generators, linear algebra routines, Fourier transforms, and more. Interoperable. https://numpy.org/doc/
- Datetime: The datetime module supplies classes for manipulating dates and times. https://docs.python.org/3/library/datetime.html
- Pandas: pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language. https://pandas.pydata.org/
- Re: This module provides regular expression matching operations. https://docs.python.org/3/library/re.html
- MatPlotLib: Data visualization library. https://matplotlib.org/
- sciKit-Learn: Machine learning library for predictive data analysis built on NumPy, SciPy, and matplotlib https://scikit-learn.org/stable/
- NLTK:  Natural Language Processing library. https://www.nltk.org/

All of these and more is included in the Anaconda Distribution.

# Importing Modules
[Return to Table of Contents](#Table-of-Contents)

Once a module is imported, the functions of the module can be used. Each module has specific capabilities, advanatages, disadvantages, limitations and functions with specific parameters. This is why it is important to understand the documentation of each module. Modules can also have overlaps and functions that do the same or similar things.

In [15]:
# Let's import the numpy module.
import numpy as np # To save time we may import modules using their short name. In this case numpy as np.
# The library documentation will typically include best practices for that module.

In [16]:
# let's take square of each element of an array
# An array is a type of data collection in numpy. 
# An array is the same as a list but work only with numbers and runs faster than a list.

a = np.linspace(start = 0, stop = 10, num = 21) # Returns evenly spaced numbers over a specified interval.
print("An array of numbers from 0 to 10")
print(a)

An array of numbers from 0 to 10
[ 0.   0.5  1.   1.5  2.   2.5  3.   3.5  4.   4.5  5.   5.5  6.   6.5
  7.   7.5  8.   8.5  9.   9.5 10. ]


In [17]:
for i in range(len(a)):
    a[i] = a[i] * a[i] # could have also used the pow() function.
print("====== Square of Numbers ========")    
print(a)

[  0.     0.25   1.     2.25   4.     6.25   9.    12.25  16.    20.25
  25.    30.25  36.    42.25  49.    56.25  64.    72.25  81.    90.25
 100.  ]


# Installing Modules
[Return to Table of Contents](#Table-of-Contents)

Python also makes installing packages easy in general using conda on the command line. To open the command line do a search on the application "Anaconda Prompt" (i.e., command line, also called terminal). 

You can use the Anaconda Package index to install a package using the following line: 
- "$conda install name_of_package"
    
Another method is to use Pythons PIP package manager. To install a pakcage using pip can be done from the Jupyter Notebook code cell using the following line of code:
- "!pip install name_of_package"

PIP can also be used in the Anaconda Prompt command line. Conda access the Python index at Anaconda while the PIP access the index in the Python index site. We will talk about this more later.

<b>Example</b>

Let's use the "Names" library as an example. To install "names" library Uncomment and run the cell below with "!pip install names". Names module documentation at: https://pypi.org/project/names/. After installing comment-out the line above. No need to run again the "!pip install names". Note that the asterisk* next to the cell means that the cell is running.

In [18]:
# To install the "names" module from the Jupyter Notebook interface uncomment the line below.
#!pip install names 

# Replace the "names" with the name of the library you want to install.

In [19]:
import names # After installation only thing that needs to be done is Importing the module.

In [20]:
print(names.__version__) # After importing I can check the version of a module.
# There are other ways to check the version.

0.3.0


In [21]:
# After importing I can use functions from the names library.
# The get_full_name() function gets a random full name.
names.get_full_name()

'Pauline Armstrong'

# NOTEBOOK END