<br>
<p style="text-align: left;"><img src='https://s3.amazonaws.com/weclouddata/images/logos/sunlife_logo.png' width='35%'></p>
<p style="text-align:left;"><font size='15'><b> Python Functions & Modules</b></font> <br><font color='#FC7307' size=6>Student Copy</font> </p>
<h2 align='left' > Sunlife Data Science Training </h2>

<h4 align='left'>  Prepared by: <img src='https://s3.amazonaws.com/weclouddata/images/logos/wcd_logo.png' width='15%'>

---

# <font color='#347B98'> 1. Functions 

In Python, `function` is a group of related statements that perform a specific task.

- Functions help break our program into smaller and modular chunks. As our program grows larger and larger, functions make it more **organized and manageable**.

- Furthermore, it **avoids repetition** and makes code **reusable**.

    
__Syntax of functions__:

    def function_name(parameters):  
	    """docstring"""  
	    statement(s) 
    
    

___Function components___:
- Keyword `def` marks the start of function header.
- A `function name` to uniquely identify it. Function naming follows the same rules of writing identifiers in Python.
- `Parameters` (arguments) through which we pass values to a function. They are optional.
- A `colon (:)` to mark the end of function header.
- Optional documentation string (`docstring`) to describe what the function does.
- One or more valid python statements that make up the `function body`. Statements must have same indentation level (usually 4 spaces).
- An optional `return` statement to return a value from the function.
    
</font> 

## $\Delta$ 1.1 Define/Call a Function


### Functions - <font color='#FC7307'> Example 1

In [1]:
# Define a function

def absolute_value(num):
    """This function returns the absolute
    value of the entered number"""

    if num >= 0:
        return num
    else:
        return -num


In [2]:
# Print the docstring
print(absolute_value.__doc__)

This function returns the absolute
    value of the entered number


In [3]:
# Function calls
print(absolute_value(2))  # Output: 2

2


In [4]:
print(absolute_value(-4))  # Output: 4

4


### Functions - <font color='#FC7307'> Example 2

In [6]:
### Data Retrieval

import urllib.request
from bs4 import BeautifulSoup
import re

link = 'https://triangle.canadiantire.ca/en/rewards-cards.html'
link_html = urllib.request.urlopen(urllib.request.Request(link)).read()
soup_obj = BeautifulSoup(link_html, 'html.parser')

texts = soup_obj.findAll(text=True)

def visible(element):
    if element.parent.name in ['style', 'script', 'document']:
        return False
    elif re.match('<!--.*-->', str(element)):
        return False
    elif re.match('\n', str(element)): 
        return False
    return True

#visible_texts = filter(visible, texts)
visible_texts = [text for text in texts if visible(text)]

triangle_page = ' '.join(visible_texts)

In [7]:
# Define a function

def parse_dollars(text):
    """ The parse_dollars function takes string as input
    and extracts all dollar strings from the text
    """
    
    import re
    
    dollars = re.findall(r"(\$\d+[Kk]?((,\d+)*([.]\d+)?)?)", text)
    return dollars

In [8]:
parse_dollars.__doc__

' The parse_dollars function takes string as input\n    and extracts all dollar strings from the text\n    '

In [12]:
# Call the parse_dollars function

# Function call
parsed = parse_dollars(triangle_page)


In [13]:
parsed

[('$12,000', ',000', ',000', ''),
 ('$100', '', '', ''),
 ('$12,000', ',000', ',000', ''),
 ('$4', '', '', ''),
 ('$25', '', '', ''),
 ('$2', '', '', ''),
 ('$10', '', '', ''),
 ('$200', '', '', ''),
 ('$10', '', '', ''),
 ('$10', '', '', ''),
 ('$10', '', '', ''),
 ('$10', '', '', ''),
 ('$100', '', '', ''),
 ('$500', '', '', ''),
 ('$1000', '', '', ''),
 ('$2000', '', '', ''),
 ('$1.64', '.64', '', '.64'),
 ('$8.22', '.22', '', '.22'),
 ('$16.43', '.43', '', '.43'),
 ('$32.86', '.86', '', '.86')]

In [14]:
print([x[0] for x in parsed])

['$12,000', '$100', '$12,000', '$4', '$25', '$2', '$10', '$200', '$10', '$10', '$10', '$10', '$100', '$500', '$1000', '$2000', '$1.64', '$8.22', '$16.43', '$32.86']


<br> 

## $\Delta$ 1.2 Scope and lifetime of variables


`Scope` of a variable is the portion of a program where the variable is recognized.   
> `Parameters and variables defined inside a function is not visible from outside`. Hence, they have a **`local scope`**.

`Lifetime` of a variable is the period throughout which the variable exits in the memory. 
> The `lifetime` of variables inside a function is as long as the function executes.
They are destroyed once we return from the function. Hence, `a function does not remember the value of a variable from its previous calls`.


In [16]:
def my_func():
    x = 10
    print("Value inside function:",x)

x = 20

my_func()

print("Value outside function:",x)


Value inside function: 10
Value outside function: 20


20

###  <font color='#559E54'>$\Omega$  Python Function Lab 1 </font>


**Question: Implement the `add_product()`, `cancel_product()` functions for a insurance shopping cart app**

1. `add_product()`
2. `cancel_product()`

**Data**


In [23]:
my_slf_products = ['RRSP', 'TFSA', 'Life']

#### Q1. Write an `add_item` function

In [24]:
#################
# Your Code Here
#################

def add_product(, ):
    
    return 

In [25]:
add_product(my_slf_products, 'AD&D')

['RRSP', 'TFSA', 'Life', 'AD&D']

In [27]:
print(my_slf_products)

['RRSP', 'TFSA', 'Life', 'AD&D']


#### Q2. Write an `remove_item` function

In [28]:
add_product(my_slf_products, 'RESP')

['RRSP', 'TFSA', 'Life', 'AD&D', 'RESP']

In [29]:
#################
# Your Code Here
#################

def cancel_product(, ):

    return 


In [30]:
cancel_product(my_slf_products, 'RRSP')

['TFSA', 'Life', 'AD&D', 'RESP']

In [31]:
print(my_slf_products)

['TFSA', 'Life', 'AD&D', 'RESP']


---

## $\Delta$ 1.3 Function Arguments


### Default arguments

* In this function, the parameter `name` does not have a default value and is required (mandatory) during a call.
* On the other hand, the parameter `msg` has a default value of "Good morning!". So, it is optional during a call. If a value is provided, it will overwrite the default value.
* Any number of arguments in a function can have a default value. `But once we have a default argument, all the arguments to its right must also have default values`.



In [32]:
def greet(name, msg = "Good morning!"):
    """
    This function greets to
    the person with the
    provided message.

    If message is not provided,
    it defaults to "Good
    morning!"
    """

    print("Hello",name + ', ' + msg)

greet("Jeff")
greet("Jeff","How are you doing?")


Hello Jeff, Good morning!
Hello Jeff, How are you doing?


### Python Arbitrary Arguments
Sometimes, we do not know in advance the number of arguments that will be passed into a function.Python allows us to handle this kind of situation through function calls with arbitrary number of arguments.

In the function definition we use an asterisk `(*)` before the parameter name to denote this kind of argument.

In [33]:
def greet(*names):
    """This function greets all
    the person in the names tuple."""

    # names is a tuple with arguments
    for name in names:
        print("Hello",name)

greet("Monica","Luke","Steve","John")

Hello Monica
Hello Luke
Hello Steve
Hello John


###  <font color='#559E54'>$\Omega$  Python Function Lab 2 </font>

**Question: Implement the `add_multiple_items()`, `remove_multiple_items()` functions for a retail shopping cart app**

1. `add_multiple_items()`
2. `remove_multiple_items()`

**Note**
- Use arbitrary arguments to add multiple items

In [42]:
#################
# Your Code Here
#################

def add_multiple_products(, *products):

    
    return 

def cancel_multiple_products(, *products):

    
    return 

In [43]:
my_slf_products = ['RRSP', 'TFSA', 'Life']

In [44]:
add_multiple_products(my_slf_products, 'AD&D', 'RESP')

In [45]:
cancel_multiple_products(my_slf_products, 'RRSP', 'RESP')

In [46]:
my_slf_products

['TFSA', 'Life', 'AD&D']

---
## $\Delta$ 1.4 Built-in Functions

**Reference**
> https://docs.python.org/3/library/functions.html

<img src='https://s3.amazonaws.com/weclouddata/images/python/pythonb_builtin_funcs.png' width='50%'>

### Built-in `enumerate()`

In [116]:
# Iterate over indices and items of a list
alist = ['a1', 'a2', 'a3']

for i, a in enumerate(alist):
    print (i, a)

0 a1
1 a2
2 a3


### Built-in `zip`

In [118]:
# Iterate over two lists in parallel
alist = ['a1', 'a2', 'a3']
blist = ['b1', 'b2', 'b3']

for a, b in zip(alist, blist):
    print (a, b)

a1 b1
a2 b2
a3 b3


In [119]:
# Enumerate with zip
alist = ['a1', 'a2', 'a3']
blist = ['b1', 'b2', 'b3']

for i, (a, b) in enumerate(zip(alist, blist)):
    print (i, a, b)

0 a1 b1
1 a2 b2
2 a3 b3


In [123]:
# More efficient way
from itertools import count
alist = ['a1', 'a2', 'a3']
blist = ['b1', 'b2', 'b3']

for i, a, b in zip(count(), alist, blist):
    print (i, a, b)

0 a1 b1
1 a2 b2
2 a3 b3


---
## $\Delta$ 1.5 Anonymous Function

### $\delta$ About functional programming
> In computer science, functional programming is a programming paradigm—a style of building the structure and elements of computer programs—that treats computation as the evaluation of mathematical functions and avoids changing-state and mutable data. It is a declarative programming paradigm, which means programming is done with expressions or declarations instead of statements.

Most of us learn Python as an **Object Oriented** language, using classes to build programs. The code is written in an **imperative** (procedural) manner, meaning we tell the program **how** to do something, usually in detailed steps. **Functional Programming** on the other hand is done in a **declaritive** manner, meaning we tell the program **what** needs to be done.

Python is a **general purpose** language, meaning it can be written using multiple paradigms. As a data scientist, you may find that some things are more easily accomplished using functional programming techniques. Some libraries also incorporate these features. The biggest use case is PySpark where most of the code follows a functional paradigm. This is because Spark is written in **Scala**, the functional programming version of **Java**. Parallel computing code is usually easier and safer to write this way because it doesn't rely on sequentially manipulating the data.

For a more detailed look at Python's functional programming features, visit their docs at: https://docs.python.org/3/howto/functional.html

### $\omega$ 1.5.1 Lambda Functions

A **lambda** (or **anonymous**) function is a function that's not bound to a name. In Python, functions are objects, so they they can be passed into other functions as a parameter. It's often convienent to define this function inside the parameter if it's simple and only needs to be used once.

In [124]:
# regular functions
def apply_to_ten(func):
    return func(10)

def add_one(x):
    return x + 1

apply_to_ten(add_one)

11

In [125]:
# lambda function
apply_to_ten(lambda x: x + 1)

11

### $\omega$ 1.5.2 `map()` Functions

A common pattern in Python is looping over a collection of items. This is done in a **procedural** way (defining **how** to accomplish the task). For example if we want to square all the numbers in a list, the steps would be:
- Create an empty list `squared`
- Loop over the list of numbers
- for each number square it
- append the result to `squared`
- return `squared`

To accomplish the same task using functional programming, we use a **declarative** approach (defining **what** the results should look like). In this case:
- return a list with a `squarer` function applied to each element

In [19]:
lst = range(10)

In [20]:
def squarer(x):
    return x**2

list(map(squarer, lst))

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

In [23]:
square = map(squarer, lst)

In [26]:
list(lst)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [30]:
square.__next__()

16

In [129]:
# With a lambda function
list(map(lambda x: x**2, lst))

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

**Note**
> In functional programming, there is the idea of **lazy evaluation**, this means a list isn't computed until absolutely necessary. In Python this is called an iterator, which is what a `map object` is. We use `list` to compute the whole thing. We can demonstrate this concept by evaluating one element at a time using the `__next__` method. All iterators have this method.

In [130]:
squared_map = map(lambda x: x**2, lst)

squared_map

<map at 0x117f02860>

In [134]:
# Run multiple times to compute each element
squared_map.__next__()

9

### $\omega$ 1.5.3 `filter()` Functions
Another common pattern, is filtering a list. **Procedurally** the steps are:
- Create an empty list, `filtered`
- Loop through every number
- If the number is greater than two,
- Append the number to `filtered`
- return `filtered`

In [135]:
lst = range(10)

filtered = []
for x in lst:
    if x > 6:
        filtered.append(x)

filtered

[7, 8, 9]

In [136]:
# As a list comprehension
lst = range(10)

[x for x in lst if x > 6]

[7, 8, 9]

In [137]:
# map()
lst = range(10)

list(filter(lambda x: x > 6, lst))

[7, 8, 9]

### $\omega$ 1.5.4 `reduce()` Functions
Another common pattern in Python is accumulating many values into one value. For example, to compute a sum, we could do:
- Create an accumulator (`summed`) set to zero
- For every number in the list,
- Replace the value of `summed` with the previous value of `summed` plus the number
- Returned `summed`

In [64]:
summed = 0
for x in lst:
    summed += x

summed

45

Using functional programming, we first import `reduce` from the built-in `functools` library. `reduce` works simililarly to the way we sum computed above. It uses a function of two parameters, on each element where the first parameter is the accumulator, and the second is the current value. In this case, `x` is the accumulator and `y` is the current element. So it `x + y` means add every element `y` to the total `x` at each iteration.

In [65]:
from functools import reduce

reduce(lambda x, y: x + y, lst)

45

---
# <font color='#347B98'> 2. Modules

### $\delta$ Script vs Module

**Script**
- If you quit from the Python interpreter and enter it again, the definitions you have made (functions and variables) are lost. Therefore, if you want to write a somewhat longer program, you are better off using a text editor to prepare the input for the interpreter and running it with that file as input instead. This is known as creating a **`script`**. 

**Module**
- As your program gets longer, you may want to split it into several files for easier maintenance. You may also want to use a handy function that you’ve written in several programs without copying its definition into each program. To support this, Python has a way to put definitions in a file and use them in a script or in an interactive instance of the interpreter. Such a file is called a **`module`**; `definitions from a module can be imported into other modules or into the main module` (the collection of variables that you have access to in a script executed at the top level and in calculator mode).

A module is a file containing Python definitions and statements. The file name is the module name with the suffix .py appended. Within a module, the module’s name (as a string) is available as the value of the global variable __name__. For instance, use your favorite text editor to create a file called fibo.py in the current directory with the following contents:

> Beyond the built-in definitions, the standard Python distribution includes perhaps tens of thousands of other values, functions, and classes that are organized in additional libraries, known as **modules**, that can be imported from within a program.
* As an example, we consider the math module. While the **`built-in namespace`** includes a few mathematical functions (e.g., abs, min, max, round), many more are relegated to the **`math module (e.g., sin, cos, sqrt)`**. That module also defines approximate values for the mathematical constants, pi and e.
* Python’s **`import statement` loads definitions from a module into the current namespace**. One form of an import statement uses a syntax such as the following:
**`from math import pi, sqrt`**

<img src="https://s3.amazonaws.com/weclouddata/images/python/python_built_in_modules.png" align="center" width="500">



## $\Delta$ 2.1 Create a new module
> * To create a new module, one simply has to put the relevant definitions in a file named with a .py suffix. Those definitions can be imported from any other .py file within the same project directory.
* It is worth noting that top-level commands with the module source code are executed when the module is first imported, almost as if the module were its own script.

### $\delta$ Use the `%%file` cell magic to save code to a script

In [47]:
%ls -l

total 275428
-rw-r--r-- 1 sasdemo sas    131072 May  1 11:38 both.sas7bdat
-rw-r--r-- 1 sasdemo sas     44267 May  2 22:17 ds_2_data_api_twitter_azure_sentiment_instructor.ipynb
-rw-r--r-- 1 sasdemo sas     13875 May  2 22:17 ds_2_data_api_twitter_azure_sentiment_student.ipynb
-rw-r--r-- 1 sasdemo sas  13818084 May  2 22:17 ds_4_python_data_visualization_instructor.ipynb
-rw-r--r-- 1 sasdemo sas  13797073 May  2 22:17 ds_4_python_data_visualization_student.ipynb
-rw-r--r-- 1 sasdemo sas    223135 May  2 22:17 ds_5_numpy_instructor.ipynb
-rw-r--r-- 1 sasdemo sas   1174100 May  2 22:17 ds_6_pandas_1_instructor.ipynb
-rw-r--r-- 1 sasdemo sas    888502 May  2 22:17 ds_6_pandas_1_student.ipynb
-rw-r--r-- 1 sasdemo sas    279617 May  2 22:17 ds_7_pandas_2_instructor.ipynb
-rw-r--r-- 1 sasdemo sas    244984 May  2 22:17 ds_7_pandas_2_student.ipynb
-rw-r--r-- 1 sasdemo sas    540747 May  2 22:17 ds_8_pandas_3_instructor.ipynb
-rw-r--r-- 1 sasdemo sas    210252 May  2 22:17 ds_8_pandas_3_studen

In [48]:
%%file insure.py

def add_product(myproducts, product):
    """ Purchase new products"""
    myproducts.append(product)
    return myproducts

def cancel_product(myproducts, product):
    """ cancel existing products"""
    myproducts.remove(product)
    return myproducts


Writing insure.py


In [49]:
%ls -l

total 275436
-rw-r--r-- 1 sasdemo sas    131072 May  1 11:38 both.sas7bdat
-rw-r--r-- 1 sasdemo sas     44267 May  2 22:17 ds_2_data_api_twitter_azure_sentiment_instructor.ipynb
-rw-r--r-- 1 sasdemo sas     13875 May  2 22:17 ds_2_data_api_twitter_azure_sentiment_student.ipynb
-rw-r--r-- 1 sasdemo sas  13818084 May  2 22:17 ds_4_python_data_visualization_instructor.ipynb
-rw-r--r-- 1 sasdemo sas  13797073 May  2 22:17 ds_4_python_data_visualization_student.ipynb
-rw-r--r-- 1 sasdemo sas    223135 May  2 22:17 ds_5_numpy_instructor.ipynb
-rw-r--r-- 1 sasdemo sas   1174100 May  2 22:17 ds_6_pandas_1_instructor.ipynb
-rw-r--r-- 1 sasdemo sas    888502 May  2 22:17 ds_6_pandas_1_student.ipynb
-rw-r--r-- 1 sasdemo sas    279617 May  2 22:17 ds_7_pandas_2_instructor.ipynb
-rw-r--r-- 1 sasdemo sas    244984 May  2 22:17 ds_7_pandas_2_student.ipynb
-rw-r--r-- 1 sasdemo sas    540747 May  2 22:17 ds_8_pandas_3_instructor.ipynb
-rw-r--r-- 1 sasdemo sas    210252 May  2 22:17 ds_8_pandas_3_studen

### $\delta$ Use the `%%file` cell magic to load code from a script
> Uncomment the line below and run

In [50]:
# %load insure.py

def add_product(myproducts, product):
    """ Purchase new products"""
    myproducts.append(product)
    return myproducts

def cancel_product(myproducts, product):
    """ cancel existing products"""
    myproducts.remove(product)
    return myproducts


### $\delta$ Get names defined in a module

In [51]:
import insure

In [52]:
dir(insure)

['__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 'add_product',
 'cancel_product']

## $\Delta$ 2.2 Import a module

We can import the definitions inside a module to another module or the interactive interpreter in Python.

- We use the `import` keyword to do the import
- We can import a module using import statement and **`access the definitions`** inside it using the `dot operator` 

### Import module - <font color='#FC7307'> Example 1

In [53]:
import math
dir(math)

['__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 'acos',
 'acosh',
 'asin',
 'asinh',
 'atan',
 'atan2',
 'atanh',
 'ceil',
 'copysign',
 'cos',
 'cosh',
 'degrees',
 'e',
 'erf',
 'erfc',
 'exp',
 'expm1',
 'fabs',
 'factorial',
 'floor',
 'fmod',
 'frexp',
 'fsum',
 'gamma',
 'gcd',
 'hypot',
 'inf',
 'isclose',
 'isfinite',
 'isinf',
 'isnan',
 'ldexp',
 'lgamma',
 'log',
 'log10',
 'log1p',
 'log2',
 'modf',
 'nan',
 'pi',
 'pow',
 'radians',
 'sin',
 'sinh',
 'sqrt',
 'tan',
 'tanh',
 'trunc']

In [54]:
# import statement example
# to import standard module math

import math
print("The value of pi is", math.pi)

The value of pi is 3.141592653589793


### Import module - <font color='#FC7307'> Example 2

In [55]:
# import module by renaming it

import math as m
print("The value of pi is", m.pi)

The value of pi is 3.141592653589793


### Import module - <font color='#FC7307'> Example 3

> We can import specific names from a module without importing the module as a whole.

In [56]:
# import only pi from math module

from math import pi
print("The value of pi is", pi)

The value of pi is 3.141592653589793


In [57]:
# import all names from the standard module math

from math import *
print("The value of pi is", pi)

The value of pi is 3.141592653589793


### $\delta$ <font color='#FC7307'> Python Module Search Path

While importing a module, Python looks at several places. Interpreter first looks for a `built-in` module then (if not found) into a list of directories defined in **`sys.path`**. The search is in this order.

- The current directory.
- PYTHONPATH (an environment variable with a list of directory).
- The installation-dependent default directory.

In [58]:
import sys
print(sys.path)

['', '/usr/lib64/python35.zip', '/usr/lib64/python3.5', '/usr/lib64/python3.5/plat-linux', '/usr/lib64/python3.5/lib-dynload', '/folders/myfolders/.local/lib/python3.5/site-packages', '/usr/lib64/python3.5/site-packages', '/usr/lib64/python3.5/site-packages/IPython/extensions', '/folders/myfolders/.ipython']


### Import module from different directory

In [2]:
### Save insure_test.py module to a different folder

In [69]:
!mkdir ../test

In [71]:
%%file ../test/insure_test.py

def add_product(myproducts, product):
    """ Purchase new products"""
    myproducts.append(product)
    return myproducts

def cancel_product(myproducts, product):
    """ cancel existing products"""
    myproducts.remove(product)
    return myproducts


Overwriting ../test/insure_test.py


In [72]:
import insure_test  # will get no module named 'insure_test' error

In [73]:
import sys
print(sys.path)
sys.path.append('../test')  # include module path

['', '/usr/lib64/python35.zip', '/usr/lib64/python3.5', '/usr/lib64/python3.5/plat-linux', '/usr/lib64/python3.5/lib-dynload', '/folders/myfolders/.local/lib/python3.5/site-packages', '/usr/lib64/python3.5/site-packages', '/usr/lib64/python3.5/site-packages/IPython/extensions', '/folders/myfolders/.ipython']


In [74]:
print(sys.path)

['', '/usr/lib64/python35.zip', '/usr/lib64/python3.5', '/usr/lib64/python3.5/plat-linux', '/usr/lib64/python3.5/lib-dynload', '/folders/myfolders/.local/lib/python3.5/site-packages', '/usr/lib64/python3.5/site-packages', '/usr/lib64/python3.5/site-packages/IPython/extensions', '/folders/myfolders/.ipython', '../test']


In [76]:
import insure_test 

> After appending the path, module can be found

---
##  <font color='#559E54'>$\Delta$ 2.3 Python Module Labs </font>

**Questions: Import the `insure.py` module we created in the previous steps and create a slf_my_product list by calling the `add_product()` function**

1. Create an empty my_slf_products list
2. Add several products to the shopping cart such as 'RRSP', 'Life', 'RESP'.
3. Remove 'RRSP' from the products the customer owns

In [77]:
import insure

In [78]:
import importlib
importlib.reload(insure)

<module 'insure' from '/folders/myfolders/sunlife/insure.py'>

In [88]:
my_slf_products = []

###########################
## You Code Below
###########################





['RRSP', 'Life', 'RESP']

In [89]:
print('There\'re {} products in my account: {}'.format(len(my_slf_products), my_slf_products))

There're 3 products in my account: ['RRSP', 'Life', 'RESP']


In [90]:
insure.cancel_product(my_slf_products, 'RRSP')

['Life', 'RESP']

In [91]:
print('There\'re {} products in my account: {}'.format(len(my_slf_products), my_slf_products))

There're 2 products in my account: ['Life', 'RESP']


---

# <font color='#347B98'> 3. Package

We don't usually store all of our files in our computer in the same location. We use a well-organized hierarchy of directories for easier access.

Similar files are kept in the same directory, for example, we may keep all the songs in the "music" directory. 
> Python has `packages for directories` and `modules for files`.

As our application program grows larger in size with a lot of modules, we place similar modules in one package and different modules in different packages. This makes a project (program) easy to manage and conceptually clear.

Similar, as a directory can contain sub-directories and files
> A Python package can have `sub-packages and modules`.

> A directory must contain a file named **`__init__.py`** in order for Python to consider it as a package. This file can be left empty but we generally place the initialization code for that package in this file.


In [10]:
!mkdir ../script/db

## $\Delta$ 3.1 `__init__.py`

In [99]:
%%file ./db/__init__.py

import pandas as pd
from sqlalchemy import create_engine

DB_TYPE = 'mysql'
DB_DRIVER = 'pymysql'
DB_USER = 'sunlife'
DB_PASS = 'Noisybutter764'
DB_HOST = 'sunlife-mysql.c0h2bhc51r9d.us-east-1.rds.amazonaws.com'   # this ip address is from docker
DB_PORT = '3306'
DB_NAME = 'sunlife'
POOL_SIZE = 50
SQLALCHEMY_DATABASE_URI = '{0}+{1}://{2}:{3}@{4}:{5}/{6}'.format(DB_TYPE, 
                                                                 DB_DRIVER, 
                                                                 DB_USER,
                                                                 DB_PASS, 
                                                                 DB_HOST, 
                                                                 DB_PORT, 
                                                                 DB_NAME)
    

print(SQLALCHEMY_DATABASE_URI)
    
engine = create_engine(SQLALCHEMY_DATABASE_URI, pool_size=POOL_SIZE, max_overflow=0)
print(engine)

Overwriting ./db/__init__.py


## $\Delta$ 3.2 Create modules

In [95]:
%%file ./db/schema.py

from db import engine
import pandas as pd

def list_tables():
    tables = engine.execute("""show tables;""").fetchall()
    return [t[0] for t in tables]

def describe_table(tbname):
    results = engine.execute("""describe {}""".format(tbname)).fetchall()
    return pd.DataFrame(results, columns=['name','type','nullable','pkey','other', 'na'])

Writing ./db/schema.py


In [96]:
%%file ./db/queries.py

from db import engine, schema
import pandas as pd

def fetch_one(tbname):
    result = engine.execute("""select * from {} limit 1;""".format(tbname)).fetchall()
    return pd.DataFrame(result, columns=schema.describe_table(tbname)['name'])

Writing ./db/queries.py


## $\Delta$ 3.3 Use the Package

In [101]:
import sys
print(sys.path)
sys.path.append('./db') 

['', '/usr/lib64/python35.zip', '/usr/lib64/python3.5', '/usr/lib64/python3.5/plat-linux', '/usr/lib64/python3.5/lib-dynload', '/folders/myfolders/.local/lib/python3.5/site-packages', '/usr/lib64/python3.5/site-packages', '/usr/lib64/python3.5/site-packages/IPython/extensions', '/folders/myfolders/.ipython', '../test', './db']


In [102]:
import db

mysql+pymysql://sunlife:Noisybutter764@sunlife-mysql.c0h2bhc51r9d.us-east-1.rds.amazonaws.com:3306/sunlife
Engine(mysql+pymysql://sunlife:***@sunlife-mysql.c0h2bhc51r9d.us-east-1.rds.amazonaws.com:3306/sunlife)


In [103]:
import importlib, db
from db import queries
from db import schema

importlib.reload(db)
importlib.reload(schema)
importlib.reload(queries)


mysql+pymysql://sunlife:Noisybutter764@sunlife-mysql.c0h2bhc51r9d.us-east-1.rds.amazonaws.com:3306/sunlife
Engine(mysql+pymysql://sunlife:***@sunlife-mysql.c0h2bhc51r9d.us-east-1.rds.amazonaws.com:3306/sunlife)


<module 'db.queries' from '/folders/myfolders/sunlife/db/queries.py'>

In [104]:
schema.describe_table('product')

Unnamed: 0,name,type,nullable,pkey,other,na
0,ProductID,int(11),YES,,,
1,ProductName,varchar(200),YES,,,
2,ProductCategory,varchar(20),YES,,,
3,ProductSubCategory,varchar(50),YES,,,
4,ProductContainer,varchar(20),YES,,,
5,ProductBaseMargin,"decimal(4,2)",YES,,,


In [105]:
from db import queries
from db import schema

queries.fetch_one('product')

name,ProductID,ProductName,ProductCategory,ProductSubCategory,ProductContainer,ProductBaseMargin
0,657768,"""""""While you Were Out"""" Message Book, One Form...",Office Supplies,Paper,Wrap Bag,0.35


In [107]:
schema.describe_table('customer')

Unnamed: 0,name,type,nullable,pkey,other,na
0,CustomerID,int(11),YES,,,
1,CustomerName,varchar(100),YES,,,
2,Province,varchar(50),YES,,,
3,Region,varchar(30),YES,,,
4,CustomerSegment,varchar(20),YES,,,


In [108]:
queries.fetch_one('customer')

name,CustomerID,CustomerName,Province,Region,CustomerSegment
0,40732966,Tamara Dahlen,Ontario,Ontario,Corporate
