# 8. Imports and Modules

# Objectives
By the end of this notebook, you should be able to...

* Know the difference between the core built-in features, the standard library, and third party libraries
* Know how to import a module
* Know what a module is
* Access functionality within a module
* Create your own module
* Know how Python searches for modules
* Know how to alias names upon import


# Built-In Python Features

Whenever you first begin writing Python in a notebook, editor, or on the command line, you are using the **built-in** features of the language. These are the features that are always available to you. They consist of the following (numbers are approximate):

* 70 [functions and types][1]
* 30 [keywords][2]
* 40 [Exceptions][3]
* 50 [Operator and delimiter symbols][4]
* Whitespace for indentation

This list contains everything you can type from your keyboard that can be understood by the Python interpreter. This is what is meant by **built-in**. This core part of Python is what builds our programs.

Thus far, with just a few exceptions, we have used features that come from what is built-in directly to the language.

# The Standard Library
The [Python standard library][5], as it says in its name, is distributed with each Python installation. The standard library is not immediately available for you to use when you begin writing a Python program. You must use the **`import`** statement to access its functionality. 

# Third-Party Libraries
Outside of the standard library are user-created libraries, sometimes referred to as **third-party** libraries. These can be found at the [Python Packaging Index][6] or pypi. There are over 100k libraries but only about 10-15 are common for data scientists. These libraries can be installed with **pip**, Python's command line packaging manager. But, we've already installed the Anaconda distribution which comes with most of the data science libraries pre-installed. It also has its own package manager **conda**.

# Importing a module
In Python, we use the **`import`** keyword to give us access to new functionality. We specify the module we want to import by typing its name directly after **`import`**.

### What is a module?
All Python code is stored in files on your machine. Files that contain Python code are called **modules**. It is all the functions, types, and other objects within a module that we are getting access to when we run the import statement.

### Importing the random module
We have already used the **`random`** module in previous notebooks. Let's import it again and examine it a bit closer.

[1]: https://docs.python.org/3/library/functions.html#built-in-functions
[2]: https://docs.python.org/3/reference/lexical_analysis.html#keywords
[3]: https://docs.python.org/3/library/exceptions.html#exception-hierarchy
[4]: https://docs.python.org/3/reference/lexical_analysis.html#operators
[5]: https://docs.python.org/3/library/index.html
[6]: https://pypi.org/

In [None]:
import random

### What is `random`?
We now have a name that refers to the random module. Let's find its type:

In [None]:
type(random)

### The module type?
Yes, that is correct. We have a name that refers to an entire module (a file with Python code). We can use this name to reference functions, types, and other objects defined inside of that file.

### Display all the available names within `random`
We can display all the names within random module with the **`dir`** function. Below, a list comprehension is used to create a list of all the names filtering out the private names.

In [None]:
[name for name in dir(random) if not name.startswith('_')]

### Using functionality from a module
To use one of the names in the list above simply use dot notation with the **`random`** module. Remember to press tab after the dot to make the drop-down menu appear with a list of all the names. Also remember to press shift + tab + tab to pop out a help menu once you've chosen the name to understand how it works.

Let's use the **`randint`** function to produce a random integer between 50 and 100.

In [None]:
random.randint(50, 100)

### Problem 1

<span style="color:green">Read the documentation on the **`sample`** function to choose 5 random integers between 0 and 100.</span>

In [None]:
# your code here

## Importing a single name from a module
Instead of importing the module and then using dot notation to access the name, Python has syntax that allows you to import individual names within the module. The general syntax is the following:

```
from module import name
```

Let's import **`choice`** from the random module and then use it to select a single item at random from a sequence. Notice that you no longer have to use dot notation but can use the name directly.

In [None]:
from random import choice

In [None]:
states = ['TX', 'AZ', 'CA', 'FL']
choice(states)

## Importing multiple names from a module
In a similar fashion, you can import multiple names from a module in one statement. Separate each name by a comma during the import statement.

Below, we import three different names and use each one.

In [None]:
from random import randint, uniform, paretovariate

In [None]:
randint(100, 110)

In [None]:
uniform(5, 10)

In [None]:
paretovariate(.5)

## Using `as` to alias names when importing
It is possible to import a name and **alias** it as something else. An alias is the new name that you use as the reference. The generic form is as follows:

```
import module as new_name
```

You can also alias single names like:

```
from module import original_name as new_name
```

Let's import the **`statistics`** module as **`st`**:

In [None]:
import statistics as st

### `st` now refers to the statistics module
The alias **`st`** can now be used to access all the functionality of the **`statistics`** module. Let's use it to find the mean of a list of numbers.

In [None]:
a_list = [1, 5, 10, 3, 1]
st.mean(a_list)

### Alias a single name
It is also possible to alias a single name within a module. I should say that I typically do not do this and reserve my aliasing for entire modules as was accomplished in the previous code cell. 

The below creates an alias for the **`harmonic_mean`** function from the **`statistics`** module to something shorter like **`hm`**.

In [None]:
from statistics import harmonic_mean as hm

In [None]:
hm(a_list)

### Import all names from a module
This is not recommended at all, but you will see it crop up from time to time. You can import every single name from a module by doing **`from module import *`**. The reason this is not recommended is because this "pollutes" your namespace and makes it more difficult to determine which module the name originated from.

### The Python Standard Library Reference
Please visit the [official documentation][1] for a listing of all the modules in the standard library. The built-in names are in sections 1-5 with the actual standard library beginning in section 6. You may find this section useful for the following questions.

[1]: https://docs.python.org/3/library/index.html

### Problem 2

<span style="color:green">Import the `math` module and take the log of a number.</span>

In [None]:
# your code here

### Problem 3

<span style="color:green">Use the `string` module to print out all the uppercase letters.</span>

In [None]:
# your code here

### Problem 4

<span style="color:green">Import the single name `getcwd` from the `os` module and execute the function. What does it do?</span>

In [None]:
# your code here

### Problem 5

<span style="color:green">Import and alias the `webbrowser` module as `wb`. Then open a new tab in your browser with the `open_new_tab` function.</span>

In [None]:
# your code here

### Problem 6

<span style="color:green">Take a look at the `fractions` module (find it in the standard library link above) and add the fractions 2/3 to 1/4. It will be helpful to read the documentation from the link.</span>

In [None]:
# your code here

# Creating your own modules
Jupyter Notebooks are great for writing short code snippets, making visualizations, and adding text and images together. However, professional code is always stored in Python modules - files that ends in **`.py`**. In this directory, there is a file named **`simple_module.py`**. Open the file in a text editor so that you can see its contents.

You should see 1 string, 1 integer, and 5 functions defined in the module. Let's import the module and alias it as **`sm`** and use some of its functionality.

In [None]:
import simple_module as sm

In [None]:
# output the value of the author string
sm.author

In [None]:
# output my favorite number
sm.favorite_number

In [None]:
# add two numbers together
sm.add(3, 8)

In [None]:
# count the number of vowels in a string
sm.count_vowels('Wheat Waffles')

## Troubleshooting
If the import statement failed above then you are probably not working out of the correct working directly. The working directory is the path where you started the Jupyter Notebook. We need to ensure that you are working out of the **`precourse-assignment`** directory and not anywhere else.

### Finding the current working directory
You can find the current working directory with help from the **`getcwd`** function in the **`os`** module.

In [None]:
import os

In [None]:
os.getcwd()

### Changing the working directory
If you are not in the precourse-assignment directory you need to change the working directory by using the **`chdir`** function. Replace the contents of the string with the path to this notebook. The path should end in **`precourse-assignment`**.

For instance, the path on my machine is **`'/Users/Ted/Github Repos/Data Science Teaching/precourse-assignment'`**

In [None]:
os.chdir('DELETE AND PUT PATH HERE')

### List all content of working directory
Use the **`listdir`** function to list all the content in your current working directory. You should now see all the notebooks and other files and folders. Importantly, you should see the file **`simple_module.py`**. Rerun your import statement from above now.

In [None]:
os.listdir()

## How does Python find modules?
When importing modules, Python looks at a few specific paths to find the name of the file that ends in **`.py`**. It does not search your entire computer for every single **`.py`** file. You can actually print the set of paths with the **`sys`** module. Let's do that now.

In [None]:
import sys
sys.path

### Python first checks the working directory
The **`path`** variable stores a list of all the paths that the Python interpreter checks for modules. It checks the paths in the specific order you see output above. The first entry in that list is an empty string which represents the current working directory. Python begins its search for modules there. It then continues down the list until it finds the module. If the module is not found, you will get an error.

We defined **`simple_module.py`** in our current directory so Python could find it. All the modules in the standard library are defined in the directory that ends with **`anaconda3/lib/Python3.6`**. The third-party packages are found one level deeper in the **`site-packages`** directory. You can browse through your file system and view the raw source code of these files.

# Project: Creating your own array
We will now create a module that will be capable of completing many array operations. This small project will give you a taste of what its like being a software engineer. The answer will be in a file called **`my_array_answers.py`**. Only look at the contents of that file when you have completed your module (or at least made an attempt). The file will be composed of several functions.

### One-dimensional array
In many programming languages an **array** is a term used as a homogeneous container of data. Arrays are similar to Python lists, except that all elements are the of the same type. Arrays also can't change size once created. 

### Use a list as our array
For this project, we will use the built-in Python list as a substitute for an array. Nearly all of our functions that we define in our module will take Python lists as parameters.

### Create the file

Let's get started by creating a file named **`my_array.py`** in the **`precourse-assignment`** directory.

### Getting your code to work
You might want to first write your code in the notebook and then transfer over it to the module.

### Testing your code
To test your code, import your module into the notebook and run the function. I will use functions from the answers module to show you what the expected output should be.

### Restarting the notebook
When you edit your module, your new code will not be immediately available in this notebook if you have already imported your module here. Restart the notebook by clicking **Kernel -> Restart** in the menu above and then re-importing your module.

### Structure of project
Each new section will represent a new function that you need to add to your module. The section will follow with an example of how the new function should work.

# Start Creating Functions
Let the fun begin! Read the section and create your function.
### 1. `create_random_array`
Open your new file and define a function **`create_random_array`** with a single parameter **`n`**. Return a list with **`n`** elements of random integers between 1 and 100. You will need to import the **`random`** library at the top of the file. Save this returned array to a variable named **`array`** and use it to test the remaining functions.

In [None]:
import my_array_answers as maa

In [None]:
array = maa.create_random_array(10)
array

### 2. `add_constant`
Create the function **`add_constant`** with two parameters, **`array`** and **`x`**. Return a new array with **`x`** added to each element.

In [None]:
maa.add_constant(array, 8)

### 3. `sub_constant`, `mul_constant`, `div_constant`, `pow_constant`
Create the above functions for subtracting, multiplying, dividing and raising to a power of a single number. The signature is the same as **`add_constant`**

In [None]:
maa.sub_constant(array, 23)

In [None]:
maa.mul_constant(array, 4)

In [None]:
maa.div_constant(array, 88)

In [None]:
maa.pow_constant(array, .5)

### 4. `add`, `sub`, `mul`, `div`, `pow_`
Create the five functions above. Each function has two parameters **`array1`** and **`array2`**. These functions will do element by element operations. This means that the first element in **`array1`** will be added to the first element of **`array2`** and so on. Return a single new array. Make sure the arrays that you pass to the functions are the same length.

In [None]:
array2 = maa.create_random_array(10)
array2

In [None]:
maa.add(array, array2)

In [None]:
maa.sub(array, array2)

In [None]:
maa.mul(array, array2)

In [None]:
maa.div(array, array2)

In [None]:
maa.pow_(array, array2)

### 5. `min_`, `max_`
Define functions that find the min and max of the array. The functions will take a single parameter **`array`** and return a single number.

In [None]:
maa.max_(array)

### 6. `mean`
Define a function to take the mean of the array. It takes a single parameter.

In [None]:
maa.mean(array)

### 7. `median`
Define a function to find the median of an array. The median is the middle element of an array when sorted. If there are an even number of elements then it is the average of the two middle elements. It takes a single parameter.

In [None]:
maa.median(array)

### 8. `dot_product`
The dot product between two arrays is a two step process. First, conduct element by element multiplication and then sum up each of these products. This function takes two arrays as arguments.

In [None]:
a1 = maa.create_random_array(3)
a2 = maa.create_random_array(3)

maa.dot_product(a1, a2)

In [None]:
a1, a2

### 9. `distance`
The distance between two arrays is a four-step process. First, conduct element by element subtraction, square the results, sum these results to get a single number and take the square root.

In [None]:
a1 = maa.create_random_array(2)
a2 = maa.create_random_array(2)

In [None]:
maa.distance(a1, a2)

# Congrats on finishing notebook 8!
This completes the mandatory material needed before the start of the course. Important material still awaits. We will cover Python classes next and build a few games.