# Imports


We **import** libraries or modules into our current python session in order to use the functions, methods, or variables within it. We can also directly import specific functions, methods or variables, as opposed to the entire module. 

_______________________


## Goals

Following this lesson you should be able to:

1. Understand the differences between Modules, Libraries & Packages. 

2. Install a new package, taking adequate safety measures. 

3. Import a base python or pypi module, with and without an alias, and run a function that exists in that module. 

4. Import a module you created, and run a function that exists in that module. 

5. Import a function from a module, with and without an alias, and run that function. 

________________________

## Agenda

1. Understanding the terminology: Modules, Libraries & Packages

2. Installing packages

3. Importing 

4. Exercises

_______________________


## Terminology

Before going into imports, let's discuss some terminology: Modules, Libraries & Packages

A **Module** is a Python file that contains collections of functions and global variables. It is an executable file with a .py extension. A module can exist in: 

- Base python (https://docs.python.org/3/library/). 

- Community developed packages (https://pypi.org/), 

- Your working directory as a file that you have created. For example, Darden student Bethany Thompson has a repository named 'Predicting-Diabetes-Onset' (https://github.com/ThompsonBethany01/Predicting-Diabetes-Onset). It contains a file named Prepare.py, which is a module she created that she imports in her 'Data_Analysis.ipynb' Notebook. 

A **Library** is "a collection of related functionality of codes that allows you to perform many tasks without writing your code. It is a reusable chunk of code that we can use by importing it in our program." [source](https://www.geeksforgeeks.org/what-is-the-difference-between-pythons-module-package-and-library/) Next week, we will introduce you to the most awesome library ever ;)...Pandas. In fact, Pandas is a module, a library AND a package. https://pandas.pydata.org/. 


A **Package** is a directory having collections of modules. It can also consist of other packages, or 'sub-packages'. It is a way to distribute one or more modules. We *install* packages in order to be able to *import* modules or libraries for use. 

- Python Package Index (PyPI, https://pypi.org/) is a repository of community developed Python packages. 

- Anaconda's **Conda** product is a package manager. It helps you find and install packages (that originate in PyPI). 


______________________


## Installing Packages


- We can only **import** the **libraries and modules** of **packages** that have already been **installed**! In other words, in order to import a library or a module we have to install a package that contains it. 

There are 3 main sources from imports:

1. The [Python standard library](https://docs.python.org/3/library/). This comes with the Python language, and no special installation is needed in order to use it. 

2. 3rd party packages. These require installation either through `conda` or `pip`. You can use `pip install` (*Package in Python*) to install packages directly from [PyPI](https://pypi.org/). You can use `conda install` to install packages through Anaconda's Conda package manager. 
    
3. Our own code. We can break our code into separate files and use imports to use code from one file in another. This would be stored in the same directory as the files that would import them. *These do not require installation.*

!!!note "About Anaconda"   About Anaconda: Base installation of Anaconda includes a set of packages containing additional libraries and modules (beyond the base Python). The list of all available and installed packages through can be found @ [docs.anaconda.com](https://docs.anaconda.com/anaconda/packages/py3.7_osx-64/). 

!!!tip "Why use Conda?" There are many reasons, but one of which involves versions and ensuring they all work together. The other, is for security reasons. Anaconda vets the packages available for safety and reliability. On the other hand, PyPI is commumity developed, so it is very easy for malicious code to be contributed and exist for a bit before it is caught and pulled down. Because of that, it is important to pay close attention to what you are installing. Check the followers, supports, downloads of a package before you install it. You don't want to install a package that is new or not well known, unless you have fully vetted the code and know what it is doing. To see the code, you can visit the github repo for each package.

!!!tip "Conda vs PiP Commands" - To compare conda and pip commands, check out [this table](https://conda.io/projects/conda/en/latest/commands.html#conda-vs-pip-vs-virtualenv-commands)

You will want to **install** packages from your terminal, *outside of a python session*, using `pip install` or `conda install`. You can also run updates here. Often the install comman will ask if you want to run the update if it is already installed.  
You will see in the next section, that the **import** happens *within our python session.*

!!!tip "Try it out" 
    1. Install the bokeh package using `conda install bokeh`. 
    2. Install Zach's Python Utilities package using `pip install zgulde` https://pypi.org/project/zgulde/
    3. Install pydataset. Try using `conda install pydataset`. What happens? So, how can you install it? Go for it. 

## Importing

We may import an entire module (or library) or just pieces of it, such as a function or variable containted within it. 
When importing your own code, simple reference the name of the file without the `.py` extension. 

!!!warning "Naming Conventions" In order to import from another file, that file's name (i.e. everything before the `.py` extension) must be a valid python identifier, that is, you could use it as a variable name.

To import the entire module: 

- **import** module_name
- **import** module_name **AS** alias

To import parts of it: 

- **from** module_name **import** function_name
- **from** module_name **import** function_name **as** alias
- **from** module_name **import** function1, function2

**Example 1: A Base Python Module**

I want to use the sqrt function that exists in the Math module (in Base Python). https://docs.python.org/3/library/math.html. `math.sqrt(x)`: Return the square root of x. 

When I import the module, I need to precede any function I call from that module with the module name and a `.`. 

In [1]:
import math

x = 4
math.sqrt(x)

2.0

When I import the module with an alias, I precede any function I call from that module with the alias and a `.`. 

In [2]:
import math as m

x = 16
m.sqrt(x)

4.0

When I import the entire module I have access to all the functions within that module, such as `ceil(x)`, which returns the ceiling of x. That is, it rounds x UP. 

In [3]:
# return the ceiling of x, i.e. round UP!

x = 4.3

m.ceil(4.3)

5

When I import the function directly, I don't need to precede the function with the module name; however, I also, don't have access to the other functions within that module. 

In [4]:
from math import sqrt

sqrt(4)

2.0

I can also import multiple functions from the same module. 

In [5]:
from math import sqrt, pow

print(sqrt(2))

print(pow(3.14, 2))

1.4142135623730951
9.8596


**Example 2: A Third Party Library**

For this example, we will use Pandas, a third party library that was included in our installation of Anaconda. 

`pd` is the well-accepted alias for Pandas. I recommend keeping to those conventions when they exist. You will discover others as we move into the lessons covering the data science libraries. 

First, we will import all of pandas. We will use the Series method to create a pandas series. A series is a one-dimensional array with axis (row) labels.

In [6]:
import pandas as pd

pd.Series(["a", 1, True])

0       a
1       1
2    True
dtype: object

!!!tip "Try it out" Go to [data.world](https://data.world/datagov-uk/5cda2945-bec9-4f4d-a1dc-dfaf7ecc8105/workspace/file?filename=primary-academy-pupils-eligible-for-free-school-meals-fsm-january-2016-15.csv), select the table, and copy the contents to your clipboard. From Pandas, import the function, read_clipboard, and run the function to read the data from your clipboard.. 

**Example 3: A Local File**

Let's say we have a file, `util.py`, in the same directory as the script we are running. 
`util.py` has the following contents:

```python
def sayhello():
    print("Hello, World!")
```

We can import our file like this:

```python
import util
```

And when we run the function like this:

```python
util.sayhello()
```

We get the following: 

    Hello, World!



**Example 4: A local file that contains imports**

For exmaple 4, we will import a local module that contains code that will read in a dataset that is stored in a python module, known as [`pydataset`](https://pypi.org/project/pydataset/). When we import this file, we are also importing the modules that are imported within the file. 

My file name is imports_example_file.py. 

The code it contains is: 

```python
from pydataset import data
import pandas as pd

def get_data(dataset):
    '''
    this function reads the first 5 rows of a dataset into a dataframe
    '''
    df = data(dataset).head()
    return df
```


In [7]:
import imports_example_file as i

# read the iris dataset, which contains information about different iris species. 
dataset='iris'
i.get_data(dataset)

Unnamed: 0,Sepal.Length,Sepal.Width,Petal.Length,Petal.Width,Species
1,5.1,3.5,1.4,0.2,setosa
2,4.9,3.0,1.4,0.2,setosa
3,4.7,3.2,1.3,0.2,setosa
4,4.6,3.1,1.5,0.2,setosa
5,5.0,3.6,1.4,0.2,setosa


_______________________


When a module is imported, all the code in that file is executed. Sometimes this
can produce some undesired side effects, so a best practice is to only have
*definitions* inside of a module. If the module contains *procedural code*, that
is, code that does something, we can place it inside of an `if` statement like
this:

```python
if __name__ == '__main__':
    print('Hello, World!')
```

    Hello, World!


The `__name__` variable is a special variable that is set by python. It's value
will be `__main__` when the module is being run directly, but *not* when the
module is being imported. This way you can write files that do something when
you run them directly, but can also be imported from without producing side
effects.


## Exercises

Create a file named `import_exercises.py` or a notebook named `import_exercises.ipynb` to do your work in.

1. Import and test 3 of the functions from your functions exercise file. Import each function in a different way:

    - import the module and refer to the function with the `.` syntax
    
    - use `from` to import the function directly
    
    - use `from` and give the function a different name


2. For this exercise, read about and use the [`itertools` module](https://docs.python.org/3/library/itertools.html) from the standard library to help you solve the problem.

    - How many different ways can you combine the letters from "abc" with the numbers 1, 2, and 3?

    - How many different ways can you combine two of the letters from "abcd"?


3. Save [this file](https://gist.githubusercontent.com/ryanorsinger/f77e5ec94dbe14e21771/raw/d4a1f916723ca69ac99fdcab48746c6682bf4530/profiles.json) as `profiles.json` inside of your exercises directory. Use the `load` function from the `json` module to open this file, it will produce a list of dictionaries. Using this data, write some code that calculates and outputs the following information:

    - Total number of users

    - Number of active users
    
    - Number of inactive users
    
    - Grand total of balances for all users
    
    - Average balance per user
    
    - User with the lowest balance
    
    - User with the highest balance
    
    - Most common favorite fruit
    
    - Least most common favorite fruit
    
    - Total number of unread messages for all users