# Imports Lesson

We **import** libraries or modules into our current python session in order to use the functions, methods, or variables within it. We can also directly import specific functions, methods or variables, as opposed to the entire module.

## Goals

Following this lesson you should be able to:

1. Understand the differences between Modules, Libraries & Packages.
1. Install a new package, taking adequate safety measures.
1. Import a base python or pypi module, with and without an alias, and run a function that exists in that module.
1. Import a module you created, and run a function that exists in that module.
1. Import a function from a module, with and without an alias, and run that function.

## Terminology

Before going into imports, let's discuss some terminology:

- A **Module** is an python file with a `.py` extension.

    Modules contain functions and variables. A module can exist in:

    - The python standard library
    - Community developed packages
    - Your working directory as a file that you have created. For example, [this capstone project from the Darden cohort](https://github.com/SpotiScryers/SpotiScry) has modules named `acquire`, `prepare`, `explore`, `preprocessing`, and `model` that are imported into their jupyter notebook.


- A **Package** is a directory that contains modules.

     - It can also consist of other packages, or 'sub-packages'. Packages are a way to distribute one or more modules. We install packages in order to be able to import modules or libraries for use.
    

- A Library is a collection of code, data, documentation, and configuration, usually purpose built for specific tasks.

    - Libraries can be very large in scope like [numpy](numpy.org), a library the forms the base for most other scientific packages in python, or [matplotlib](matplotlib.org), the library we'll use for data visualization. Other libraries are smaller in scope, like [requests](https://docs.python-requests.org/en/master/), a library for sending HTTP requests.


- The **Python Package Index**, PyPI, https://pypi.org/, is a repository of community developed Python packages.


- Anaconda's **Conda** product is a package manager. It helps you find and install 3rd party packages.

## Import Sources

There are 3 main sources from imports:

1. The [Python standard library](https://docs.python.org/3/library/)

- This comes with the Python language, and no special installation is needed in order to use it.

2. 3rd party packages

- 3rd party packages are typically installed with a package manger, usually either `conda` or `pip`.

    - `pip` is the package manager that comes with the python language. You can use `pip install` to install packages from the Python Package Index.

    - `conda` is an alternative package management tool used by anaconda. You can use `conda install` to install packages published on through anaconda.

    - `conda` is generally preferred as it ensures that the versions of all installed libraries are compatible with each other, and the packages it makes available are all vetted by anaconda. While very rare, packages on pypi can by contain malicious code, as anyone can publish a package.

    - In general, you should research the libraries you are considering using before installing them. Visiting the project's github page, looking over the documentation, and seeing how active the community is are all good ideas.

3. Our own code

- We can break our code into separate files and use imports to use code from one file in another python file, or in a jupyter notebook. For this course, we will store imported modules in the same directory as the file that is importing them.

## Installing Packages

We can only import the libraries and modules of packages that have been installed. In other words, in order to import a library or a module we have to install a package that contains it. Because we installed python with anaconda, we already have many 3rd party packages commonly used in data science work installed.

To install additional packages, we run commands in the shell in our terminal application (**not** within a python session). For example:

> `conda install somepackage`

or 

> `pip install somepackage`

## Importing

We can either import an entire module (or library) or just pieces of it, such as a specific function or variable. We can give **aliases** to anything we import.

To import an entire module:

> `import somemodule`
   
Later:

> `somemodule.somefunction()`

In [1]:
import math

In [2]:
# sqrt() is a function in the math module, but we cannot use it without a prefix
y = sqrt(10)

NameError: name 'sqrt' is not defined

In [3]:
'''To reference variables or functions within the module 
   we prefix the variable or function name with the name of the module 
   and a period.'''
x = math.sqrt(10)
print(x)

3.1622776601683795


***
To import a module with an **alias**:

> `import somemodule as sm`

Later:

> `sm.somefunction()`

In [4]:
# This is being imported under the alias np
import numpy as np

In [5]:
'''Usually aliases are used to shorten longer module names, 
   and to reference variables and functions within the module 
   we prefix them with the alias and a period.'''
x = np.repeat(3, 5)
x

array([3, 3, 3, 3, 3])

In [6]:
# If we try to use the full name of the module when we imported it under an alias, it will not work
z = numpy.repeat(3, 5)
z

NameError: name 'numpy' is not defined

In [7]:
# If we import it again without the alias, then both versions will work
import numpy

In [8]:
z = numpy.repeat(3, 5)
z

array([3, 3, 3, 3, 3])

In [9]:
x2 = np.repeat(3, 5)
x2

array([3, 3, 3, 3, 3])

***

To import **specific parts** of a module:

> `from somemodule import somefunction`

or

> `from somemodule import anotherfunction, yetanotherfunction`

Later:

> `somefunction()`

> `anotherfunction()`

> `yetanotherfunction()`

In [10]:
from itertools import accumulate

In [11]:
# Accumulate doesn't return an array, it returns an iterable (like a stored pattern)
accumulate([1,2,3,4,5])

<itertools.accumulate at 0x7f9fa132e080>

In [12]:
# We can fully express this pattern using a function that can take an iterable as an argument, like list()
list(accumulate([1,2,3,4,5]))

[1, 3, 6, 10, 15]

Like modules, we can give aliases to the specific pieces we import:

> `from somemodule import somefunction as some_func`

> `from anothermodule import anotherfunction as another_func, yetanotherfunction as yaf`

Later...

> `some_func()`

> `another_func()`

> `yaf()`

### Importing Your Own Code

When importing your own code, reference the name of the file without the .py extension.

In order to import from another file, that file's name (everything before the .py file extension) must be a valid python identifier, that is, you could use it as a variable name.

In [13]:
# In VSCode we created the sample_module.py file. This file contains its own imports, functions, and variables
import sample_module

In [14]:
# We used a function from our module in this notebook. 
# This can also help to keep this notebook looking "cleaner" by storing long code blocks in our module
num = sample_module.add_two(2)

In [15]:
num

4

In [16]:
# If we update our module, we will need to restart the kernel or the changes will not be reflected in our import
# Simply rerunning the import sample_module line in our code will not suffice. The kernel must be restarted.
new_num = sample_module.subtract_two(num)

In [17]:
new_num

2

In [18]:
# We can import a specific function from our own module
from sample_module import double_the_root as dtr

In [19]:
x = dtr(16)

In [20]:
x

8.0

In [21]:
# We can call upon variables stored in our module by using a prefix
print(sample_module.bootcamp_name)

Codeup


In [22]:
# We can also call upon variables without the need for a prefix if we call it specifically, and we can alias it
from sample_module import super_secret_password_never_share_this_ever as pwd

In [23]:
# Here we use a variable, and anyone viewing our code does not know what our underlying password is
x = pwd.upper()

In [24]:
# We often store passwords or other sensitive credentials in an env.py file
# If that file is listed in our .gitignore, then we wont accidentally push our credentials to github
# This is the farthest someone will get using our code without our env.py file
from env import password as pwd

ModuleNotFoundError: No module named 'env'

In [25]:
import sample_module2 as sm2

In [26]:
sm2.bootcamp_name

'CodeupCodeup'

In [27]:
import sample_module as sm

In [28]:
sm.bootcamp_name

'Codeup'

In the above, sample_module2 was a module that imported sample_module and made use of the variables of sample_module. Both modules have variables with the same name, but using prefixes of the module name/alias can identify which variable we are using.

In [29]:
# If we use the same alias more than once, the last use of that alias will be what is stored in our current python instance
from sample_module import bootcamp_name as bn
from sample_module2 import bootcamp_name as bn

In [30]:
bn

'CodeupCodeup'

## Running a module as a script vs. importing

When a module is imported, every line of code will be run, including lines that produce output or change variables. If we want to use a module as both a script and as a resource for imports, then we can "hide" code that we don't want to run when importing by using:

In [None]:
if __name__ == '__main__':
    main()