# Modules and scripts

Pitfalls of Jupyter-based development:
- no separation between the code implementation (i.e. a class) and its use (i.e. an analysis using that class);
- mix-ups between global and local variables can lead to unintended consequences!
- Jupyter allows cells to be executed in ***any order*** and it's hard to keep track of what the program is doing: you may have the impression that everything works but once you reset the program and try to execute it in a **linear** fashion it breaks!

### Modules versus scripts
- Scripts give a sequence of commands to execute
- Modules have code that is designed to be imported and used by another file
- Both are stored as plain text files

In short
- script = code to execute (it does something!)
- module = code to import (class and function definitions)

### Scripts

Let's try making a simple script and running it with `python helloworld.py`, `./helloworld.py` and `python -m helloworld`. The last option treats the script as a module and searches for a module of this name in the module search path.

Notice at the beginning of the script that there is this sequence: `#!/usr/bin/env python3`. This is called a shebang and makes it possible to execute as ./helloworld.py on the command line (on Mac or Linux). You must first make the script executable with `chmod +x script.py`.

In [None]:
import helloworld

Treating this as a module doesn't make too much sense - this is code to be executed, so it is better treated as a script. Note that the execution will only happen on the first import.

In [None]:
import helloworld

We can also invoke the script in Jupyter, in the following way.

In [None]:
!python helloworld.py

But you may have noticed that normally when you import something, nothing is executed.

In [None]:
import math

Let's look at a more complicated script, that prints out the Fibonacci sequence. This isn't actually a great script - it is mixing code that is executed with a function that could better go in a module, but for such a simple program this is fine. 

### Modules

Modules are loaded with the `import` statement. They don't necessarily have to be in python! They are helpful for organizing code and making it more reusable. 

Scope is an important aspect of modules, which we will see shortly. **Namespace** is an important concept here, recall the definition (from Wikipedia): a namespace is an abstract container or environment created to hold a logical grouping of unique identifiers or symbols. A namespace groups items (functions/classes/data) together and helps avoid conflicts from repeated names.

Modules can be organized into packages, which we won't cover in depth.

In [None]:
import galaxy_mod

We have seen the built-in function `dir()` before - we will use it more today to 

- Check what is in the module
- Check what is in the **local symbol table** (sort of like the local namespace).

In [None]:
dir(galaxy_mod)

In [None]:
dir()

In [None]:
galaxy_mod

Items from the module can be accessed with the dot operator.

In [None]:
print(galaxy_mod.name)

In [None]:
print(galaxy_mod.galaxy_list)

In [None]:
galaxy_mod.print_my_galaxy('NGC 1275')

In [None]:
galaxy_mod.MyGalaxyClass()

But this will not work 

In [None]:
print(galaxy_list)

You can import items from the module individually, we have seen this before.

In [None]:
from math import pi

In [None]:
print(pi)

In [None]:
from galaxy_mod import galaxy_list

In [None]:
dir()

In [None]:
print(galaxy_list)

You can also import to alternative names. This can be particularly useful for avoiding overwriting a name in your local symbol table.

In [None]:
name = "Python for Physicists"

In [None]:
dir()

In [None]:
from galaxy_mod import name as name_galaxy_cat

In [None]:
print(name)
print(name_galaxy_cat)

In [None]:
dir()

You can import everything from a module at once, but it is usually not a good idea. Why not?

In [None]:
from galaxy_mod import *

In [None]:
dir()

## Packages

A **package** is a collection of modules. We will work with several packages throughout the rest of the course. Next week will see the package `numpy`. This package provides the `numpy` module, that provides some basic functionality. Some features of `numpy` are accessible through other modules provided by the same package.

There are several options for importing from a package.
1. `import <packagename>`
2. `import <packagename> as <alias>`
3. `from <packagename> import <modulename>`
4. `from <packagename> import <modulename> as <alias>`

In [None]:
import numpy

# This is the main module of the package.
print(type(numpy))

In [None]:
# These are symbols contained in the main module.
print(type(numpy.ndarray))
print(type(numpy.array))

In [None]:
# This is a module accessible through the main module.
print(type(numpy.random))

In [None]:
# This is a symbol contained in the random module
print(type(numpy.random.random))

The way a package makes accessible its functionality through is main module is based on a chain of `import` statements. 

Packages that are installable through `pip install package_name` are published at [pypi](https://pypi.org/). You can also install packages with conda through `conda install package_name`. You may also learn how to write your own private package and install it locally. Let's look at a simple example.

### Best practices with modules and packages
- avoid using `import *`
- if you plan to use only a few items from the module in specific places, use `from module import class as class_alias`;
- if you plan to use many features all the time, import the module with a short alias `import numpy as np`;
- you may store "constants" in modules but try not to store variables!