# Modules and Packages

### Pip install and pypi

* PyPi is a repo for open-source third-party Python packages
* It's similar to RubyGems for Ruby, the File Exchange for MATLAB, and so on.
* So far, we have only used libraries that come pre-installed with Python.
* However, there are many other libraries available that are open-source and are shared on PyPi
* We can use the command <b>pip install</b> at the command line to install these packages.

To download a package, type "pip install [package name] into a command line (such as cmd)

### Googling to search for packages

e.g., Google "python package for excel"

* Type "pip install openpyxl"
* Type "python"
* Type "import openpyxl"

### Creating unique modules and packages

* Modules are just .py scripts that you can call in another .py script
* Packages are just a collection of modules

So, the next steps of the tutorial are in SublimeText (and the associated scripts can be found in the Modules and Packages section of the Bootcamp Jupyter Notebook). In essence, a user can create a series of scripts (save as .py) in Python and then a master script that <i>imports</i> these scripts as modules. 

In [2]:
#I haven't done this, but say we create a script called "myscript.py" which contains a function my_func
from mymodule import my_func

my_func()

ModuleNotFoundError: No module named 'mymodule'

Obviously, this failed to run because I can't create "mymodule" in Jupyter. But this is the general syntax for importing any module and contained function in Python.

In [4]:
#To create a Package, you first create a folder structure. 
#In each level of the structure, create an empty file called "__init__.py". 
#This file is KEY because it tells Python that the folder structure is a package. 
#Then, create modules in each level of the file structure. 

In [7]:
#To import from main and subdirectories, use the following syntax:
from MyMainPackage import some_main_script
from MyMainPackage.SubPackage import my_subscript

ModuleNotFoundError: No module named 'MyMainPackage'

You can foresee creating a folder of, say, oceanographic processing scripts (such as, removing atmospheric pressure from a time series, computing sound speed in water, computing wave statistics) with subfolders for each type or application of each script. Then you would call the modules within the folders to run processing in a main script.

For what it's worth, I have been advised in the past not to create many nested scripts when doing data processing, because this gets very confusing to debug when something changes--a different data type or software update or something!

### __name__ and "__main__"

* Sometimes, when you are importing from a module, you want to know whether a module function is being used as an import, or if you are using the original .py file of that module
    * Be sure to check the "Explanation.txt" in this section of the Bootcamp's Jupyter-Notebook!
* In Python, all the script at indentation level 0 gets run automatically.
* But, there is a built-in variable called __name__ which by default is equal to __main__, the level 0 indentation
* So, scripts can be constructed with a bunch of def func(): calls, followed by an if statement relating name to main.

In [8]:
#For example
def func():
    print("func() in script.py")

print("This is level 0 indentation")

if __name__ == '__main__': #This checks to see if the script has been run directly or has been imported.
    print("script.py is being run directly!")
else:
    print("script.py has been imported!")  