# Introduction to python for hydrologists &mdash; namespaces, modules, and packages#

There are a variety of ways to import existing code into a Python script or interactive session.

There is alot of flexibility in how this is done, but a few suggested practices will be covered here.

In [None]:
import this

In the above Easter Egg, we can learn a couple things. First, the end line highlights that namespaces are important!

Also, by importing `this`, it actually executed some code (printing out the Zen of Python). This means Python knew where to find a module called `this` and executed it upon import.

## Namespaces##
There's a nice explanation of namespaces [here](http://bytebaker.com/2008/07/30/python-namespaces/). 

First, we need to understand what is a _name_ in Python. A name is a general container referencing something. Like in many languages, think of a variable:


In [None]:
a=5
a

But in python, we can also use a name for a function.

In [None]:
def funky(description):
    print ('this {0} function is funky!'.format(description))
    

In [None]:
funky

In [None]:
funky('Town')

In [None]:
f = funky
f

So, we assigned f to, in a sense, point to the function `funky`.

Names (and therefore variables) can assume various types and get reused without definition.

In [None]:
a=5
print (a)
a = [12.3, 44.9]
print (a)
a = 'stuff in quotes'
print (a)

So, namespace is just a space containing all the names in use during a Python session. 

__An important caution with names__
Since you can think of a name of a variable as a tag, there is a special behavior related to lists that can cause massive grief!

First, what happens when a single value is associated with a name (like a variable)

In [None]:
a = 5
b = a
print (a)
print (b)
b = 6
print (a)
print (b)

This starts out fine. But now if we change an element in `b`...

In [None]:
a=[1.0, 2.0, 3.5, 4.9]
print ('a is {0}'.format(a))
b=a
print ('b is {0}'.format(b))
print ('_'*15)
b[2]=9999999
print ('a is {0}'.format(a))
print ('b is {0}'.format(b))

Oh no! Changing `b` also changed `a`! The reason for this is that `a` and `b` are both pointing to a location in memory that is storing the information (in this case, starting with the list `[1.0, 2.0, 3.5, 4.9]` and later becoming the list `[1.0, 2.0, 100, 4.9]`). This same behavior happens when using `numpy` arrays.

The way around this is to make a full copy of the information (by value rather than by reference). In typical Python, this means importing a module called `copy` and using either the function `copy.copy` or `copy.deepcopy`. In `numpy`, copy is built-in.

In [None]:
import copy
a = [1,2,3]
b = copy.copy(a)
b[2] = 99
print (a)
print (b)

In [None]:
import numpy as np
a = np.array([1,2,3])
print (a)
b = a
b[1]=99
print (a)
print (b)

In [None]:
b = a.copy()
b[0] = -9999
print (a)
print (b)

### Here's a brutally insidious example of how this can cause serious trouble!

In [None]:
import numpy as np
rech_list = []
nper = 3
rech = np.zeros( (4,4), dtype=np.float)
for kper in range(nper):
    rech += np.random.random((4,4))
    rech_list.append(rech)
for crech in rech_list:
    print (crech)

The fix is a little ugly and kludgey, but all we have to do is make a copy to break the connection to the old memory location

In [None]:
import numpy as np
rech_list = []
nper = 3
rech = np.zeros( (4,4), dtype=np.float)
for kper in range(nper):
    rech = rech.copy()
    rech += np.random.random((4,4))
    rech = rech.copy()
    rech_list.append(rech)
for crech in rech_list:
    print (crech)

## Modules, Packages, and the Standard Python Library##
The [Standard Python Library](https://docs.python.org/3/library/) is the set of functions that are part of Python by default.

More technically, names point to "objects". a "module" is a file (with extension `.py`) that contains python code. If there are functions in that code, they can be accessed using the name of the module and a dot (`.`). 

Packages are nested modules and are often "installed" to be accessible to Python from anywhere. More on that at the end of the lesson.

Let's import a module and find a function within it.

In [None]:
import numpy


In [None]:
import numpy
numpy.sqrt

In [None]:
numpy.sqrt(3)

## Importing code and handling namespaces##
There are several main ways to import a module. 

The most straightforward way is to just use `import numpy` as we did above.

In [None]:
import numpy
numpy

This then shows that `numpy` is a module. Whenever you want to use a function from numpy, you just use the dot like `numpy.sqrt`.

The main advantage to this approach is you always know the provenance of any function. Also, you could (bad idea!) make your own function called `sqrt`. 

In [None]:
def sqrt(numb):
    print ('my dumb function called sqrt actually just ' + \
    'prints the number you provided--> {0}!'.format(numb))
sqrt


In [None]:
numpy.sqrt

In [None]:
sqrt(3)

In [None]:
numpy.sqrt(3)

Another option is to import only some function you need from a module like `from numpy import sqrt`. The problem here is, we don't necessarily know where this came from. Whichever was either imported or created most recently gets that name in the namespace. DANGER!

In [None]:
from numpy import sqrt
sqrt

You can also use an alias to import a specific function like `from numpy import sqrt as square_root`. In this case, and in the case above, you can get the provenance from the import statements at the top of the code, but if the code gets really long, this can be hard to keep track of.

In [None]:
from numpy import sqrt as npsquare_root
npsquare_root

This is like renaming `funky` to `f` above.

Living really dangerously, you can import all functions from a module like `from numpy import *`

In [None]:
import numpy
numpy.sqrt

In [None]:
from numpy import *
sqrt

The problem here is, you now have access to all these functions, but you also don't know provenance at all. Some modules, like `numpy` are large and have many functions (many of which may have common names that you might use yourself and that you might not be aware of).

So.....really, the safest way is like the first way, but that can get long (for example, if you use `import matplotlib`, then every time you use a function from the module you have to type `matplotlib.<some function>` and that gets verbose. A compromise is importing an entire module but assigning it an alias like `import numpy as np`

In [None]:
import numpy as np
np.sqrt

There is a commonly accepted set of aliases for some common scientific computing modules that we recommend: 

* `import matplotlib.pyplot as plt`
* `import numpy as np`
* `import matplotlib as mpl`
* `import pandas as pd`

In addition to keeping the provenance straight, adopting this protocol helps make your code more readable by other people (see Zen above!)

In [None]:
import this

## More on Packages##

Packages are nested modules, each level of which is accessed by using a dot (`.`). In some cases, submodules from a package must be imported separately like `matplotlib.pyplot`. This behavior is at the discretion of the programmer of the package. In the case of `matplotlib`, importing *everything* would be big and sometimes unnecessary. So, importing the highest level named package just gets you some basic parameters associated with the package overall, but functional capabilities need to be imported individually.

Here's an example using matplotlib.

In [None]:
import matplotlib as mpl
mpl

In [None]:
try:
    mpl.pyplot
except Exception as e:
    print(e)

In [None]:
import matplotlib.pyplot as plt


# Paths for importing and installation#

From the official [documentation](https://docs.python.org/2/tutorial/modules.html), the hierarchy of searching for modules and packages is:

* the directory containing the input script (or the current directory).
* PYTHONPATH (a list of directory names, with the same syntax as the shell variable PATH).
* the installation-dependent default.

The PYTHONPATH variable is a system variable on Windows and thus requires an administrative account to change/add to. You can see your search path using the built-in `sys` module.

In [None]:
import sys
sys.path

blahblah/site-packages is the location where many packages get installed.



## Where are all the codes living?

[PyPi](https://pypi.python.org/pypi) is the Python Package Index

[GitHub](http://github.com) is a huge interactive repository of code in many languages

[Anaconda](http://docs.continuum.io/anaconda/pkgs.html) Here are the Anaconda packages

To install new packages, with Anaconda, you can use `conda install <package name>`. 

You can also use `pip install <package name>`.

More about this later.