In [1]:
# For interactive plots, comment the next line
%pylab inline
# For interactive plots, uncomment the next line
# %pylab ipympl
import warnings
warnings.filterwarnings('ignore')

Populating the interactive namespace from numpy and matplotlib


# Introduction
For instructions on using Jupyter notebooks, see the [README.md](../../README.md) file. 

This notebook describes some Python basics, along with specifics about the PODPAC library. Specifically we will go over:

* How to import libraries in Python
* The structure of the PODPAC library
* Basic Python language features such as indexing and class inheritence
* Creating a MATLAB-like environment in Python using the `Numpy` and `Matplotlib` libraries

# Importing modules
* Unlike MATLAB, Python libraries need to be `imported` before they can be used
* Imported libraries usually have a namespace
* Portions of libraries, can be imported

## Examples

In [2]:
import podpac                     # Import PODPAC with the namespace 'podpac'
import podpac as pc               # Import PODPAC with the namespace 'pc'
from podpac import Coordinates    # Import Coordinates from PODPAC into the main namespace

# The following is not generally recommended because it is difficult to know where functions/classes/etc
# come from without the namespace (makes code harder to read). Also, if importing multiple packages using
# this approach, any functions/classes/etc that have the same name will be over-written by the last 
# imported package. 
from podpac import *              # Import everything made available in the public API

# PODPAC Library Structure
PODPAC is composed out of multiple sub-modules/sub-libraries. The major ones, from a user's perspective are shown below. 
<img src='Images/podpac-user-api.png' style='width:80%; margin-left:auto;margin-right:auto;' />


We can examine what's in the PODPAC library by using the `dir` function

In [3]:
dir(podpac)

['Coordinates',
 'Node',
 'NodeException',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__path__',
 '__spec__',
 '__version__',
 'algorithm',
 'authentication',
 'clinspace',
 'compositor',
 'coordinates',
 'core',
 'crange',
 'data',
 'datalib',
 'interpolators',
 'pipeline',
 'settings',
 'version',
 'version_info']

Anything that starts with the "dunderscore" `__<attr>__` is an internal Python method and can be ignored. 

In PODPAC, the top-level classes and functions are frequently used and include:
* `Coordinates`: class for defining coordinates
* `Node`: Base class for defining PODPAC compute Pipeline
* `NodeException`: The error type thrown by Nodes
* `clinspace`: A helper function used to create uniformly spaced coordinates based on the number of points
* `crange`: Another helper function used to create uniformly spaced coordinates based on step size
* `settings`: A module with various settings that define caching behavior, login credentials, etc.
* `version_info`: Pyton dictionary giving the version of the PODPAC library

The top-level modules or sub-packages (or sub libraries) include: 
* `algorithm`: here you can find generic `Algorithm` nodes to do different types of computations
* `authentication`: this contains utilities to help authenticate users to download data
* `compositor`: here you can find nodes that help to combine multiple data sources into a single node
* `coordinates`: this module contains additional utilities related to creating coordinates
* `core`: this is where the core library is implemented, and follows the directory structure of the code
* `data`: here you can find generic `DataSource` nodes for reading and interpreting  data sources
* `datalib`: here you can find domain-specific `DataSource` nodes for reading data from specific instruments, studies, and programs
* `interpolators`: this contains classes for dealing with automatic interpolation
* `pipeline`: this contains generic `Pipeline` nodes which can be used to share and re-create PODPAC processing routines

Diving into specifically what's available in some of these submodules

In [4]:
# Generic Algorithm nodes
dir(podpac.algorithm)

['Algorithm',
 'Arange',
 'Arithmetic',
 'Convolution',
 'CoordData',
 'Count',
 'DayOfYear',
 'ExpandCoordinates',
 'GroupReduce',
 'Kurtosis',
 'Max',
 'Mean',
 'Median',
 'Min',
 'Reduce',
 'Reduce2',
 'SelectCoordinates',
 'SinCoords',
 'Skew',
 'SpatialConvolution',
 'StandardDeviation',
 'Sum',
 'TimeConvolution',
 'Variance',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__']

In [5]:
# Generic DataSource nodes
dir(podpac.data)

['Array',
 'CSV',
 'DataSource',
 'H5PY',
 'INTERPOLATION_DEFAULT',
 'INTERPOLATION_METHODS',
 'INTERPOLATION_SHORTCUTS',
 'Interpolation',
 'InterpolationException',
 'PyDAP',
 'Rasterio',
 'ReprojectedSource',
 'S3',
 'WCS',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 'interpolation_trait']

In [6]:
# Specific datasources
dir(podpac.datalib)

['SMAP',
 'SMAPBestAvailable',
 'SMAPPorosity',
 'SMAPProperties',
 'SMAPSource',
 'SMAPWilt',
 'SMAP_PRODUCT_MAP',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__path__',
 '__spec__',
 'smap']

In [7]:
# Nothing here yet
# dir(podpac.alglib)

# Basic Python languages features
* Python uses zero indexing

In [8]:
alist = [1, 2, 3, 4]
alist[0] 

1

* Python is typeless

In [9]:
mytype = 'is now a string'  # variable mytype is a string
mytype = 154147             # variable mytype is now an integer

* Python is object oriented, supporting class inheritence

In [10]:
# define a class
class MyClass(object):  # Inherits from standard Python object (new-style classes)
    my_class_integer = 0  # This is a class attributes, it will be copied for new instances
    my_class_list = [1]   # This is a class attributes, it will be shared amongst instances
    
    # This is the class constructor
    def __init__(self, my_class_instance_list=None):
        self.my_class_instance_list = my_class_instance_list # This is an instance variable

# Define a child class that inherits from MyClass
class MyChildClass(MyClass): 
    my_child_class_str = 'A string'  # Add a new attribute
    my_class_integer = 1  # Overwrite the value from the base class
    
# Create an instance of each class
my_class = MyClass()
my_child_class = MyChildClass()

# Demonstrate the inheritence
print("The child has the parent's attributes (and methods):")
print("\t my_child_class.my_class_integer=", my_child_class.my_class_integer)
print("\t my_child_class.my_class_list=", my_child_class.my_class_list)
print("\t my_child_class.my_class_instance_list=", my_child_class.my_class_instance_list)
print("\t my_child_class.my_child_class_str=", my_child_class.my_child_class_str)

The child has the parent's attributes (and methods):
	 my_child_class.my_class_integer= 1
	 my_child_class.my_class_list= [1]
	 my_child_class.my_class_instance_list= None
	 my_child_class.my_child_class_str= A string


* Python passes by reference, sometimes...
    * Basic types are copied (int, float, str)
    * Container types are passed my reference (list, tuple, dict, object)
    
To demonstrate this, we create two instances of our class defined above and only change values in one instances. The same behaviour can be observed for functions.

In [11]:
def print_attrs(MyClass, MyChildClass, my_class1, my_class2):
    print('\tMyClass.my_class_list: \t\t ', MyClass.my_class_list)
    print('\tMyChildClass.my_class_list: \t ', MyChildClass.my_class_list)
    print('\tmy_class1.my_class_list: \t ', my_class1.my_class_list)
    print('\tmy_class2.my_class_list: \t ', my_class2.my_class_list)
    print('\tMyChildClass.my_class_integer:\t', MyChildClass.my_class_integer)
    print('\tmy_class1.my_class_integer:\t', my_class1.my_class_integer)
    print('\tmy_class2.my_class_integer:\t', my_class2.my_class_integer)
    print('\tMyChildClass.my_child_class_str:', MyChildClass.my_child_class_str)
    print('\tmy_class1.my_child_class_str:\t', my_class1.my_child_class_str)
    print('\tmy_class2.my_child_class_str:\t', my_class2.my_child_class_str)
    print('\tmy_class1.my_class_instance_list:', my_class1.my_class_instance_list)
    print('\tmy_class2.my_class_instance_list:', my_class2.my_class_instance_list)

In [12]:
# Create two instances of the same class. Class attributes should be shared. Instance variables are copied. 
my_class1 = MyChildClass(my_class_instance_list=[4444])
my_class2 = MyChildClass(my_class_instance_list=[4444])
print('Before modifying values in my_class1')
print_attrs(MyClass, MyChildClass, my_class1, my_class2)

my_class1.my_class_list[0] += 1000; my_class1.my_class_instance_list[0] += 1000
my_class1.my_class_integer += 1000; my_class1.my_child_class_str += "modified "

print('After modifying values in my_class1')
print_attrs(MyClass, MyChildClass, my_class1, my_class2)

Before modifying values in my_class1
	MyClass.my_class_list: 		  [1]
	MyChildClass.my_class_list: 	  [1]
	my_class1.my_class_list: 	  [1]
	my_class2.my_class_list: 	  [1]
	MyChildClass.my_class_integer:	 1
	my_class1.my_class_integer:	 1
	my_class2.my_class_integer:	 1
	MyChildClass.my_child_class_str: A string
	my_class1.my_child_class_str:	 A string
	my_class2.my_child_class_str:	 A string
	my_class1.my_class_instance_list: [4444]
	my_class2.my_class_instance_list: [4444]
After modifying values in my_class1
	MyClass.my_class_list: 		  [1001]
	MyChildClass.my_class_list: 	  [1001]
	my_class1.my_class_list: 	  [1001]
	my_class2.my_class_list: 	  [1001]
	MyChildClass.my_class_integer:	 1
	my_class1.my_class_integer:	 1001
	my_class2.my_class_integer:	 1
	MyChildClass.my_child_class_str: A string
	my_class1.my_child_class_str:	 A stringmodified 
	my_class2.my_child_class_str:	 A string
	my_class1.my_class_instance_list: [5444]
	my_class2.my_class_instance_list: [4444]


# Creating a MATLAB-like environment in Python using the Numpy and Matplotlib libraries
Unlike MATLAB, the standard Python library does not come with array-handling and plotting capabilities. 

* For array-handling, the `Numpy` Python package can be used, and is generally imported as follows:

In [13]:
import numpy as np

* For plotting, the `Matplotlib` Python package can be used, and is generally imported as follows:

In [14]:
import matplotlib.pyplot as plt

* [Numpy for Matlab users](https://docs.scipy.org/doc/numpy-1.15.0/user/numpy-for-matlab-users.html) is a useful reference for new users.
* `Matplotlib` plotting routines use nearly the same interface as MATLAB plotting routines.
* Both `Numpy` and `Matplotlib` can be imported as follows:

In [15]:
from matplotlib.pylab import *

* When using JupyterLab or an IPython console, the "IPython magic function" `%pylab` can be used.

In [16]:
%pylab

Using matplotlib backend: Qt5Agg
Populating the interactive namespace from numpy and matplotlib


* This magic function, when invoked, can be instructed to use different plotting "Backends", and that affects how plots are displayed
```python 
%pylab  # nothing specified will default to creating a new window for plots
%pylab inline  # this will create images (non-interactive) inside the console or JupyterLab notebook
%pylab ipympl  # thiw will create interactive plots inside JupyterLab notebooks
```