# PODPAC Introduction

*Author: Creare* <br>
*Date: April 01 2020* <br>

**Keywords**: podpac

## Overview

This notebook provides a high level overview of the PODPAC library.

### Prerequisites

- Python 2.7 or above
- [`podpac`](https://podpac.org/install.html#install)
- *Review the [README.md](../README.md) and [jupyter-tutorial.ipynb](jupyter-tutorial.ipynb) for additional info on using jupyter notebooks*

### See Also

- [python/basic-python.ipynb](../python/basic-python.ipynb): Basic introduction to Python language features
- [python/matlab.ipynb](../python/matlab.ipynb): Introduction to Python for MATLAB users
- [xarray](xarray.ipynb): Short reference for the core [`xarray`](https://xarray.pydata.org/en/stable/) module.

# Importing modules

PODPAC has multiple modules, which can be imported all at once, or individually:

In [1]:
import podpac                     # Import PODPAC with the namespace 'podpac'
import podpac as pc               # Import PODPAC with the namespace 'pc'
from podpac import Coordinates    # Import Coordinates from PODPAC into the main namespace

# PODPAC library structure
PODPAC is composed out of multiple sub-modules/sub-libraries. The major ones, from a user's perspective are shown below. 
<img src='../../images/podpac-user-api.png' style='width:80%; margin-left:auto;margin-right:auto;' />


We can examine what's in the PODPAC library by using the `dir` function

In [2]:
dir(podpac)

['Coordinates',
 'Node',
 'NodeException',
 'UnitsDataArray',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__path__',
 '__spec__',
 '__version__',
 'algorithm',
 'authentication',
 'cached_property',
 'clinspace',
 'compositor',
 'coordinates',
 'core',
 'crange',
 'data',
 'interpolators',
 'managers',
 'settings',
 'style',
 'units',
 'utils',
 'version',
 'version_info']

In PODPAC, the top-level classes and functions are frequently used and include:

* `Coordinates`: class for defining coordinates
* `Node`: Base class for defining PODPAC compute Pipeline
* `NodeException`: The error type thrown by Nodes
* `clinspace`: A helper function used to create uniformly spaced coordinates based on the number of points
* `crange`: Another helper function used to create uniformly spaced coordinates based on step size
* `settings`: A module with various settings that define caching behavior, login credentials, etc.
* `version_info`: Python dictionary giving the version of the PODPAC library

The top-level modules or sub-packages (or sub libraries) include: 
* `algorithm`: here you can find generic `Algorithm` nodes to do different types of computations
* `authentication`: this contains utilities to help authenticate users to download data
* `compositor`: here you can find nodes that help to combine multiple data sources into a single node
* `coordinates`: this module contains additional utilities related to creating coordinates
* `core`: this is where the core library is implemented, and follows the directory structure of the code
* `data`: here you can find generic `DataSource` nodes for reading and interpreting  data sources
* `datalib`: here you can find domain-specific `DataSource` nodes for reading data from specific instruments, studies, and programs
* `interpolators`: this contains classes for dealing with automatic interpolation
* `pipeline`: this contains generic `Pipeline` nodes which can be used to share and re-create PODPAC processing routines

Diving into specifically what's available in some of these submodules

In [3]:
# Generic Algorithm nodes
dir(podpac.algorithm)

['Algorithm',
 'Arange',
 'Arithmetic',
 'Convolution',
 'CoordData',
 'Count',
 'DayOfYear',
 'ExpandCoordinates',
 'Generic',
 'GroupReduce',
 'Kurtosis',
 'Mask',
 'Max',
 'Mean',
 'Median',
 'Min',
 'SelectCoordinates',
 'SinCoords',
 'Skew',
 'SpatialConvolution',
 'StandardDeviation',
 'Sum',
 'TimeConvolution',
 'UnaryAlgorithm',
 'Variance',
 'YearSubstituteCoordinates',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__']

In [4]:
# Generic DataSource nodes
dir(podpac.data)

['Array',
 'CSV',
 'DataSource',
 'Dataset',
 'H5PY',
 'INTERPOLATION_DEFAULT',
 'INTERPOLATION_METHODS',
 'INTERPOLATION_METHODS_DICT',
 'INTERPOLATORS',
 'INTERPOLATORS_DICT',
 'Interpolation',
 'InterpolationException',
 'InterpolationTrait',
 'PyDAP',
 'Rasterio',
 'ReprojectedSource',
 'WCS',
 'Zarr',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__']

In [5]:
# Specific data libraries built into podpac
import podpac.datalib   # not loaded by default
dir(podpac.datalib)

['EGI',
 'GFS',
 'GFSLatest',
 'IntakeCatalog',
 'SMAP',
 'SMAPBestAvailable',
 'SMAPPorosity',
 'SMAPProperties',
 'SMAPSource',
 'SMAPWilt',
 'SMAP_PRODUCT_MAP',
 'TerrainTiles',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__path__',
 '__spec__',
 'drought_monitor',
 'egi',
 'gfs',
 'intake_catalog',
 'nasaCMR',
 'smap',
 'smap_egi',
 'sys',
 'terraintiles']

In [6]:
# Nothing here yet
# dir(podpac.alglib)