Skip to content

Commit

Permalink
Merge branch 'develop' into main
Browse files Browse the repository at this point in the history
  • Loading branch information
mpu-creare committed Apr 14, 2021
2 parents 7749f0d + 85e2f4a commit c65b5cb
Show file tree
Hide file tree
Showing 122 changed files with 8,590 additions and 7,013 deletions.
4 changes: 2 additions & 2 deletions dist/local_Windows_install/run_podpac_jupyterlab.bat
Expand Up @@ -4,6 +4,6 @@ call bin\set_local_conda_path.bat
call bin\fix_hardcoded_absolute_paths.bat
call bin\activate_podpac_conda_env.bat

cd podpac_examples
cd podpac-examples
jupyter lab
cd ..
cd ..
3 changes: 2 additions & 1 deletion doc/source/api.rst
Expand Up @@ -100,10 +100,11 @@ Classes to manage interpolation
:toctree: api/
:template: class.rst

podpac.interpolation.Interpolation
podpac.interpolators.Interpolator
podpac.interpolators.NearestNeighbor
podpac.interpolators.NearestPreview
podpac.interpolators.Rasterio
podpac.interpolators.RasterioInterpolator
podpac.interpolators.ScipyGrid
podpac.interpolators.ScipyPoint

Expand Down
6 changes: 4 additions & 2 deletions doc/source/deploy-notes.md
Expand Up @@ -68,7 +68,7 @@ $ conda create -n podpac python=3
$ bin\activate_podpac_conda_env.bat

# Install core dependencies
$ conda install matplotlib>=2.1 numpy>=1.14 scipy>=1.0 traitlets>=4.3 xarray>=0.10 ipython psutil requests>=2.18
$ conda install matplotlib>=2.1 numpy>=1.14 scipy>=1.0 traitlets>=4.3 xarray>=0.10 ipython psutil requests>=2.18 owslib
$ conda install pyproj>=2.2 rasterio>=1.0 -c conda-forge
$ pip install pint>=0.8 lazy-import>=0.2.2

Expand All @@ -94,8 +94,10 @@ $ jupyter labextension install jupyter-leaflet
$ jupyter labextension install jupyter-matplotlib
$ jupyter nbextension enable --py widgetsnbextension
$ jupyter lab build
$ ~~python -m ipykernel install --user~~
```
~~$ python -m ipykernel install --user~~

```bash
# clean conda environment
$ conda clean -a -y
# Also delete miniconda/pkgs/.trash for a smaller installation
Expand Down
105 changes: 104 additions & 1 deletion doc/source/interpolation.md
Expand Up @@ -55,8 +55,111 @@ interpolation = [
* **Descripition**: The dimensions listed in the `'dims'` list will used the specified method. These dictionaries can also specify the same field shown in the previous section.
* **Details**: PODPAC loops through the `interpolation` list, using the settings specified for each dimension independently.

**NOTE! Specifying the interpolation as a list also control the ORDER of interpolation.**
The first item in the list will be interpolated first. In this case, `lat`/`lon` will be bilinearly interpolated BEFORE `time` is nearest-neighbor interpolated.

## Interpolators
The list of available interpolators are as follows:
* `NearestNeighbor`: A custom implementation based on `scipy.cKDtree`, which handles nearly any combination of source and destination coordinates
* `XarrayInterpolator`: A light-weight wrapper around `xarray`'s `DataArray.interp` method, which is itself a wrapper around `scipy` interpolation functions, but with a clean `xarray` interface
* `RasterioInterpolator`: A wrapper around `rasterio`'s interpolation/reprojection routines. Appropriate for grid-to-grid interpolation.
* `ScipyGrid`: An optimized implementation for `grid` sources that uses `scipy`'s `RegularGridInterpolator`, or `RectBivariateSplit` interpolator depending on the method.
* `ScipyPoint`: An implementation based on `scipy.KDtree` capable of `nearest` interpolation for `point` sources
* `NearestPreview`: An approximate nearest-neighbor interpolator useful for rapidly viewing large files

The default order for these interpolators can be found in `podpac.data.INTERPOLATORS`.

### NearestNeighbor
Since this is the most general of the interpolators, this section deals with the available parameters and settings for the `NearestNeighbor` interpolator.

#### Parameters
The following parameters can be set by specifying the interpolation as a dictionary or a list, as described above.

* `respect_bounds` : `bool`
* Default is `True`. If `True`, any requested dimension OUTSIDE of the bounds will be interpolated as `nan`.
Otherwise, any point outside the bounds will have the value of the nearest neighboring point
* `remove_nan` : `bool`
* Default is `False`. If `True`, `nan`'s in the source dataset will NOT be interpolated. This can be used if a value for the function
is needed at every point of the request. It is not helpful when computing statistics, where `nan` values will be explicitly
ignored. In that case, if `remove_nan` is `True`, `nan` values will take on the values of neighbors, skewing the statistical result.
* `*_tolerance` : `float`, where `*` in ["spatial", "time", "alt"]
* Default is `inf`. Maximum distance to the nearest coordinate to be interpolated.
Corresponds to the unit of the `*` dimension.
* `*_scale` : `float`, where `*` in ["spatial", "time", "alt"]
* Default is `1`. This only applies when the source has stacked dimensions with different units. The `*_scale`
defines the factor that the coordinates will be scaled by (coordinates are divided by `*_scale`) to output
a valid distance for the combined set of dimensions.
For example, when "lat, lon, and alt" dimensions are stacked, ["lat", "lon"] are in degrees
and "alt" is in feet, the `*_scale` parameters should be set so that
`|| [dlat / spatial_scale, dlon / spatial_scale, dalt / alt_scale] ||` results in a reasonable distance.
* `use_selector` : `bool`
* Default is `True`. If `True`, a subset of the coordinates will be selected BEFORE the data of a dataset is retrieved. This
reduces the number of data retrievals needed for large datasets. In cases where `remove_nan` = `True`, the selector may select
only `nan` points, in which case the interpolation fails to produce non-`nan` data. This usually happens when requesting a single
point from a dataset that contains `nan`s. As such, in these cases set `use_selector` = `False` to get a non-`nan` value.

#### Advanced NearestNeighbor Interpolation Examples
* Only interpolate points that are within `1` of the source data lat/lon locations
```python
interpolation={"method": "nearest", "params": {"spatial_tolerance": 1}},
```
* When interpolating with mixed time/space, use `1` day as equivalent to `1` degree for determining the distance
```python
interpolation={
"method": "nearest",
"params": {
"spatial_scale": 1,
"time_scale": "1,D",
"alt_scale": 10,
}
}
```
* Remove nan values in the source datasource -- in some cases a `nan` may still be interpolated
```python
interpolation={
"method": "nearest",
"params": {
"remove_nan": True,
}
}
```
* Remove nan values in the source datasource in all cases, even for single point requests located directly at `nan`-values in the source.
```python
interpolation={
"method": "nearest",
"params": {
"remove_nan": True,
"use_selector": False,
}
}
```
* Do nearest-neighbor extrapolation outside of the bounds of the source dataset
```python
interpolation={
"method": "nearest",
"params": {
"respect_bounds": False,
}
}
```
* Do nearest-neighbor interpolation of time with `nan` removal followed by spatial interpolation
```python
interpolation = [
{
"method": "nearest",
"params": {
"remove_nan": True,
},
"dims": ["time"]
},
{
"method": "nearest",
"dims": ["lat", "lon", "alt"]
},
]
```
## Notes and Caveats
While the API is well developed, all conceivable functionality is not. For example, while we can interpolate gridded data to point data, point data to grid data interpolation is not as well supported, and there may be errors or unexpected results. Advanced users can develop their own interpolators, but this is not currently well-documented.

**Gotcha**: Parameters for a specific interpolator may silently be ignored if a different interpolator is automatically selected.
**Gotcha**: Parameters for a specific interpolator may be ignored if a different interpolator is automatically selected. These ignored parameters are logged as warnings.

1 change: 1 addition & 0 deletions podpac/algorithm.py
Expand Up @@ -29,3 +29,4 @@
TransformTimeUnits,
)
from podpac.core.algorithm.signal import Convolution
from podpac.core.algorithm.reprojection import Reproject
2 changes: 1 addition & 1 deletion podpac/compositor.py
Expand Up @@ -5,4 +5,4 @@
# REMINDER: update api docs (doc/source/user/api.rst) to reflect changes to this file

from podpac.core.compositor.ordered_compositor import OrderedCompositor
from podpac.core.compositor.tile_compositor import UniformTileCompositor, UniformTileMixin, TileCompositor
from podpac.core.compositor.tile_compositor import TileCompositor, TileCompositorRaw
2 changes: 1 addition & 1 deletion podpac/coordinates.py
Expand Up @@ -8,6 +8,6 @@
from podpac.core.coordinates import Coordinates
from podpac.core.coordinates import crange, clinspace
from podpac.core.coordinates import Coordinates1d, ArrayCoordinates1d, UniformCoordinates1d
from podpac.core.coordinates import StackedCoordinates, DependentCoordinates, RotatedCoordinates
from podpac.core.coordinates import StackedCoordinates, RotatedCoordinates
from podpac.core.coordinates import merge_dims, concat, union
from podpac.core.coordinates import GroupCoordinates
71 changes: 33 additions & 38 deletions podpac/core/algorithm/algorithm.py
Expand Up @@ -14,7 +14,7 @@
# Internal dependencies
from podpac.core.coordinates import Coordinates, union
from podpac.core.units import UnitsDataArray
from podpac.core.node import Node, NodeException, node_eval, COMMON_NODE_DOC
from podpac.core.node import Node, NodeException, COMMON_NODE_DOC
from podpac.core.utils import common_doc, NodeTrait
from podpac.core.settings import settings
from podpac.core.managers.multi_threading import thread_manager
Expand Down Expand Up @@ -58,23 +58,21 @@ class Algorithm(BaseAlgorithm):
Developers of new Algorithm nodes need to implement the `algorithm` method.
"""

def algorithm(self, inputs):
def algorithm(self, inputs, coordinates):
"""
Arguments
----------
inputs : dict
Evaluated outputs of the input nodes. The keys are the attribute names.
Raises
------
NotImplementedError
Description
Evaluated outputs of the input nodes. The keys are the attribute names. Each item is a `UnitsDataArray`.
coordinates : podpac.Coordinates
Requested coordinates.
Note that the ``inputs`` may contain different coordinates than the requested coordinates
"""

raise NotImplementedError

@common_doc(COMMON_DOC)
@node_eval
def eval(self, coordinates, output=None):
def _eval(self, coordinates, output=None, _selector=None):
"""Evalutes this nodes using the supplied coordinates.
Parameters
Expand All @@ -83,6 +81,8 @@ def eval(self, coordinates, output=None):
{requested_coordinates}
output : podpac.UnitsDataArray, optional
{eval_output}
_selector: callable(coordinates, request_coordinates)
{eval_selector}
Returns
-------
Expand All @@ -103,7 +103,7 @@ def eval(self, coordinates, output=None):
if settings["MULTITHREADING"] and n_threads > 1:
# Create a function for each thread to execute asynchronously
def f(node):
return node.eval(coordinates)
return node.eval(coordinates, _selector=_selector)

# Create pool of size n_threads, note, this may be created from a sub-thread (i.e. not the main thread)
pool = thread_manager.get_thread_pool(processes=n_threads)
Expand All @@ -124,36 +124,31 @@ def f(node):
else:
# Evaluate nodes in serial
for key, node in self.inputs.items():
inputs[key] = node.eval(coordinates)
inputs[key] = node.eval(coordinates, output=output, _selector=_selector)
self._multi_threaded = False

# accumulate output coordinates
coords_list = [Coordinates.from_xarray(a.coords, crs=a.attrs.get("crs")) for a in inputs.values()]
output_coordinates = union([coordinates] + coords_list)

result = self.algorithm(inputs)
if isinstance(result, UnitsDataArray):
if output is None:
output = result
else:
output[:] = result.data[:]
elif isinstance(result, xr.DataArray):
if output is None:
output = self.create_output_array(
Coordinates.from_xarray(result.coords, crs=result.attrs.get("crs")), data=result.data
)
else:
output[:] = result.data
elif isinstance(result, np.ndarray):
if output is None:
output = self.create_output_array(output_coordinates, data=result)
else:
output.data[:] = result
else:
raise NodeException
result = self.algorithm(inputs, coordinates)

if "output" in output.dims and self.output is not None:
output = output.sel(output=self.output)
if not isinstance(result, xr.DataArray):
raise NodeException("algorithm returned unsupported type '%s'" % type(result))

if "output" in result.dims and self.output is not None:
result = result.sel(output=self.output)

if output is not None:
missing = [dim for dim in result.dims if dim not in output.dims]
if any(missing):
raise NodeException("provided output is missing dims %s" % missing)

output_dims = output.dims
output = output.transpose(..., *result.dims)
output[:] = result.data
output = output.transpose(*output_dims)
elif isinstance(result, UnitsDataArray):
output = result
else:
output_coordinates = Coordinates.from_xarray(result)
output = self.create_output_array(output_coordinates, data=result.data)

return output

Expand Down
10 changes: 6 additions & 4 deletions podpac/core/algorithm/coord_select.py
Expand Up @@ -46,7 +46,7 @@ def _default_coordinates_source(self):
return self.source

@common_doc(COMMON_DOC)
def eval(self, coordinates, output=None):
def _eval(self, coordinates, output=None, _selector=None):
"""Evaluates this nodes using the supplied coordinates.
Parameters
Expand All @@ -55,6 +55,8 @@ def eval(self, coordinates, output=None):
{requested_coordinates}
output : podpac.UnitsDataArray, optional
{eval_output}
_selector: callable(coordinates, request_coordinates)
{eval_selector}
Returns
-------
Expand All @@ -77,15 +79,15 @@ def eval(self, coordinates, output=None):
raise ValueError("Modified coordinates do not intersect with source data (dim '%s')" % dim)

outputs = {}
outputs["source"] = self.source.eval(self._modified_coordinates, output=output)
outputs["source"] = self.source.eval(self._modified_coordinates, output=output, _selector=_selector)

if self.substitute_eval_coords:
dims = outputs["source"].dims
coords = self._requested_coordinates
extra_dims = [d for d in coords.dims if d not in dims]
coords = coords.drop(extra_dims).coords
coords = coords.drop(extra_dims)

outputs["source"] = outputs["source"].assign_coords(**coords)
outputs["source"] = outputs["source"].assign_coords(**coords.xcoords)

if output is None:
output = outputs["source"]
Expand Down

0 comments on commit c65b5cb

Please sign in to comment.