Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YAML notebook #191

Merged
merged 18 commits into from Apr 26, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 2 additions & 0 deletions HISTORY.rst
Expand Up @@ -25,6 +25,7 @@ New features and enhancements
* New masking feature in ``extract_dataset``. (:issue:`180`, :pull:`182`).
* New function ``xs.spatial.subset`` to replace ``xs.extract.clisops_subset`` and add method "sel". (:issue:`180`, :pull:`182`).
* Add long_name attribute to diagnostics. ( :pull:`189`).
* Added a new YAML-centric notebook (:issue:`8`, :pull:`191`).

Breaking changes
^^^^^^^^^^^^^^^^
Expand Down Expand Up @@ -56,6 +57,7 @@ Internal changes
* The top-level Makefile now includes a `linkcheck` recipe, and the ReadTheDocs configuration no longer reinstalls the `llvmlite` compiler library. (:pull:`173`).
* The checkups on coverage and duplicates can now be skipped in `subset_file_coverage`. (:pull:`170`).
* Changed the `ProjectCatalog` docstrings to make it more obvious that it needs to be created empty. (:issue:`99`, :pull:`184`).
* Added parse_config to `creep_fill`, `creep_weights`, and `reduce_ensemble` (:pull:`191`).

v0.5.0 (2023-02-28)
-------------------
Expand Down
1 change: 1 addition & 0 deletions docs/index.rst
Expand Up @@ -33,6 +33,7 @@ Features
notebooks/3_diagnostics
notebooks/4_ensemble_reduction
notebooks/5_warminglevels
notebooks/6_config
columns
api
contributing
Expand Down
30 changes: 23 additions & 7 deletions docs/notebooks/1_catalog.ipynb
Expand Up @@ -7,7 +7,10 @@
"source": [
"# Using and understanding Catalogs\n",
"\n",
"<div class=\"alert alert-info\"> <b>NOTE:</b> Catalogs in `xscen` are built upon Datastores in `intake_esm`. For more information on basic usage, such as the `search()` function, please consult their documentation: <a href=\"https://intake-esm.readthedocs.io/en/stable/\">https://intake-esm.readthedocs.io/en/stable/</a>.</div>\n",
"<div class=\"alert alert-info\"> <b>INFO</b>\n",
"\n",
"Catalogs in `xscen` are built upon Datastores in `intake_esm`. For more information on basic usage, such as the `search()` function, [please consult their documentation](https://intake-esm.readthedocs.io/en/stable/).\n",
"</div>\n",
"\n",
"Catalogs are made of two files:\n",
"\n",
Expand Down Expand Up @@ -441,7 +444,10 @@
"\n",
"`.derivedcat` can be called on a catalog to obtain the list of DerivedVariable and the function associated to them. In addition, `._requested_variables` will display the list of variables that will be opened by the `to_dataset_dict()` function, including *DerivedVariables*.\n",
"\n",
"**NOTE:** `_requested_variables` should NOT be modified under any circumstance, as it is likely to make `to_dataset_dict()` fail. To add some transparency on which variables have been *requested* and which are the *dependent* ones, `xscen` has added `_requested_variables_true` and `_dependent_variables`. This is very likely to be changed in the future."
"<div class=\"alert alert-warning\"> <b>WARNING</b>\n",
" \n",
"`_requested_variables` should NOT be modified under any circumstance, as it is likely to make `to_dataset_dict()` fail. To add some transparency on which variables have been **requested** and which are the **dependent** ones, `xscen` has added `_requested_variables_true` and `_dependent_variables`. This is very likely to be changed in the future.\n",
"</div>"
]
},
{
Expand Down Expand Up @@ -477,7 +483,9 @@
"id": "aec5c3fc",
"metadata": {},
"source": [
"<div class=\"alert alert-warning\"> <b>WARNING:</b> Note that `allow_conversion` currently fails if:\n",
"<div class=\"alert alert-info\"> <b>INFO</b>\n",
" \n",
"`allow_conversion` currently fails if:\n",
"<ul>\n",
"<li>The requested DerivedVariable also requires a DerivedVariable itself.</li>\n",
"<li>The dependent variables exist at different frequencies (e.g. 'pr @1hr' & 'tas @3hr')</li>\n",
Expand Down Expand Up @@ -542,7 +550,9 @@
{
"cell_type": "markdown",
"id": "9513fc05",
"metadata": {},
"metadata": {
"tags": []
},
"source": [
"### Appending new data to a ProjectCatalog\n",
"\n",
Expand All @@ -556,7 +566,10 @@
"\n",
"#### Parsing a directory \n",
"\n",
"<div class=\"alert alert-info\"> <b>NOTE:</b> If you are an Ouranos employee, this section should be of limited use (unless you need to retroactively parse a directory containing exiting datasets). Please consult the existing Ouranos catalogs using xs.search_data_catalogs instead.</div>\n",
"<div class=\"alert alert-info\"> <b>INFO</b> \n",
" \n",
"If you are an Ouranos employee, this section should be of limited use (unless you need to retroactively parse a directory containing exiting datasets). Please consult the existing Ouranos catalogs using `xs.search_data_catalogs` instead.\n",
"</div>\n",
"\n",
"The `parse_directory` function relies on analyzing patterns to adequately decode the filenames to store that information in the catalog. \n",
"\n",
Expand Down Expand Up @@ -606,7 +619,10 @@
"\n",
"This utility can also be called by itself through `xs.catalog.generate_id()`.\n",
"\n",
"**NOTE:** Note that when constructing IDs, empty columns will be skipped."
"<div class=\"alert alert-info\"> <b>INFO</b> \n",
"\n",
"When constructing IDs, empty columns will be skipped.\n",
"</div>"
]
},
{
Expand Down Expand Up @@ -662,7 +678,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
"version": "3.10.10"
}
},
"nbformat": 4,
Expand Down
57 changes: 42 additions & 15 deletions docs/notebooks/2_getting_started.ipynb
Expand Up @@ -60,7 +60,10 @@
"source": [
"### Searching a subset of datasets within *DataCatalogs*\n",
"\n",
"<div class=\"alert alert-info\"> <b>NOTE</b>: At this stage, the search criteria should be for variables that will be <i>bias corrected</i>, not necessarily the variables required for the final product. For example, if <i>sfcWindfromdir</i> is the final product, then <i>uas</i> and <i>vas</i> should be searched for since these are the variables that will be bias corrected.</div>\n",
"<div class=\"alert alert-info\"> <b>INFO</b>\n",
" \n",
"At this stage, the search criteria should be for variables that will be **bias corrected**, not necessarily the variables required for the final product. For example, if `sfcWindfromdir` is the final product, then `uas` and `vas` should be searched for since these are the variables that will be bias corrected.\n",
"</div>\n",
"\n",
"`xs.search_data_catalogs` is used to consult a list of *DataCatalogs* and find a subset of datasets that match given search parameters. More details on that function and possible usage are given in the [Understanding Catalogs](1_catalog.ipynb#Advanced-search:-xs.search_data_catalogs) Notebook.\n",
"\n",
Expand Down Expand Up @@ -127,7 +130,10 @@
"source": [
"## Extracting data\n",
"\n",
"<div class=\"alert alert-warning\"> <b>WARNING:</b> It is heavily recommended to stop and analyse the results of <i>search_data_catalogs</i> before proceeding to the extraction function.</div>\n",
"<div class=\"alert alert-warning\"> <b>WARNING</b> \n",
" \n",
"It is heavily recommended to stop and analyse the results of `search_data_catalogs` before proceeding to the extraction function.\n",
"</div>\n",
"\n",
"### Defining the region\n",
"\n",
Expand Down Expand Up @@ -215,7 +221,10 @@
"\n",
"**NOTE:** Calling the extraction function without passing by `search_data_catalogs` beforehand is possible, but will not support *DerivedVariables*.\n",
"\n",
"<div class=\"alert alert-warning\"> <b>Note:</b> `extract_dataset` currently only accepts a single ID at a time.</div>"
"<div class=\"alert alert-info\"> <b>NOTE</b> \n",
" \n",
"`extract_dataset` currently only accepts a single unique ID at a time.\n",
"</div>"
]
},
{
Expand Down Expand Up @@ -367,9 +376,12 @@
"source": [
"## Regridding data\n",
"\n",
"<div class=\"alert alert-info\"> <b>NOTE:</b> Regridding in `xscen` is built upon `xESMF`. For more information on basic usage and available regridding methods, please consult their documentation: <a href=\"https://xesmf.readthedocs.io/en/latest/\">https://xesmf.readthedocs.io/en/latest/</a>. Their <a href=\"https://xesmf.readthedocs.io/en/latest/notebooks/Masking.html\">masking and extrapolation tutorial</a> is of particular interest.\n",
"<div class=\"alert alert-info\"> <b>NOTE</b> \n",
"\n",
"More details on the regridding functions themselves can be found within the <a href=\"https://earthsystemmodeling.org/esmpy/\">ESMPy</a> and <a href=\"https://earthsystemmodeling.org/\">ESMF</a> documentation.</div>\n",
"Regridding in `xscen` is built upon `xESMF`. For more information on basic usage and available regridding methods, [please consult their documentation](https://xesmf.readthedocs.io/en/latest/). Their [masking and extrapolation tutorial](https://xesmf.readthedocs.io/en/latest/notebooks/Masking.html) is of particular interest.\n",
"\n",
"More details on the regridding functions themselves can be found within the [ESMPy](https://earthsystemmodeling.org/esmpy/) and [ESMF](https://earthsystemmodeling.org/) documentation.\n",
"</div>\n",
"\n",
"The only requirement for using datasets in `xESMF` is that they contain *lon* and *lat*, with *mask* as an optional data variable. Using these, the package can manage both regular and rotated grids. The main advantage of `xESMF` compared to other tools such as *scipy*'s *griddata*, in addition to the fact that the methods are climate science-based, is that the transformation weights are calculated once and broadcasted on the *time* dimension.\n",
"\n",
Expand Down Expand Up @@ -469,7 +481,10 @@
"source": [
"### Preparing arguments for xESMF.Regridder\n",
"\n",
"<div class=\"alert alert-info\"> <b>NOTE: </b> xESMF's API appears to be broken on their ReadTheDocs. For a list of available arguments and options in <i>Regridder()</i>, please consult their <a href=\"https://github.com/pangeo-data/xESMF/blob/master/xesmf/frontend.py\">Github page</a> directly.</div>\n",
"<div class=\"alert alert-info\"> <b>NOTE</b> \n",
" \n",
"xESMF's API appears to be broken on their ReadTheDocs. For a list of available arguments and options in `Regridder()`, please consult their [Github page](https://github.com/pangeo-data/xESMF/blob/master/xesmf/frontend.py) directly.\n",
"</div>\n",
"\n",
"`xESMF.Regridder` is the main utility that computes the transformation weights and performs the regridding. It is supported by many optional arguments and methods, which can be called in `xscen` through `regridder_kwargs`.\n",
"\n",
Expand Down Expand Up @@ -506,7 +521,10 @@
"\n",
"Other options exist in `ESMF/ESMPy`, but not `xESMF`. As they get implemented, they should automatically get supported by `xscen`.\n",
"\n",
"<div class=\"alert alert-block alert-warning\"> <b>NOTE: </b>Some utilities that exist in `xESMF` have not yet been explicitely added to `xscen`. If <i>conservative</i> regridding is desired, for instance, some additional scripts might be required on the User's side to create the lon/lat boundaries.</div>"
"<div class=\"alert alert-info\"> <b>NOTE</b>\n",
" \n",
"Some utilities that exist in `xESMF` have not yet been explicitely added to `xscen`. If *conservative* regridding is desired, for instance, some additional scripts might be required on the User's side to create the lon/lat boundaries\n",
"</div>"
]
},
{
Expand Down Expand Up @@ -624,7 +642,10 @@
"source": [
"## Bias adjusting data\n",
"\n",
"<div class=\"alert alert-info\"> <b>NOTE:</b> Bias adjustment in `xscen` is built upon `xclim.sdba`. For more information on basic usage and available methods, please consult their documentation: <a href=\"https://xclim.readthedocs.io/en/stable/sdba.html\">https://xclim.readthedocs.io/en/stable/sdba.html</a>.</div>\n",
"<div class=\"alert alert-info\"> <b>NOTE</b> \n",
" \n",
"Bias adjustment in `xscen` is built upon `xclim.sdba`. For more information on basic usage and available methods, [please consult their documentation](https://xclim.readthedocs.io/en/stable/sdba.html).\n",
"</div>\n",
"\n",
"### Preparing arguments for xclim.sdba\n",
"\n",
Expand Down Expand Up @@ -673,7 +694,10 @@
"- `simulation_period` defines the period to bias adjust.\n",
"- `xclim_adjust_kwargs` is described above.\n",
"\n",
"<div class=\"alert alert-warning\"> <b>NOTE: </b> These functions currently do not support multiple variables due to the fact that train and adjust arguments might vary. The function must be called separately for every variable. </div>"
"<div class=\"alert alert-info\"> <b>NOTE</b> \n",
" \n",
"These functions currently do not support multiple variables due to the fact that train and adjust arguments might vary. The function must be called separately for every variable. \n",
"</div>"
]
},
{
Expand Down Expand Up @@ -778,12 +802,12 @@
"id": "65cd14ef",
"metadata": {},
"source": [
"\n",
"\n",
"\n",
"## Computing indicators\n",
"\n",
"<div class=\"alert alert-info\"> <b>NOTE:</b> `xscen` relies heavily on `xclim`'s YAML support for calculating indicators. For more information on how to build the YAML file, consult : <a href=\"https://xclim.readthedocs.io/en/latest/notebooks/extendxclim.html?highlight=yaml#YAML-file\">https://xclim.readthedocs.io/en/latest/notebooks/extendxclim.html?highlight=yaml#YAML-file</a>.</div>\n",
"<div class=\"alert alert-info\"> <b>NOTE</b> \n",
" \n",
"`xscen` relies heavily on `xclim`'s YAML support for calculating indicators. For more information on how to build the YAML file, consult [this Notebook](https://xclim.readthedocs.io/en/latest/notebooks/extendxclim.html?highlight=yaml#YAML-file).\n",
"</div>\n",
"\n",
"`xs.compute_indicators` makes use of *xclim*'s indicator modules functionalities to compute a given list of indicators. It is called by either using:\n",
"\n",
Expand Down Expand Up @@ -1053,7 +1077,10 @@
"\n",
"Usually, we would create an ensemble out of different models, but for this toy example, it will be made out of the 2 experiments of the same model that we have available.\n",
"\n",
"<div class=\"alert alert-warning\"> <b>WARNING: </b> If given a set of paths, `xclim.ensembles.create_ensemble` will ignore the chunking on disk and open the datasets with only a chunking along the new `realization` dimension. Thus, for large datasets, `create_kwargs` should be used to explicitely specify chunks.</div>"
"<div class=\"alert alert-warning\"> <b>WARNING</b> \n",
"\n",
"If given a set of paths, `xclim.ensembles.create_ensemble` will ignore the chunking on disk and open the datasets with only a chunking along the new `realization` dimension. Thus, for large datasets, `create_kwargs` should be used to explicitely specify chunks.\n",
"</div>"
]
},
{
Expand Down Expand Up @@ -1243,7 +1270,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
"version": "3.10.10"
}
},
"nbformat": 4,
Expand Down
7 changes: 5 additions & 2 deletions docs/notebooks/3_diagnostics.ipynb
Expand Up @@ -251,7 +251,10 @@
"\n",
"Note that it is possible to add many rows to `measures_heatmap`.\n",
"\n",
"<div class=\"alert alert-info\"> <b>NOTE:</b> The bias correction performed in the Getting Started tutorial was adjusted for speed rather than performance, using only a few quantiles. The performance results below are thus quite poor, but that was expected.</div>"
"<div class=\"alert alert-info\"> <b>NOTE</b> \n",
" \n",
"The bias correction performed in the Getting Started tutorial was adjusted for speed rather than performance, using only a few quantiles. The performance results below are thus quite poor, but that was expected.\n",
"</div>"
]
},
{
Expand Down Expand Up @@ -396,7 +399,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
"version": "3.10.10"
}
},
"nbformat": 4,
Expand Down
9 changes: 6 additions & 3 deletions docs/notebooks/4_ensemble_reduction.ipynb
Expand Up @@ -89,7 +89,10 @@
"source": [
"## Selecting a reduced ensemble\n",
"\n",
"<div class=\"alert alert-info\"> <b>NOTE:</b> Ensemble reduction in `xscen` is built upon `xclim.ensembles`. For more information on basic usage and available methods, please consult their documentation: <a href=\"https://xclim.readthedocs.io/en/stable/notebooks/ensembles-advanced.html\">https://xclim.readthedocs.io/en/stable/notebooks/ensembles-advanced.html</a>.</div>\n",
"<div class=\"alert alert-info\"> <b>NOTE</b>\n",
" \n",
"Ensemble reduction in `xscen` is built upon `xclim.ensembles`. For more information on basic usage and available methods, [please consult their documentation](https://xclim.readthedocs.io/en/stable/notebooks/ensembles-advanced.html).\n",
"</div>\n",
"\n",
"Ensemble reduction through `xscen.reduce_ensemble` consists in a simple call to `xclim`. The arguments are:\n",
"- `data`, which is the 2D DataArray that is created by using `xs.build_reduction_data`.\n",
Expand Down Expand Up @@ -168,9 +171,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.8"
"version": "3.10.10"
}
},
"nbformat": 4,
"nbformat_minor": 1
"nbformat_minor": 4
}
6 changes: 4 additions & 2 deletions docs/notebooks/5_warminglevels.ipynb
Expand Up @@ -335,7 +335,9 @@
"\n",
"Even more than with time-based horizons, the first step of ensemble statistics should be to generate the weights. Indeed, if a model has 3 experiments reaching a given warming level, we want it to have the same weight as a model with only 2 experiments reaching that warming.\n",
"\n",
"<div class=\"alert alert-warning\"> <b>Warning</b>: `xs.ensembles.generate_weights` is currently purely based on metadata, and thus cannot distinguish subtleties about which realization reaches which warming level if multiple experiments are concatenated together before passing them to the function. The results are likely to be wrong, which is why each warming level needs to be computed individually.\n",
"<div class=\"alert alert-warning\"> <b>WARNING</b>\n",
" \n",
"`xs.ensembles.generate_weights` is currently purely based on metadata, and thus cannot distinguish subtleties about which realization reaches which warming level if multiple experiments are concatenated together before passing them to the function. The results are likely to be wrong, which is why each warming level needs to be computed individually.\n",
"</div>\n",
"\n",
"Next, the weights and the datasets can be passed to `xs.ensemble_stats` to calculate the ensemble statistics."
Expand Down Expand Up @@ -407,7 +409,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
"version": "3.10.10"
}
},
"nbformat": 4,
Expand Down