Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions docs/guide/experiment_config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,8 @@ Configuration files are written in ``YAML`` format and are divided into differen
- Poisson S-test:
func: poisson_evaluations.spatial_test
plot_func: plot_poisson_consistency_test
run_mode: parallel
force_rerun: true
postprocess:
plot_forecasts:
cmap: magma
Expand Down Expand Up @@ -278,3 +280,12 @@ The seismicity catalog can be defined with the ``catalog`` parameter. It represe

.. important::
The main catalog will be stored, and consecutively filtered to the extent of each testing time-window, as well as to the experiment's spatial domain, and magnitude- and depth- ranges.


Run Configuration
-----------------

The ``run_mode`` parameter allows to perform the experiment tasks in either ``sequential`` (default) or ``parallel``.
The former is appropriate for staging and testing the experiment is working, whereas ``parallel`` is optimal for heavy computations when the experiment is set to production (e.g., real conditions)

The ``force_rerun`` makes the experiment recompute every forecast. Default is ``false``, which allows when instantiating the experiment to make a self-discovery of existing forecasts.
8 changes: 4 additions & 4 deletions docs/guide/postprocess_config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -422,19 +422,19 @@ Here are some basic functionalities from **floatCSEP** to access catalogs, forec
timewindow_str = timewindow2str(timewindow)
model.get_forecast(timewindow_str)

* - :attr:`Model.registry.path <floatcsep.infrastructure.registries.ForecastRegistry>`
* - :attr:`Model.registry.path <floatcsep.infrastructure.registries.ModelFileRegistry>`
- Directory of the model file or source code.
* - :attr:`Model.registry.database <floatcsep.infrastructure.registries.ForecastRegistry>`
* - :attr:`Model.registry.database <floatcsep.infrastructure.registries.ModelFileRegistry>`
- Database path where forecasts are stored.
* - :attr:`TimeIndependentModel.forecast_unit <floatcsep.model.TimeIndependentModel>`
- The forecast unit for a time independent model.
* - :meth:`TimeDependentModel.func <floatcsep.model.TimeIndependentModel>`
- The function command to execute a time dependent source code.
* - :meth:`TimeDependentModel.func_kwargs`
- The keyword arguments of the model, passed to the arguments file.
* - :meth:`TimeDependentModel.registry.args_file <floatcsep.infrastructure.registries.ForecastRegistry>`
* - :meth:`TimeDependentModel.registry.args_file <floatcsep.infrastructure.registries.ModelFileRegistry>`
- The path of the arguments file. Default is ``args.txt``.
* - :meth:`TimeDependentModel.registry.input_cat <floatcsep.infrastructure.registries.ForecastRegistry>`
* - :meth:`TimeDependentModel.registry.input_cat <floatcsep.infrastructure.registries.ModelFileRegistry>`
- The path of the input catalog for the model execution.


Expand Down
67 changes: 40 additions & 27 deletions docs/reference/api_reference.rst
Original file line number Diff line number Diff line change
Expand Up @@ -52,8 +52,6 @@ instances onto an experimental workflow. The class and its main methods are:
Experiment.set_models
Experiment.set_tests
Experiment.stage_models
Experiment.set_input_cat
Experiment.set_test_cat
Experiment.set_tasks
Experiment.run
Experiment.read_results
Expand All @@ -74,7 +72,6 @@ reading. The abstract and concrete classes, and their main methods are:
Model.factory

TimeIndependentModel
TimeIndependentModel.init_db
TimeIndependentModel.get_forecast

TimeDependentModel.stage
Expand Down Expand Up @@ -128,8 +125,8 @@ These are the helper functions of ``floatCSEP``
parse_timedelta_string
read_time_cfg
read_region_cfg
timewindows_ti
timewindows_td
time_windows_ti
time_windows_td


Some additional plotting functions to pyCSEP are:
Expand All @@ -148,16 +145,26 @@ Some additional plotting functions to pyCSEP are:

A small wrapper for ``pyCSEP`` readers

.. currentmodule:: floatcsep.utils.readers
.. currentmodule:: floatcsep.utils.file_io

.. autosummary::
:nosignatures:

ForecastParsers.dat
ForecastParsers.xml
ForecastParsers.quadtree
ForecastParsers.csv
ForecastParsers.hdf5
CatalogParser.ascii
CatalogParser.json

CatalogSerializer.ascii
CatalogSerializer.json

GriddedForecastParsers.dat
GriddedForecastParsers.xml
GriddedForecastParsers.quadtree
GriddedForecastParsers.csv
GriddedForecastParsers.hdf5

CatalogForecastParsers.csv
CatalogForecastParsers.load_hermes_catalog

HDF5Serializer.grid2hdf5
serialize

Expand Down Expand Up @@ -194,20 +201,25 @@ components (e.g., forecasts, catalogs, results, etc.), and allows to be aware of
.. autosummary::
:nosignatures:

ForecastRegistry
ForecastRegistry.get_forecast
ForecastRegistry.fmt
ForecastRegistry.forecast_exists
ForecastRegistry.build_tree

ExperimentRegistry
ExperimentRegistry.add_forecast_registry
ExperimentRegistry.get_forecast_registry
ExperimentRegistry.get_result
ExperimentRegistry.get_test_catalog
ExperimentRegistry.get_figure
ExperimentRegistry.result_exist
ExperimentRegistry.build_tree
ModelFileRegistry
ModelFileRegistry.fmt
ModelFileRegistry.get_input_catalog_key
ModelFileRegistry.get_forecast_key
ModelFileRegistry.get_args_key
ModelFileRegistry.get_input_dir
ModelFileRegistry.get_forecast_dir
ModelFileRegistry.get_args_template_path
ModelFileRegistry.forecast_exists
ModelFileRegistry.build_tree

ExperimentFileRegistry
ExperimentFileRegistry.add_model_registry
ExperimentFileRegistry.get_model_registry
ExperimentFileRegistry.get_result_key
ExperimentFileRegistry.get_test_catalog_key
ExperimentFileRegistry.get_figure_key
ExperimentFileRegistry.result_exist
ExperimentFileRegistry.build_tree


**Repositories**
Expand All @@ -225,8 +237,9 @@ catalogs, forecasts), abstracting the experiment logic from the pyCSEP io functi
CatalogRepository.set_main_catalog
CatalogRepository.catalog
CatalogRepository.get_test_cat
CatalogRepository.set_test_cat
CatalogRepository.set_input_cat
CatalogRepository.set_test_cats
CatalogRepository.set_input_cats
CatalogRepository.filter_catalog

GriddedForecastRepository
GriddedForecastRepository.load_forecast
Expand Down
4 changes: 2 additions & 2 deletions docs/reference/infrastructure.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,12 @@ and the required workflow to run an Experiment.
Registries
----------

.. autoclass:: floatcsep.infrastructure.registries.ForecastRegistry
.. autoclass:: floatcsep.infrastructure.registries.ModelFileRegistry
:members:
:undoc-members:
:show-inheritance:

.. autoclass:: floatcsep.infrastructure.registries.ExperimentRegistry
.. autoclass:: floatcsep.infrastructure.registries.ExperimentFileRegistry
:members:
:undoc-members:
:show-inheritance:
Expand Down
8 changes: 4 additions & 4 deletions docs/reference/utilities.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,11 +27,11 @@ This section documents the `accessors` module.
:inherited-members:


Readers
-------
This section documents the `readers` module.
Readers and Parsers
-------------------
This section documents the `file_io` module.

.. automodule:: floatcsep.utils.readers
.. automodule:: floatcsep.utils.file_io
:members:
:undoc-members:
:show-inheritance:
Expand Down
2 changes: 1 addition & 1 deletion docs/tutorials/case_a.rst
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ The source code can be found in the ``tutorials/case_a`` folder or in `GitHub <
.. literalinclude:: ../../tutorials/case_a/catalog.csep
:caption: tutorials/case_a/catalog.csep

* The forecast ``best_model.dat`` to be evaluated is written in the ``.dat`` format (see :doc:`pycsep:concepts/forecasts`). Forecast formats are detected automatically (see :mod:`floatcsep.utils.readers.ForecastParsers`)
* The forecast ``best_model.dat`` to be evaluated is written in the ``.dat`` format (see :doc:`pycsep:concepts/forecasts`). Forecast formats are detected automatically (see :mod:`floatcsep.utils.file_io.GriddedForecastParsers`)

.. literalinclude:: ../../tutorials/case_a/best_model.dat
:caption: tutorials/case_a/best_model.dat
Expand Down
81 changes: 34 additions & 47 deletions floatcsep/experiment.py
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ class Experiment:

model_config (str): Path to the models' configuration file
test_config (str): Path to the evaluations' configuration file
run_mode (str): 'serial' or 'parallel'
run_mode (str): 'sequential' or 'parallel'
default_test_kwargs (dict): Default values for the testing
(seed, number of simulations, etc.)
postprocess (dict): Contains the instruction for postprocessing
Expand All @@ -99,10 +99,11 @@ def __init__(
catalog: str = None,
models: str = None,
tests: str = None,
exp_class: str = "ti",
postprocess: str = None,
default_test_kwargs: dict = None,
run_dir: str = "results",
run_mode: str = "serial",
run_mode: str = "sequential",
stage_dir: ... = "results",
report_hook: dict = None,
**kwargs,
Expand Down Expand Up @@ -155,14 +156,19 @@ def __init__(
f"\tMagnitude range: [{numpy.min(self.magnitudes)},"
f" {numpy.max(self.magnitudes)}]"
)
exp_class_str = (
"Time-Dependent"
if self.exp_class in ("td", "time-dependent")
else "Time-Independent"
)
log.info(f"\tExperiment class: {exp_class_str}")

self.catalog = None
self.models = []
self.tests = []

self.postprocess = postprocess if postprocess else {}
self.default_test_kwargs = default_test_kwargs

self.catalog_repo.set_main_catalog(catalog, self.time_config, self.region_config)

self.models = self.set_models(
Expand Down Expand Up @@ -347,35 +353,12 @@ def set_tests(self, test_config: Union[str, Dict, List]) -> list:

return tests

def set_test_cat(self, tstring: str) -> None:
"""
Filters the complete experiment catalog to a test sub-catalog bounded by the test
time-window. Writes it to filepath defined in :attr:`Experiment.registry`

Args:
tstring (str): Time window string
"""

self.catalog_repo.set_test_cat(tstring)

def set_input_cat(self, tstring: str, model: Model) -> None:
"""
Filters the complete experiment catalog to an input sub-catalog filtered to the
beginning of the test time-window.

Args:
tstring (str): Time window string
model (:class:`~floatcsep.model.Model`): Model to give the input
catalog
"""
self.catalog_repo.set_input_cat(tstring, model)

def set_tasks(self) -> None:
"""
Lazy definition of the experiment core tasks by wrapping instances,
methods and arguments. Creates a graph with task nodes, while assigning
task-parents to each node, depending on each Evaluation signature.
The tasks can then be run in serial as a list or asynchronous
The tasks can then be run in sequential as a list or asynchronous
using the graph's node dependencies.
For instance:

Expand Down Expand Up @@ -403,38 +386,39 @@ def set_tasks(self) -> None:
# Prepare the testing catalogs
task_graph = TaskGraph()
for time_i in tw_strings:
# The method call Experiment.set_test_cat(time_i) is created lazily
task_i = Task(instance=self, method="set_test_cat", tstring=time_i)
# An is added to the task graph
task_i = Task(instance=self.catalog_repo, method="set_test_cats", tstring=time_i)
task_graph.add(task_i)
# the task will be executed later with Experiment.run()
# once all the tasks are defined
if self.exp_class in ["td", "time_dependent"]:
task_j = Task(
instance=self.catalog_repo,
method="set_input_cats",
tstring=time_i,
models=self.models,
)

task_graph.add(task=task_j)

# Set up the Forecasts creation
for time_i in tw_strings:
for model_j in self.models:
if isinstance(model_j, TimeDependentModel):
task_tj = Task(
instance=self, method="set_input_cat", tstring=time_i, model=model_j
)

task_graph.add(task=task_tj)
# A catalog needs to have been filtered

task_ij = Task(
instance=model_j,
method="create_forecast",
tstring=time_i,
force=self.force_rerun,
)

task_graph.add(task=task_ij)
# A catalog needs to have been filtered
if isinstance(model_j, TimeDependentModel):
task_graph.add_dependency(
task_ij, dep_inst=self, dep_meth="set_input_cat", dkw=(time_i, model_j)
task_ij,
dep_inst=self.catalog_repo,
dep_meth="set_input_cats",
dkw=(time_i, model_j),
)
task_graph.add_dependency(
task_ij, dep_inst=self, dep_meth="set_test_cat", dkw=time_i
task_ij, dep_inst=self.catalog_repo, dep_meth="set_test_cats", dkw=time_i
)

# Set up the Consistency Tests
Expand All @@ -450,6 +434,7 @@ def set_tasks(self) -> None:
region=self.region,
)
task_graph.add(task_ijk)

# the forecast needs to have been created
task_graph.add_dependency(
task_ijk, dep_inst=model_j, dep_meth="create_forecast", dkw=time_i
Expand Down Expand Up @@ -531,7 +516,6 @@ def set_tasks(self) -> None:
task_graph.add_dependency(
task_k, dep_inst=m_j, dep_meth="create_forecast", dkw=time_str
)

self.task_graph = task_graph

def run(self) -> None:
Expand All @@ -544,13 +528,16 @@ def run(self) -> None:
- Memory monitor?
- Queuer?
"""
log.info(f"Running {self.task_graph.ntasks} tasks")

if self.seed:
numpy.random.seed(self.seed)

self.task_graph.run()
log.info("Calculation completed")
if self.run_mode == "parallel":
cpu_count = os.cpu_count() or 4
workers = getattr(self, "concurrent_tasks", None) or min(cpu_count, 32)
self.task_graph.run_parallel(max_workers=workers)
else:
self.task_graph.run()
log.debug("Post-run forecast registry")
log_models_tree(log, self.registry, self.time_windows)
log.debug("Post-run result summary")
Expand Down Expand Up @@ -677,7 +664,7 @@ def from_yml(cls, config_yml: str, repr_dir=None, **kwargs):
Returns:
An :class:`~floatcsep.experiment.Experiment` class instance
"""
log.info("Initializing experiment from .yml file")
log.info(f"Initializing experiment from {config_yml} file")
with open(config_yml, "r") as yml:

# experiment configuration file
Expand Down
Loading