Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Second set of data flow and data service pre-work #144

Merged
merged 30 commits into from
Feb 24, 2022

Conversation

robertbartel
Copy link
Contributor

@robertbartel robertbartel commented Feb 24, 2022

Continuation of work started in #142:

  • Extending JobExecStep values
  • Adopting use of DataRequirement for modeling job worker data requirements
  • Redesign of JobStatus to be composite type, rather than composite enum
  • Addition of abstract Dataset and DatasetManager types
  • Further work on modeldata.data metadata types
    • In particular, adjusting DataRequirement class so it can be made concrete, removing obsoleted CatchmentDataRequirement class, and changing test_catchment_data_requirement.py to test_data_requirement.py

Fixing bug resulting from extending MaaSRequest to ModelExecRequest but
not updating all usage (here, in job.py).
Adding new type in job.py to encapsulate the definition of data input or
output needs for a scheduled worker.
Redesigning JobStatus class to be a serializable composite of
JobExecPhase and JobExecStep, which it was before anyway, but without
itself redunantly being another enum type.
Replacing internal WorkerDataRequirement type with DataRequirement type
from dmod.modeldata, and adopting its use within Job.
Extending DataFormat metadata type to initialize with indices and (if
needed) implied indices (i.e., not also data fields).
Adding new ContinuousRestriction, DiscreteRestriction, and DataDomain
types for defining and filtering data domains.
Updating DataDomain type to also contain a DataFormat property, to let
it encapsulate the data fields in the domain effectively.
Updating DataRequirement type to use the DataDomain to define its domain
instead of a generic type variable; with this done, making it a concrete
type and removing deprecated CatchmentDataRequirement subtype.
Renaming to test_data_requirement.py after making DataRequirement type
concrete and removing CatchmentDataRequirement type.
Updating tests after recent redesign that involved removing the
CatchmentDataRequirement subtype.
Removing URI in favor of simple string access location property, and
adding UUID properties to Dataset and DatasetManager.
Adding new DatasetUser type and functions within DatasetManager for
managing known users of a Dataset.
@robertbartel robertbartel added enhancement New feature or request maas MaaS Workstream labels Feb 24, 2022
@robertbartel robertbartel added this to the 1.0.0 (AGU FIH) milestone Feb 24, 2022
@robertbartel robertbartel self-assigned this Feb 24, 2022
Copy link
Contributor

@christophertubbs christophertubbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like what I see; I look forward to using it.

@christophertubbs christophertubbs merged commit 432374e into NOAA-OWP:master Feb 24, 2022
@robertbartel
Copy link
Contributor Author

Relates to #128.

@robertbartel robertbartel deleted the launch_jobs/forcings_2 branch March 16, 2022 14:02
donaldwj pushed a commit to donaldwj/DMOD that referenced this pull request Jun 21, 2022
* Fix bug related to job types and ModelExecRequest.

Fixing bug resulting from extending MaaSRequest to ModelExecRequest but
not updating all usage (here, in job.py).

* Add uri as dependency for scheduler package.

* New WorkerDataRequirement helper type for Jobs.

Adding new type in job.py to encapsulate the definition of data input or
output needs for a scheduled worker.

* Add new JobExecStep values and update docstring.

* Redesign JobStatus type.

Redesigning JobStatus class to be a serializable composite of
JobExecPhase and JobExecStep, which it was before anyway, but without
itself redunantly being another enum type.

* Update tests to account for redesigned JobStatus.

* Adopt DataRequirement type from modeldata.

Replacing internal WorkerDataRequirement type with DataRequirement type
from dmod.modeldata, and adopting its use within Job.

* Update scheduler package to depend on modeldata.

* Update DataFormat with is_time_series and fields.

* Add TimeRange importable from dmod.modeldata.data.

* Add abstract Dataset and DatasetManager types.

* Extending features of DataFormat.

Extending DataFormat metadata type to initialize with indices and (if
needed) implied indices (i.e., not also data fields).

* Add data domain and restriction metadata types.

Adding new ContinuousRestriction, DiscreteRestriction, and DataDomain
types for defining and filtering data domains.

* Adjust TimeRange to extend ContinuousRestriction.

* Refactor meta_data.py var/property names and doc.

* Refactor class order in meta_data.py.

* Update DataDomain to contain a data format.

Updating DataDomain type to also contain a DataFormat property, to let
it encapsulate the data fields in the domain effectively.

* Update DataRequirement to use DataDomain directly.

Updating DataRequirement type to use the DataDomain to define its domain
instead of a generic type variable; with this done, making it a concrete
type and removing deprecated CatchmentDataRequirement subtype.

* Update modeldata.data init for new/removed types.

* Rename test_catchment_data_requirement.py file.

Renaming to test_data_requirement.py after making DataRequirement type
concrete and removing CatchmentDataRequirement type.

* Add DataDomain to modeldata.data package init.

* Fix typo in type hints for DataFormat.__init__.

* Adjust modeldata.data init for circular imports.

* Update DataRequirement unit tests.

Updating tests after recent redesign that involved removing the
CatchmentDataRequirement subtype.

* Remove URI and add UUIDs in dataset and manager.

Removing URI in favor of simple string access location property, and
adding UUID properties to Dataset and DatasetManager.

* Add DatasetManager uuid property function.

* Add DatasetUser and manager funcs to link.

Adding new DatasetUser type and functions within DatasetManager for
managing known users of a Dataset.

* Fix dataset.py imports after changes to package.

* Update some Dataset docstring and func signatures.

* Add abstract Dataset list_files function.
christophertubbs pushed a commit to christophertubbs/DMOD that referenced this pull request Jan 24, 2023
* Fix bug related to job types and ModelExecRequest.

Fixing bug resulting from extending MaaSRequest to ModelExecRequest but
not updating all usage (here, in job.py).

* Add uri as dependency for scheduler package.

* New WorkerDataRequirement helper type for Jobs.

Adding new type in job.py to encapsulate the definition of data input or
output needs for a scheduled worker.

* Add new JobExecStep values and update docstring.

* Redesign JobStatus type.

Redesigning JobStatus class to be a serializable composite of
JobExecPhase and JobExecStep, which it was before anyway, but without
itself redunantly being another enum type.

* Update tests to account for redesigned JobStatus.

* Adopt DataRequirement type from modeldata.

Replacing internal WorkerDataRequirement type with DataRequirement type
from dmod.modeldata, and adopting its use within Job.

* Update scheduler package to depend on modeldata.

* Update DataFormat with is_time_series and fields.

* Add TimeRange importable from dmod.modeldata.data.

* Add abstract Dataset and DatasetManager types.

* Extending features of DataFormat.

Extending DataFormat metadata type to initialize with indices and (if
needed) implied indices (i.e., not also data fields).

* Add data domain and restriction metadata types.

Adding new ContinuousRestriction, DiscreteRestriction, and DataDomain
types for defining and filtering data domains.

* Adjust TimeRange to extend ContinuousRestriction.

* Refactor meta_data.py var/property names and doc.

* Refactor class order in meta_data.py.

* Update DataDomain to contain a data format.

Updating DataDomain type to also contain a DataFormat property, to let
it encapsulate the data fields in the domain effectively.

* Update DataRequirement to use DataDomain directly.

Updating DataRequirement type to use the DataDomain to define its domain
instead of a generic type variable; with this done, making it a concrete
type and removing deprecated CatchmentDataRequirement subtype.

* Update modeldata.data init for new/removed types.

* Rename test_catchment_data_requirement.py file.

Renaming to test_data_requirement.py after making DataRequirement type
concrete and removing CatchmentDataRequirement type.

* Add DataDomain to modeldata.data package init.

* Fix typo in type hints for DataFormat.__init__.

* Adjust modeldata.data init for circular imports.

* Update DataRequirement unit tests.

Updating tests after recent redesign that involved removing the
CatchmentDataRequirement subtype.

* Remove URI and add UUIDs in dataset and manager.

Removing URI in favor of simple string access location property, and
adding UUID properties to Dataset and DatasetManager.

* Add DatasetManager uuid property function.

* Add DatasetUser and manager funcs to link.

Adding new DatasetUser type and functions within DatasetManager for
managing known users of a Dataset.

* Fix dataset.py imports after changes to package.

* Update some Dataset docstring and func signatures.

* Add abstract Dataset list_files function.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request maas MaaS Workstream
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

2 participants