Summary
Currently, every GeoDataFrame-based executor (LUCCVectorExecutor, future
raster/vector variants) reimplements the same boilerplate for load(),
validate(), and save(). The only genuinely domain-specific method is
run(). This issue proposes extracting a shared GeoDataFrameExecutor base
class to eliminate that duplication.
Motivation
LUCCVectorExecutor.load() and save() are already nearly identical to what
any future GDF executor would need:
load() — calls load_dataset(), captures checksum, applies column_map
validate() — checks column_map keys against model spec
save() — calls save_dataset(), records output_sha256, sets status
_check_columns() — post-load column presence check
These four concerns are infrastructure, not domain logic. A second executor
(e.g., a raster variant or a new LUCC model) would duplicate them verbatim —
which is the signal that a base class is warranted.
Proposed Design
New file: dissmodel/executor/geodataframe.py
class GeoDataFrameExecutor(ModelExecutor):
"""
Base class for executors that operate on GeoDataFrames.
Provides default implementations of load(), validate(), and save()
that cover the standard GDF contract (load_dataset, column_map,
checksum, save_dataset). Subclasses only need to implement run().
"""
output_ext: str = "gpkg" # subclasses may override (e.g. "geojson")
def load(self, record: ExperimentRecord) -> gpd.GeoDataFrame: ...
def validate(self, record: ExperimentRecord) -> None: ...
def save(self, result: gpd.GeoDataFrame, record: ExperimentRecord) -> ExperimentRecord: ...
def run(self, data: gpd.GeoDataFrame, record: ExperimentRecord):
raise NotImplementedError
Reduced executor definition
After this change, LUCCVectorExecutor becomes:
class LUCCVectorExecutor(GeoDataFrameExecutor):
name = "lucc_vector"
def run(self, data: gpd.GeoDataFrame, record: ExperimentRecord) -> gpd.GeoDataFrame:
# only domain logic: Environment, Demand, Potential, Allocation
...
Three overridden methods collapse into one.
Scope
In scope:
- Extract
GeoDataFrameExecutor into dissmodel/executor/geodataframe.py
- Migrate
LUCCVectorExecutor (in dissluc) to inherit from it
- Export
GeoDataFrameExecutor from dissmodel.executor
- Update docstrings and type hints
Out of scope (tracked separately):
- Declarative pipeline via TOML (
model.pipeline driving component
instantiation) — more powerful but trades flexibility for convention;
worth evaluating post-JOSS
- Raster base class (
XarrayExecutor or similar) — follow-up issue once
a second raster executor exists
Acceptance Criteria
Notes
The right moment to implement this is when a second GDF executor is
introduced — the duplication cost becomes concrete and the abstraction boundary
is validated by two real use cases. This issue can be picked up speculatively
before that point if bandwidth allows.
Summary
Currently, every GeoDataFrame-based executor (
LUCCVectorExecutor, futureraster/vector variants) reimplements the same boilerplate for
load(),validate(), andsave(). The only genuinely domain-specific method isrun(). This issue proposes extracting a sharedGeoDataFrameExecutorbaseclass to eliminate that duplication.
Motivation
LUCCVectorExecutor.load()andsave()are already nearly identical to whatany future GDF executor would need:
load()— callsload_dataset(), captures checksum, appliescolumn_mapvalidate()— checkscolumn_mapkeys against model specsave()— callssave_dataset(), recordsoutput_sha256, sets status_check_columns()— post-load column presence checkThese four concerns are infrastructure, not domain logic. A second executor
(e.g., a raster variant or a new LUCC model) would duplicate them verbatim —
which is the signal that a base class is warranted.
Proposed Design
New file:
dissmodel/executor/geodataframe.pyReduced executor definition
After this change,
LUCCVectorExecutorbecomes:Three overridden methods collapse into one.
Scope
In scope:
GeoDataFrameExecutorintodissmodel/executor/geodataframe.pyLUCCVectorExecutor(indissluc) to inherit from itGeoDataFrameExecutorfromdissmodel.executorOut of scope (tracked separately):
model.pipelinedriving componentinstantiation) — more powerful but trades flexibility for convention;
worth evaluating post-JOSS
XarrayExecutoror similar) — follow-up issue oncea second raster executor exists
Acceptance Criteria
GeoDataFrameExecutorexists indissmodel/executor/geodataframe.pyLUCCVectorExecutorinherits from it with onlyrun()overriddenGeoDataFrameExecutoris exported fromdissmodel.executor.__init__output_extoverride patternNotes
The right moment to implement this is when a second GDF executor is
introduced — the duplication cost becomes concrete and the abstraction boundary
is validated by two real use cases. This issue can be picked up speculatively
before that point if bandwidth allows.