Skip to content

ComparerCollection: clarify the collection abstraction #651

@ecomodeller

Description

@ecomodeller

While reviewing ComparerCollection, a few things make the "collection" part less clear than it could be. None are urgent; filing for later.

1. __iter__ violates the Mapping contract

ComparerCollection inherits from collections.abc.Mapping, but:

def __iter__(self) -> Iterator[Comparer]:
    return iter(self._comparers.values())

Mapping.__iter__ should yield keys. Because the Mapping mixin builds keys(), items(), and __contains__ on top of __iter__, the consequences are:

  • list(cc) → list of Comparer, not names
  • cc.keys() → yields Comparer (KeysView delegates to __iter__)
  • cc.items()(Comparer, Comparer) pairs
  • dict(cc) is broken

Either drop the Mapping base (it's a custom name-keyed container that happens to be dict-like) or fix __iter__ to yield names and update callers.

2. Container vs. aggregator are conflated

The class fuses two responsibilities:

  • Container: storage, indexing, iteration, sel/query/rename fan-out, merge, __getitem__ overloads, zip-based save/load.
  • Aggregator: skill(observed=...), mean_skill, weighted score, gridded_skill, plus the supporting machinery (_append_xy_to_res, _attrs_keys_in_by, _add_as_col_if_not_in_index, _mean_skill_by).

They don't need to be separate classes, but splitting them in the file (or factoring the aggregator into a module of free functions on a collection) would make the responsibilities legible.

3. __getitem__ returns different types based on argument shape

cc["alti"]Comparer; cc[["alti"]]ComparerCollection. Overloads document it, but a single-element list looks like it should be symmetric with the scalar form. Worth at least a docstring callout.

4. Unenforced invariants leak into helpers

  • _unit_text returns self[0]._unit_text with a comment "it should be the same for all" — the check is commented out.
  • _name is hardcoded to "Observations".

Small, but they signal that the collection isn't sure what it represents when children disagree.

5. Empty-drop is implicit and inconsistent across selection methods

  • sel drops children with n_points == 0.
  • query filters then drops empties.
  • rename / filter_by_attrs don't drop.

A reader has to inspect each method to know whether obs_names is preserved.

Suggested first step

Fix the Mapping contract — smallest change, clearest payoff, crispest answer to "what is the collection part."

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions