Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 26 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,31 @@
# Change log

### 0.2.12

#### Added
- Per-geometry schema support: auto-picks Point/LineString/Polygon schemas with sensible defaults.
- Structured per-feature **issues** output (former “fixme”): one best, human-friendly message per feature.
- Friendly error formatter:
- Compacts `Enum` errors.
- Summarizes `anyOf` by unioning required keys → “must include one of: …”.
- `_feature_index_from_error()` to reliably extract `feature_index` from `jsonschema_rs` error paths.
- `_get_colset()` utility for safe set extraction with diagnostics for missing columns.
- Unit tests covering helpers, schema selection, and issues aggregation.

#### Changed
- `validate()` now **streams** `jsonschema_rs` errors; legacy `errors` list remains but is capped by `max_errors`.
- `ValidationResult` now includes `issues`.
- Schema selection prefers geometry from the first feature; falls back to filename heuristics (`nodes/points`, `edges/lines`, `zones/polygons`).

#### Fixed
- Robust GeoJSON/extension handling:
- Safe fallback to index when `_id` is missing.
- Non-serializable property detection in extensions (with clear messages).
- Safer flattening of `_w_id` (list-like) for zone validations.

#### Migration Notes
- Prefer consuming `ValidationResult.issues` for per-feature UX and tooling.

### 0.2.11

- Fixed [BUG-2065](https://dev.azure.com/TDEI-UW/TDEI/_workitems/edit/2065/)
Expand Down
445 changes: 333 additions & 112 deletions src/python_osw_validation/__init__.py

Large diffs are not rendered by default.

97 changes: 97 additions & 0 deletions src/python_osw_validation/helpers.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
from typing import Optional
import re

def _feature_index_from_error(err) -> Optional[int]:
"""
Return the index after 'features' in the instance path, else None.
Works with jsonschema_rs errors.
"""
path = list(getattr(err, "instance_path", []))
for i, seg in enumerate(path):
if seg == "features" and i + 1 < len(path) and isinstance(path[i + 1], int):
return path[i + 1]
return None

def _err_kind(err) -> str:
"""
Best-effort classification of error kind.
Prefers jsonschema_rs 'kind', falls back to 'validator', then message.
"""
kobj = getattr(err, "kind", None)
if kobj is not None:
return type(kobj).__name__.split("_")[-1] # e.g. 'AnyOf', 'Enum', 'Required'
v = getattr(err, "validator", None)
if isinstance(v, str):
return v[0].upper() + v[1:] # 'anyOf' -> 'AnyOf'
msg = getattr(err, "message", "") or ""
return "AnyOf" if "anyOf" in msg else ""


def _clean_enum_message(err) -> str:
"""Compact enum error (strip ‘…or N other candidates’)."""
msg = getattr(err, "message", "") or ""
msg = re.sub(r"\s*or\s+\d+\s+other candidates", "", msg)
return msg.split("\n")[0]


def _pretty_message(err, schema) -> str:
"""
Convert a jsonschema_rs error to a concise, user-friendly string.

Special handling:
- Enum → compact message
- AnyOf → summarize the union of 'required' fields across branches:
"must include one of: <fields>"
"""
kind = _err_kind(err)

if kind == "Enum":
return _clean_enum_message(err)

if kind == "AnyOf":
# Follow schema_path to the anyOf node; union of 'required' keys in branches.
sub = schema
try:
for seg in getattr(err, "schema_path", []):
sub = sub[seg]

required = set()

def crawl(node):
if isinstance(node, dict):
if isinstance(node.get("required"), list):
required.update(node["required"])
for key in ("allOf", "anyOf", "oneOf"):
if isinstance(node.get(key), list):
for child in node[key]:
crawl(child)
elif isinstance(node, list):
for child in node:
crawl(child)

crawl(sub)

if required:
props = ", ".join(sorted(required))
return f"must include one of: {props}"
except Exception:
pass

# Default: first line from library message
return (getattr(err, "message", "") or "").split("\n")[0]


def _rank_for(err) -> tuple:
"""
Ranking for 'best' error per feature.
Prefer Enum > (Type/Required/Const) > (Pattern/Minimum/Maximum) > others.
"""
kind = _err_kind(err)
order = (
0 if kind == "Enum" else
1 if kind in {"Type", "Required", "Const"} else
2 if kind in {"Pattern", "Minimum", "Maximum"} else
3
)
length = len(getattr(err, "message", "") or "")
return (order, length)
Loading