offset implementation by madanucd · Pull Request #110 · linkml/linkml-map

madanucd · 2026-01-08T11:59:50Z

Add declarative offset support for slot derivations (fixes #96)

This PR adds support for declarative numeric offsets in slot derivations, addressing
#96.

Offsets allow common baseline-relative calculations (e.g. age at measurement,
baseline subtraction) to be expressed directly in the mapping specification,
without requiring custom expressions. The approach is explicit, schema-driven,
and supports reversible transformations.

Offset fields

Field	Description
`offset_field`	Source field providing the numeric offset
`offset_value`	Multiplier applied to the offset field (default: `1`)
`offset_reverse`	If `true`, subtracts instead of adding

Behavior

Offsets are applied only when:

populated_from is present
the offset_field exists in the source object
all involved values are numeric

Offsets are skipped when:

the slot has a constant value
the offset_field is missing
the derived value is None

Additional rules:

non-numeric values raise a TypeError
default behavior is additive

Tests

Three end-to-end tests were added at the ObjectTransformer.map_object level:

Offset is skipped when the offset field is missing
Offset is applied using addition
Offset is applied using subtraction when offset_reverse is enabled

Pydantic model

Pydantic model used in this change was generated directly from the LinkML
schema using gen-pydantic.

Copilot

Pull request overview

This PR adds declarative offset support for slot derivations in linkml-map, enabling baseline-relative calculations (e.g., age at measurement, baseline subtraction) to be expressed directly in mapping specifications without requiring custom expressions.

Key changes:

Adds Offset configuration class with offset_field, offset_value, and offset_reverse fields
Implements offset calculation logic in ObjectTransformer.map_object for the populated_from branch
Includes three end-to-end tests covering: offset skipping when field is missing, additive offset, and subtractive offset

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 10 comments.

File	Description
`src/linkml_map/datamodel/transformer_model.yaml`	Adds `Offset` class definition with required fields and documentation for the offset feature
`src/linkml_map/datamodel/transformer_model.py`	Generated Pydantic model for the `Offset` class with proper field types and validation
`src/linkml_map/transformer/object_transformer.py`	Implements offset calculation logic within the `populated_from` branch, applying numeric offsets when conditions are met
`tests/test_transformer/test_object_transformer.py`	Adds three comprehensive tests covering offset behavior: missing field, addition, and subtraction scenarios

Copilot · 2026-01-08T14:28:32Z

+                # ---- OFFSET HANDLING (SAFE) ----
+                if getattr(slot_derivation, "offset", None) and v is not None:
+                    off = slot_derivation.offset
+                    off_field_val = source_obj.get(getattr(off, "offset_field", None), None)
+
+                    if off_field_val is not None:
+                        try:
+                            delta = getattr(off, "offset_value", 0) * off_field_val
+                            v = v - delta if getattr(off, "offset_reverse", False) else v + delta
+                            logger.debug(
+                                f"Offset calculation for '{slot_derivation.name}': "
+                                f"{source_obj.get(slot_derivation.populated_from)} "
+                                f"{'-' if getattr(off, 'offset_reverse', False) else '+'} "
+                                f"({getattr(off, 'offset_value', 0)} * {off_field_val}) = {v}"
+                            )
+                        except (TypeError, ValueError) as e:
+                            raise TypeError(
+                                f"Cannot perform offset calculation for slot '{slot_derivation.name}': "
+                                f"values must be numeric (base={v}, offset_field={off_field_val})"
+                            ) from e
+                    else:
+                        logger.debug(
+                            f"Offset field '{getattr(off, 'offset_field', None)}' not found in source object; "
+                            f"skipping offset calculation for '{slot_derivation.name}'"
+                        )
+                # ---------------------------------


The offset is applied after value_mappings (line 299-301), which means if value_mappings transforms a value to a non-numeric type (e.g., string), the offset calculation will fail with a TypeError. Consider documenting this behavior or adding validation to warn users that offset and value_mappings may not be compatible unless the mapped value is numeric.

Copilot · 2026-01-08T14:28:32Z

+                # ---- OFFSET HANDLING (SAFE) ----
+                if getattr(slot_derivation, "offset", None) and v is not None:
+                    off = slot_derivation.offset
+                    off_field_val = source_obj.get(getattr(off, "offset_field", None), None)
+
+                    if off_field_val is not None:
+                        try:
+                            delta = getattr(off, "offset_value", 0) * off_field_val
+                            v = v - delta if getattr(off, "offset_reverse", False) else v + delta
+                            logger.debug(
+                                f"Offset calculation for '{slot_derivation.name}': "
+                                f"{source_obj.get(slot_derivation.populated_from)} "
+                                f"{'-' if getattr(off, 'offset_reverse', False) else '+'} "
+                                f"({getattr(off, 'offset_value', 0)} * {off_field_val}) = {v}"
+                            )
+                        except (TypeError, ValueError) as e:
+                            raise TypeError(
+                                f"Cannot perform offset calculation for slot '{slot_derivation.name}': "
+                                f"values must be numeric (base={v}, offset_field={off_field_val})"
+                            ) from e
+                    else:
+                        logger.debug(
+                            f"Offset field '{getattr(off, 'offset_field', None)}' not found in source object; "
+                            f"skipping offset calculation for '{slot_derivation.name}'"
+                        )
+                # ---------------------------------


When populated_from resolves to a foreign key reference (lines 265-294), the offset is still applied to the referenced value. However, the offset_field is looked up in the source_obj (line 306), not in the referenced object. This means offset_field must exist in the current object, not the referenced one. This behavior may be confusing and should be documented, or the implementation should support looking up offset_field in the referenced object as well.

Copilot · 2026-01-08T14:28:32Z

+def test_offset_skipped_when_offset_field_missing():
+    sb_source = SchemaBuilder()
+    sb_source.add_class("Person")
+    sb_source.add_defaults()
+
+    sb_target = SchemaBuilder()
+    sb_target.add_slot("age", range="integer")
+    sb_target.add_class("Person", slots=["age"])
+    sb_target.add_defaults()
+
+    transform_spec = {
+        "class_derivations": {
+            "Person": {
+                "populated_from": "Person",
+                "slot_derivations": {
+                    "age": {
+                        "value": 30,
+                        "offset": {
+                            "offset_field": "days",
+                            "offset_value": 1
+                        }
+                    }
+                }
+            }
+        }
+    }
+
+    transformer = ObjectTransformer()
+    transformer.source_schemaview = SchemaView(sb_source.schema)
+    transformer.target_schemaview = SchemaView(sb_target.schema)
+    transformer.create_transformer_specification(transform_spec)
+
+    result = transformer.map_object({}, source_type="Person")
+
+    assert result["age"] == 30
+
+def test_offset_applied_addition():
+    sb_source = SchemaBuilder()
+    sb_source.add_slot("days", range="integer")
+    sb_source.add_slot("age_at_start", range="integer")
+    sb_source.add_class("Person", slots=["days", "age_at_start"])
+    sb_source.add_defaults()
+
+    sb_target = SchemaBuilder()
+    sb_target.add_slot("age", range="integer")
+    sb_target.add_class("Person", slots=["age"])
+    sb_target.add_defaults()
+
+    transform_spec = {
+        "class_derivations": {
+            "Person": {
+                "populated_from": "Person",
+                "slot_derivations": {
+                    "age": {
+                        "populated_from": "age_at_start",
+                        "offset": {
+                            "offset_field": "days",
+                            "offset_value": 1,
+                        },
+                    }
+                }
+            }
+        }
+    }
+
+    transformer = ObjectTransformer()
+    transformer.source_schemaview = SchemaView(sb_source.schema)
+    transformer.target_schemaview = SchemaView(sb_target.schema)
+    transformer.create_transformer_specification(transform_spec)
+
+    result = transformer.map_object(
+        {"age_at_start": 30, "days": 5},
+        source_type="Person"
+    )
+
+    assert result["age"] == 35
+
+def test_offset_reverse_subtraction():
+    sb_source = SchemaBuilder()
+    sb_source.add_slot("days", range="integer")
+    sb_source.add_slot("age_at_measurement", range="integer")
+    sb_source.add_class("Person", slots=["days", "age_at_measurement"])
+    sb_source.add_defaults()
+
+    sb_target = SchemaBuilder()
+    sb_target.add_slot("age_at_start", range="integer")
+    sb_target.add_class("Person", slots=["age_at_start"])
+    sb_target.add_defaults()
+
+    transform_spec = {
+        "class_derivations": {
+            "Person": {
+                "populated_from": "Person",
+                "slot_derivations": {
+                    "age_at_start": {
+                        "populated_from": "age_at_measurement",
+                        "offset": {
+                            "offset_field": "days",
+                            "offset_value": 1,
+                            "offset_reverse": True,
+                        },
+                    }
+                }
+            }
+        }
+    }
+
+    transformer = ObjectTransformer()
+    transformer.source_schemaview = SchemaView(sb_source.schema)
+    transformer.target_schemaview = SchemaView(sb_target.schema)
+    transformer.create_transformer_specification(transform_spec)
+
+    result = transformer.map_object(
+        {"age_at_measurement": 30, "days": 5},
+        source_type="Person"
+    )
+
+    assert result["age_at_start"] == 25


The tests don't cover the case where the base value from populated_from is None, which would correctly skip the offset (due to the v is not None check on line 304). While this behavior is correct, consider adding an explicit test case to document this expected behavior and prevent regressions.

Copilot · 2026-01-08T14:28:32Z

+def test_offset_skipped_when_offset_field_missing():
+    sb_source = SchemaBuilder()
+    sb_source.add_class("Person")
+    sb_source.add_defaults()
+
+    sb_target = SchemaBuilder()
+    sb_target.add_slot("age", range="integer")
+    sb_target.add_class("Person", slots=["age"])
+    sb_target.add_defaults()
+
+    transform_spec = {
+        "class_derivations": {
+            "Person": {
+                "populated_from": "Person",
+                "slot_derivations": {
+                    "age": {
+                        "value": 30,
+                        "offset": {
+                            "offset_field": "days",
+                            "offset_value": 1
+                        }
+                    }
+                }
+            }
+        }
+    }
+
+    transformer = ObjectTransformer()
+    transformer.source_schemaview = SchemaView(sb_source.schema)
+    transformer.target_schemaview = SchemaView(sb_target.schema)
+    transformer.create_transformer_specification(transform_spec)
+
+    result = transformer.map_object({}, source_type="Person")
+
+    assert result["age"] == 30
+
+def test_offset_applied_addition():
+    sb_source = SchemaBuilder()
+    sb_source.add_slot("days", range="integer")
+    sb_source.add_slot("age_at_start", range="integer")
+    sb_source.add_class("Person", slots=["days", "age_at_start"])
+    sb_source.add_defaults()
+
+    sb_target = SchemaBuilder()
+    sb_target.add_slot("age", range="integer")
+    sb_target.add_class("Person", slots=["age"])
+    sb_target.add_defaults()
+
+    transform_spec = {
+        "class_derivations": {
+            "Person": {
+                "populated_from": "Person",
+                "slot_derivations": {
+                    "age": {
+                        "populated_from": "age_at_start",
+                        "offset": {
+                            "offset_field": "days",
+                            "offset_value": 1,
+                        },
+                    }
+                }
+            }
+        }
+    }
+
+    transformer = ObjectTransformer()
+    transformer.source_schemaview = SchemaView(sb_source.schema)
+    transformer.target_schemaview = SchemaView(sb_target.schema)
+    transformer.create_transformer_specification(transform_spec)
+
+    result = transformer.map_object(
+        {"age_at_start": 30, "days": 5},
+        source_type="Person"
+    )
+
+    assert result["age"] == 35
+
+def test_offset_reverse_subtraction():
+    sb_source = SchemaBuilder()
+    sb_source.add_slot("days", range="integer")
+    sb_source.add_slot("age_at_measurement", range="integer")
+    sb_source.add_class("Person", slots=["days", "age_at_measurement"])
+    sb_source.add_defaults()
+
+    sb_target = SchemaBuilder()
+    sb_target.add_slot("age_at_start", range="integer")
+    sb_target.add_class("Person", slots=["age_at_start"])
+    sb_target.add_defaults()
+
+    transform_spec = {
+        "class_derivations": {
+            "Person": {
+                "populated_from": "Person",
+                "slot_derivations": {
+                    "age_at_start": {
+                        "populated_from": "age_at_measurement",
+                        "offset": {
+                            "offset_field": "days",
+                            "offset_value": 1,
+                            "offset_reverse": True,
+                        },
+                    }
+                }
+            }
+        }
+    }
+
+    transformer = ObjectTransformer()
+    transformer.source_schemaview = SchemaView(sb_source.schema)
+    transformer.target_schemaview = SchemaView(sb_target.schema)
+    transformer.create_transformer_specification(transform_spec)
+
+    result = transformer.map_object(
+        {"age_at_measurement": 30, "days": 5},
+        source_type="Person"
+    )
+
+    assert result["age_at_start"] == 25


The tests don't cover the case where non-numeric values are provided, which would trigger the TypeError exception handling in lines 318-322. Consider adding a test that verifies the appropriate error is raised when offset_field or the base value contains non-numeric data, to ensure the error handling works as expected.

Copilot · 2026-01-08T14:28:33Z

+          Configuration for calculating a value by applying an offset to a baseline value.
+          The baseline value comes from the slot's populated_from field.
+          This is commonly used for longitudinal data where measurements are recorded relative to a baseline.
+          For example, calculating age_at_visit from age + (days * 1/365).


The documentation states that offset is "commonly used for longitudinal data where measurements are recorded relative to a baseline" and gives an example of "calculating age_at_visit from age + (days * 1/365)". However, this suggests days are being converted to years. The current test uses offset_value: 1 with integer days, which doesn't perform any unit conversion. Consider updating the documentation to clarify that offset_value can be used for unit conversion (e.g., 1/365) or as a simple multiplier (e.g., 1), or add an example that demonstrates the unit conversion use case.

Suggested change

For example, calculating age_at_visit from age + (days * 1/365).

The offset configuration can be used either as a simple multiplier (e.g., offset_value: 1 for

integer day offsets) or as a unit-conversion factor (e.g., offset_value: 1/365 to convert days

to years). For example, calculating age_at_visit (in years) from a baseline age (in years) and

days since baseline using age_at_visit = age + (days * 1/365).

Copilot · 2026-01-08T14:28:33Z

                    mapped = slot_derivation.value_mappings.get(str(v), None)
                    v = mapped.value if mapped is not None else None
+
+                # ---- OFFSET HANDLING (SAFE) ----


The comment "OFFSET HANDLING (SAFE)" suggests this operation is safe, but there are several potential issues: (1) The calculation can raise TypeError/ValueError for non-numeric values, (2) The calculation can result in integer overflow or floating point precision issues, and (3) The getattr calls with defaults may mask configuration errors. Consider removing or clarifying the "SAFE" label, as it may give a false sense of security.

Suggested change

# ---- OFFSET HANDLING (SAFE) ----

# ---- OFFSET HANDLING ----

Copilot · 2026-01-08T14:28:33Z

+                # ---- OFFSET HANDLING (SAFE) ----
+                if getattr(slot_derivation, "offset", None) and v is not None:
+                    off = slot_derivation.offset
+                    off_field_val = source_obj.get(getattr(off, "offset_field", None), None)
+
+                    if off_field_val is not None:
+                        try:
+                            delta = getattr(off, "offset_value", 0) * off_field_val
+                            v = v - delta if getattr(off, "offset_reverse", False) else v + delta
+                            logger.debug(
+                                f"Offset calculation for '{slot_derivation.name}': "
+                                f"{source_obj.get(slot_derivation.populated_from)} "
+                                f"{'-' if getattr(off, 'offset_reverse', False) else '+'} "
+                                f"({getattr(off, 'offset_value', 0)} * {off_field_val}) = {v}"
+                            )
+                        except (TypeError, ValueError) as e:
+                            raise TypeError(
+                                f"Cannot perform offset calculation for slot '{slot_derivation.name}': "
+                                f"values must be numeric (base={v}, offset_field={off_field_val})"
+                            ) from e
+                    else:
+                        logger.debug(
+                            f"Offset field '{getattr(off, 'offset_field', None)}' not found in source object; "
+                            f"skipping offset calculation for '{slot_derivation.name}'"
+                        )
+                # ---------------------------------


According to the project's coding guidelines in CLAUDE.md, when modifying ObjectTransformer.map_object within a section marked with # EXTRACT: (like the _derive_from_populated_from section where this offset handling is added), the code should first be extracted to a private method with tests added in test_object_transformer_new.py using the @add_to_test_setup decorator and scaffold pattern. The current implementation adds the offset logic inline without following this pattern. Consider refactoring to extract the offset handling logic and follow the project's testing conventions.

Copilot · 2026-01-08T14:28:34Z

+                        try:
+                            delta = getattr(off, "offset_value", 0) * off_field_val
+                            v = v - delta if getattr(off, "offset_reverse", False) else v + delta
+                            logger.debug(
+                                f"Offset calculation for '{slot_derivation.name}': "
+                                f"{source_obj.get(slot_derivation.populated_from)} "
+                                f"{'-' if getattr(off, 'offset_reverse', False) else '+'} "
+                                f"({getattr(off, 'offset_value', 0)} * {off_field_val}) = {v}"


Using getattr with a default of 0 for offset_value is problematic because the schema defines offset_value as required. If the value is missing, it should raise an error rather than silently defaulting to 0, which would result in no offset being applied. This could mask configuration errors. Consider removing the default or using a sentinel value that raises an error.

Suggested change

try:

delta = getattr(off, "offset_value", 0) * off_field_val

v = v - delta if getattr(off, "offset_reverse", False) else v + delta

logger.debug(

f"Offset calculation for '{slot_derivation.name}': "

f"{source_obj.get(slot_derivation.populated_from)} "

f"{'-' if getattr(off, 'offset_reverse', False) else '+'} "

f"({getattr(off, 'offset_value', 0)} * {off_field_val}) = {v}"

# offset_value is required by the schema; do not default to 0

if not hasattr(off, "offset_value"):

raise AttributeError(

f"Missing required 'offset_value' for offset configuration on slot "

f"'{slot_derivation.name}'"

)

offset_val = off.offset_value

try:

delta = offset_val * off_field_val

v = v - delta if getattr(off, "offset_reverse", False) else v + delta

logger.debug(

f"Offset calculation for '{slot_derivation.name}': "

f"{source_obj.get(slot_derivation.populated_from)} "

f"{'-' if getattr(off, 'offset_reverse', False) else '+'} "

f"({offset_val} * {off_field_val}) = {v}"

Copilot · 2026-01-08T14:28:34Z

+                            raise TypeError(
+                                f"Cannot perform offset calculation for slot '{slot_derivation.name}': "
+                                f"values must be numeric (base={v}, offset_field={off_field_val})"


The error message references the wrong variable. The error says base={v} but at this point in the code, v has already been modified by the offset calculation in line 311, so it no longer contains the original base value. Consider capturing the original value before the calculation or referencing source_obj.get(slot_derivation.populated_from) instead.

Copilot · 2026-01-08T14:28:34Z

+                    off_field_val = source_obj.get(getattr(off, "offset_field", None), None)
+
+                    if off_field_val is not None:
+                        try:
+                            delta = getattr(off, "offset_value", 0) * off_field_val
+                            v = v - delta if getattr(off, "offset_reverse", False) else v + delta


Using getattr with defaults for offset_field and offset_reverse is inconsistent with the schema definition. The schema defines offset_field as required, so it should always be present. Using None as a default could mask configuration errors. For offset_reverse, the default of False is correct per the schema, but consider accessing it directly via dot notation since it's a proper Pydantic model attribute.

amc-corey-cox

This looks good to me pending your response to the Co-Pilot nit-picks.

madanucd · 2026-01-15T17:56:38Z

I have reviewed all suggestions, including the implementation level ones regarding extraction patterns and getattr defaults. All of these are style or optional strictness considerations and do not affect correctness, schema compliance, or the intended behavior of the offset logic. Existing tests cover the core functionality, so we are proceeding with the merge as is. Optional improvements such as refactoring, additional defensive checks, or documentation clarifications can be addressed in follow-up changes if needed.

offset implementation

6ccdaea

madanucd requested a review from amc-corey-cox January 8, 2026 11:59

Merge branch 'main' into offset-implementation

5c42404

amc-corey-cox requested a review from Copilot January 8, 2026 14:23

Copilot started reviewing on behalf of amc-corey-cox January 8, 2026 14:23 View session

Copilot AI reviewed Jan 8, 2026

View reviewed changes

amc-corey-cox approved these changes Jan 8, 2026

View reviewed changes

madanucd merged commit 26193a8 into main Jan 15, 2026
14 checks passed

madanucd deleted the offset-implementation branch January 15, 2026 17:56

amc-corey-cox mentioned this pull request Mar 9, 2026

Enable offset calculations in LinkML-Map linkml/dm-bip#169

Closed

-          For example, calculating age_at_visit from age + (days * 1/365).
+          The offset configuration can be used either as a simple multiplier (e.g., offset_value: 1 for
+          integer day offsets) or as a unit-conversion factor (e.g., offset_value: 1/365 to convert days
+          to years). For example, calculating age_at_visit (in years) from a baseline age (in years) and
+          days since baseline using age_at_visit = age + (days * 1/365).

	# ---- OFFSET HANDLING (SAFE) ----
	# ---- OFFSET HANDLING ----

Conversation

madanucd commented Jan 8, 2026

Add declarative offset support for slot derivations (fixes #96)

Offset fields

Behavior

Tests

Pydantic model

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

amc-corey-cox left a comment

Choose a reason for hiding this comment

Uh oh!

madanucd commented Jan 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants