Skip to content

Local AI bug fixing pipeline test #10 normalize IPHost/IPNetwork/MAC address values at input time (closes #8896)#8904

Open
polmichel wants to merge 16 commits intostablefrom
ai-bug-pipeline-8896-iphost-hfid-consistency
Open

Local AI bug fixing pipeline test #10 normalize IPHost/IPNetwork/MAC address values at input time (closes #8896)#8904
polmichel wants to merge 16 commits intostablefrom
ai-bug-pipeline-8896-iphost-hfid-consistency

Conversation

@polmichel
Copy link
Copy Markdown
Contributor

@polmichel polmichel commented Apr 14, 2026

Why

IPHost attribute values are inconsistent across Infrahub's interfaces when no netmask is provided at creation time. The HFID returned by GraphQL cannot be used for subsequent lookups, display labels disagree with stored values, and the Python SDK returns a different representation than the UI.

This PR is about unifying this value display across the different layers of the application.

This was only impacting after attribute creation. Once the attribute is reloaded, bug should have disappeared.

Goals

Ensure IPHost/IPNetwork values are normalized to their canonical with_prefixlen form immediately on input, so all downstream consumers (HFID, display labels, GraphQL, SDK) see a consistent representation.

Out-of-scope

MAC address attributes had the same issue, and this PR also fixes it.

Closes #8896

What changed

  • Behavioral change: IPHost and IPNetwork attr.value is now always in canonical with_prefixlen form (e.g., "192.0.2.1" becomes "192.0.2.1/32") from the moment the attribute is initialized. This makes HFIDs, display labels, GraphQL responses, and SDK values all consistent.

  • Implementation: Added _canonicalize_value() hook to BaseAttribute (no-op by default), overridden in IPHost and IPNetwork and MACAddress. Called in BaseAttribute.__init__() after validation. For these classes, serialize_value has been removed.

How to test

uv run pytest backend/tests/component/core/test_manager_node.py::test_hfid_round_trip_with_iphost_attribute -x -v

Impact & rollout

  • Backward compatibility: IPHost values now always include the prefix in attr.value. Code that previously relied on the raw un-prefixed form (e.g., "192.0.2.1") from attr.value will now see "192.0.2.1/32". Display labels using {{ ip_address__value }} will include the mask; users wanting mask-free display can use {{ ip_address__ip }}.
  • Deployment notes: Safe to deploy. Existing nodes with stale HFIDs (stored without mask) will be corrected when their HFID is next recomputed (e.g., on update). A separate data migration may be desirable for bulk correction.

Checklist

  • Tests added/updated
  • Changelog entry added (uv run towncrier create ...)
  • External docs updated (if user-facing or ops-facing change)
  • Internal .md docs updated (internal knowledge and AI code tools knowledge)

@github-actions github-actions Bot added group/backend Issue related to the backend (API Server, Git Agent) group/ci Issue related to the CI pipeline labels Apr 14, 2026
@polmichel polmichel changed the title test: failing test for #8896 -- IPHost HFID round-trip inconsistency fix: canonicalize IPHost/IPNetwork values at input time (closes #8896) Apr 14, 2026
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Apr 14, 2026

Merging this PR will not alter performance

✅ 12 untouched benchmarks


Comparing ai-bug-pipeline-8896-iphost-hfid-consistency (dd9bf40) with stable (90ea093)

Open in CodSpeed

@polmichel polmichel changed the title fix: canonicalize IPHost/IPNetwork values at input time (closes #8896) Local AI bug fixing pipeline test #10 canonicalize IPHost/IPNetwork values at input time (closes #8896) Apr 14, 2026
@polmichel

This comment was marked as outdated.

polmichel and others added 2 commits April 28, 2026 13:50
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
IPHost and IPNetwork attribute values were only normalized to their
canonical with_prefixlen form during DB serialization (serialize_value).
HFID and display labels are computed before serialization, so they used
the raw user input (e.g. "192.0.2.1") while the DB stored the
normalized form ("192.0.2.1/32"), breaking HFID round-trip lookups.

Add _canonicalize_value() hook to BaseAttribute (no-op by default),
overridden in IPHost and IPNetwork to normalize on input. This ensures
attr.value is always canonical from the moment it is set.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@polmichel polmichel force-pushed the ai-bug-pipeline-8896-iphost-hfid-consistency branch from 4dc7717 to d0fcb28 Compare April 28, 2026 11:50
@github-actions github-actions Bot removed the group/ci Issue related to the CI pipeline label Apr 28, 2026
polmichel and others added 5 commits April 28, 2026 14:00
MacAddress.serialize_value() normalizes values to standard EUI-48 form
(e.g. "aa:bb:cc:dd:ee:ff" -> "AA-BB-CC-DD-EE-FF") only at DB write time.
Like IPHost/IPNetwork before this fix, HFID and display labels are
computed from attr.value before serialization, causing the same
round-trip mismatch when a MacAddress participates in an HFID.

Override _canonicalize_value() on MacAddress so attr.value matches the
form produced by serialize_value() from the moment it is set.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The IPHost, IPNetwork, and MacAddress classes defined the same canonical
form in two places: _canonicalize_value() (called from __init__) and
serialize_value() (called from to_db()). Drift between them would
re-introduce the HFID/DB mismatch fixed in #8896.

Have serialize_value() delegate to _canonicalize_value(self.value) so
each class has a single source of truth for its canonical form.
serialize_value remains the abstract DB-representation hook used by
HashedPassword/List/JSON, which need transforms that are destructive or
type-changing and therefore cannot share the canonicalization hook.

Also update the existing test_validate_mac_address_returns assertion to
reflect that attr.value is now canonical at input time.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add a "Value Canonicalization" subsection to the HFID knowledge doc
explaining the _canonicalize_value() hook, why it is distinct from
serialize_value(), and which kinds override it. This is load-bearing
context for future maintainers adding a new attribute kind that needs
input-time normalization.

Update the user-facing schema topic to note that IPHost, IPNetwork, and
MacAddress values are stored in canonical form, so users querying these
fields see normalized output (e.g. AA-BB-CC-DD-EE-FF for MAC).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Drop inline examples to match the density of surrounding bullets.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
"with-prefixlen" is internal Python ipaddress module terminology.
CIDR notation is the industry-standard term users will recognize.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added the type/documentation Improvements or additions to documentation label Apr 28, 2026
@polmichel polmichel marked this pull request as ready for review April 28, 2026 13:24
@polmichel polmichel requested review from a team as code owners April 28, 2026 13:24
@polmichel polmichel changed the title Local AI bug fixing pipeline test #10 canonicalize IPHost/IPNetwork values at input time (closes #8896) Local AI bug fixing pipeline test #10 canonicalize IPHost/IPNetwork/MAC address values at input time (closes #8896) Apr 28, 2026
Comment on lines +283 to +346
async def test_hfid_round_trip_with_iphost_attribute(db: InfrahubDatabase, default_branch: Branch) -> None:
"""HFID containing an IPHost value can be used to look up the node after creation."""
SCHEMA = {
"nodes": [
{
"name": "HostAddress",
"namespace": "Test",
"human_friendly_id": ["address__value"],
"attributes": [
{"name": "address", "kind": "IPHost"},
],
}
]
}
schema_root = SchemaRoot(**SCHEMA)
registry.schema.register_schema(schema=schema_root, branch=default_branch.name)

node = await Node.init(db=db, schema="TestHostAddress", branch=default_branch)
await node.new(db=db, address="192.0.2.10")
await node.save(db=db)

hfid = await node.get_hfid(db=db)

# The HFID must use the canonical with_prefixlen form so it matches the
# DB-stored attribute value and can round-trip through get_one_by_hfid.
assert hfid == ["192.0.2.10/32"]

found = await NodeManager.get_one_by_hfid(db=db, hfid=hfid, kind="TestHostAddress")
assert found is not None
assert found.id == node.id


async def test_hfid_round_trip_with_macaddress_attribute(db: InfrahubDatabase, default_branch: Branch) -> None:
"""HFID containing a MacAddress value can be used to look up the node after creation."""
SCHEMA = {
"nodes": [
{
"name": "Interface",
"namespace": "Test",
"human_friendly_id": ["mac__value"],
"attributes": [
{"name": "mac", "kind": "MacAddress"},
],
}
]
}
schema_root = SchemaRoot(**SCHEMA)
registry.schema.register_schema(schema=schema_root, branch=default_branch.name)

node = await Node.init(db=db, schema="TestInterface", branch=default_branch)
await node.new(db=db, mac="aa:bb:cc:dd:ee:ff")
await node.save(db=db)

hfid = await node.get_hfid(db=db)

# The HFID must use the canonical EUI-48 form so it matches the DB-stored
# attribute value and can round-trip through get_one_by_hfid.
assert hfid == ["AA-BB-CC-DD-EE-FF"]

found = await NodeManager.get_one_by_hfid(db=db, hfid=hfid, kind="TestInterface")
assert found is not None
assert found.id == node.id


Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure we need HFID tests for this issue. I think tests validating that no matter the format of an IP address/network or a MAC address these are stored in a consistent way would be enough

Copy link
Copy Markdown
Contributor Author

@polmichel polmichel Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, this covers the issue scenario but is weird from a system expectations perspective. Will just check the attribute value on the Attribute class after Node.save() method, thank you

Copy link
Copy Markdown
Contributor

@ajtmccarty ajtmccarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this need a migration for existing non-normalized IP networks/hosts and MAC addresses?

Comment thread backend/infrahub/core/attribute.py Outdated

if self.value is not None:
self.validate(value=self.value, name=self.name, schema=self.schema)
self.value = self._canonicalize_value(self.value)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is canonicalize a networking term? normalize seems more common

Copy link
Copy Markdown
Contributor Author

@polmichel polmichel Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have heard it in mathematics, but normalize seems more common, thanks

Comment thread backend/infrahub/core/attribute.py Outdated

return ipaddress.ip_network(self.value).with_prefixlen
"""Serialize the value before storing it in the database."""
return self._canonicalize_value(self.value)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like this should be handled by the update to BaseAttribute.__init__, so maybe serialize_value can be removed from IPNetwork and IPHost and MacAddress, but maybe I don't understand the order of operations

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, initially I spotted that serialize_value was containing the same logic but the refactor was not good. This is logic should not be in database serialization step but value normalization step.

assert test_mac.value == mac_address
# MacAddress canonicalizes the input to the standard EUI-48 form so the
# in-memory value matches the DB-stored representation.
assert test_mac.value == test_mac._canonicalize_value(mac_address)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this isn't really testing that the MAC address is in the expected format. it's just testing that the mac address is in the format returned by _canonicalize_value

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, I have left the comme and remove the call to _canonicalize_value. Value is now hardcoded

@polmichel
Copy link
Copy Markdown
Contributor Author

polmichel commented Apr 28, 2026

does this need a migration for existing non-normalized IP networks/hosts and MAC addresses?

I do not think so. Because the value was serialized before being saved into the database, so the real value in the database should be correct. However, the initial value displayed in display_label or hfid fields was using the Attribute.value before refetching it from the database. This value was not serialized (yet).

I have updated the PR description with an additional element about this precision.

@polmichel polmichel changed the title Local AI bug fixing pipeline test #10 canonicalize IPHost/IPNetwork/MAC address values at input time (closes #8896) Local AI bug fixing pipeline test #10 normalize IPHost/IPNetwork/MAC address values at input time (closes #8896) Apr 28, 2026
| Kind | Normalized form |
|------|-----------------|
| `IPHost` | `ipaddress.ip_interface(value).with_prefixlen` (e.g. `192.0.2.1` → `192.0.2.1/32`) |
| `IPNetwork` | `ipaddress.ip_network(value).with_prefixlen` |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no example for this one?

reloaded = await NodeManager.get_one(db=db, id=node.id, branch=default_branch)
assert reloaded is not None
assert reloaded.mac.value == "AA-BB-CC-DD-EE-FF"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should there be a test for retrieving a Node that was saved before this change was made? i.e. an object with an IP address of 192.168.0.0 in the database is retrieved as 192.168.0.0/32. I believe it would require a cypher query to set the data to be incorrect

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

group/backend Issue related to the backend (API Server, Git Agent) type/documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: IPHost attribute values vary in when they require/return their netmask from API and UI

3 participants