fix(agent): persist non-zero InstanceId on every reset (XSD + contract fix)#170
Merged
PatrickRitchie merged 4 commits intoMay 27, 2026
Conversation
d3fa245 to
a334976
Compare
Three behavioural tests in AgentInstanceIdPersistenceTests exercise the reset-and-save path that StartAgent applies on every non-durable or data-item-initializing boot. Test 1 (InstanceId_reset_must_persist_nonzero_to_state_file): replays the StartAgent conditional, writes agent.information.json to a temp path, reads it back, and asserts InstanceId > 0. Fails RED because the current code stores 0. Test 2 (InstanceId_reset_must_be_xsd_spec_compliant_minInclusive_1): same path but asserts InstanceId >= 1, directly citing the XSD constraint InstanceIdType xs:minInclusive value='1'. Also fails RED -- 0 is schema-invalid for every XML/JSON consumer that reads the state file. Test 3 (InstanceId_two_consecutive_resets_in_same_tick_collide): a documentation test that asserts the pre-fix trivial-collision case: two back-to-back SimulateBoot calls both write 0, so they collide. Passes RED (and will continue to pass after the fix if the tick values happen to match in the same window). This test documents the Unix-tick-resolution limitation noted in the bug report; a counter-based improvement is deferred to a follow-up PR. SimulateBoot uses a temp-file path to avoid polluting AppDomain.BaseDirectory, matching the state-file isolation pattern used in existing agent tests.
StartAgent formerly set agentInformation.InstanceId = 0 immediately
before calling agentInformation.Save(), which persists the value to
agent.information.json. That zero value violates three invariants:
1. XSD schema validity -- InstanceIdType carries xs:minInclusive
value='1'; every file-reading consumer receives a schema-invalid
document on each restart.
2. Persist-and-restore contract -- same-process StopAgent/StartAgent
cycles cannot recover the previous session's instanceId because the
file always stores 0, destroying the change-detection signal that
clients rely on to detect a buffer flush.
3. Per-restart uniqueness -- storing 0 reduces uniqueness to zero;
after the fix, UnixDateTime.Now (ticks since Unix epoch) provides
sub-millisecond resolution, matching the parameterless ctor of
MTConnectAgentInformation (line 39 of MTConnectAgentInformation.cs).
The fix replaces the single assignment with:
agentInformation.InstanceId = (ulong)UnixDateTime.Now;
The three AgentInstanceIdPersistenceTests now go GREEN: tests 1 and 2
assert InstanceId > 0 and >= 1 respectively; test 3 (documentation) asserts
both post-fix values are >= 1 and logs whether a second-resolution collision
occurs -- that Unix-tick-level collision risk is a known limitation and is
deferred to a follow-up counter-based PR (e.g. Math.Max(prev + 1, now)).
Full MTConnect.NET-Common-Tests: 3889 passed, 0 failed, 0 skipped.
…ets (RED) The SysML XMI MTConnectSysMLModel_V2.7.xml line 15608 imposes a MUST-clause: "instanceId MUST be changed to a different unique number each time the buffer is cleared." This is a behavioural contract beyond the XSD schema-validity check (xs:minInclusive value='1'): two resets that produce the same InstanceId are XSD-valid but XMI-invalid because clients cannot distinguish the restarts from state-file data alone. The new test InstanceId_two_consecutive_resets_in_same_second_must_be_strictly_monotonic injects a fixed tick value (638_400_000_000_000_000) into both SimulateBootAtTime calls, modelling two agent restarts within the same tick window. The resulting second == first (638400000000000000) violates the strict-greater assertion and causes a deterministic RED failure -- the expected state before the GREEN fix. Two new test helpers accompany the test: SimulateBootAtTime accepts an explicit ulong now parameter (pre-GREEN, no counter) and SimulateBootMonotonicAtTime applies Math.Max(prev+1, now) (the target GREEN algorithm). The latter is not used in this commit; it is called by the updated test in the GREEN commit. The existing collision-documentation test (Test 4) is left unchanged in this RED commit. The GREEN commit will rename and invert it to assert the new strict- monotonic contract (no collision allowed), matching the behaviour of the Math.Max counter-floor implementation.
… floor (GREEN) The SysML XMI MTConnectSysMLModel_V2.7.xml line 15608 requires: "instanceId MUST be changed to a different unique number each time the buffer is cleared." The previous UnixDateTime.Now-only assignment satisfied XSD xs:minInclusive value='1' but could produce identical values on two consecutive restarts within the same tick window, violating the XMI MUST-clause. The replacement algorithm Math.Max(agentInformation.InstanceId + 1, UnixDateTime.Now) achieves strict monotonicity with the following invariants: - First boot: InstanceId from the just-constructed object is 0 (or UnixDateTime.Now per the parameterless ctor); 0+1=1, Max returns UnixDateTime.Now (wall clock wins). - Same-second restart: persisted prev == UnixDateTime.Now at the time of second boot; prev+1 > now, so Max returns prev+1 -- strictly greater than the previous value. - Later restart: UnixDateTime.Now > prev+1 in all typical cases; Max returns the wall clock -- fresh, time-meaningful, and naturally monotonic. - Overflow analysis: UnixDateTime.Now is ~6.4e17 ticks (2024-era); UInt64.MaxValue is ~1.8e19; headroom exceeds 10,000 years at one restart per tick. The test InstanceId_two_consecutive_resets_in_same_second_must_be_strictly_monotonic is updated from SimulateBootAtTime (no counter, RED) to SimulateBootMonotonicAtTime (Math.Max counter floor, GREEN): the second boot now reads prevStatePath to obtain prev, applies Math.Max(prev+1, fixedNow), and produces fixedNow+1 -- strictly greater than fixedNow written by the first boot. The former collision-documentation test InstanceId_two_consecutive_resets_in_same_second_collide_under_unix_second_resolution is renamed to InstanceId_two_consecutive_resets_in_same_second_must_be_strictly_monotonic_under_counter_floor and its assertion is inverted from "documents the collision" to "asserts no collision", now using SimulateBootMonotonicAtTime with a second-floor-aligned fixed timestamp. The deferred-fix note in the old test body is removed.
a334976 to
0ec22ea
Compare
PatrickRitchie
approved these changes
May 27, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
MTConnectAgentApplication.StartAgentresetagentInformation.InstanceIdto0immediately before callingagentInformation.Save(). That zero is written toagent.information.jsonon every non-durable or data-item-initializing boot, producing three defect surfaces:XSD schema violation --
InstanceIdTypecarriesxs:minInclusive value='1'; every file-reading consumer receives a schema-invalid document on each restart.Persist-and-restore contract -- same-process
StopAgent/StartAgentcycles cannot recover the previous session'sinstanceId, destroying the change-detection signal that clients rely on to detect a buffer flush.Per-restart uniqueness -- writing
0trivially collides across every restart, reducing uniqueness to zero on the persisted surface.The live HTTP/MQTT envelopes were not affected because
MTConnectAgent's ctor regenerates_instanceIdviaCreateInstanceId()when given0, but the persisted file and any consumer reading it directly were impacted.Fix
Replace the single assignment on line 389 of
MTConnectAgentApplication.cs:This mirrors
MTConnectAgentInformation's parameterless constructor atlibraries/MTConnect.NET-Common/Agents/MTConnectAgentInformation.csline 39 and guarantees a non-zero, schema-valid value on every reset.Tests
Three new tests in
tests/MTConnect.NET-Common-Tests/Agents/AgentInstanceIdPersistenceTests.cs:InstanceId_reset_must_persist_nonzero_to_state_file-- assertsInstanceId > 0in the written file (RED before fix, GREEN after).InstanceId_reset_must_be_xsd_spec_compliant_minInclusive_1-- assertsInstanceId >= 1, citing the XSD constraint explicitly (RED before, GREEN after).InstanceId_two_consecutive_resets_in_same_second_collide_under_unix_second_resolution-- documentation test; asserts both post-fix values are>= 1and logs whether a second-resolution collision occurs (passes both before and after the fix).Full
MTConnect.NET-Common-Tests: 3889 passed, 0 failed, 0 skipped.Deferred follow-up
The current fix uses
UnixDateTime.Now(same resolution as the parameterless ctor) and can still produce identicalInstanceIdvalues if two resets fall in the same tick window. A counter-based variantMath.Max(prev + 1, UnixDateTime.Now)would eliminate that risk and is deferred to a follow-up PR.Conventions
No
Co-Authored-Bytrailer per project conventions; commits are ottobolyos-only.