Skip to content

Conversation

@mbrandonw
Copy link
Member

If a write is made to the database while a batch of records is being prepared and sent to iCloud, it is possible for us to accidentally cache an old timestamp on the the records. That could prevent other devices from updating their local data with fresh server data.

line: UInt = #line,
column: UInt = #column
) async throws {
) async throws -> SendRecordsCallback {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need this bit of infrastructure in the mocks to wiggle in between the moment a batch is sent out to iCloud and the moment the sentRecordZoneChanges delegate method is called.

Comment on lines +2428 to +2430
self.userModificationTime = #sql("""
max(\(self.userModificationTime), \(lastKnownServerRecord.userModificationTime))
""")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the actual fix.

Comment on lines +315 to +316
title🗓️: 2,
🗓️: 2
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can write a failing test that shows data not properly syncing (we need to better support multiple sync engines in a test), but this at least shows the edit was made at the right time. Without the changes in this PR this timestamp was 1.


extension CKRecord {
@TaskLocal static var printTimestamps = false
@TaskLocal static var printRecordChangeTag = false
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted a way to get visibility into record change tags.

@lukaskubanek
Copy link
Contributor

lukaskubanek commented Jan 24, 2026

Hi @mbrandonw, reading the description mentioning the step of preparing changes for upload and seeing the fix in setLastKnownServerRecord(_:) I wonder how exactly this is being called. Are you referring to this call stack?

  • nextRecordZoneChangeBatch(reason:options:syncEngine:)
  • refreshLastKnownServerRecord(_:) (after update)
  • setLastKnownServerRecord(_:)

If so, I’d like to point out that this very spot exhibits different behavior between production and tests. I reported it in #356. The short version is that in tests, the record is always written, as the mock database doesn’t set CloudKit’s internal modificationDate, whereas in production this only happens for newly created records (if I’m reading this correctly).

However, even when aligning the behavior between test and production environments, I wonder why we would ever want to save an in-flight record as the last-known server record before it’s confirmed by the server. I point it out in step 4 of scenario 2.1 in my document on the current behavior. It breaks the model for a potential three-way merge conflict resolution, as it would prevent using the last-known server record as the ancestor.

@mbrandonw
Copy link
Member Author

Are you referring to this call stack?

No, the problematic call stack actually originates from sentRecordZoneChanges, not nextRecordZoneChangeBatch.

If so, I’d like to point out that this very spot exhibits different behavior between production and tests. I reported it in #356. The short version is that in tests, the record is always written, as the mock database doesn’t set CloudKit’s internal modificationDate, whereas in production this only happens for newly created records (if I’m reading this correctly).

Yeah, it does seem like we should better emulate modificationDate as we do with recordChangeTag. It should be powered off of @Dependency(\.currentTime) too. Will think about this a bit more.

However, even when aligning the behavior between test and production environments, I wonder why we would ever want to save an in-flight record as the last-known server record before it’s confirmed by the server. I point it out in step 4 of scenario 2.1 in my document on the current behavior. It breaks the model for a potential three-way merge conflict resolution, as it would prevent using the last-known server record as the ancestor.

Yeah, to be honest I don't remember why we do that. And removing that call to setLastKnownServerRecord from nextRecordZoneChangeBatch makes only one single test fail (merge_clientRecordUpdatedBeforeServerRecord):

-        isCompleted🗓️: 30,
+        isCompleted🗓️: 60,

And I'm trying to think which is correct. The isCompleted field is edited at time 30, but the merge conflict is resolved at time 60. Will have to think about why we do that a bit more, but I do think this change is still good for us to make.

@mbrandonw mbrandonw merged commit f3f7f5f into main Jan 28, 2026
5 checks passed
@mbrandonw mbrandonw deleted the fix-user-modification-time branch January 28, 2026 00:23
@mbrandonw
Copy link
Member Author

However, even when aligning the behavior between test and production environments, I wonder why we would ever want to save an in-flight record as the last-known server record before it’s confirmed by the server. I point it out in step 4 of scenario 2.1 in my document on the current behavior. It breaks the model for a potential three-way merge conflict resolution, as it would prevent using the last-known server record as the ancestor.

Hey @lukaskubanek, we discussed this at length yesterday, and our intention is to indeed eagerly store the updated CKRecord in SyncMetadata as we prepare the batch. This is how we can capture timestamps for later use in merge resolution.

The name of the field (lastKnownServerRecord) may be throwing you off, but that name may not really be correct. It's just the most up-to-date record we have, including before we've even sent the record to iCloud.

And further, our understanding is that this does not interfere with 3-way merging. It's our understanding that to perform 3-way merging we shouldn't even need anything from the locally cached records, and instead should only be using the server/client/ancestor records sent back from iCloud when .serverRecordChanged error is emitted.

I have just recently opened a PR (#389) to support ancestor records in the MockCloudDatabase, and we will investigate soon whether or not that data is coming back correctly from iCloud when using SQLiteData.

@lukaskubanek
Copy link
Contributor

@mbrandonw I’m glad to see an initial discussion around 3-way merge support. That said, based on my earlier concept work, I have a few points I’d like to comment on.

It's our understanding that to perform 3-way merging we shouldn't even need anything from the locally cached records, and instead should only be using the server/client/ancestor records sent back from iCloud when .serverRecordChanged error is emitted.

I don’t think it’s sufficient to rely solely on CloudKit’s .serverRecordChanged error, as that only covers the conflict-on-send scenario. I initially made this same mistake in my own implementation and only later realized I was ignoring the conflict-on-fetch scenario, which led to data loss. Based on my experience and a few discussions with Drew McCormack (Forked, Ensembles) this is something that has to be handled manually, as CloudKit doesn’t provide any tooling for it. I tried to explain this in my original thread in #272, but here’s a short version.

Basically, when there are concurrent changes to a single record on both the client and the server, it’s up to CKSyncEngine whether it first sends the client record (triggering .serverRecordChanged) or first fetches the server record and attempts to apply it locally. In the former case (conflict-on-send), the conflict is clearly indicated and the CKError provides all the information needed for a 3-way merge. In the latter case (conflict-on-fetch), the conflict isn’t indicated at all and is treated as a regular upsert. With the current implementation in SQLiteData, this would drop the server values for the conflicting fields.

In #272 (reply in thread), I’ve shown a proof of the conflict-on-fetch scenario. It’s based on client-side custom conflict detection and demonstrates that it’s possible to observe such conflicts from the fetch path (handleFetchedRecordZoneChanges), which should also trigger 3-way merge to resolve the conflict.

Based on my research, I don’t see a way to support proper 3-way merge conflict resolution without keeping track of an ancestor record locally.

The name of the field (lastKnownServerRecord) may be throwing you off, but that name may not really be correct. It's just the most up-to-date record we have, including before we've even sent the record to iCloud.

Yes, I might have been shoehorning it into the ancestor role too much. That said, the name was the only thing I had to work with, as there is no documentation for these internal bits and it was hard to infer the intention. (And I can imagine it may have evolved while you were building the library.)

our intention is to indeed eagerly store the updated CKRecord in SyncMetadata as we prepare the batch. This is how we can capture timestamps for later use in merge resolution.

OK, that might be related to the didSet flag in the upsert logic that I wasn’t able to fully understand.

I’m still not sure I’m getting it but it seems that you’re tracking the in-flight record to support the upsert logic, which then works with the server record, the in-flight record, and the database row to derive the field updates. Since there’s no ancestor, it’s more like a two-way merge with the client version split into two representations. Is that a good summary?

In my concept work, I also ran into the question of whether this merge/consolidation logic really needs to run on every upsert, or only when a conflict is detected, i.e. when there are concurrent server and client changes. Server changes can be detected via change tags between the ancestor and server records. For client changes, I opted to use differences in modification timestamps between the ancestor record and SyncMetadata. Not sure if this is sufficient but it’d make the model much simpler. See again this comment.


I feel this comment is already getting quite long, so I’ll wrap it up here.

I think the key point to discuss before diving in is the conflict-on-fetch scenario I described above and whether it’s something you’d consider addressing in SQLiteData. It rules out relying solely on CloudKit’s .serverRecordChanged hook provided by CloudKit and requires custom infrastructure, namely tracking the ancestor record.

I’d be very interested to hear your thoughts and happy to discuss further.

@lukaskubanek
Copy link
Contributor

And I'm trying to think which is correct. The isCompleted field is edited at time 30, but the merge conflict is resolved at time 60.

One more note on this behavior.

I’ve noticed that since the mock database doesn’t set modificationDate on records (reported in #356), the merge conflict test results aren’t credible, as they don’t reflect production behavior.

When I apply my local changes to have the mock database set modificationDate, I get the same test diff as the one you posted above. Without modificationDate, the in-flight record isn’t being stored, which leads me to the same outcome you’re seeing when you comment out the step for storing the in-flight record.

I think the first step here would be to address this issue so that we can work with the tests reliably.

@mbrandonw
Copy link
Member Author

I’ve noticed that since the mock database doesn’t set modificationDate on records (reported in #356), the merge conflict test results aren’t credible, as they don’t reflect production behavior.

Yeah, we absolutely want to fix this in the mock database, and we should do it soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants