You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current implementation lacks an on_conflict clause during the insertion of cl_audit records. This omission leads to the possibility of inserting duplicate cl_audit records when a transaction is reprocessed, even if it has already been indexed. The relevant code snippet is as follows:
This behavior is problematic as the ingester is expected to handle reprocessing of transactions seamlessly, without altering the current state of the active index snapshot or introducing redundant data. Duplicate entries can lead to skewed query results and contribute to index bloat, which is a concern given the substantial size of the index.
Proposed Fix
To address this issue, an on_conflict clause should be added to the insertion operation. This clause will ensure that the insertion does nothing if a uniqueness constraint (specifically on the combination of tree, node_id, and seq) is violated. Implementing this change requires the addition of the said constraint.
However, it is important to note that if teams have been using cl_audits for a significant period, there may already be a considerable number of duplicate cl_audit records. These records will need to be deduplicated before the proposed patch can be applied effectively.
The text was updated successfully, but these errors were encountered:
kespinola
changed the title
[Bug] saving cl_audit records are not idempotent
[Bug] saving cl_audit records is not idempotent
Dec 21, 2023
Issue
The current implementation lacks an on_conflict clause during the insertion of cl_audit records. This omission leads to the possibility of inserting duplicate cl_audit records when a transaction is reprocessed, even if it has already been indexed. The relevant code snippet is as follows:
digital-asset-rpc-infrastructure/nft_ingester/src/program_transformers/bubblegum/db.rs
Lines 97 to 100 in ec33003
This behavior is problematic as the ingester is expected to handle reprocessing of transactions seamlessly, without altering the current state of the active index snapshot or introducing redundant data. Duplicate entries can lead to skewed query results and contribute to index bloat, which is a concern given the substantial size of the index.
Proposed Fix
To address this issue, an on_conflict clause should be added to the insertion operation. This clause will ensure that the insertion does nothing if a uniqueness constraint (specifically on the combination of tree, node_id, and seq) is violated. Implementing this change requires the addition of the said constraint.
However, it is important to note that if teams have been using cl_audits for a significant period, there may already be a considerable number of duplicate cl_audit records. These records will need to be deduplicated before the proposed patch can be applied effectively.
The text was updated successfully, but these errors were encountered: