Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make some DataONE APIs into transactions in Metacat #1642

Closed
taojing2002 opened this issue Jun 14, 2023 · 2 comments
Closed

Make some DataONE APIs into transactions in Metacat #1642

taojing2002 opened this issue Jun 14, 2023 · 2 comments
Assignees
Milestone

Comments

@taojing2002
Copy link
Contributor

Recently we got some reports that the Metacat instances only had partial records on objects. It is warning sign that uploading objects to Metacat is not a transaction.

Create, update, and delete are the methods which should be transactions - they should either fully succeed or rollback properly.

As we move Metacat instances to k8s , we should expect interruptions during those procedures more frequently.

So we need to design a system to do this job.

@taojing2002 taojing2002 added this to the 3.0.0 milestone Jun 14, 2023
@mbjones
Copy link
Member

mbjones commented Nov 3, 2023

@artntek @taojing2002 one way to determine if this will be a real issue on k8s deployments would be to develop a test for CRUD API operations that take some time and force a pod evictions at key points in the middle of what should be an atomic operation. For that to happen, we might need to test with a long-running operation, such as a large data upload. During that "transaction", we could then use the k8s Eviction API to evict the metacat pod and make k8s spin it up on another node. This is what would happen if, for example, a hardware failure happens or if Nick cordons and drains a node for maintenance. The example in the docs they give for the API call is:

curl -v -H 'Content-type: application/json' https://your-cluster-api-endpoint.example/api/v1/namespaces/default/pods/quux/eviction -d @eviction.json

We should be able to automate that in a test, albeit with some complications due to authentication against a k8s cluster -- its a complicated integration test. Maybe its sufficient for us to just run this run this test manually at first to gauge if there are even any issues here to be addressed.

@taojing2002 taojing2002 self-assigned this Jan 30, 2024
@artntek artntek changed the title Make some DataONE APIs transactions in Metacat Make some DataONE APIs into transactions in Metacat Feb 8, 2024
@taojing2002
Copy link
Contributor Author

taojing2002 commented Feb 8, 2024

In this ticket: we found:

  • the metacat code saves the metadata to Hazelcast (which, in turn, saves it to the database systemmetadata table), BEFORE checking the accessPolicy fields.
  • the Chrome bug causes the accessPolicy section to be messed up Cannot submit or edit datasets from Chrome version 120.0.6099.71 metacatui#2235 (comment), which in turn causes metacat to throw an exception at this point (...InvalidSystemMetadata: The Permission shouldn't be null....), and fails to save the data object.
  • Because the above steps are not part of a transaction, we now have the metadata saved, but not the data.

Matthew commented:
actually, on develop, this code needing a transaction has now moved to SystemMetadataManager.java -> updateSystemMetadata()

Matthew and I discussed the issue - now the code is fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants