Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Corruption Recovery #15

Closed
ahankinson opened this issue Feb 27, 2018 · 6 comments
Closed

Corruption Recovery #15

ahankinson opened this issue Feb 27, 2018 · 6 comments
Labels
Component: Client Behaviors Component: Validation Confirmed: Out-of-scope Use case will not be included in the upcoming version of the spec or implementation notes.

Comments

@ahankinson
Copy link
Contributor

A power outage occurred when a software component was in the middle of writing an OCFL object, leaving the object in an ambiguous state. There should be mechanisms for recovering from various failure modes.

@ahankinson ahankinson added the Proposed: In-Scope Use case is up for discussion and may change the spec, implementation notes, or become an extension. label Feb 27, 2018
@zimeon
Copy link
Contributor

zimeon commented Mar 8, 2018

Perhaps the first question is how one can understand the state of the OCFL object? Then, what mechanisms might avoid the possibility that the corruption could affect the integrity of a version previous to the one being added? How could one revert the partial update to get back to clean state in order to re-run the update?

@ahankinson
Copy link
Contributor Author

One of the necessary tools for OCFL will be a validator, and so the state of an OCFL object would ultimately be one that is valid according to the spec. Of course, when writing a bunch of files to disk the possible failure states can range from "Connection to the NFS / S3 store failed" (i.e., relatively high-level) to "A disk array lost power and no battery was available to let it finish writing" (i.e., low-level).

It may be that this is where the spec could specify a recommended order of operations for OCFL filesystems, e.g., take checksum, write file to disk, record checksum. This would let a validator know whether a) a file was written completely (matches recorded checksum); b) a checksum was recorded correctly (a file exists with a matching checksum recorded).

I don't think we could enumerate all the possible failure states, but perhaps we could view the validation process as a bit like "fsck", where it could detect and alert the maintainer that something was wrong, giving them the ability to fix it.

@zimeon
Copy link
Contributor

zimeon commented Mar 9, 2018

Yes, I think that given the critical place of the manifest/versions.jsonld file in being able to reconstruct state of the object from the blobs, rules for update might include rules about writing the new manifest to some agreed temporary file and then switching them over in a controlled way (which might be different in filesystem vs. cloud stores)

@ahankinson
Copy link
Contributor Author

ahankinson commented Sep 5, 2018

F2F 2018.09.05: An object with a version directory and no record in the inventory is invalid, which, referencing #14, is not permitted. Specific automated or manual interventions are not prescribed and are not in scope.

@ahankinson ahankinson added Confirmed: In-scope Use case will be included in the upcoming version of the spec or implementation notes. Component: Client Behaviors and removed Proposed: In-Scope Use case is up for discussion and may change the spec, implementation notes, or become an extension. labels Sep 5, 2018
@neilsjefferies
Copy link
Member

A lot of this is now covered in the implementation Notes on writing new versions now

@rosy1280 rosy1280 added Proposed: In-Scope Use case is up for discussion and may change the spec, implementation notes, or become an extension. Confirmed: Out-of-scope Use case will not be included in the upcoming version of the spec or implementation notes. and removed Confirmed: In-scope Use case will be included in the upcoming version of the spec or implementation notes. Proposed: In-Scope Use case is up for discussion and may change the spec, implementation notes, or become an extension. labels Sep 22, 2023
@zimeon
Copy link
Contributor

zimeon commented Sep 22, 2023

Editors' meeting 2023-09-22: OCFL, by its nature, cannot provide a strong notion of transaction. An application writing an OCFL object must manage that process and ensure that it completes creation of a valid object. On failure, there must be some cleanup and some ideas are document in the Implementation Notes - Clean up. Closing as out-of-scope.

@zimeon zimeon closed this as completed Sep 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Client Behaviors Component: Validation Confirmed: Out-of-scope Use case will not be included in the upcoming version of the spec or implementation notes.
Projects
None yet
Development

No branches or pull requests

4 participants