Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More compact inventory.json format #642

Open
srerickson opened this issue Sep 29, 2023 · 0 comments
Open

More compact inventory.json format #642

srerickson opened this issue Sep 29, 2023 · 0 comments

Comments

@srerickson
Copy link
Contributor

srerickson commented Sep 29, 2023

One issue with OCFL v1.x is that the inventory.json can get quite large-- especially when there are many versions, version files, and you add fixity to the mix. A more compact format is possible by removing duplication of digests and file paths. This is illustrated in the structure below, which is based on the spec-ex-full fixture. The original inventory is 3773 bytes, the alternative structure is 2217 bytes- almost half the size despite carrying the same information.

{
    "type": "https://ocfl.io/2.0-draft/spec/#inventory",
    "id": "ark:/12345/bcd987",
    "digestAlgorithm": "sha512",
    "head": "v3",
    "manifest": {
	"4d27c86b026ff709b02b05d126cfef7ec3aed5f83f5e98df7d7592f7a44bd1dc7f29509cff06b884158baa36a2bbeda11ab8a64b56585a70f5ce1fa96e26eb53": {
			"content": ["v2/content/foo/bar.xml"],
			"v1": [],
			"v2": ["foo/bar.xml"],
			"v3": ["foo/bar.xml"],
			"fixity":{
				"md5": "2673a7b11a70bc7ff960ad8127b4adeb",
				"sha1": "a6357c99ecc5752931e133227581e914968f3b9c"
			}
		},
	"7dcc352f96c56dc5b094b2492c2866afeb12136a78f0143431ae247d02f02497bbd733e0536d34ec9703eba14c6017ea9f5738322c1d43169f8c77785947ac31": {
			"content": ["v1/content/foo/bar.xml"],
			"v1": ["foo/bar.xml"],
			"v2": [],
			"v3": [],
			"fixity":{
				"md5": "184f84e28cbe75e050e9c25ea7f2e939",
				"sha1": "66709b068a2faead97113559db78ccd44712cbf2"
			}
		}, 
	"cf83e1357eefb8bdf1542850d66d8007d620e4050b5715dc83f4a921d36ce9ce47d0d13c5d85f2b0ff8318d2877eec2f63b931bd47417a81a538327af927da3e": {
			"content": ["v1/content/empty.txt"],
			"v1": ["empty.txt"],
			"v2": ["empty.txt","empty2.txt"],
			"v3": ["empty2.txt"],
			"fixity":{
				"md5": "d41d8cd98f00b204e9800998ecf8427e",
				"sha1": "da39a3ee5e6b4b0d3255bfef95601890afd80709"
			}
		},
	"ffccf6baa21809716f31563fafb9f333c09c336bb7400088f17e4ff307f98fc9b14a577f92f3285913b7f53a6d5cf004503cf839aada1c885ac69336cbfb862e": {
			"content": ["v1/content/image.tiff"],
			"v1": ["image.tiff"],
			"v2": [],
			"v3": ["image.tiff"],
			"fixity":{
				"md5": "c289c8ccd4bab6e385f5afdd89b5bda2",
				"sha1": "b9c7ccc6154974288132b63c15db8d2750716b49"
			}
		}
    },
	"versions": {
		"v1": {
			"created": "2018-01-01T01:01:01Z",
			"message": "Initial import",
			"user": {
				"address": "mailto:alice@example.com",
				"name": "Alice"
			}
		},
		"v2": {
			"created": "2018-02-02T02:02:02Z",
			"message": "Fix bar.xml, remove image.tiff, add empty2.txt",
			"user": {
				"address": "mailto:bob@example.com",
				"name": "Bob"
			}
		},
		"v3": {
			"created": "2018-03-03T03:03:03Z",
			"message": "Reinstate image.tiff, delete empty.txt",
			"user": {
				"address": "mailto:cecilia@example.com",
				"name": "Cecilia"
			}
		}
	}
}

Edit:

Perhaps a cleaner structure for the manifest entries would be like:

{ 
  "content": ["v2/content/foo/bar.xml"],
  "state":{
    "v1": [],
    "v2": ["foo/bar.xml"],
    "v3": ["foo/bar.xml"]
  },
  "fixity":{
    "md5": "2673a7b11a70bc7ff960ad8127b4adeb",
    "sha1": "a6357c99ecc5752931e133227581e914968f3b9c"
  }
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants