Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is an OCFL digital object? #9

Closed
zimeon opened this issue Apr 4, 2018 · 13 comments
Closed

What is an OCFL digital object? #9

zimeon opened this issue Apr 4, 2018 · 13 comments
Assignees
Labels
Ready for Review Ready for review by editorial group
Milestone

Comments

@zimeon
Copy link
Contributor

zimeon commented Apr 4, 2018

We need a good definition for the glossary (https://github.com/OCFL/spec/blame/master/GLOSSARY.md#L2 ).

@zimeon
Copy link
Contributor Author

zimeon commented Apr 4, 2018

I think we should not try to describe an object in the abstract but instead do it in terms of the OCFL properties. For a first stab, could be something along the lines of:

An OCFL digital object is a collection of zero or more datastreams and administrative information that together have a URI identifier. An object may contain a sequence of versions of the datastreams.

@ahankinson
Copy link
Contributor

Do you think the definition in the discussion paper provides a good starting point?

@zimeon
Copy link
Contributor Author

zimeon commented Apr 10, 2018

My comment was motivated by thinking the definition in the paper is not adequate:

An OCFL digital object is a collection of files and metadata that can have a notional but largely implicitly-understood boundary: “This is the thing, and this is not the thing.” ... The object should also contain a record of the metadata that describes the origin, character, and purpose of the collection of files. ...

because:

  • I think an OCFL digital object has a very well defined boundary and, like BagIt, a notion of testable completeness (how that maps to boundaries in other models is arbitrary, but out of scope)
  • It omits the idea of having an identifier
  • I think the description of what the metadata describes is an overreach (it should enable all of those things to be described, but they are note required.

Looking at it now, I think my #9 (comment) was excessively reductionist, and omits both the metadata/data distinction and log files

@julianmorley
Copy link
Contributor

Is OCFL actually an AIP? Or is it a set of standards that other AIP formats can conform to, to gain compatibility and extended features? For example, what does an OCFL-compliant Moab object look like? What does an OCFL-compliant Bagit object look like? What benefits (aside from cross-compatibility) do they gain from compliance?

I know that I am putting myself on the hook to answer at least some of these questions.

@ahankinson
Copy link
Contributor

I believe yes, it is a standardised form of an AIP.

@rosy1280
Copy link
Contributor

rosy1280 commented Jun 4, 2018

@julianmorley its a specification that AIPs can conform to.

from @rotated8 its a grouping of files that represent an intellectual endeavor

@ahankinson
Copy link
Contributor

ahankinson commented Jun 4, 2018

An OCFL Object is a group of one or more bitstreams and their administrative information that together have a URI. An object may contain a sequence of versions of the bitstreams that together represent an intellectual endeavour.

@ahankinson
Copy link
Contributor

ahankinson commented Jun 4, 2018

OCFL is an application-independent specification that describes the storage of OCFL Objects in a structured, transparent, and predictable manner.

@ahankinson ahankinson assigned ahankinson and unassigned rosy1280 Jun 4, 2018
@awoods
Copy link
Member

awoods commented Jun 18, 2018

@ahankinson : Your suggested OCFL Object definition and OCFL effort specification definition seem like a great starting points.

I would be inclined to move them into the specification document.

@zimeon
Copy link
Contributor Author

zimeon commented Jun 18, 2018

Re. the OCFL Object definition I'd suggest a stronger "together are identified by a URI" rather than just "have". I'm not convinced about "intellectual endeavour" part for versioning, it seems that there might be many reasons for a changes in the set of bitstreams comprising an object. I don't think a format conversion is an intellectual endeavour for example, even the accretion of data in a dataset would be questionable. Is it just that the versions represent the evolution of the object?

@awoods
Copy link
Member

awoods commented Jun 18, 2018

Would it be easier to wordsmith against a PR?

@zimeon
Copy link
Contributor Author

zimeon commented Jun 18, 2018

If I make the PR it might be biased towards my (changing - noting that I just objected to the "have" I had earlier suggested) views ;-)

@awoods awoods added the Ready for Review Ready for review by editorial group label Jun 21, 2018
@awoods
Copy link
Member

awoods commented Jun 22, 2018

Resolved with: #25

@awoods awoods closed this as completed Jun 22, 2018
@zimeon zimeon added this to the Alpha milestone Oct 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ready for Review Ready for review by editorial group
Projects
None yet
Development

No branches or pull requests

5 participants