Skip to content

Conversation

@RichardWallis
Copy link
Contributor

Implement Issue #1758 - Archives and their collections
Implement Issue #1759 - MaterialExtent & CollectionSize

New Types: Archive, ArchiveComponent
New Properties: accessConditions, archiveHeld, holdingArchive, collectionSize, materialExtent

New Types: Archive, ArchiveComponent
New Properties: accessConditions, archiveHeld, holdingArchive, collectionSize, materialExtent
@anarchivist
Copy link

Hi @RichardWallis - happy to see progress on this. Just curious if there were any potential status updates regarding what this might need to move forward.

@RichardWallis
Copy link
Contributor Author

The current status is that this PR is awaiting the assembly of a list of candidates to go into the next release.

Hopefully it will not be too long before an umbrella issue is created for candidates to go into what will probably be version 3.5

@danbri
Copy link
Contributor

danbri commented Apr 20, 2018

ping @azaroth42 to take a looksee over this.

@anarchivist
Copy link

Per Richard's request for use case and implementation interest, I'm reproducing my message from the public-architypes list:

I can speak to our motivation [...] Stanford University Libraries have worked on ArcLight, which is an open source discovery platform for archives. We're hoping to 1) roll it out as a product locally, ideally over the next year, and 2) add Schema.org markup to improve discovery. It's also intended to be a broader community solution, and potentially suitable for consortia to adopt.

In addition, I'm part of the (North American) Archives and Linked Data working group that wrote and presented a poster at DCMI 2017 regarding the proposed extensions and the gap analysis we did regarding existing descriptive standards and the extension. I'm hoping that my colleagues from the group will be able to share some additional discussion on potential implementations/applications that they're working on.

@sfolsom
Copy link

sfolsom commented May 23, 2018

I wonder if since Archive is being proposed as a new type, if the properties archiveHeld and holdingArchive could be generalized to be "holds" and "heldBy". This would make the property pair useful for libraries, museums, historical societies, and other non-archive institutions.

@anarchivist
Copy link

@sfolsom - that's an interesting idea. I also wonder about the potential alignment or not with offeredBy, which is used as examples for library collections. (cc @dbs)

@RichardWallis
Copy link
Contributor Author

@sfolsom An interesting idea which would as you say be useful for libraries, museums, historical societies, and other non-archive institutions. My question would be - do the understanding of the terms "holds" and "heldBy" remain consistent across other domains such as legal, life science etc.

@thadguidry
Copy link
Contributor

@RichardWallis Yes, set theory and containerization concepts apply here. One thing I like about just using "holds" is that your not explicitly calling out any expected Type like a Collection, or a single Member, Item, Node, or Thing. "holdsMembers" "holdsItems" versus just "holds". This allows for greater flexibility for any individual item OR sets of collections.

Thad's Archive:
"holds" : [CollectionStarWars, CollectionVietnamWar, Commodore64Games, StolenMyPreciousRing]

@sfolsom
Copy link

sfolsom commented May 24, 2018

@RichardWallis I imagine heldBy and holds, or generally "holds, keeps, maintains" and "held by, kept by, maintained by" would be true for any domain.

To @anarchivist's point about http://schema.org/offers, I'm curious if part of the proposal was to reflect the fact that an archive may not be offering the collection, fond, item (even for viewing or circulation), and yet it's still important to know the holding information. FWIW, I think this is a useful distinction. Said differently, one who offers something likely holds the thing (or at least brokers it), but one who holds a thing may not offer it.

Similarly, one who holds something, may not own it, making this proposal subtly different from (and more simple than) http://schema.org/OwnershipInfo.

[I hope this discussion is only helping the proposal; I would like to see it pass.]

@ruthtillman
Copy link

This would support some of the work we're doing at Penn State Libraries where we're trying to publish more linked data from our Special Collections Library and Archives. We're getting our data ready for Spotlight, as mentioned by @anarchivist above, but we're exploring other opportunities for linked data publishing.

I appreciate the discussion of holding here -- with some of the concerns about domain specificity as discussed above...what holding means in an archival sense vs. what it means to offer things or own them. Short form is that I support the proposal as is. Longer-term, I think that there's more room for conversation about holding...

anarchivist pushed a commit to archival/schema-org that referenced this pull request Jul 12, 2018
anarchivist pushed a commit to archival/schema-org that referenced this pull request Jul 12, 2018
@Dataliberate
Copy link
Contributor

Since the failure of this proposal to make the cut for version 3.4 of Schema.org; I have received several statements of support and intentions to use these terms once adopted, via the Schema Architypes Community groups, and github comments. I reproduce them below so that the level of support and intention from significant individuals and significant archives organisations can be made apparent.

Based upon this, plus positive responses in the individual issues that this PR enacts, I would strongly request that this PR is included in the 3.5 release.

/cc @danbri

  • Adrian Stevenson - Archives Hub - JISC. (May 2018)
    For our part, the Archives Hub is intending to publish schema.org markup within our online archival descriptions. We are looking to do this somewhere around July to Sept 2018. We would add the archive enhancements when they are released. The consumption use case is a more difficult one for us to see at this stage, but we would be looking to integrate data from other archival schema.org sources as and when they appear if this looks like it would enhance discovery and improve the functionality of the Archives Hub service.

  • Mark Matienzo - Stanford University Libraries
    I can speak to our motivation as well. Stanford University Libraries have worked on ArcLight, which is an open source discovery platform for archives. We're hoping to 1) roll it out as a product locally, ideally over the next year, and 2) add Schema.org markup to improve discovery. It's also intended to be a broader community solution, and potentially suitable for consortia to adopt.
    In addition, I'm part of the (North American) Archives and Linked Data working group that wrote and presented a poster at DCMI regarding the proposed extensions and the gap analysis we did regarding existing descriptive standards and the extension. I'm hoping that my colleagues from the group will be able to share some additional discussion on potential implementations/applications that they're working on.

  • Adrian Stevenson - Archives Hub - JISC. (July 2018)
    We should have some basic schema.org markup available in the Archives Hub descriptions very soon now, as it’s one of the things in our current work package. I’ll let the list know when this is available. We’ll of course add the archive extensions when we can once they’re accepted.

  • Ruth - Pennsylvania State University Libraries
    This would support some of the work we're doing at Penn State Libraries where we're trying to publish more linked data from our Special Collections Library and Archives. We're getting our data ready for Spotlight, as mentioned by @anarchivist above, but we're exploring other opportunities for linked data publishing. I appreciate the discussion of holding here -- with some of the concerns about domain specificity as discussed above...what holding means in an archival sense vs. what it means to offer things or own them. Short form is that I support the proposal as is. Longer-term, I think that there's more room for conversation about holding...

  • Alec Mulinder - The National Archives (UK)
    We at The National Archives of the United Kingdom are already implementing the sharing of Schema.org data across our website and online catalogue.  Based upon the materials we hold we consider the proposed types in the Archives proposal to extend Schema.org to be key to helping us satisfactorily describe our resources.  As such we strongly support the adoption of this proposal.

  • State Archives and Records Authority of NSW, Australia
    State Archives NSW is currently using schema.org terms to describe government digital archives and would welcome the inclusion of more domain-specific terms to the vocabulary. We would adopt these terms in our descriptive practices and would seek to make them available for online discovery as well.
    Terry Jolliffe | Project Officer, Digital Archives

@anarchivist
Copy link

I'll also add this Twitter thread between myself, @danbri, and @informaticmonad that adds some more context of other areas of work we're considering: https://twitter.com/InformaticMonad/status/1030175595949260800

@btwashburn
Copy link

From a recent post to the Archives and Linked Data Interest Group discussion (https://groups.google.com/forum/#!topic/archives-and-linked-data):

ArchiveGrid is an open discovery system that OCLC's Research division maintains, to provide access to descriptions of archival materials, based (mostly) on MARC records from WorldCat, along with finding aids harvested from contributing institutions.

All of the 5.4M ArchiveGrid records that are based on MARC records in WorldCat include JSON-LD schema.org data in their HTML source.

I've recently updated the JSON-LD to incorporate the proposed ArchiveComponent, holdingArchive, and Archive extensions.

Here's an ArchiveGrid search to return MARC records:

https://researchworks.oclc.org/archivegrid/?p=1&q=type%3Amarc

And here's a URL to view one of those records in the Google Structured Data testing tool, which should detect and display the embedded JSON-LD:

https://search.google.com/structured-data/testing-tool#url=https%3A%2F%2Fresearchworks.oclc.org%2Farchivegrid%2Fcollection%2Fdata%2F950912120

The testing tool throws errors for the (currently) unrecognized Schema.org classes and predicates, as expected.

I have questions about rendering information about the institution that holds the item.

My previous practice had been to make use of the "offers" relationship and the "offeredBy" structure to link to the associated institution:

"offers": {
"@type":"Offer",
"offeredBy": {
"url":"https://researchworks.oclc.org/archivegrid/archive/123",
"@type": [
"Organization",
"Archive"
],
"name":"American Antiquarian Society"
}
}

I've retained that structure, and have added a parallel "holdingArchive" relationship:

"holdingArchive": {
"@type":"Archive",
"url":"https://researchworks.oclc.org/archivegrid/archive/123",
"name":"American Antiquarian Society"
}

I noted that ArchivesSpace is now embedding JSON-LD on collection pages (on at least some sites), where the "provider" relationship is used to indicate the responsible institution.

For example, in https://archives.etsu.edu/repositories/2/resources/2:

"provider": {
"@id": null,
"url": "https://archives.etsu.edu/repositories/2",
"@type": "Organization",
"name": "Archives of Appalachia"
}

So I'm wondering what the recommended best practice is, and whether the parallel rendering I've adopted would be acceptable, at least for now.

@RichardWallis
Copy link
Contributor Author

This is indeed awesome Bruce!

Thanks Mark for forwarding to the relevant lists, and again Bruce for adding to the pull request comments on the Schema.org Github.

This will hopefully help push the PR over the edge into the next Schema.org release, especially with Mark viewing it from the standpoint of potential data consumption.

Hopefully sometime soon after that merge, Google's testing tool will stop throwing errors for these terms.

As to the questions raised...

'offeredBy', 'provider' & 'holdingArchive' on the surface are similar, but in detail have significant differences.

offeredBy: A pointer to the organization or person making the offer
This Schema property is used on the Offer type, which describes the an offer to make some Thing available: "An offer to transfer some rights to an item or to provide a service — for example, an offer to sell tickets to an event, to rent the DVD of a movie, to stream a TV show over the internet, to repair a motorcycle, or to loan a book."
It therefore would not be expected to be used as a property for a CreativeWork, Collection, etc., but as used by Bruce to relate a thing to the organization making it available.

provider: The service provider, service operator, or service performer; the goods producer. Another party (a seller) may offer those services or goods on behalf of the provider. A provider may also serve as the seller.
Despite the Schema leaning towards commercial relationships in its descriptions, the provider of a thing is considered to be the producer (originator?) of a thing. As indicated, this does not prevent the provider being also being referenced, via offeredBy in an Offer (and in our case the holdingArchive of an ArchiveComponent).

holdingArchive: Archive that holds, keeps or maintains the ArchiveComponent
The specific relationship between a Thing (defined as an ArchiveComponent) and the Archive (Institution with archival holdings - An Archive, or Archives, is an organization which keeps and preserves archival material and potentially makes it accessible to the public.) that holds it.
In the examples from ArchiveGrid Bruce has, I believe validly, added a parallel relationship to that supplied via Offer/offeredBy, utilising holdingArchive, between the item and the Archive.

I would comment that the Offer description could benefit from a few extra properties identifying the parameters of the offer such as businessFunction (Loan - http://purl.org/goodrelations/v1#LeaseOut - practice defined with the library community), areaServed, availableAtOrFrom, price (0 - free), etc.

@danbri
Copy link
Contributor

danbri commented Mar 8, 2019

Looking at this, lots of good things about it, but let's discuss a couple of things.

  1. the name "Archive" pulls in a couple of directions even within the intended use of it here: as a kind of organization (which is what I see in the definitions) but also the repository or repositories of stuff that is managed/curated/held by those organizations.

Further, there is the growing sense in which "archive" is used for a wider range of archival activity. Archiving my digital photos, for example. Or an archive within an institution. Your current definition assumes archives are made available to the public.

How about "ArchiveOrganization" instead of "Archive", and maybe sprinkle on a "typically" or "usually" to allow for non-public cases.

  1. accessConditions

This is too specific to add purely for this sense of archive. Please postpone it for now.

"Details of conditions of access and use of an archive or item" ...

... even for the very nearby usecase of libraries, museums, this property doesn't fit. We already have opening hours, I suggest whatever we do ought to apply to all visitable organizations/places i.e. local businesses. The case of modeling access details to a specific item is interesting but again very usecase specific. Can you post a few examples of this property so we can look for commonalities with other areas of schema markup?

/cc @mkanzaki in case this is relevant to Japan Search too

@RichardWallis
Copy link
Contributor Author

@danbri - let's discuss a couple of things...

  1. the name "Archive" ...

How about "ArchiveOrganization" instead of "Archive", and maybe sprinkle on a "typically" or "usually" to allow for non-public cases.

Seems a sensible suggested adjustment.

  1. accessConditions
    I agree it is a bit specific to archives - where on an individual item basis they want to say things like "Make an appointment to view via the County Archivist", "Academic researchers can apply to..", Please check with the Theatre and Performance enquiry team regarding access arrangements before making an appointment to listen to this item.", etc. It is different to opening hours and other access conditions for an organisation or place.

The intention of the property does have wider application not just in Archives, Libraries etc. It could be applicable to many CreativeWork subtypes. I am struggling to come up with another property name that means "what you have to do to get hold of this thing".

I agree it is probably a good idea to put this particular property on hold for later proposal when more generic use cases and naming has been explored.

I will update the proposal to reflect these two points.

@anarchivist
Copy link

@danbri 💬

  1. the name "Archive" ...

How about "ArchiveOrganization" instead of "Archive", and maybe sprinkle on a "typically" or "usually" to allow for non-public cases.

Sounds good to me, and from what I see @RichardWallis used "potentially" as the word here. FWIW, the International Standard for Description of Institutions with Archival Holdings (ISDIAH) specifies that such institutions make materials available to the public, but I don't think we need to litigate this too much further.

@danbri 💬

  1. accessConditions ... This is too specific to add purely for this sense of archive. Please postpone it for now. "Details of conditions of access and use of an archive or item" ... The case of modeling access details to a specific item is interesting but again very usecase specific. Can you post a few examples of this property so we can look for commonalities with other areas of schema markup?

I will note that access restrictions often apply to an entire collection or a major section of them (e.g. a series). It may overlap with needs to express restrictions for access on datasets.

This is a messy list of documentation and examples, so apologies - hope it's helpful though.

Archives

Datasets

(Perhaps an interesting aside...)

More cultural heritage examples

  • US National Archives
  • British Library digital collections: e.g. "In-Library access only"
  • More examples
    • Open for public research.

    • Privacy restricted until 2025; permission to use materials must be obtained from the Supervisor of Reference Services.

    • Use of archival audiovisual recordings with no duplicate access copy requires advance notice.

    • Use of original papers requires an appointment.

    • University records are public records and once fully processed are generally open to research use. Records that contain personally identifiable information will be closed to protect individual privacy. The closure of university records is subject to compliance with applicable laws.

    • Only the photocopies (housed in Box 105) of these fragile materials may be used.

    • Ban Ki-moon's papers (those under AG-069-003) were screened for immediate disclosure in 2017-18, following the completion of Mr. Ban's final term. Digitized versions of all those archives identified for disclosure are available online.

@RichardWallis
Copy link
Contributor Author

@anarchivist Thanks for all the links & examples.

@danbri 'accessRights' is a good candidate for a more generic property that would be helpful for most CreativeWork sub-types.

@RichardWallis
Copy link
Contributor Author

@danbri Suggested changes applied.
See:

@anarchivist
Copy link

Added issue #2173 to discuss the accessRights proposal.

Copy link
Contributor

@danbri danbri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#application: schemaorgae
in app.yaml - intended?

@RichardWallis
Copy link
Contributor Author

#application: schemaorgae
in app.yaml - intended?

Yes - it is needed (in the version of the code the PR branch is based upon) to run in the current dev environment where the application variable has been superseded by a deployment command line value.

It now matches that setting in the master branch.

/cc @danbri

@danbri danbri merged commit ce5f0b9 into master Mar 13, 2019
danbri pushed a commit that referenced this pull request Mar 20, 2019
* matched fix in main branch

* Updated defintions to reflect agreed changes to proposal
Mage changes visible
@RichardWallis RichardWallis deleted the archivespr branch June 13, 2019 09:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants