Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getty Vocabularies as a Taxonomy Source #2057

Closed
Jegelewicz opened this issue Apr 24, 2019 · 7 comments
Closed

Getty Vocabularies as a Taxonomy Source #2057

Jegelewicz opened this issue Apr 24, 2019 · 7 comments
Labels
Function-Taxonomy/Identification Priority-Normal (Not urgent) Normal because this needs to get done but not immediately.

Comments

@Jegelewicz
Copy link
Member

Not sure of the best practice for linking between issues but just wanted to make a note that I'd like to consider the open source data for Getty Vocabularies (http://www.getty.edu/research/tools/vocabularies/lod/index.html) in this discussion as well because the Nomenclature 4.0 isn't a great fit for art collections and Dusty has commented that our current "taxon" structure for categorizing our objects is leading to difficulties with searches.

Originally posted by @marecaguthrie in #1732 (comment)

@Jegelewicz Jegelewicz added Function-Taxonomy/Identification Priority-Normal (Not urgent) Normal because this needs to get done but not immediately. labels Apr 24, 2019
@marecaguthrie
Copy link

marecaguthrie commented Apr 24, 2019 via email

@marecaguthrie
Copy link

Maybe we could just start with "visual works by material and technique" and see how that goes....that would be a huge benefit to the collection.
Meanwhile, I'm going to try and understand all of this a bit better!

@AJLinn
Copy link

AJLinn commented Apr 25, 2019

I just found a 2010 publication (put out by the Getty FWIW) that looks at how these two systems differ. Here's the relevant section. Short answer: we should use both if we can. The AAT is really focused for art collections while the Nomenclature was developed for historical collections. It sounds like we should be able to use them both in combination.

4.3.3. Nomenclature for Museum Cataloging vs. the AAT
Users of vocabularies often ask how Chenhall’s Nomenclature differs from the AAT. There is some overlap, but the two vocabularies differ in several ways; thus, catalogers often need to use both.
• Nomenclature is more generalist, with shallow coverage of
more disparate types of cultural artifacts, and it has headings
in addition to terms. For art and architecture, the AAT has
broader and deeper coverage.
• The only overlap between Nomenclature and the AAT is in
the AAT Objects Facet.
• The AAT has incorporated all of Nomenclature that is within
scope for the AAT.
• Much of Nomenclature is out of scope for the AAT
(e.g., medical and surgical equipment) because the
AAT focuses on art and cultural heritage.
• The AAT is a polyhierarchical thesaurus, compliant with
national and international standards for thesaurus construction. The first two editions of Nomenclature were categorized
authority lists. The third edition more closely approaches the
model of a monohierarchical thesaurus. Accepted usage practice of the third edition of Nomenclature allows for objects to be cataloged with more than one term for cross-indexing
purposes. By contrast, in the first two editions, standard practice was to assign only one term to an object, which discouraged and complicated cross-indexing of objects with multiple
functional contexts.
• Nomenclature has fewer used for terms than the AAT. In
Nomenclature, nonpreferred terms do not appear in the hierarchical list of terms but in the alphabetical list of terms in
the back of the book, with the preferred term noted.
• Nomenclature has no qualifiers, while the AAT has qualifiers.
• Nomenclature is in English. The base language of the AAT is
English; however, terms may exist in multiple languages.
• Nomenclature includes some compound terms (headings)
that AAT users would construct for themselves.
• The third edition of Nomenclature will have definitions
for broad terms at the category, classification, and subclassification level. Object terms will not have definitions,
although some terms will be accompanied by helpful hints
about usage. The AAT has scope notes for most terms at
all levels.
• At the time of this writing, the draft revision of Nomenclature prefers capitalized and inverted terms, while the AAT
prefers terms in lowercase and expressed in natural order.
• Nomenclature does not include the published warrant for
each term. The AAT cites published sources and institutional
contributors for most terms.

@Jegelewicz
Copy link
Member Author

If I can help, you guys let me know. I agree that this would add superpowers to Arctos!

@marecaguthrie
Copy link

marecaguthrie commented Apr 25, 2019 via email

@dustymc
Copy link
Contributor

dustymc commented Apr 25, 2019

On the surface this looks fairly trivial, albeit cumbersome. (I don't think their model was designed by anyone who intended to use it!)

Some obvious issues:

We need to find a way to allow names that don't look like Linnean taxonomy in without introducing garbage. If we can limit non-Linnean names to machine input I don't think that's much of a problem, we'll just need to rebuild some stuff to accommodate. If there's some need to enter as taxonomy things that aren't in Getty (or any other source of 'cultural taxonomy') then this has some potential to blow up.

At the time of this writing, the draft revision of Nomenclature prefers capitalized and inverted terms, while the AAT
prefers terms in lowercase and expressed in natural order.

Given two ways of saying "thing" about 99.9999% of users will find some of the things and then happily leave with only part of what they were looking for and no clue that they've missed some (or most) of what's available. We can deal with this by creating relationships (until/unless something evil happens here: #1755 (comment)), but we have to find the "synonyms" to do so. I can probably catch some (most??) with the maintenance scripts, what I miss will rely on users noticing and flagging. Perhaps Getty could even be persuaded to create synonyms to other LOD data? (That's sort of the point of LOD after all!)

Just from browsing around various docs on Getty, I get the impression that the terms are not particularly stable. Neither is Linnean taxonomy, but there changes are supported by publication rather than what looks to me like arbitrary closed-door decisions. That's likely to cause some sort of complications, but we can find a way to deal with it.

Getty includes (a LOT of) "GuideTerm" values - http://vocab.getty.edu/aat/300191091 / http://www.getty.edu/vow/AATFullDisplay?find=&logic=AND&note=&page=1&subjectid=300191091. "Natural history" people do the same ("incertae sedis" is popular), but those usually aren't presented as nodes in a hierarchy. Assuming those are in fact not things you'd want to use in cataloging, avoiding introducing them as names is going to be fairly weird. They are perfectly acceptable as classification terms.

I'm sure there will be more stumbling blocks, but I can't find anything that looks fatal.

@marecaguthrie is there some reason you wouldn't just use the title as the identification? I think this is essentially a perfect use case for why we have a formal separation between identification and taxonomy. From #1755 (comment) the specimen would display "whatever the artist called the thing" here:

Screen Shot 2019-04-25 at 11 57 14 AM

and here

Screen Shot 2019-04-25 at 11 58 17 AM

and could be located by searching "whatever the artist called the thing" here and here:

Screen Shot 2019-04-25 at 11 59 08 AM

and, barring any huge steps backwards from #1755, by anything on http://www.getty.edu/vow/AATFullDisplay?find=&logic=AND&note=&subjectid=300033618 here:

Screen Shot 2019-04-25 at 12 00 30 PM

@Jegelewicz
Copy link
Member Author

Probably covered by #2499 and #2478 closing as duplicate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Function-Taxonomy/Identification Priority-Normal (Not urgent) Normal because this needs to get done but not immediately.
Projects
None yet
Development

No branches or pull requests

4 participants