# Getty AAT

## What is the AAT? 

The **Getty Arts & Architecture Thesauris** is a multilingual, semantically structured thesaurus, or structured vocabulary. It encompases terms, descriptions, and other information for generic concepts related to art, architecture, archaeology, and other cultural heritage ([Cobb, 2015](https://doi.org/10.1080/19386389.2015.1103081)).


- The full published data set contains around 57,390 records (subject_id) and 503,230 terms (term_id), as of 25 August 2022)
  - Subject_ids contain one or more terms (plural, translations, spelling variations etc.)
- It's translated completely to Dutch by the Netherlands Institute for Art History (more info needed)
- It's also released as Linked Open Data, described using RDF. 
- A minimum record in AAT contains a numeric ID, a term, and a position in the hierarchy.



--------> TODO: use .ttl instead of .xml <-------- \
See the file `AAT.xml` for a sample of AAT. It shows the structure of the vocabulary. 
You can see that the file starts with Subject 300015646, which is a child of the Styles and Periods Facet (lines 6 and 9). 

### Terminology
- `Facet`: The major subdivision of AAT's hierarchical structure, containing classes of concepts.  
- `Hierarchy`: Groupings of terminology that are arranged within the seven facets.  
- `Record`: A single entry that contains information about a specific concept.  
- `Term`: A word or phrase that represents the concept in the record. Includes singular/plural forms, spelling variations, and translations. A record can contain multiple terms.   
- `Descriptor` or preferred term: The term that will by default used to refer to a concept. 

Here's an example of a record:

<div>
<img src="img/example_record.png" width="500"/>
</div>


[Image source](https://www.getty.edu/research/tools/vocabularies/aat/AAT-Users-Manual.pdf)

## What does it contain?

The [AAT](https://www.getty.edu/vow/AATHierarchy?find=beauty&logic=AND&note=&page=1&subjectid=300000000) is a faceted, hierarchical database of concepts. The root is called "Top of the AAT hierarchies". It contains seven ***facets*** that form the major subdivisions, each containing a class of concepts. Each facet contains one or more ***hierarchies***. 

- `Root`: Top of the AAT hierarchies 
	- `ASSOCIATED CONCEPT` FACET
		- Associated Concepts
	- `PHYSICAL ATTRIBUTES` FACET
		- Attributes and Properties
		- Conditions and Effects
		- Design Elements
		- Color
	- `STYLES AND PERIODS` FACET
		- Styles and Periods
	- `AGENTS` FACET
		- People
		- Organizations
		- Living Organisms
	- `ACTIVITIES` FACET
		- Disciplines
		- Functions
		- Events
		- Physical and Mental Activities
		- Processes and Techniques
	- `MATERIALS` FACET
		- Materials
	- `OBJECTS` FACET
		- Object Groupings and Systems
		- Object Genres 
		- Components
		- Built Environment
			- Settlements and Landscapes
			- Built Complexes and Districts
			- Single Built Works
			- Open Spaces and Site Elements
		- Furnishings and Equipment
			- Furnishings
			- Costume
			- Tools and Equipment
			- Weapons and Ammunition
			- Measuring Devices
			- Containers
			- Sound Decides
			-  Recreational Artifacts
			- Transportation Vehicles
		- Visual and Verbal Communication
			- Visual Works
			- Exchange Media
			- Information Forms
	- `BRAND NAMES` FACET
		- Brand Names


- There's different `Record_Types` for nodes:
    - Facet
    - Hierarchy
    - Concept
    - GuideTerm (placeholder to create a level in the hierarchy)
    - (ScopeNote)
    - (ObsoleteSubject)


![example of the use of different terms](img/008-complex-hierarchy.png)

[image source](https://vocab.getty.edu/doc/#Term)

## How can it be used? 


#### [SPARQL Endpoint](https://vocab.getty.edu/)

- Concepts are indentified by their 'Subject_ID'

- The AAT is availavble in the following formats: XML, relational tables, N-Tripes (LOD)
- Individual records can be downloaded in JSON, RDF, N3/Turtle, N-Tripes


#### [API](http://vocabsservices.getty.edu/Docs/Getty_Vocabularies_Web_Services_Documentation_v2.pdf)
- [AAT Programming guidelines](http://vocabsservices.getty.edu/AATService.asmx)
  - a set of 11 operations are supported, such as GetChildren, GetParents, GetSubjectTerms
  - examples included in SOAP 1.1 and 1.2, HTTP GET, and HTTP POST protocols
- Try out the GetChildren operation with HTTP POST protocol [here](http://vocabsservices.getty.edu/AATService.asmx?op=AATGetChildren) (subjectID: 300015646)

- Or run HTTP GET requests for some operators (see [GetParentLabel](http/getparentlabel.http) and [getSubjectTerms](http/getsubjectterms.http))


## What parts should be used in dataverse-nl/SSH? 

[Good example: termennetwerk](https://termennetwerk.netwerkdigitaalerfgoed.nl/)


What would be useful properties to keep? 
- `Subject_ID` allows for a link back to the source 
- `Term_Language` (possibly only keep Dutch entries)
- `Term_ID`
- `Term_Text`




### Let's look at the facets 

[`Associated concepts`](https://www.getty.edu/vow/AATHierarchy?find=beauty&logic=AND&note=&subjectid=300179462)
- e.g. philosofical concepts, scientific concepts etc
+ leaf nodes useful
± hierarchy names sometimes useful


[`Physical attributes`](https://www.getty.edu/vow/AATHierarchy?find=beauty&logic=AND&note=&page=1&subjectid=300264087)
- leaf nodes not that useful (too specific)
± hierarchy names somewhat useful


[`Styles and Periods`](https://www.getty.edu/vow/AATHierarchy?find=beauty&logic=AND&note=&subjectid=300015646)
+ nodes under the hierarchy nodes useful
+ hierarchy nodes and guide terms not useful



[`Agents`](https://www.getty.edu/vow/AATHierarchy?find=beauty&logic=AND&note=&page=1&subjectid=300264089)
- categories of people/organizations/animals/plants
- too detailed?
- node leafs or node leafs -1 might be useful 
- 


[`Activities`](https://www.getty.edu/vow/AATHierarchy?find=beauty&logic=AND&note=&subjectid=300264090)
- does not seem really useful to me 
- a lot of verbs that you probably wouldn't use to describe a dataset 

[`Materials`]()
- hierarchy/guide terms nodes are not useful 
- not sure if the leaf nodes are interesting 

[`Objects`](https://www.getty.edu/vow/AATHierarchy?find=beauty&logic=AND&note=&subjectid=300264092)
- largest category 
- hierarchy nodes useful
- leaf nodes useful (when of hierarchy nodes)
- deeper leaf nodes may be too specific (e.g. types of comic operas)


[`Brand Names`]()
- not useful for our purpose, was introduced by the conservation community for materials/processes/objects don't have a generic name but rather a name under trademark protection. 



--> I'd say `Associated concepts`, `Styles and Periods`, and `Objects` are the facets with the concepts that could be most useful as keywords for finding/describing datasets.  

## Notes
- Entailment? Do we want to keep the hierarchical relationships in place? So if you were to search for "painting ", do you want to match all terms it contains? or only the exact level? 
- Overall, hierarchy names seem like good categories to me, but so do the leaf nodes. Facets and guide terms don't seem that useful for this use. The usefulness of concepts depends on the facet (sometimes too fine-grained)