# [The Getty Arts & Architecture Thesaurus](https://www.getty.edu/research/tools/vocabularies/aat/) (AAT) 

## Important Links or *How to find your way around in the Getty Maze*


* [Vocabularies Downloads](https://www.getty.edu/research/tools/vocabularies/obtain/download.html)
* [AAT Hierarchy](https://www.getty.edu/vow/AATHierarchy?find=&logic=AND&note=&english=N&subjectid=300000000)
* [Technical Vocabularies Documentation](http://vocab.getty.edu/doc/)
* [SPARQL UI](https://vocab.getty.edu/queries#Finding_Subjects)
* [Semantic view](https://vocab.getty.edu/aat/) (Top of hierarchy)
* [ODC-By license](https://opendatacommons.org/licenses/by/1-0/)


## What is the AAT? 

The [**Getty Arts & Architecture Thesaurus**](https://www.getty.edu/research/tools/vocabularies/aat/) is a multilingual, semantically structured **controlled vocabulary**. It encompases terms, descriptions, and other information for generic concepts related to visual art, architecture, archaeology, and other cultural heritage. Importantly, the AAT contains generic terms, not iconographic subjects or proper names. In other words: ["each concept is a case of many (a generic thing), not a case of one (a specific thing)"](https://www.getty.edu/research/tools/vocabularies/aat/about.html#purpose). The full AAT database contains around 74,460 concept records (subject_id) and 503,230 terms (term_id). Each concept record contains one or more terms (e.g., singular/plural forms, spelling variations, translations). A record minimally contains a unique numeric id, a term, and an indication for the position in the structured hierarchy. Often it also contains a description of the term, a list of associated or equivalent terms and temporal information. The AAT is translated to Dutch by the Netherlands Institute for Art History. Note that the AAT is a compiled resource, so it's not a complete, definitive collection of concepts. It expands through community [contributions](https://www.getty.edu/research/tools/vocabularies/contributors.html) by domain experts. 


### Terminology
Here you find a list of the most important terminology related to the AAT. The terms relate to the different layers of information in the thesaurus. 

- `Facet`: The major subdivision or upper layer of AAT's hierarchical structure, each containing classes of concepts. In total, there are 8 facets.  
- `Hierarchy`: Groupings of terminology that are arranged within the facets.  
- `Guide Term`: A placeholder for categories within the hierarchies. They serve to organize the hierarchies into logical segments. 
- `Record`: A single entry that contains information about a specific concept.  
- `Term`: A word or phrase that represents the concept in the record. Includes singular/plural forms, spelling variations, and translations. A record can contain multiple terms.   



This example illustrates how the different levels work: 
- `Facet`: [Agents](https://www.getty.edu/vow/AATHierarchy?find=beauty&logic=AND&note=&subjectid=300264089)
    - `Hierarchy`: [People (hierarchy name)](https://www.getty.edu/vow/AATHierarchy?find=beauty&logic=AND&note=&subjectid=300024978)
        - `Guide Term`: [ < people by occupation > ](https://www.getty.edu/vow/AATHierarchy?find=beauty&logic=AND&note=&subjectid=300024980)
            - `Record`: [taxonomists](https://www.getty.edu/vow/AATFullDisplay?find=beauty&logic=AND&note=&subjectid=300237758)
                - `Term`: [taxonoom (Dutch)](https://www.getty.edu/vow/AATFullDisplay?find=beauty&logic=AND&note=&subjectid=300237758)

Take a look at this example record as seen from an end-user, and see some of the terminology in practice. 

<div>
<img src="img/example_record.png" width="600"/>
</div>


[Image source](https://www.getty.edu/research/tools/vocabularies/aat/AAT-Users-Manual.pdf), p32. 

## What does the AAT contain?

The AAT is structured as a hierarchical database of concepts, containing seven **facets** that each host a number of **hierarchies**. Below you can see an outline of the facets and hierarchies, including example terms. See [here](https://www.getty.edu/vow/AATHierarchy?find=beauty&logic=AND&note=&page=1&subjectid=300000000) for a collapsable hierarchy display on the Getty website. It allows you to browse through the structure and inspect the leaf nodes. You may see that each facet has its own internal logic to order the nodes (e.g. on function, geographical, temporal or cultural information). 

The root of the structure is called *Top of the AAT hierarchies*. Each facet in the table below contains a clickable link to their hierarchy on the Getty website. 

Most facets are relatively straightforward in their meaning. The final facet, *Brand Names*, allows for necessary additions by the conservation community, particularly where a material, process, or object does not have a generic name and the names are under trademark protection. The largest facet is the *Objects* facet. 

- [`Root`](https://www.getty.edu/vow/AATHierarchy?find=beauty&logic=AND&note=&subjectid=300000000): **Top of the AAT hierarchies**
	- [`ASSOCIATED CONCEPT`](https://www.getty.edu/vow/AATHierarchy?find=beauty&logic=AND&note=&subjectid=300264086) FACET \
		(e.g. *beauty*, *socialism*, *cultural pluralism*)
		- Associated Concepts 
	- [`PHYSICAL ATTRIBUTES`](https://www.getty.edu/vow/AATHierarchy?find=beauty&logic=AND&note=&page=1&subjectid=300264087) FACET \
		(e.g. *borders, round, waterlogged*)
		- Attributes and Properties
		- Conditions and Effects
		- Design Elements
		- Color
	- [`STYLES AND PERIODS`](https://www.getty.edu/vow/AATHierarchy?find=beauty&logic=AND&note=&subjectid=300264088) FACET \
		(e.g. *Abstract Expressionist, Yoruba*)
		- Styles and Periods
	- [`AGENTS`](https://www.getty.edu/vow/AATHierarchy?find=beauty&logic=AND&note=&page=1&subjectid=300264089) FACET \
		(e.g. *printmakers, landscape architects*)
		- People
		- Organizations
		- Living Organisms
	- [`ACTIVITIES`](https://www.getty.edu/vow/AATHierarchy?find=beauty&logic=AND&note=&subjectid=300264090) FACET \
		(e.g. *archaeology, engineering, analyzing*)
		- Disciplines
		- Functions
		- Events
		- Physical and Mental Activities
		- Processes and Techniques
	- [`MATERIALS`](https://www.getty.edu/vow/AATHierarchy?find=beauty&logic=AND&note=&subjectid=300264091) FACET \
		(e.g. *iron, clay, artificial ivory*)
		- Materials
	- [`OBJECTS`](https://www.getty.edu/vow/AATHierarchy?find=beauty&logic=AND&note=&subjectid=300264092) FACET \
		(e.g. *paintings, facades, cathedrals, chairs*)
		- Object Groupings and Systems
		- Object Genres 
		- Components
		- Built Environment
			- Settlements and Landscapes
			- Built Complexes and Districts
			- Single Built Works
			- Open Spaces and Site Elements
		- Furnishings and Equipment
			- Furnishings
			- Costume
			- Tools and Equipment
			- Weapons and Ammunition
			- Measuring Devices
			- Containers
			- Sound Decides
			-  Recreational Artifacts
			- Transportation Vehicles
		- Visual and Verbal Communication
			- Visual Works
			- Exchange Media
			- Information Forms
	- [`BRAND NAMES`](https://www.getty.edu/vow/AATHierarchy?find=beauty&logic=AND&note=&subjectid=300343372) FACET \
		(e.g. *Agfacolor, Arches paper*)
		- Brand Names


### Take a look under the hood: a Turtle file
The AAT was originally released in XML and Relational Table formats. More recently, it has been made available as[ Linked Open Data (LOD)](https://www.getty.edu/research/tools/vocabularies/lod/index.html) in JSON, RDF, N3/Turtle, and N-Triples formats. It's recommended to not work with the XML or Relational Table formats because they may become obsolete in the future. The LOD options are the more robust choice. 

Let's take a look at what a single record can look like in [Turtle](https://en.wikipedia.org/wiki/Turtle_(syntax)) (Terse RDF Triple Language, or .ttl) format. Turtle is a syntax for expressing data in the Resource Description Framework (RDF), which is used to represent information about resources in the web and data interchange. Turtle is designed to be more human-readable compared to other RDF syntaxes, like XML. Turtle supports the use of URIs to uniquely identify concepts, improving data retrieval and linking related information. 


Open the [aat_300123559.ttl file](sparql/aat_300123559.ttl) in a new window and try to identify the following information: 
- The record identifier 
- The label name 
- The date of the most recent modification
- The parent label
- The Dutch translations



You may have found that the record identifier is *300123559*, the label name *Attributes and Properties (hierarchy name)*, the date of the most recent modification *2015-07-03*, and the parent label *Physical Attributes Facet*. The Dutch translations of the terms and the descriptions are accessible through the `@nl` tags 

## AAT License: ODC-By
> All Getty Vocabulary data found on this site are made available openly and free from fees under the  [Open Data Commons Attribution License (ODC-By) 1.0.](https://opendatacommons.org/licenses/by/1-0/).

Source: https://www.getty.edu/research/tools/vocabularies/obtain/index.html#citing_vocab

According to the [ODC-By license summary](https://opendatacommons.org/licenses/by/summary/):
    
    You are free:

    * To share: To copy, distribute and use the database.
    * To create: To produce works from the database.
    * To adapt: To modify, transform and build upon the database.

    As long as you:

    Attribute: You must attribute any public use of the database, or works produced from the database, in the manner specified in the license. For any use or redistribution of the database, or works produced from it, you must make clear to others the license of the database and keep intact any notices on the original database.

## How to cite AAT

>  To attribute the Getty Vocabularies under the Open Data Commons Attribution License (ODC-By) 1.0, please use the following language:
For AAT: *Contains information from the J. Paul Getty Trust, Getty Research Institute, the Art & Architecture Thesaurus, which is made available under the [ODC Attribution License](http://opendatacommons.org/licenses/by/1-0/).*

## How can the AAT be used? 
The AAT can be downloaded in various formats for local use, but more integrated options are available, such as the SPARQL Endpoint and the API. See the sections below with information on how to use them. 


### [SPARQL Endpoint](https://vocab.getty.edu/)
With the SPARQL Endpoint you can take advantage of the linked data principles that are integrated in the AAT, and it allows you to perform complex queries. 

Below you find a Python implementation of a SPARQL query. It contains a simple query that retrieves all term labels that correspond to a given record. You can change the output by specifying different `subjectID`s

In [1]:
from SPARQLWrapper import SPARQLWrapper, JSON

def query_getty_aat(subjectID):
    # Define the SPARQL endpoint
    endpoint = "http://vocab.getty.edu/sparql"
    
    # Define the SPARQL query with the input subjectID
    knows_query = f"""
    PREFIX aat: <http://vocab.getty.edu/aat/>
    PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
    SELECT * {{
        aat:{subjectID} rdfs:label ?label .
        FILTER (LANG(?label) = 'nl'  || LANG(?label) = 'en'   )  
    }}"""
    
    # Initialize the SPARQL wrapper
    sparql = SPARQLWrapper(endpoint)
    sparql.setQuery(knows_query)
    sparql.setReturnFormat(JSON)
    
    # Execute the query and get the results
    results = sparql.query().convert()
    
    # Print the results
    print("Subject ID: ", subjectID)
    for result in results["results"]["bindings"]:
        #print("subjectID:", subjectID)
        print(" "*3, "label: ", result["label"]["value"])
    
    print()

######## Change the subjectIDs here ########
subjectIDs = ["300008458", "300386018", "300055821"]


for subjectID in subjectIDs:
    query_getty_aat(subjectID)



Subject ID:  300008458
    label:  cathedral cities
    label:  cathedral city
    label:  cities, cathedral
    label:  domsteden
    label:  domstad

Subject ID:  300386018
    label:  Costa Rican
    label:  Costa Ricaans

Subject ID:  300055821
    label:  beauty
    label:  the Beautiful
    label:  schoonheid
    label:  schoon



### [API](http://vocabsservices.getty.edu/Docs/Getty_Vocabularies_Web_Services_Documentation_v2.pdf)

API Documentation: http://vocabsservices.getty.edu/AATService.asmx

The AAT also offers an **[API](http://vocabsservices.getty.edu/AATService.asmx)** (Application Programming Interface), a tool that allows developers to access and manipulate data programmatically. This makes it easier to integrate into applications and websites.

The API provides a straightforward way to retrieve specific information, such as related terms or hierarchical structures. It supports [various operations](http://vocabsservices.getty.edu/AATService.asmx), including **GetChildren** and **GetParents**, which allow you to explore the relationships between concepts. The API supports different protocols like **SOAP** and **HTTP**. 

To get started with the API, you can try out the [**GetChildren**](http://vocabsservices.getty.edu/AATService.asmx?op=AATGetChildren) operation on their website. You can enter any subject ID, for example `300015646`.

Alternatively, can run **HTTP GET** requests for the operators that are included in the current repositoey (see the files [GetParentLabel](http/getparentlabel.http) and [getSubjectTerms](http/getsubjectterms.http)). 

##### Negative aspects of Getty API?

* responses are in XML (a bit antiquated and a pain to parse)
* limited number of API methods
* soon obsolete: according to statement in the [Getty LOD pages](https://www.getty.edu/research/tools/vocabularies/lod/index.html) it is stated (Dec.2024) that they will be updating Getty Vocabularies tech infrastructure, and retiring the XML (aka this API) at the end of 2025
* information present in the API response, is rather poor, when compared to what the SPARQL-endpoint RDF files/dump return 
    * no Dutch(NL) label is returned
    * no (immediately visible) information on the class of the term - is it a skos:Concept or is it a skos:Collention? ¯\_(ツ)_/¯ 

Note: see an example response to the API method `AATGetSubject` by entering the AAT concept id ie. [300444999](http://vocab.getty.edu/page/aat/300444999) (label:'Post-Colonial'@en`) in http://vocabsservices.getty.edu/AATService.asmx/AATGetSubject you will see what I mean, comparing to the information present in the SPARQL-endpoint/RDF ([JSON](https://vocab.getty.edu/aat/300444999.json), [Semantic View](https://vocab.getty.edu/aat/300444999))

## N-Triples (Knowledge Graph dumps)

Under The Datasets of the Documentation and Downloads of [The Getty Vocabularies](https://vocab.getty.edu/)
you can find the  NTriples Zip AAT: [full.zip](http://aatdownloads.getty.edu/VocabData/full.zip) (all statements), [explicit.zip](http://aatdownloads.getty.edu/VocabData/explicit.zip) (only explicit statements) 

the explicit.zip contains several .nl files:
``` 
AATOut_1Subjects.nt
AATOut_2Terms.nt
AATOut_AssociativeRels.nt
AATOut_ContribRels.nt
AATOut_Contribs.nt
AATOut_HierarchicalRels.nt
AATOut_Lang_sameAs.nt
AATOut_LCSHAlignment.nt
AATOut_Notations.nt
AATOut_ObsoleteSubjects.nt
AATOut_OrderedCollections.nt
AATOut_RevisionHistory.nt
AATOut_RevisionHistorySource.nt
AATOut_ScopeNotes.nt
AATOut_SemanticLinks.nt
AATOut_SourceRels.nt
AATOut_Sources.nt
AATOut_TermsTest.nt
``` 
From which I think the relevant information will be in 
* AATOut_1Subjects.nt - the subjects(concepts)
* AATOut_2Terms.nt - labels for the subjects
* AATOut_Lang_sameAs.nt (maybe)

TODO: WHAT IS THE CONTENT OF THESE FILES???

(TODO: clarify if the AAT: full.zip includes skos:broad and  skos:narrow relations) 

# Hierarchical Representations in AAT 

**The Getty AAT does not explicitly use the common SKOS properties `skos:boarder` and `skos:narrower` to denote the hierarchical relations** between its concepts. Instead it uses `gpv:narrower`and `gpv:broader`. However skos hierarchies are possible to query via [Getty SPAQL web UI](https://vocab.getty.edu/sparql) with inference enabled, as these SKOS relations are inferred.


It possible to query Getty AAT SPARQL endpoint (via [web UI](https://vocab.getty.edu/sparql))  for this relations, if we enable inference ([see query](https://vocab.getty.edu/queries?toc=&query=%23+find+subject+with+skos%3Anarrower+relationships+to+other+subject+%0D%0A%0D%0APREFIX+skos%3A+%3Chttp%3A%2F%2Fwww.w3.org%2F2004%2F02%2Fskos%2Fcore%23%3E%0D%0A%0D%0ASELECT+*%0D%0AWHERE+%7B%0D%0A++++%3Fs+skos%3Anarrower+%3Fo+.%0D%0A%7D%0D%0ALIMIT+10&implicit=true&equivalent=false#Finding_Subjects)). If querying the Getty AAT SPARQL endpoint remotely, like we try in [sparql/aat_narrower.ttl](sparql/aat_narrower.ttl), we cannot rely on inferred statements, hence we get zero results from that query.


![skos relations inferred](img/inference-skos.png) source: https://www.getty.edu/research/tools/vocabularies/lod/aat_semantic_representation.pdf / https://vocab.getty.edu/doc/


## Do we need SKOS?

Since for the use case of the SSHOC, where a user is offered terms in the dataverse key words, we can disregard the  hierarchical relations of AAT. Instead we shall focus on **What terms we want to offer**. And look how to capture them.  


# What AAT terms we want to offer?

## Subject_Hierarchy AAT
https://vocab.getty.edu/doc/#Subject_Hierarchy

## produce a submodule of AAT
and include it in SKOSMOS




# Language Labels in AAT

If we take an example the concept http://vocab.getty.edu/aat/300008458 (cathedral cities)

In AATOut_1Subjects.ttl:<http://vocab.getty.edu/aat/300008458> he have the following statements about it:

```txt
<http://vocab.getty.edu/aat/300008458>
        <http://vocab.getty.edu/ontology#displayOrder>  "1"^^<http://www.w3.org/2001/XMLSchema#positiveInteger>;
        <http://vocab.getty.edu/ontology#parentStringAbbrev>  "cathedral centers, <settlements by function: administrative>, ... Objects Facet";
        <http://vocab.getty.edu/ontology#parentString>  "cathedral centers, <settlements by function: administrative>, <settlements by function>, inhabited places, Settlements and Landscapes (hierarchy name), Built Environment (hierarchy name), Objects Facet";
        <http://purl.org/dc/elements/1.1/identifier>  "300008458";
        <http://creativecommons.org/ns#license>  <http://opendatacommons.org/licenses/by/1.0/>;
        <http://www.w3.org/2004/02/skos/core#inScheme>  <http://vocab.getty.edu/aat/>;
        <http://rdfs.org/ns/void#inDataset>  <http://vocab.getty.edu/dataset/aat>;
        <http://purl.org/dc/terms/license>  <http://opendatacommons.org/licenses/by/1.0/>;
        a       <http://vocab.getty.edu/ontology#Concept> .

```

No labels theremat. But in history and culture, dates are rarely precise.

Given that dating information about people, places, and things is often uncertain or ambiguous, in the Getty vocabularies a “display date” is combined with two estimated numeric values representing the broadest span of years to be used for retrieving this information. To emphasize that these values are not hard dates, but estimations for retrieval purposes, they are represented with these custom properties: gvp:estStart, gvp:estEnd. For more information please refer to the Dates section of the Editorial Guidelines.

If we then check what statements are made in AATOut_2Terms.ttl we see statements made about different labels of aat:300008458

Query
```sparql
SELECT * WHERE { <http://vocab.getty.edu/aat/300008475> ?p ?v}
```

Results: 
```txt
-------------------------------------------------------------------------------------------------------------------------
| p                                              | v                                                                    |
=========================================================================================================================
| <http://www.w3.org/2008/05/skos-xl#prefLabel>  | <http://vocab.getty.edu/aat/term/1000754126-zh-Latn-pinyin-x-notone> |
| <http://www.w3.org/2008/05/skos-xl#prefLabel>  | <http://vocab.getty.edu/aat/term/1000754124-zh-Hant>                 |
| <http://www.w3.org/2008/05/skos-xl#prefLabel>  | <http://vocab.getty.edu/aat/term/1000468934-nl>                      |
| <http://www.w3.org/2008/05/skos-xl#prefLabel>  | <http://vocab.getty.edu/aat/term/1000008475-en>                      |
| <http://www.w3.org/2008/05/skos-xl#prefLabel>  | <http://vocab.getty.edu/aat/term/1000754128-zh-Latn-wadegile>        |
| <http://www.w3.org/2008/05/skos-xl#prefLabel>  | <http://vocab.getty.edu/aat/term/1000416350-es>                      |
| <http://www.w3.org/2008/05/skos-xl#prefLabel>  | <http://vocab.getty.edu/aat/term/1000754127-zh-Latn-pinyin-x-hanyu>  |
| <http://vocab.getty.edu/ontology#prefLabelGVP> | <http://vocab.getty.edu/aat/term/1000008475-en>                      |
| <http://www.w3.org/2008/05/skos-xl#altLabel>   | <http://vocab.getty.edu/aat/term/1000754125-zh-Hant>                 |
| <http://www.w3.org/2008/05/skos-xl#altLabel>   | <http://vocab.getty.edu/aat/term/1000468935-nl>                      |
-------------------------------------------------------------------------------------------------------------------------
```

which show us that 3 properties are used to indicate labels skosXL:prefLabel, gvp:prefLabelGVP and skosXL:altLabel. I believe we should focus on the skosXL:prefLabel (preferred label) and ignore the rest. Note that for the concept in question there is 1 skosXL:prefLabel in English (http://vocab.getty.edu/aat/term/1000008475-en) and another for Dutch (http://vocab.getty.edu/aat/term/1000468934-nl)

If we query the statements made in AATOut_2Terms.ttl about that term `SELECT * WHERE {<http://vocab.getty.edu/aat/term/1000468934-nl> ?p ?v}` we get the following results

```txt
| p                                                 | v                                                       |
===============================================================================================================
| <http://purl.org/dc/elements/1.1/identifier>      | "1000468934"                                            |
| <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> | <http://www.w3.org/2008/05/skos-xl#Label>               |
| <http://www.w3.org/2008/05/skos-xl#literalForm>   | "<nederzettingen naar locatie: topografisch>"@nl        |
| <http://vocab.getty.edu/ontology#term>            | "nederzettingen naar locatie: topografisch"@nl          |
| <http://vocab.getty.edu/ontology#termType>        | <http://vocab.getty.edu/term/type/Descriptor>           |
| <http://purl.org/dc/terms/language>               | <http://vocab.getty.edu/language/nl>                    |
| <http://vocab.getty.edu/ontology#displayOrder>    | "7"^^<http://www.w3.org/2001/XMLSchema#positiveInteger> |
---------------------------------------------------------------------------------------------------------------
```
Similar for http://vocab.getty.edu/aat/term/1000008475-en  `SELECT * WHERE {<http://vocab.getty.edu/aat/term/1000008475-en> ?p ?v}` we get the following results


```txt
===============================================================================================================
| <http://purl.org/dc/elements/1.1/identifier>      | "1000008475"                                            |
| <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> | <http://www.w3.org/2008/05/skos-xl#Label>               |
| <http://vocab.getty.edu/ontology#termPOS>         | <http://vocab.getty.edu/term/POS/NeuterAdjectival>      |
| <http://www.w3.org/2008/05/skos-xl#literalForm>   | "<settlements by location: topographical>"@en           |
| <http://vocab.getty.edu/ontology#term>            | "settlements by location: topographical"@en             |
| <http://vocab.getty.edu/ontology#termType>        | <http://vocab.getty.edu/term/type/Descriptor>           |
| <http://purl.org/dc/terms/language>               | <http://vocab.getty.edu/language/en>                    |
| <http://vocab.getty.edu/ontology#displayOrder>    | "1"^^<http://www.w3.org/2001/XMLSchema#positiveInteger> |
---------------------------------------------------------------------------------------------------------------
```

which means that for each concept, we need
* the skos-xl#prefLabel values (named entities) that terminate in `-en` or `-nl`
* for each of those skos-xl#prefLabel values extract the `gpv#term` value

# How to transform onto concept + en(label) + nl(label)?

concept aat:300008458
        a gvp:Concept -> a skos:Concept
        <http://purl.org/dc/terms/license>  <http://opendatacommons.org/licenses/by/1.0/>;
        <http://www.w3.org/2004/02/skos/core#inScheme>  <http://vocab.getty.edu/aat/>;
        skos:prefLabel ... @en
        skos:prefLabel ... @nl

<http://vocab.getty.edu/aat/300008458>
        <http://vocab.getty.edu/ontology#displayOrder>  "1"^^<http://www.w3.org/2001/XMLSchema#positiveInteger>;
        <http://vocab.getty.edu/ontology#parentStringAbbrev>  "cathedral centers, <settlements by function: administrative>, ... Objects Facet";
        <http://vocab.getty.edu/ontology#parentString>  "cathedral centers, <settlements by function: administrative>, <settlements by function>, inhabited places, Settlements and Landscapes (hierarchy name), Built Environment (hierarchy name), Objects Facet";
        <http://purl.org/dc/elements/1.1/identifier>  "300008458";
        <http://creativecommons.org/ns#license>  <http://opendatacommons.org/licenses/by/1.0/>;
        <http://www.w3.org/2004/02/skos/core#inScheme>  <http://vocab.getty.edu/aat/>;
        <http://rdfs.org/ns/void#inDataset>  <http://vocab.getty.edu/dataset/aat>;
        <http://purl.org/dc/terms/license>  <http://opendatacommons.org/licenses/by/1.0/>;
        a       <http://vocab.getty.edu/ontology#Concept> .


EndPointInternalError: EndPointInternalError: The endpoint returned the HTTP status code 500.


```

No labels there

If we then check what statements are made in AATOut_2Terms.ttl we see statements made about different labels of aat:300008458

Query
```sparql
SELECT * WHERE { <http://vocab.getty.edu/aat/300008475> ?p ?v}
```

Results: 
```txt
-------------------------------------------------------------------------------------------------------------------------
| p                                              | v                                                                    |
=========================================================================================================================
| <http://www.w3.org/2008/05/skos-xl#prefLabel>  | <http://vocab.getty.edu/aat/term/1000754126-zh-Latn-pinyin-x-notone> |
| <http://www.w3.org/2008/05/skos-xl#prefLabel>  | <http://vocab.getty.edu/aat/term/1000754124-zh-Hant>                 |
| <http://www.w3.org/2008/05/skos-xl#prefLabel>  | <http://vocab.getty.edu/aat/term/1000468934-nl>                      |
| <http://www.w3.org/2008/05/skos-xl#prefLabel>  | <http://vocab.getty.edu/aat/term/1000008475-en>                      |
| <http://www.w3.org/2008/05/skos-xl#prefLabel>  | <http://vocab.getty.edu/aat/term/1000754128-zh-Latn-wadegile>        |
| <http://www.w3.org/2008/05/skos-xl#prefLabel>  | <http://vocab.getty.edu/aat/term/1000416350-es>                      |
| <http://www.w3.org/2008/05/skos-xl#prefLabel>  | <http://vocab.getty.edu/aat/term/1000754127-zh-Latn-pinyin-x-hanyu>  |
| <http://vocab.getty.edu/ontology#prefLabelGVP> | <http://vocab.getty.edu/aat/term/1000008475-en>                      |
| <http://www.w3.org/2008/05/skos-xl#altLabel>   | <http://vocab.getty.edu/aat/term/1000754125-zh-Hant>                 |
| <http://www.w3.org/2008/05/skos-xl#altLabel>   | <http://vocab.getty.edu/aat/term/1000468935-nl>                      |
-------------------------------------------------------------------------------------------------------------------------
```

which show us that 3 properties are used to indicate labels skosXL:prefLabel, gvp:prefLabelGVP and skosXL:altLabel. I believe we should focus on the skosXL:prefLabel (preferred label) and ignore the rest. Note that for the concept in question there is 1 skosXL:prefLabel in English (http://vocab.getty.edu/aat/term/1000008475-en) and another for Dutch (http://vocab.getty.edu/aat/term/1000468934-nl)

If we query the statements made in AATOut_2Terms.ttl about that term `SELECT * WHERE {<http://vocab.getty.edu/aat/term/1000468934-nl> ?p ?v}` we get the following results

```txt
| p                                                 | v                                                       |
===============================================================================================================
| <http://purl.org/dc/elements/1.1/identifier>      | "1000468934"                                            |
| <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> | <http://www.w3.org/2008/05/skos-xl#Label>               |
| <http://www.w3.org/2008/05/skos-xl#literalForm>   | "<nederzettingen naar locatie: topografisch>"@nl        |
| <http://vocab.getty.edu/ontology#term>            | "nederzettingen naar locatie: topografisch"@nl          |
| <http://vocab.getty.edu/ontology#termType>        | <http://vocab.getty.edu/term/type/Descriptor>           |
| <http://purl.org/dc/terms/language>               | <http://vocab.getty.edu/language/nl>                    |
| <http://vocab.getty.edu/ontology#displayOrder>    | "7"^^<http://www.w3.org/2001/XMLSchema#positiveInteger> |
---------------------------------------------------------------------------------------------------------------
```
Similar for http://vocab.getty.edu/aat/term/1000008475-en  `SELECT * WHERE {<http://vocab.getty.edu/aat/term/1000008475-en> ?p ?v}` we get the following results


```txt
===============================================================================================================
| <http://purl.org/dc/elements/1.1/identifier>      | "1000008475"                                            |
| <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> | <http://www.w3.org/2008/05/skos-xl#Label>               |
| <http://vocab.getty.edu/ontology#termPOS>         | <http://vocab.getty.edu/term/POS/NeuterAdjectival>      |
| <http://www.w3.org/2008/05/skos-xl#literalForm>   | "<settlements by location: topographical>"@en           |
| <http://vocab.getty.edu/ontology#term>            | "settlements by location: topographical"@en             |
| <http://vocab.getty.edu/ontology#termType>        | <http://vocab.getty.edu/term/type/Descriptor>           |
| <http://purl.org/dc/terms/language>               | <http://vocab.getty.edu/language/en>                    |
| <http://vocab.getty.edu/ontology#displayOrder>    | "1"^^<http://www.w3.org/2001/XMLSchema#positiveInteger> |
---------------------------------------------------------------------------------------------------------------
```

which means that for each concept, we need
* the skos-xl#prefLabel values (named entities) that terminate in `-en` or `-nl`
* for each of those skos-xl#prefLabel values extract the `gpv#term` value

# How to transform onto concept + en(label) + nl(label)?

concept aat:300008458
        a gvp:Concept -> a skos:Concept
        <http://purl.org/dc/terms/license>  <http://opendatacommons.org/licenses/by/1.0/>;
        <http://www.w3.org/2004/02/skos/core#inScheme>  <http://vocab.getty.edu/aat/>;
        skos:prefLabel ... @en
        skos:prefLabel ... @nl

<http://vocab.getty.edu/aat/300008458>
        <http://vocab.getty.edu/ontology#displayOrder>  "1"^^<http://www.w3.org/2001/XMLSchema#positiveInteger>;
        <http://vocab.getty.edu/ontology#parentStringAbbrev>  "cathedral centers, <settlements by function: administrative>, ... Objects Facet";
        <http://vocab.getty.edu/ontology#parentString>  "cathedral centers, <settlements by function: administrative>, <settlements by function>, inhabited places, Settlements and Landscapes (hierarchy name), Built Environment (hierarchy name), Objects Facet";
        <http://purl.org/dc/elements/1.1/identifier>  "300008458";
        <http://creativecommons.org/ns#license>  <http://opendatacommons.org/licenses/by/1.0/>;
        <http://www.w3.org/2004/02/skos/core#inScheme>  <http://vocab.getty.edu/aat/>;
        <http://rdfs.org/ns/void#inDataset>  <http://vocab.getty.edu/dataset/aat>;
        <http://purl.org/dc/terms/license>  <http://opendatacommons.org/licenses/by/1.0/>;
        a       <http://vocab.getty.edu/ontology#Concept> .


## The AAT in other projects

- **[Termennetwerk](https://termennetwerk.netwerkdigitaalerfgoed.nl/)**: This project presents a Dutch search engine for terms and links them to their URIs in various thesauri, including the AAT.

- **[Europeana](https://www.europeana.eu/en)**: An online information portal that provides access to millions of cultural heritage resources by aggregating metadata from museums, libraries, and archives across Europe. A part of their records are encoded with AAT URIs, which they use to retrieve translations of the records [(source)](https://doi.org/10.7152/nasko.v5i1.15179 ).  

- **[A Methodology for Semantic Enrichment of Cultural Heritage Images Using Artificial Intelligence Technologies](https://doi.org/10.3390/jimaging7080121)**: This paper proposes a methodology for analyzing and enriching a collection of cultural images, specifically focusing on food in a cultural context. It combines the use of ontologies, including the AAT, to consistently represent concepts alongside Computer Vision tools to enhance image descriptions. --> expand, read method, what part of getty? how did they select it? 

- **["Linking HBIM Graphical and Semantic Information Through the Getty AAT"](https://doi.org/10.1088/1757-899X/364/1/012100)**: This study explores the integration of graphical and semantic information in Historic Building Information Modeling (HBIM) using the AAT. It develops a method that automatically links elements from a 3D model of a Spanish castle to concepts in the Getty AAT. 




## Other Getty Vocabularies

### [ULAN](https://www.getty.edu/research/tools/vocabularies/ulan/about.html)
The AAT is one example of the vocabularies that are developed by the Getty Institute. There's [more](https://www.getty.edu/research/tools/vocabularies/index.html), for example the Union List of Artist Names (ULAN). It contains records for artists and agents in the cultural landscape, specifically the visual arts. This includes [names, relationships, and biographical information for makers and other people and corporate bodies](https://www.getty.edu/research/tools/vocabularies/ulan/about.html#scope).

Similar to the AAT, the ULAN consists or *records*, in this case referring to a unique person, institute or corporation. A record can contain mutliple *terms* that may capture [given names, pseudonyms, variant spellings, names in multiple languages, and names that have changed over time](https://www.getty.edu/research/tools/vocabularies/ulan/about.html#scope). A minimal record contains the following fields: record type, name, name source, display biography, nationality, role, birth date and death date. 

See an overview of the ULAN Facets below. Information is taken from the [ULAN documentation](https://www.getty.edu/research/tools/vocabularies/ulan/about.html#scope)


- `PERSONS, ARTIST` Facet: represents information about individuals involved in the creation or production of works of art or architecture (e.g., *Rembrandt van Rijn*)
- `CORPORATE BODIES` Facet: represents information about corporate bodies, defined as two or more people working together to create or produce art or architecture (e.g., *Adler and Sullivan*)
- `NON-ARTISTS` Facet: mostly represents patrons, who often had input in the creative process, and occasionally donors, sitters, and others whose names are required for indexing visual works but who are themselves not artists.  
- `UNKNOWN PEOPLE BY CULTURE` Facet: refers to the generic culture in which a work was created (e.g. *unknown Aztec*, or simply *Aztec*)
- `UNIDENTIFIED NAMED PEOPLE ` Facet: people or corporate bodies where the identity is knowable, but has not yet been thoroughly researched

Records may be linked through associative relationships, including professional relationships (e.g. *assistant of*, *influenced*) and familial relationships (e.g. *child of*, *sibling of*). See [here](https://www.getty.edu/vow/ULANFullDisplay?find=van+gogh&role=&nation=&page=1&subjectid=500115588) for an example of an ULAN record. 

The ULAN is also available as Linked Open Data, and can be accessed through the Sparql Endpoint. Most of the ULAN records are in English, but notes and descriptions are sometimes translated to other languages, including Dutch. 


### [Iconography Authority]()

## AAT and Data Station SSH

In the context of Data Station SSH, the AAT can be useful to provide record names as keywords. However, curation should probably happen some way to select terms at a relevant granularity. The section below outlines challenges, recommendations, and some concrete properties that can be helpful in this curation. 

### Challenges
- **Selecting the right granularity**: The main challenge would be to find a way to curate which concepts should be included or excluded. 
  - **Size**: the AAT contains more than 70.000 records that all refer to unique concepts. If we would use the Dutch terms for each concept, it would mean we have 50.000 keywords to choose from. Many of these terms may be topically irrelevant, too specific or too broad for our purposes. 
  - **Lack of generalizability across facets**: each facet has its own internal logic. A path that may lead to useful terms in one facet might not translate to another facet. 




### Recommendations 
- Curation per facet: establish a path to a layer that yields keyword at the desired level of granularity for each facet. For example, in the *Associated concepts* facet, the [leaf nodes](https://www.getty.edu/vow/AATHierarchy?find=beauty&logic=AND&note=&subjectid=300055938) generally seem useful terms to function as keywords, whereas the hierarchy names are quite generic. In contrast, in the *Physical attribute* facet, some of the [leaf nodes](https://www.getty.edu/vow/AATHierarchy?find=beauty&logic=AND&note=&subjectid=300185404) seem too specific to be useful keywords (e.g. *halation, oxidative-reductive deterioration*). 
- Some facets may be excluded entirely (e.g. *Brand Names*, possibly *Activities*)

The following properties are useful for the process of selecting the records: 
- `Subject_ID`: This is the unique identifier for a concept record.  
- `Parent_ID`: This allows for tracing the source of a record 
- `Term_Language`: Allows for the possibly to only keep Dutch and English entries
- `Term_ID`: The unique identifier for a term
- `Term_Text`: The text or description belonging to a term


Each node belongs to one of the following `Record_Type` categories: 
- Facet
- Hierarchy
- Concept
- GuideTerm (placeholder to create a level in the hierarchy)
- (ScopeNote)
- (ObsoleteSubject)

These labels can help filter out unnecessary records. For example, Facet names or GuideTerms will rarely, if ever, be useful as keywords. 

Moreover, the SKOS predicates such as `broader` can be used to define the desired paths.  

![example of the use of different terms](img/008-complex-hierarchy.png)

[image source](https://vocab.getty.edu/doc/#Term)
