# Exploring Baroque Ceiling Painting Data in the NFDI4Culture Knowledge Graph

This notebook is a starting point for a data story about baroque art and ceiling paintings using the NFDI4Culture Knowledge Graph.

Focus:
- Work with **data portals** (especially CbDD and the Color Slide Archive of Wall and Ceiling Painting)
- Use **SPARQL** to query the KG
- Prepare results for visualisation (maps, timelines, comparisons)

You can adapt the queries step by step as you learn more about the concrete RDF schema of the datasets.

In [151]:
# Install dependencies (run once per environment)
!pip install SPARQLWrapper pandas matplotlib --quiet


[notice] A new release of pip is available: 25.1 -> 25.3
[notice] To update, run: python.exe -m pip install --upgrade pip


In [152]:
from SPARQLWrapper import SPARQLWrapper, JSON
import pandas as pd
import matplotlib.pyplot as plt

pd.set_option("display.max_rows", 50)
pd.set_option("display.max_columns", 20)
pd.set_option("display.width", 120)

# NFDI4Culture SPARQL endpoint
ENDPOINT_URL = "https://nfdi4culture.de/sparql"

# Prefixes used in queries
# NOTE: The KG uses http://schema.org/ (not https://)
PREFIXES = """\
PREFIX fabio: <http://purl.org/spar/fabio/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX nfdicore: <https://nfdi.fiz-karlsruhe.de/ontology/>
PREFIX schema:  <http://schema.org/>
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX dcat:    <http://www.w3.org/ns/dcat#>
PREFIX n4c:     <https://nfdi4culture.de/id/>
"""

def run_sparql(query: str) -> pd.DataFrame:
    """Run a SPARQL query against the NFDI4Culture endpoint and return a pandas DataFrame.

    The query body should *not* include prefixes, they are automatically prepended.
    This version accesses the JSON result safely to avoid indexing errors in static type checkers.
    """
    sparql = SPARQLWrapper(ENDPOINT_URL)
    sparql.setReturnFormat(JSON)
    sparql.setQuery(PREFIXES + "\n" + query)
    results = sparql.query().convert()

    # Be defensive: ensure results is a dict and extract bindings safely
    if not isinstance(results, dict):
        return pd.DataFrame()

    bindings = results.get("results", {}).get("bindings", [])
    rows = []
    for binding in bindings:
        # each binding is a dict of variable -> { "type": ..., "value": ... }
        row = {var: val.get("value") for var, val in binding.items()}
        rows.append(row)
    return pd.DataFrame(rows)

## 1. Inspect the CbDD portal (Corpus of Baroque Ceiling Painting in Germany)

- Portal ID from the registry: `n4c:E4264`
- Goal: See which properties connect the portal to data feeds, homepages, subjects, etc.

Run this once and scan the property list. It tells you which predicates to use in later queries.

In [153]:
query_inspect_cbdd = """\
SELECT ?p ?o
WHERE {
  n4c:E4264 ?p ?o .
}
ORDER BY ?p
LIMIT 200
"""

df_cbdd_props = run_sparql(query_inspect_cbdd)
df_cbdd_props

Unnamed: 0,p,o
0,http://schema.org/contributor,nodeID://b694781
1,http://schema.org/contributor,nodeID://b696236
2,http://schema.org/contributor,nodeID://b698637
3,http://schema.org/contributor,nodeID://b700108
4,http://schema.org/description,\n The Corpus of Baroque Ceiling Painting i...
5,http://schema.org/hasPart,https://nfdi4culture.de/id/E6077
6,http://schema.org/image,https://nfdi4culture.de//fileadmin/user_upload...
7,http://schema.org/keywords,https://nfdi4culture.de/id/E3953
8,http://schema.org/keywords,https://nfdi4culture.de/id/E3959
9,http://schema.org/keywords,https://nfdi4culture.de/id/E3968


## 2. Find the data feeds / datasets that belong to CbDD

From the inspection above, identify the property that links the portal to its parts.
Typical options are:
- `schema:hasPart`
- `dcterms:hasPart`

The query below assumes one of these. If the portal uses a different property, adapt the `FILTER` or replace `?hasPart` by the exact predicate.

In [154]:
# Try multiple approaches to find parts/feeds of the CbDD portal

# Approach 1: Portal has parts (portal -> part)
query_cbdd_parts_v1 = """
SELECT ?part ?partLabel ?partType ?predicate
WHERE {
  n4c:E4264 ?predicate ?part .
  FILTER(?predicate IN (schema:hasPart, dcterms:hasPart, dcat:dataset, dcat:distribution))

  OPTIONAL { ?part schema:name ?partLabel . }
  OPTIONAL { ?part rdf:type ?partType . }
}
ORDER BY ?partLabel
LIMIT 50
"""

df_cbdd_parts_v1 = run_sparql(query_cbdd_parts_v1)
print("Approach 1 - Portal hasPart/dataset:")
print(df_cbdd_parts_v1)
print("\n" + "="*60 + "\n")

# Approach 2: Parts point to portal (part -> portal via isPartOf)
query_cbdd_parts_v2 = """
SELECT ?part ?partLabel ?partType ?predicate
WHERE {
  ?part ?predicate n4c:E4264 .
  FILTER(?predicate IN (schema:isPartOf, dcterms:isPartOf, dcat:inCatalog))

  OPTIONAL { ?part schema:name ?partLabel . }
  OPTIONAL { ?part rdf:type ?partType . }
}
ORDER BY ?partLabel
LIMIT 50
"""

df_cbdd_parts_v2 = run_sparql(query_cbdd_parts_v2)
print("Approach 2 - Part isPartOf portal:")
print(df_cbdd_parts_v2)
print("\n" + "="*60 + "\n")

# Approach 3: Check all outgoing predicates from the portal to find the right one
query_cbdd_all_out = """
SELECT ?predicate (COUNT(?object) AS ?count) (SAMPLE(?object) AS ?sampleObject)
WHERE {
  n4c:E4264 ?predicate ?object .
}
GROUP BY ?predicate
ORDER BY DESC(?count)
LIMIT 30
"""

df_cbdd_predicates = run_sparql(query_cbdd_all_out)
print("All outgoing predicates from CbDD portal:")
print(df_cbdd_predicates)

# Use whichever approach returned results
df_cbdd_parts = df_cbdd_parts_v1 if not df_cbdd_parts_v1.empty else df_cbdd_parts_v2

Approach 1 - Portal hasPart/dataset:
                               part                                          partLabel  \
0  https://nfdi4culture.de/id/E6077  Metadata from the Corpus of Baroque Ceiling Pa...   
1  https://nfdi4culture.de/id/E6077  Metadata from the Corpus of Baroque Ceiling Pa...   
2  https://nfdi4culture.de/id/E6077  Metadata from the Corpus of Baroque Ceiling Pa...   
3  https://nfdi4culture.de/id/E6077  Metadata from the Corpus of Baroque Ceiling Pa...   

                                         partType                  predicate  
0      http://www.w3.org/ns/hydra/core#Collection  http://schema.org/hasPart  
1              http://purl.org/spar/fabio/Dataset  http://schema.org/hasPart  
2                      http://schema.org/DataFeed  http://schema.org/hasPart  
3  https://nfdi.fiz-karlsruhe.de/ontology/Dataset  http://schema.org/hasPart  


Approach 2 - Part isPartOf portal:
Empty DataFrame
Columns: []
Index: []


All outgoing predicates from CbDD portal

In [155]:
# Approach 4: Check all incoming predicates to the portal (things that reference n4c:E4264)
query_cbdd_all_in = """
SELECT ?predicate (COUNT(?subject) AS ?count) (SAMPLE(?subject) AS ?sampleSubject)
WHERE {
  ?subject ?predicate n4c:E4264 .
}
GROUP BY ?predicate
ORDER BY DESC(?count)
LIMIT 30
"""

df_cbdd_incoming = run_sparql(query_cbdd_all_in)
print("All incoming predicates TO CbDD portal (things pointing to it):")
print(df_cbdd_incoming)

# If we found incoming predicates, let's explore the subjects
if not df_cbdd_incoming.empty and 'sampleSubject' in df_cbdd_incoming.columns:
    sample_subj = df_cbdd_incoming.iloc[0]['sampleSubject']
    print(f"\nSample subject pointing to portal: {sample_subj}")
    
    # Inspect that sample subject
    query_sample_subj = f"""
    SELECT ?p ?o
    WHERE {{
      <{sample_subj}> ?p ?o .
    }}
    LIMIT 50
    """
    df_sample_subj = run_sparql(query_sample_subj)
    print("\nProperties of that sample subject:")
    print(df_sample_subj)

All incoming predicates TO CbDD portal (things pointing to it):
                                 predicate count                     sampleSubject
0              http://schema.org/subjectOf    13  https://nfdi4culture.de/id/E2971
1  http://schema.org/includedInDataCatalog     1  https://nfdi4culture.de/id/E6077

Sample subject pointing to portal: https://nfdi4culture.de/id/E2971

Properties of that sample subject:
                                                    p                                                  o
0     http://www.w3.org/1999/02/22-rdf-syntax-ns#type                      http://schema.org/DefinedTerm
1     http://www.w3.org/1999/02/22-rdf-syntax-ns#type  https://nfdi.fiz-karlsruhe.de/ontology/NFDI_00...
2          http://www.w3.org/2000/01/rdf-schema#label                                               JPEG
3        http://www.w3.org/2000/01/rdf-schema#seeAlso        https://nfdi4culture.de/resource/E2971.json
4        http://www.w3.org/2000/01/rdf-schema#seeAlso    

Look at `df_cbdd_parts`. Identify the URI of the **metadata feed** for CbDD (for example something like `n4c:E6077`).

Copy that feed URI into the variable below. This will be your **main entrypoint** into individual ceiling painting records.

In [156]:
# TODO: Set this to the actual CbDD feed URI you found in df_cbdd_parts
# Example placeholder: "n4c:E6077"
CBDD_FEED_URI = "n4c:E6077"  # <--- change this to the real feed ID from the previous cell

In [157]:
# First, let's inspect the E6077 feed to understand its structure
query_inspect_feed = """
SELECT ?p ?o
WHERE {
  n4c:E6077 ?p ?o .
}
ORDER BY ?p
LIMIT 100
"""

df_feed_props = run_sparql(query_inspect_feed)
print("Properties of the CbDD feed (E6077):")
print(df_feed_props)
print("\n" + "="*60 + "\n")

# Also check what points TO the feed (incoming relations)
query_feed_incoming = """
SELECT ?predicate (COUNT(?s) AS ?count) (SAMPLE(?s) AS ?sampleSubject)
WHERE {
  ?s ?predicate n4c:E6077 .
}
GROUP BY ?predicate
ORDER BY DESC(?count)
LIMIT 20
"""

df_feed_incoming = run_sparql(query_feed_incoming)
print("Incoming predicates to the feed (what points to E6077):")
print(df_feed_incoming)

Properties of the CbDD feed (E6077):
                                    p                                                  o
0       http://schema.org/contributor                                   nodeID://b696914
1       http://schema.org/contributor                                   nodeID://b698482
2       http://schema.org/contributor                                   nodeID://b698789
3       http://schema.org/contributor                                   nodeID://b700071
4   http://schema.org/dataFeedElement  https://nfdi4culture.de/id/ark:/60538/E6077_00...
..                                ...                                                ...
95  http://schema.org/dataFeedElement  https://nfdi4culture.de/id/ark:/60538/E6077_0a...
96  http://schema.org/dataFeedElement  https://nfdi4culture.de/id/ark:/60538/E6077_0a...
97  http://schema.org/dataFeedElement  https://nfdi4culture.de/id/ark:/60538/E6077_0b...
98  http://schema.org/dataFeedElement  https://nfdi4culture.de/id/ark:/60

## 3. Sample ceiling-painting records from the CbDD feed

Pattern used here (adjust if inspection shows different properties):
- Records belong to a data feed: `?item schema:isPartOf CBDD_FEED_URI`
- Each record has a title / name: `schema:name`
- Optional location (`schema:spatial` → place → `schema:name`)
- Optional temporal coverage (`schema:temporalCoverage`)

You can extend this with more properties after you inspect one of the `?item` URIs.

In [158]:
# The feed uses schema:dataFeedElement -> DataFeedItem -> schema:item -> actual painting
# Let's first understand the structure of the DataFeedItems

query_item_predicates = """
SELECT ?p (COUNT(?o) AS ?count) (SAMPLE(?o) AS ?sampleValue)
WHERE {
  n4c:E6077 schema:dataFeedElement ?item .
  ?item ?p ?o .
}
GROUP BY ?p
ORDER BY DESC(?count)
LIMIT 30
"""

df_item_preds = run_sparql(query_item_predicates)
print("Predicates on DataFeedItem objects:")
print(df_item_preds)
print("\n" + "="*60 + "\n")

# Now let's explore the actual paintings (via schema:item)
query_painting_predicates = f"""
SELECT ?p (COUNT(?o) AS ?count) (SAMPLE(?o) AS ?sampleValue)
WHERE {{
  {CBDD_FEED_URI} schema:dataFeedElement ?feedItem .
  ?feedItem schema:item ?painting .
  ?painting ?p ?o .
}}
GROUP BY ?p
ORDER BY DESC(?count)
LIMIT 50
"""

df_painting_preds = run_sparql(query_painting_predicates)
print("All predicates used by actual paintings (via schema:item):")
df_painting_preds

Predicates on DataFeedItem objects:
                                                 p count                                        sampleValue
0                   http://schema.org/dateModified  6228                                         2025-09-08
1                           http://schema.org/item  6228  https://www.deckenmalerei.eu/f128e020-2dc2-4cf...
2  http://www.w3.org/1999/02/22-rdf-syntax-ns#type  6228                     http://schema.org/DataFeedItem
3                    http://schema.org/dateCreated  6228                                         2024-11-16


All predicates used by actual paintings (via schema:item):
All predicates used by actual paintings (via schema:item):


Unnamed: 0,p,count,sampleValue
0,https://nfdi4culture.de/ontology/CTO_0001026,23359,http://vocab.getty.edu/aat/300411453
1,https://nfdi4culture.de/ontology/CTO_0001009,6672,nodeID://b2644059
2,https://nfdi4culture.de/ontology/CTO_0001025,6230,nodeID://b2652610
3,http://www.w3.org/2000/01/rdf-schema#label,6228,Fassadenmalerei: allegorische Darstellung
4,https://nfdi.fiz-karlsruhe.de/ontology/NFDI_00...,6228,https://nfdi4culture.de/id/E6404
5,https://nfdi.fiz-karlsruhe.de/ontology/NFDI_00...,6228,https://nfdi4culture.de/id/E2430
6,https://nfdi4culture.de/ontology/CTO_0001049,6228,https://nfdi4culture.de/ontology/CTO_0001047
7,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,6228,https://nfdi4culture.de/ontology/CTO_0001005
8,https://nfdi4culture.de/ontology/CTO_0001006,6228,https://nfdi4culture.de/id/E6077
9,https://nfdi.fiz-karlsruhe.de/ontology/NFDI_00...,6228,https://www.deckenmalerei.eu/11db6ad0-c4c3-11e...


In [159]:
# Let's get sample actual painting records with their key properties
# Based on the predicates discovered above:
# - rdfs:label = title/name
# - CTO_0001073 = year (5527 records have this)
# - schema:latitude/longitude = coordinates (1244 have geo)
# - CTO_0001026 = ICONCLASS subjects (23359 - multiple per painting)
# - schema:associatedMedia = images (4596)

query_sample_paintings = f"""
SELECT ?painting ?label ?year ?lat ?lon (SAMPLE(?iconclass) AS ?subject) (SAMPLE(?image) AS ?imageNode)
WHERE {{
  {CBDD_FEED_URI} schema:dataFeedElement ?feedItem .
  ?feedItem schema:item ?painting .
  
  # Title/label
  ?painting rdfs:label ?label .
  
  # Optional: Year
  OPTIONAL {{ ?painting <https://nfdi4culture.de/ontology/CTO_0001073> ?year . }}
  
  # Optional: Coordinates
  OPTIONAL {{
    ?painting schema:latitude ?lat .
    ?painting schema:longitude ?lon .
  }}
  
  # Optional: ICONCLASS subject
  OPTIONAL {{ ?painting <https://nfdi4culture.de/ontology/CTO_0001026> ?iconclass . }}
  
  # Optional: Associated media
  OPTIONAL {{ ?painting schema:associatedMedia ?image . }}
}}
GROUP BY ?painting ?label ?year ?lat ?lon
LIMIT 20
"""

df_sample_paintings = run_sparql(query_sample_paintings)
print(f"Sample of {len(df_sample_paintings)} ceiling paintings:")
df_sample_paintings

Sample of 20 ceiling paintings:


Unnamed: 0,painting,label,year,subject,imageNode,lat,lon
0,https://www.deckenmalerei.eu/dc95ca5c-cd0d-496...,Die Landschaften,1680-1685,http://vocab.getty.edu/aat/300411453,nodeID://b2642145,,
1,https://www.deckenmalerei.eu/402571b6-fcf6-46a...,Feld 17: Lucius Verus,um 1678,https://iconclass.org/41D26411,,,
2,https://www.deckenmalerei.eu/b7995ad0-cb4a-11e...,Bischof Konrad II. von Osnabrück,1656/57,http://vocab.getty.edu/aat/300411453,,,
3,https://www.deckenmalerei.eu/e94f0bcb-123d-453...,Morgendlicher Aufbruch zur Jagd,1726,http://vocab.getty.edu/aat/300411453,nodeID://b2654117,,
4,https://www.deckenmalerei.eu/83dc3b4d-3087-46f...,"Partenkirchen, Wallfahrts- u. Votivkirche St. ...",1704,http://vocab.getty.edu/aat/300004792,,47.49831805268119,11.111469254103664
5,https://www.deckenmalerei.eu/a34c0e10-45e1-438...,Deckengemälde: Die Kardinaltugenden,um 1735,https://iconclass.org/92D19164,nodeID://b2641736,,
6,https://www.deckenmalerei.eu/e97c28e5-3356-458...,Kartusche: „Magnus in vita“,1739,http://vocab.getty.edu/aat/300411453,nodeID://b2649569,,
7,https://www.deckenmalerei.eu/90c80383-22f8-4df...,Die Deckenmalerei im Großen Saal,1687-1702,https://iconclass.org/26C0,nodeID://b2643255,,
8,https://www.deckenmalerei.eu/294795ce-7b49-40f...,Den Hirsch verlassen die Kräfte,"1763, 1772",https://iconclass.org/43C11,nodeID://b2649228,,
9,https://www.deckenmalerei.eu/9bb75557-d137-457...,"Feldhase, Rotkelchen (?), Rosenstrauch und zwe...",1542,https://iconclass.org/25F26%28HARE%29,nodeID://b2642759,,


### Data Summary

Based on the inspection above, here's what the CbDD dataset contains:

**Total: 6,228 ceiling paintings**

| Property | Count | Description | Example |
|----------|-------|-------------|----------|
| `rdfs:label` | 6,228 | Title/name of the painting | "Minerva, Apoll und die Musen" |
| `CTO_0001073` | 5,527 | Year/date of creation | "1720", "um 1730", "1720-1730" |
| `CTO_0001026` | 23,359 | ICONCLASS/AAT subject codes (avg. 3.7 per painting) | `iconclass.org/92D1916` |
| `schema:associatedMedia` | 4,570 | Links to images (as ImageObject) | — |
| `schema:latitude/longitude` | 1,244 | Geographic coordinates | 48.57, 13.46 |
| `CTO_0001009` | 6,672 | Related buildings/locations (links to GND) | `gnd/118636960` |
| `CTO_0001019` | 5,363 | Part-of relationships (painting hierarchies) | — |
| `NFDI_000...` | 439 | GND identifiers for the painting itself | `gnd/7678538-5` |

**Subject Classification:**
- Uses both **ICONCLASS** (iconographic classification) and **Getty AAT** (Art & Architecture Thesaurus)
- 4,831 unique subject codes across all paintings
- Most common: architectural elements (AAT), mythological scenes (ICONCLASS)

**Image Data:**
- Images are `schema:ImageObject` with:
  - `CTO_0001021`: Image URL (hosted at `deckenmalerei-bilder.badw.de`)
  - `CTO_0001007`: License (mostly CC BY 4.0)

**Linked Data Connections:**
- Paintings → GND (German National Library authority files)
- Paintings → ICONCLASS (iconographic subjects)
- Paintings → Getty AAT (art vocabulary)
- Paintings → Buildings via location relationships

In [160]:
# Get overall statistics about the CbDD dataset

# Count total paintings
query_total_count = f"""
SELECT (COUNT(DISTINCT ?painting) AS ?totalPaintings)
WHERE {{
  {CBDD_FEED_URI} schema:dataFeedElement ?feedItem .
  ?feedItem schema:item ?painting .
}}
"""
df_total = run_sparql(query_total_count)
print(f"Total paintings in CbDD: {df_total['totalPaintings'].iloc[0]}")

# Count paintings with coordinates
query_geo_count = f"""
SELECT (COUNT(DISTINCT ?painting) AS ?withGeo)
WHERE {{
  {CBDD_FEED_URI} schema:dataFeedElement ?feedItem .
  ?feedItem schema:item ?painting .
  ?painting schema:latitude ?lat .
  ?painting schema:longitude ?lon .
}}
"""
df_geo = run_sparql(query_geo_count)
print(f"Paintings with coordinates: {df_geo['withGeo'].iloc[0]}")

# Count paintings with images
query_image_count = f"""
SELECT (COUNT(DISTINCT ?painting) AS ?withImages)
WHERE {{
  {CBDD_FEED_URI} schema:dataFeedElement ?feedItem .
  ?feedItem schema:item ?painting .
  ?painting schema:associatedMedia ?img .
}}
"""
df_img = run_sparql(query_image_count)
print(f"Paintings with images: {df_img['withImages'].iloc[0]}")

# Count unique ICONCLASS subjects
query_iconclass_count = f"""
SELECT (COUNT(DISTINCT ?iconclass) AS ?uniqueSubjects)
WHERE {{
  {CBDD_FEED_URI} schema:dataFeedElement ?feedItem .
  ?feedItem schema:item ?painting .
  ?painting <https://nfdi4culture.de/ontology/CTO_0001026> ?iconclass .
}}
"""
df_iconclass = run_sparql(query_iconclass_count)
print(f"Unique ICONCLASS subjects: {df_iconclass['uniqueSubjects'].iloc[0]}")

Total paintings in CbDD: 6228
Paintings with coordinates: 1244
Paintings with images: 4570
Paintings with coordinates: 1244
Paintings with images: 4570
Unique ICONCLASS subjects: 4831
Unique ICONCLASS subjects: 4831


In [161]:
# Explore the most common ICONCLASS subjects
query_top_subjects = f"""
SELECT ?iconclass (COUNT(?painting) AS ?count)
WHERE {{
  {CBDD_FEED_URI} schema:dataFeedElement ?feedItem .
  ?feedItem schema:item ?painting .
  ?painting <https://nfdi4culture.de/ontology/CTO_0001026> ?iconclass .
}}
GROUP BY ?iconclass
ORDER BY DESC(?count)
LIMIT 20
"""

df_top_subjects = run_sparql(query_top_subjects)
print("Top 20 ICONCLASS subjects (motifs/themes):")
df_top_subjects

Top 20 ICONCLASS subjects (motifs/themes):


Unnamed: 0,iconclass,count
0,http://vocab.getty.edu/aat/300411453,4984
1,http://vocab.getty.edu/aat/300004792,1244
2,https://iconclass.org/92D1916,463
3,https://iconclass.org/26A,219
4,https://iconclass.org/25G4111,104
5,https://iconclass.org/25G3,99
6,https://iconclass.org/25G41,94
7,https://iconclass.org/48A9872,74
8,https://iconclass.org/45C22,71
9,https://iconclass.org/48C161,70


In [162]:
# Explore the image data structure (schema:associatedMedia)
# The images are blank nodes, let's see what properties they have

query_image_props = f"""
SELECT ?p (COUNT(?o) AS ?count) (SAMPLE(?o) AS ?sampleValue)
WHERE {{
  {CBDD_FEED_URI} schema:dataFeedElement ?feedItem .
  ?feedItem schema:item ?painting .
  ?painting schema:associatedMedia ?image .
  ?image ?p ?o .
}}
GROUP BY ?p
ORDER BY DESC(?count)
LIMIT 20
"""

df_image_props = run_sparql(query_image_props)
print("Properties of image objects (schema:associatedMedia):")
df_image_props

Properties of image objects (schema:associatedMedia):


Unnamed: 0,p,count,sampleValue
0,https://nfdi4culture.de/ontology/CTO_0001021,4596,https://deckenmalerei-bilder.badw.de/eas/parti...
1,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,4596,http://schema.org/ImageObject
2,https://nfdi4culture.de/ontology/CTO_0001007,4511,CC BY 4.0


In [163]:
# Get sample paintings with their image URLs
# Images use CTO_0001021 for the URL (not schema:contentUrl)
query_paintings_with_images = f"""
SELECT ?painting ?label ?imageUrl ?license
WHERE {{
  {CBDD_FEED_URI} schema:dataFeedElement ?feedItem .
  ?feedItem schema:item ?painting .
  ?painting rdfs:label ?label .
  ?painting schema:associatedMedia ?image .
  ?image <https://nfdi4culture.de/ontology/CTO_0001021> ?imageUrl .
  OPTIONAL {{ ?image <https://nfdi4culture.de/ontology/CTO_0001007> ?license . }}
}}
LIMIT 15
"""

df_paintings_images = run_sparql(query_paintings_with_images)
print(f"Sample paintings with image URLs ({len(df_paintings_images)} records):")
df_paintings_images

Sample paintings with image URLs (15 records):


Unnamed: 0,painting,label,imageUrl,license
0,https://www.deckenmalerei.eu/11db6ad0-c4c3-11e...,Fassadenmalerei: allegorische Darstellung,https://deckenmalerei-bilder.badw.de/eas/parti...,© Bildarchiv Foto Marburg / CbDD / Angelika Dr...
1,https://www.deckenmalerei.eu/1e4fad40-ce49-4d0...,Die Decke des Marmosaals,https://deckenmalerei-bilder.badw.de/eas/parti...,Rechte vorbehalten
2,https://www.deckenmalerei.eu/24712313-0bed-4dd...,Landschaft mit Hasenjagd,https://deckenmalerei-bilder.badw.de/eas/parti...,© CbDD / Bayrische Schlösserverwaltung / Jan-E...
3,https://www.deckenmalerei.eu/271d168c-74ed-4ee...,Die vier Nebenbilder an der Decke des Roten Saals,https://previous.bildindex.de/bilder/fmd100297...,"Rechte vorbehalten | Originator: Scheidt, Thom..."
4,https://www.deckenmalerei.eu/284af00a-294b-4cc...,"Passau, Große Messergasse 6",https://previous.bildindex.de/bilder/fmd100448...,"CC BY-NC-ND 4.0 | Originator: Dietel, Theresa ..."
5,https://www.deckenmalerei.eu/2fff5a82-d436-48a...,Eckbilder: antikisierende Henkelvasen in Rahmu...,https://previous.bildindex.de/bilder/zi0300_00...,Rechte vorbehalten | Rights holder: Deutsches ...
6,https://www.deckenmalerei.eu/3d6fbf7c-df73-411...,"Römhild, Schloss Glücksburg",https://previous.bildindex.de/bilder/fmc445016...,"Rechte vorbehalten | Originator: Hildebrand, G..."
7,https://www.deckenmalerei.eu/4aadb214-c123-49b...,Der Kamin im Raum westlich des Saals,https://previous.bildindex.de/bilder/fmd100254...,"CC BY-NC-ND 4.0 | Originator: Lechtape, Andrea..."
8,https://www.deckenmalerei.eu/4f31c621-2435-4bf...,Decke aus Süddithmarschen aus der Nähe von Dingen,https://previous.bildindex.de/bilder/fmd100307...,"Rechte vorbehalten | Originator: Lechtape, And..."
9,https://www.deckenmalerei.eu/5014ae03-9edf-42f...,Akanthusornament,https://deckenmalerei-bilder.badw.de/eas/parti...,"Rechte vorbehalten | Rights holder: Dreyer, An..."


In [164]:
# Explore temporal distribution - get year values
query_years = f"""
SELECT ?year (COUNT(?painting) AS ?count)
WHERE {{
  {CBDD_FEED_URI} schema:dataFeedElement ?feedItem .
  ?feedItem schema:item ?painting .
  ?painting <https://nfdi4culture.de/ontology/CTO_0001073> ?year .
}}
GROUP BY ?year
ORDER BY DESC(?count)
LIMIT 30
"""

df_years = run_sparql(query_years)
print("Most common date values (note: various formats like '1720', 'um 1730', '1720-1730'):")
df_years

Most common date values (note: various formats like '1720', 'um 1730', '1720-1730'):


Unnamed: 0,year,count
0,1542,169
1,1543,105
2,um 1542,86
3,1656/57,65
4,1751,65
5,um 1678,59
6,1703–1705,56
7,um 1732–1742,56
8,um 1750,55
9,1682,54


In [165]:
# Explore location/building relationships (CTO_0001009)
# This appears to link paintings to buildings

query_building_props = f"""
SELECT ?p (COUNT(?o) AS ?count) (SAMPLE(?o) AS ?sampleValue)
WHERE {{
  {CBDD_FEED_URI} schema:dataFeedElement ?feedItem .
  ?feedItem schema:item ?painting .
  ?painting <https://nfdi4culture.de/ontology/CTO_0001009> ?building .
  ?building ?p ?o .
}}
GROUP BY ?p
ORDER BY DESC(?count)
LIMIT 20
"""

df_building_props = run_sparql(query_building_props)
print("Properties of building/location objects (CTO_0001009):")
df_building_props

Properties of building/location objects (CTO_0001009):



Unnamed: 0,p,count,sampleValue
0,https://nfdi.fiz-karlsruhe.de/ontology/NFDI_00...,6672,https://d-nb.info/gnd/118636960


## 3.2 Detailed Analysis of Individual Paintings

Let's fetch a list of paintings with complete metadata and display their images.

In [166]:
# Fetch detailed painting records with all key properties
query_detailed_paintings = f"""
SELECT DISTINCT ?painting ?label ?year ?lat ?lon ?imageUrl ?license
       (GROUP_CONCAT(DISTINCT ?iconclass; separator=", ") AS ?subjects)
WHERE {{
  {CBDD_FEED_URI} schema:dataFeedElement ?feedItem .
  ?feedItem schema:item ?painting .
  
  # Required: Title and image
  ?painting rdfs:label ?label .
  ?painting schema:associatedMedia ?image .
  ?image <https://nfdi4culture.de/ontology/CTO_0001021> ?imageUrl .
  
  # Optional properties
  OPTIONAL {{ ?image <https://nfdi4culture.de/ontology/CTO_0001007> ?license . }}
  OPTIONAL {{ ?painting <https://nfdi4culture.de/ontology/CTO_0001073> ?year . }}
  OPTIONAL {{
    ?painting schema:latitude ?lat .
    ?painting schema:longitude ?lon .
  }}
  OPTIONAL {{ ?painting <https://nfdi4culture.de/ontology/CTO_0001026> ?iconclass . }}
}}
GROUP BY ?painting ?label ?year ?lat ?lon ?imageUrl ?license
LIMIT 10
"""

df_detailed = run_sparql(query_detailed_paintings)
print(f"Fetched {len(df_detailed)} paintings with images:")
df_detailed

Fetched 10 paintings with images:


Unnamed: 0,painting,label,year,lat,lon,imageUrl,license,subjects
0,https://www.deckenmalerei.eu/bf9a1cc3-1df5-4f1...,"Wolfegg, Schloss",1580–1583; 1651–1690,47.8232044572414,9.79170442039409,https://previous.bildindex.de/bilder/fmd100264...,"CC BY-NC-ND 4.0 | Originator: Bunz, Achim | Ri...",http://vocab.getty.edu/aat/300004792
1,https://www.deckenmalerei.eu/63c445dd-4e38-457...,Die großen Veduten,1710-1720,,,https://previous.bildindex.de/bilder/fmd100415...,"CC BY-NC-ND 4.0 | Originator: Lechtape, Andrea...","http://vocab.getty.edu/aat/300411453, https://..."
2,https://www.deckenmalerei.eu/e1417934-9cf2-49e...,Ferdinand III.,1727-31,,,https://previous.bildindex.de/bilder/fmd100348...,"CC BY-NC-ND 4.0 | Originator: Scheidt, Thomas ...","http://vocab.getty.edu/aat/300411453, https://..."
3,https://www.deckenmalerei.eu/82f2844f-dd01-445...,Die Fabel vom alten Hund und seinem Herrn,um 1542,,,https://deckenmalerei-bilder.badw.de/eas/parti...,© CbDD / Bayrische Schlösserverwaltung / Jan-E...,"http://vocab.getty.edu/aat/300411453, https://..."
4,https://www.deckenmalerei.eu/aad77f4d-0ea5-45d...,Ehemalige Wandmalerei,1640-1660,,,https://previous.bildindex.de/bilder/mi11084b1...,,"http://vocab.getty.edu/aat/300411453, https://..."
5,https://www.deckenmalerei.eu/8e2839a0-db7c-44c...,Die Deckenausmalung des gesamten Grottensaals,,,,https://previous.bildindex.de/bilder/fmd100083...,"Rechte vorbehalten | Originator: Lechtape, And...",http://vocab.getty.edu/aat/300411453
6,https://www.deckenmalerei.eu/a4df5cdd-945d-44a...,Triumph des Mordechai,1791,,,https://previous.bildindex.de/bilder/fmd100146...,"Rechte vorbehalten | Originator: Gaasch, Uwe |...","http://vocab.getty.edu/aat/300411453, https://..."
7,https://www.deckenmalerei.eu/5faf05f0-39df-403...,"Deckengemälde mit Apoll und Amor, umgeben von ...",um 1785,,,https://previous.bildindex.de/bilder/fmd100275...,"CC BY-NC-ND 4.0 | Originator: Bunz, Achim | Ri...","http://vocab.getty.edu/aat/300411453, https://..."
8,https://www.deckenmalerei.eu/ea4657d8-e933-406...,Decke des Raumes westlich des Treppenhauses,bis 1726,,,https://previous.bildindex.de/bilder/fmd100231...,"Rechte vorbehalten | Originator: Lechtape, And...","http://vocab.getty.edu/aat/300411453, https://..."
9,https://www.deckenmalerei.eu/171a8f17-e017-4b9...,AEQVARI PAVET ALTA MINOR: Hand mit Stecken sch...,1595-1605,,,https://previous.bildindex.de/bilder/fmd100053...,Rechte vorbehalten | Rights holder: Deutsches ...,"http://vocab.getty.edu/aat/300411453, https://..."


In [167]:
# Analyze the first painting in detail - get ALL its properties
if not df_detailed.empty:
    first_painting_uri = df_detailed.iloc[0]['painting']
    print(f"Detailed analysis of: {df_detailed.iloc[0]['label']}")
    print(f"URI: {first_painting_uri}\n")
    
    query_all_props = f"""
    SELECT ?property ?value
    WHERE {{
      <{first_painting_uri}> ?property ?value .
    }}
    ORDER BY ?property
    """
    
    df_all_props = run_sparql(query_all_props)
    
    # Group properties for better display
    print(f"This painting has {len(df_all_props)} property values:\n")
    df_all_props['property_short'] = df_all_props['property'].apply(lambda x: x.split('/')[-1] if '/' in x else x)
    
    # Show grouped summary
    prop_counts = df_all_props['property_short'].value_counts()
    print("Property summary:")
    for prop, count in prop_counts.items():
        sample_val = df_all_props[df_all_props['property_short'] == prop]['value'].iloc[0]
        # Truncate long values
        sample_val = str(sample_val)[:60] + '...' if len(str(sample_val)) > 60 else sample_val
        print(f"  • {prop}: {count} value(s) - e.g., {sample_val}")

Detailed analysis of: Wolfegg, Schloss
URI: https://www.deckenmalerei.eu/bf9a1cc3-1df5-4f19-b807-67df6c385c2f

This painting has 16 property values:

Property summary:
  • CTO_0001009: 2 value(s) - e.g., nodeID://b2650830
  • associatedMedia: 1 value(s) - e.g., nodeID://b2647534
  • latitude: 1 value(s) - e.g., 47.8232044572414
  • longitude: 1 value(s) - e.g., 9.79170442039409
  • 22-rdf-syntax-ns#type: 1 value(s) - e.g., https://nfdi4culture.de/ontology/CTO_0001005
  • rdf-schema#label: 1 value(s) - e.g., Wolfegg, Schloss
  • NFDI_0000142: 1 value(s) - e.g., https://nfdi4culture.de/id/E6404
  • NFDI_0000191: 1 value(s) - e.g., https://nfdi4culture.de/id/E2430
  • NFDI_0001008: 1 value(s) - e.g., https://www.deckenmalerei.eu/bf9a1cc3-1df5-4f19-b807-67df6c3...
  • CTO_0001006: 1 value(s) - e.g., https://nfdi4culture.de/id/E6077
  • CTO_0001010: 1 value(s) - e.g., nodeID://b2657420
  • CTO_0001025: 1 value(s) - e.g., nodeID://b2650373
  • CTO_0001026: 1 value(s) - e.g., http://vocab.get

In [168]:
# Explore the ICONCLASS subjects (iconographic themes) of the first painting
if not df_detailed.empty:
    first_painting_uri = df_detailed.iloc[0]['painting']
    
    query_subjects = f"""
    SELECT ?iconclass ?iconclassLabel
    WHERE {{
      <{first_painting_uri}> <https://nfdi4culture.de/ontology/CTO_0001026> ?iconclass .
      OPTIONAL {{ ?iconclass rdfs:label ?iconclassLabel . }}
    }}
    """
    
    df_subjects = run_sparql(query_subjects)
    
    print(f"ICONCLASS subjects for '{df_detailed.iloc[0]['label']}':")
    print("="*60)
    
    for idx, row in df_subjects.iterrows():
        iconclass_uri = row['iconclass']
        # Extract the code from the URI
        code = iconclass_uri.split('/')[-1] if '/' in iconclass_uri else iconclass_uri
        
        # Determine if it's ICONCLASS or Getty AAT
        if 'iconclass.org' in iconclass_uri:
            source = 'ICONCLASS'
            link = iconclass_uri
        elif 'vocab.getty.edu' in iconclass_uri:
            source = 'Getty AAT'
            link = iconclass_uri
        else:
            source = 'Other'
            link = iconclass_uri
            
        print(f"  • [{source}] {code}")
        print(f"    🔗 {link}")
        print()

ICONCLASS subjects for 'Wolfegg, Schloss':
  • [Getty AAT] 300004792
    🔗 http://vocab.getty.edu/aat/300004792



In [169]:
# Check if the ICONCLASS/AAT subjects have labels stored in the NFDI4Culture KG
# Let's see what properties these subject URIs have in the KG

query_subject_properties = f"""
SELECT ?p (COUNT(?o) AS ?count) (SAMPLE(?o) AS ?sampleValue)
WHERE {{
  {CBDD_FEED_URI} schema:dataFeedElement ?feedItem .
  ?feedItem schema:item ?painting .
  ?painting <https://nfdi4culture.de/ontology/CTO_0001026> ?subject .
  ?subject ?p ?o .
}}
GROUP BY ?p
ORDER BY DESC(?count)
LIMIT 20
"""

df_subject_props = run_sparql(query_subject_properties)
print("Properties of ICONCLASS/AAT subject URIs in the NFDI4Culture KG:")
df_subject_props

Properties of ICONCLASS/AAT subject URIs in the NFDI4Culture KG:


Unnamed: 0,p,count,sampleValue
0,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,23359,https://nfdi4culture.de/ontology/CTO_0001030


In [170]:
# The subject URIs don't have labels in the KG.
# Let's check sample subject URIs to understand their structure
query_sample_subjects = f"""
SELECT DISTINCT ?subject
WHERE {{
  {CBDD_FEED_URI} schema:dataFeedElement ?feedItem .
  ?feedItem schema:item ?painting .
  ?painting <https://nfdi4culture.de/ontology/CTO_0001026> ?subject .
}}
LIMIT 20
"""

df_sample_subjects = run_sparql(query_sample_subjects)
print("Sample subject URIs:")
for uri in df_sample_subjects['subject']:
    print(f"  {uri}")

Sample subject URIs:
  http://vocab.getty.edu/aat/300411453
  https://iconclass.org/5
  https://iconclass.org/43C11126
  https://iconclass.org/25HH
  https://iconclass.org/25B1
  https://iconclass.org/25B2
  https://iconclass.org/25B3
  https://iconclass.org/25B4
  http://vocab.getty.edu/aat/300004792
  https://iconclass.org/92D1521
  https://iconclass.org/48A9854
  https://iconclass.org/48C14%28SCHEINARCHITEKTUR%29
  https://iconclass.org/48C16
  https://iconclass.org/12A52113
  https://iconclass.org/71E135
  https://iconclass.org/11I62%28AARON%29
  https://iconclass.org/92D19213
  https://iconclass.org/48A983111
  https://iconclass.org/25G4%28HOP%29
  https://iconclass.org/25G4%28WHEAT%29


In [171]:
# Query external SPARQL endpoints for subject labels
import requests
import time
from functools import lru_cache
import urllib.parse

@lru_cache(maxsize=500)
def query_iconclass_sparql(notation):
    """Query ICONCLASS SPARQL endpoint for a label."""
    try:
        # URL-decode the notation (e.g., "48C14%28SCHEINARCHITEKTUR%29" -> "48C14(SCHEINARCHITEKTUR)")
        notation_decoded = urllib.parse.unquote(notation)
        
        endpoint = "https://iconclass.org/sparql"
        query = f"""
        PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
        
        SELECT ?label
        WHERE {{
          <https://iconclass.org/{notation_decoded}> skos:prefLabel ?label .
          FILTER(LANG(?label) = "en")
        }}
        LIMIT 1
        """
        
        resp = requests.get(
            endpoint,
            params={'query': query, 'format': 'json'},
            headers={'Accept': 'application/sparql-results+json'},
            timeout=10
        )
        if resp.ok:
            data = resp.json()
            bindings = data.get("results", {}).get("bindings", [])
            if bindings:
                return bindings[0].get("label", {}).get("value")
    except Exception as e:
        pass
    return None

@lru_cache(maxsize=500)
def query_getty_sparql(aat_id):
    """Query Getty AAT SPARQL endpoint for a label using rdfs:label."""
    try:
        endpoint = "http://vocab.getty.edu/sparql"
        # Getty uses rdfs:label with language tags
        query = f"""
        PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
        
        SELECT ?label
        WHERE {{
          <http://vocab.getty.edu/aat/{aat_id}> rdfs:label ?label .
          FILTER(LANG(?label) = "en")
        }}
        LIMIT 1
        """
        
        resp = requests.get(
            endpoint,
            params={'query': query, 'format': 'json'},
            headers={'Accept': 'application/sparql-results+json'},
            timeout=10
        )
        if resp.ok:
            data = resp.json()
            bindings = data.get("results", {}).get("bindings", [])
            if bindings:
                return bindings[0].get("label", {}).get("value")
    except Exception as e:
        pass
    return None

def resolve_subject_from_sparql(uri):
    """Resolve a subject URI to its label using external SPARQL endpoints."""
    code = uri.split('/')[-1]
    
    if 'iconclass.org' in uri:
        label = query_iconclass_sparql(code)
        source = 'ICONCLASS'
    elif 'vocab.getty.edu' in uri:
        label = query_getty_sparql(code)
        source = 'Getty AAT'
    else:
        label = None
        source = 'Unknown'
    
    return {
        'uri': uri,
        'code': code,
        'label': label or f'[{code}]',
        'source': source,
        'resolved': label is not None
    }

# Test with sample codes
print("Testing external SPARQL endpoints...")
print("="*60)

print("\n1. ICONCLASS tests:")
for code in ["92D1521", "25HH", "48C14%28SCHEINARCHITEKTUR%29", "5"]:
    label = query_iconclass_sparql(code)
    print(f"   {code}: {label}")

print("\n2. Getty AAT tests:")
for code in ["300004792", "300411453"]:
    label = query_getty_sparql(code)
    print(f"   {code}: {label}")

print("\n" + "="*60)
print("Functions defined: resolve_subject_from_sparql(uri)")

Testing external SPARQL endpoints...

1. ICONCLASS tests:
   92D1521: Cupid shooting a dart   92D1521: Cupid shooting a dart

   25HH: landscapes - HH - ideal landscapes
   25HH: landscapes - HH - ideal landscapes
   48C14%28SCHEINARCHITEKTUR%29: None
   48C14%28SCHEINARCHITEKTUR%29: None
   5: Abstract Ideas and Concepts

2. Getty AAT tests:
   5: Abstract Ideas and Concepts

2. Getty AAT tests:
   300004792: None
   300004792: None
   300411453: None

Functions defined: resolve_subject_from_sparql(uri)
   300411453: None

Functions defined: resolve_subject_from_sparql(uri)


In [172]:
# Test the SPARQL-based label resolver on subjects from our sample paintings
print("Resolving labels for subjects using external SPARQL endpoints...")
print("="*70)

# Collect unique subjects from df_detailed
all_subjects = set()
for subjects_str in df_detailed['subjects'].dropna():
    for s in subjects_str.split(', '):
        s = s.strip()
        if s:
            all_subjects.add(s)

print(f"\nFound {len(all_subjects)} unique subject codes in sample paintings\n")

# Resolve each subject (limit to first 15 to avoid too many API calls)
resolved = []
for uri in list(all_subjects)[:15]:
    code = uri.split('/')[-1]
    print(f"Resolving: {code[:30]}...", end=" ")
    result = resolve_subject_from_sparql(uri)  # Use new SPARQL function
    resolved.append(result)
    status = "✓" if result['resolved'] else "✗"
    label_display = result['label'][:50] + "..." if len(result['label']) > 50 else result['label']
    print(f"{status} [{result['source']}] {label_display}")
    time.sleep(0.2)  # Be nice to external endpoints

# Create a dataframe of resolved subjects
df_resolved = pd.DataFrame(resolved)
print("\n" + "="*70)
print(f"\nResolved {sum(df_resolved['resolved'])}/{len(df_resolved)} subjects successfully")
df_resolved

Resolving labels for subjects using external SPARQL endpoints...

Found 37 unique subject codes in sample paintings

Resolving: 300004792... ✗ [Getty AAT] [300004792]
Resolving: 25G41%28POPPY%29... ✓ [ICONCLASS] flowers: poppy
Resolving: 25G41%28POPPY%29... ✓ [ICONCLASS] flowers: poppy
Resolving: 44B1132... ✗ [ICONCLASS] [44B1132]
Resolving: 44B1132... ✗ [ICONCLASS] [44B1132]
Resolving: 46A126... ✓ [ICONCLASS] the nine worthies, 'les neuf preux'
Resolving: 46A126... ✓ [ICONCLASS] the nine worthies, 'les neuf preux'
Resolving: 98B%28OVID%29... ✓ [ICONCLASS] (story of) Ovid
Resolving: 98B%28OVID%29... ✓ [ICONCLASS] (story of) Ovid
Resolving: 96B11... ✓ [ICONCLASS] Aeneas leaves the country of Troy
Resolving: 96B11... ✓ [ICONCLASS] Aeneas leaves the country of Troy
Resolving: 96B113... ✓ [ICONCLASS] 'Pius Aeneas': Aeneas, leading Ascanius, escapes f...
Resolving: 96B113... ✓ [ICONCLASS] 'Pius Aeneas': Aeneas, leading Ascanius, escapes f...
Resolving: 41D222%28COLLAR%29... ✓ [ICONCLASS] ne

Unnamed: 0,uri,code,label,source,resolved
0,http://vocab.getty.edu/aat/300004792,300004792,[300004792],Getty AAT,False
1,https://iconclass.org/25G41%28POPPY%29,25G41%28POPPY%29,flowers: poppy,ICONCLASS,True
2,https://iconclass.org/44B1132,44B1132,[44B1132],ICONCLASS,False
3,https://iconclass.org/46A126,46A126,"the nine worthies, 'les neuf preux'",ICONCLASS,True
4,https://iconclass.org/98B%28OVID%29,98B%28OVID%29,(story of) Ovid,ICONCLASS,True
5,https://iconclass.org/96B11,96B11,Aeneas leaves the country of Troy,ICONCLASS,True
6,https://iconclass.org/96B113,96B113,"'Pius Aeneas': Aeneas, leading Ascanius, escap...",ICONCLASS,True
7,https://iconclass.org/41D222%28COLLAR%29,41D222%28COLLAR%29,neck-gear: collar,ICONCLASS,True
8,https://iconclass.org/44B113,44B113,king,ICONCLASS,True
9,https://iconclass.org/46A1266,46A1266,Julius Caesar (one of the nine worthies),ICONCLASS,True


In [173]:
# Enhanced display function that shows resolved subject labels (using SPARQL endpoints)
def display_painting_with_labels(row, max_width=500, resolve_subjects=True):
    """Display a painting with its metadata and resolved subject labels."""
    label = row.get('label', 'Unknown')
    year = row.get('year', 'Unknown date')
    image_url = row.get('imageUrl', '')
    subjects = row.get('subjects', '')
    lat = row.get('lat', '')
    lon = row.get('lon', '')
    painting_uri = row.get('painting', '')
    
    # Create location string if coordinates exist
    location = f"📍 {lat}, {lon}" if lat and lon else ""
    
    # Resolve subject labels using SPARQL endpoints
    subject_html_items = []
    if subjects and resolve_subjects:
        subject_list = [s.strip() for s in subjects.split(',') if s.strip()]
        for uri in subject_list[:5]:  # Limit to 5 subjects
            resolved = resolve_subject_from_sparql(uri)  # Use SPARQL-based resolver
            badge_color = '#4CAF50' if 'iconclass' in uri.lower() else '#2196F3'
            subject_html_items.append(
                f'<span style="background: {badge_color}; color: white; padding: 2px 8px; '
                f'border-radius: 12px; font-size: 12px; margin: 2px; display: inline-block;" '
                f'title="{resolved["source"]}: {resolved["code"]}">{resolved["label"]}</span>'
            )
    subject_html = ''.join(subject_html_items) if subject_html_items else '<em>No subjects</em>'
    
    html = f"""
    <div style="border: 1px solid #ddd; padding: 15px; margin: 10px 0; border-radius: 8px; background: #fafafa;">
        <h3 style="margin-top: 0; color: #333;">{label}</h3>
        <p style="color: #000;"><strong>Date:</strong> {year}</p>
        <div style="margin: 10px 0;">
            <strong style="color: #000;">Subjects:</strong><br>
            <div style="margin-top: 5px;">{subject_html}</div>
        </div>
        {f'<p style="color: #000;">{location}</p>' if location else ''}
        <p><a href="{painting_uri}" target="_blank" style="color: #0066cc;">🔗 View in CbDD</a></p>
        <img src="{image_url}" style="max-width: {max_width}px; max-height: 500px; border-radius: 4px;" 
             onerror="this.onerror=null; this.src=''; this.alt='Image could not be loaded';">
    </div>
    """
    display(HTML(html))

print("✅ Enhanced display function defined: display_painting_with_labels()")
print("\nThis function resolves subject codes via SPARQL to external vocabularies.")
print("Subject sources are color-coded: 🟢 ICONCLASS | 🔵 Getty AAT")

✅ Enhanced display function defined: display_painting_with_labels()

This function resolves subject codes via SPARQL to external vocabularies.
Subject sources are color-coded: 🟢 ICONCLASS | 🔵 Getty AAT


In [174]:
# Display paintings with resolved subject labels (using SPARQL endpoints)
print("Displaying paintings with resolved subject labels:\n")
print("Fetching labels from ICONCLASS and Getty AAT SPARQL endpoints...")
print("(🟢 = ICONCLASS, 🔵 = Getty AAT)\n")

for idx, row in df_detailed.head(3).iterrows():
    print(f"\n--- Painting {idx+1}: {row.get('label', 'Unknown')[:50]} ---")
    display_painting_with_labels(row)
    time.sleep(0.3)  # Small delay between paintings

Displaying paintings with resolved subject labels:

Fetching labels from ICONCLASS and Getty AAT SPARQL endpoints...
(🟢 = ICONCLASS, 🔵 = Getty AAT)


--- Painting 1: Wolfegg, Schloss ---



--- Painting 2: Die großen Veduten ---



--- Painting 3: Ferdinand III. ---


## 4. Compare CbDD and Color Slide Archive of Wall and Ceiling Painting

Portal IDs from the registry:
- CbDD: `n4c:E4264`
- Color Slide Archive: `n4c:E4267`

Goal: Count how many records in the KG come from each of these portals.

We assume a pattern similar to:
- `?item schema:isPartOf ?feed`
- `?feed schema:isPartOf ?portal` or `?feed dcterms:isPartOf ?portal`

You may have to adjust the property in the middle depending on what you see in the inspection of the feed nodes.

In [175]:
query_ceiling_portal_counts = """\
SELECT ?portal ?portalLabel (COUNT(DISTINCT ?item) AS ?records)
WHERE {
  VALUES ?portal { n4c:E4264  n4c:E4267 }

  # feed belongs to one of the two portals
  ?feed ?isPartOfPortal ?portal .
  FILTER(?isPartOfPortal IN (schema:isPartOf, dcterms:isPartOf))

  # items belong to that feed
  ?item schema:isPartOf ?feed .

  ?portal schema:name ?portalLabel .
}
GROUP BY ?portal ?portalLabel
ORDER BY DESC(?records)
"""

df_ceiling_portal_counts = run_sparql(query_ceiling_portal_counts)
df_ceiling_portal_counts

In [176]:
# Simple bar chart of records per portal (CbDD vs Color Slide Archive)
if not df_ceiling_portal_counts.empty:
    plt.figure(figsize=(6, 4))
    plt.bar(df_ceiling_portal_counts["portalLabel"], df_ceiling_portal_counts["records"].astype(int))
    plt.xticks(rotation=20, ha="right")
    plt.ylabel("Number of records in KG")
    plt.title("Records from baroque wall & ceiling painting portals")
    plt.tight_layout()
    plt.show()
else:
    print("No results yet. Check if the intermediate predicate (?isPartOfPortal) is correct.")

No results yet. Check if the intermediate predicate (?isPartOfPortal) is correct.


## 5. Next steps for your data story

Ideas for how you can extend this notebook:

1. **Map of painting locations**  
   - From `df_cbdd_items_sample` or a larger query, extract `placeLabel` and, if available, coordinates.  
   - Use a mapping library (e.g. `folium`) to display points on a map.

2. **Timeline of creation dates**  
   - Inspect which property holds precise dates or centuries (e.g. `schema:temporalCoverage`, other date fields).  
   - Parse years to integers, bucket by decade or century, plot as a bar chart.

3. **Motif / subject comparison between portals**  
   - If items use `schema:about` or `dcterms:subject` with concepts (ICONCLASS, GND), count their frequency per portal.  
   - Visualise top motifs for CbDD vs Color Slide Archive in a grouped bar chart.

4. **Linked Data demonstration**  
   - Use the item inspection to find external identifiers (e.g. Wikidata, GND).  
   - Show a small RDF snippet or perform a federated query as part of your story.

You can keep all experiments you do here and later turn the most interesting figures and tables into your final data story.