## Script for data entry from iDAI.chronontology to PeriodO

1. Go to PeriodO's website: https://perio.do/en/
2. Open PeriodO client (allow permanent data to be saved in the browser). This should then open in https://client.perio.do/?page=open-backend.
3. Run “Add data” source
4. Create authority in data source
5. Create a test period that can be deleted later. It is recommended to include a note such as “delete me” in the label.
6. Adjust and run the following code. Important: The Internet browser must be closed when the main loop is executed.

GitHub Repo with this and additional files: https://github.com/Lukas-LaMass/iDAI.chronontology_to_PeriodO

---

### Skript zur Dateneingabe von iDAI.chronontology zu PeriodO

1. PeriodO aufrufen: https://perio.do/en/
2. PeriodO client öffnen (im Browser das Speichern permanenter Daten zualssen). Dieser sollte dann in https://client.perio.do/?page=open-backend geöffnet sein.
3. "Add data" source ausführen
4. Create authority in data source
5. Erstellen einer Testperiode, die später gelöscht werden kann. Der Hinweis "delete me" o.ä. empfiehlt sich im Label.
6. Anpassen und Ausführen des folgenden Codes. Wichtig: Der Internet-Browser muss beendet sein.

GitHub Repo: https://github.com/Lukas-LaMass/iDAI.chronontology_to_PeriodO

In [1]:
#import necessary libraries

import pandas as pd
import requests
import json

import time
import selenium
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.firefox.options import Options
from selenium.webdriver.firefox.firefox_profile import FirefoxProfile
import os
import shutil


In [2]:
#set variables

chronontology_url = "https://chronontology.dainst.org/data/period/"

#read in data as excel
#df_data = pd.read_excel("Chronontology_to_PeriodO_testdata2.xlsx", sheet_name="Bestand_2025_02_21")
#read in data as csv
df_data = pd.read_csv("SPP2143_zu_PeriodO_Testdata.csv", sep=",", encoding="utf-8")
df_chron_gaz_mapping = pd.read_excel("chronontology_gazetteer_mapping.xlsx", sheet_name="Sheet1")
df_gaz_wikidata_mapping = pd.read_excel("Gazetteer_To_Wikidata_Mapping_2025-07-11.xlsx", sheet_name="wikidata")

#Selenium Setup

periodo_client_url = "https://client.perio.do/?page=period-view&backendID=local-15&authorityID=https%3A%2F%2Fclient.perio.do%2F.well-known%2Fgenid%2Fa72460e3d382e5a52fef9f6338c7eeaf&periodID=https%3A%2F%2Fclient.perio.do%2F.well-known%2Fgenid%2F6840edfa6052dda199b0a6d986b78841"

#Selenium Setup

def setup_periodo_profile():
    #source_profile = r"C:\Users\lukas\AppData\Roaming\Mozilla\Firefox\Profiles\wwqozyt8.default-release"
    source_profile = r"C:\Users\lukas\AppData\Roaming\Mozilla\Firefox\Profiles\166od4l5.default-release"
    selenium_profile = r"C:\temp\selenium_periodo_profile"
    
    if not os.path.exists(selenium_profile):
        shutil.copytree(source_profile, selenium_profile)
    
    return selenium_profile

print(df_data.head())
#print(df_chron_gaz_mapping.head())
#print(df_gaz_wikidata_mapping.head())


       ChronoID      importID                           name  \
0  2exnuQecx5iY  spp2143:0005       Sahara-Sudan Neolothikum   
1  kICzVpoTKUGa  spp2143:0006                     Leiterband   
2  zUGcXUwSQyoU  spp2143:0028                        Djara A   
3  DAd023qmv7Uf  spp2143:0055  Microlithic Late Stone Age C2   
4  2wX2Y7CG20k4  spp2143:0131                     Firgi Type   

                                                type provenance  
0                                   material_culture    SPP2143  
1                         material_culture / pottery    SPP2143  
2                          material_culture / lithic    SPP2143  
3                          material_culture / lithic    SPP2143  
4  cultural / material_culture / pottery / archae...    SPP2143  


In [3]:
#Retreive JSON from Chronontology API

def get_period_data(period_id):
    url = f"{chronontology_url}{period_id}"
    response = requests.get(url)
    data = response.json()
    return data
    

In [4]:
#Original Label maps to Names

def get_label(data):
    names = data['resource']['names']
    if 'en' in names:
        label = names['en']
    else:
        values = names.values()
        label = list(values)
    label = label[0]
    return label

chronontology_languages = {
    'ar':'http://lexvo.org/id/iso639-1/ar', #ara Arabisch
    'de':'http://lexvo.org/id/iso639-1/de', #deu Deutsch
    'el':'http://lexvo.org/id/iso639-1/el', #ell Neugriechisch
    'en':'http://lexvo.org/id/iso639-1/en', #eng Englisch
    'es':'http://lexvo.org/id/iso639-1/es', #spa Spanisch
    'eu':'http://lexvo.org/id/iso639-1/eu', #eus Baskisch
    'fr':'http://lexvo.org/id/iso639-1/fr', #fra Französisch
    'it':'http://lexvo.org/id/iso639-1/it', #ita Italienisch
    'nl':'http://lexvo.org/id/iso639-1/nl', #nld Niederländisch
    'pl':'http://lexvo.org/id/iso639-1/pl', #pol Polnisch
    'pt':'http://lexvo.org/id/iso639-1/pt', #por Portugiesisch
    'ru':'http://lexvo.org/id/iso639-1/ru', #rus Russisch
    'sq':'http://lexvo.org/id/iso639-1/sq', #sqi Albanisch
    'sr':'http://lexvo.org/id/iso639-1/sr', #srp Serbisch
    'tk':'http://lexvo.org/id/iso639-1/tr', #tuk Türkisch
    'vi':'http://lexvo.org/id/iso639-1/vi', #vie Vietnamesisch
    'zh':'http://lexvo.org/id/iso639-1/zh', #zho Chinesisch
    'la':'http://lexvo.org/id/iso639-1/la', #lat Latein
    'egy': 'http://lexvo.org/id/iso639-3/egy', #egy Ägyptisch
    'grc': 'http://lexvo.org/id/iso639-3/grc', #grc Altgriechisch
    'xmr': 'http://lexvo.org/id/iso639-3/xmr', #Meroitisch
    'cop': 'http://lexvo.org/id/iso639-3/cop'} #Koptisch

def get_label_language(data):
    names = data['resource']['names']
    #get the first key in the names dictionary
    first_key = next(iter(names))
    language = chronontology_languages[first_key]
    return language

def get_language_tag(language):
    language_tag = language.split('/')[-1]
    return language_tag

    

In [5]:
#Alternative names with language indication

def get_all_names(data):
    names = data['resource']['names']
    return names

In [6]:
# Standardize years

def standardize_date(date_str):
    if date_str is None or date_str == "":
        return None
    if "-" in date_str:
        negative = True
    else:
        negative = False
    # Remove any non-numeric characters
    standardized_date = ''.join(filter(str.isdigit, date_str))
    # If the date is less than 4 digits, pad it with leading zeros
    if len(standardized_date) < 4:
        standardized_date = standardized_date.zfill(4)
    if negative == True:
        standardized_date = "-" + standardized_date
    else:
        pass
    return standardized_date

#test_year = "-0"
#print(f"Standardized year: {standardize_date(test_year)}")

In [7]:
#Start (original text and standardized)

def get_startdate(data):
    if 'hasTimespan' not in data['resource']:
        start_date = '9999'
        return start_date
    else:
        start_date = data['resource']['hasTimespan'][0]
        if 'begin' in start_date:
            start_date = start_date['begin']
            if "at" in start_date:
                start_date = start_date['at']
            elif 'notBefore' and 'notAfter' in start_date:
                start_date = start_date['notBefore']
            elif 'notBefore' in start_date:
                start_date = start_date['notBefore']
            elif 'notAfter' in start_date:
                start_date = start_date['notAfter']
            else:
                pass
        start_date = standardize_date(start_date)
    return start_date

def get_startdate_label(data):
    if 'hasTimespan' not in data['resource']:
        start_date_label = '9999'
        return start_date_label
    else:
        start_date = data['resource']['hasTimespan'][0]
        if 'timeOriginal' in start_date:
            start_date_label = start_date['timeOriginal']
        else:
            pass
    return start_date_label


In [8]:
#End date (Original text and standardized)

def get_enddate(data):
    if 'hasTimespan' not in data['resource']:
        end_date = '9999'
        return end_date
    else:
        end_date = data['resource']['hasTimespan'][0]
        if 'begin' in end_date:
            end_date = end_date['end']
            if "at" in end_date:
                end_date = end_date['at']
                if end_date == "present":
                    end_date = "2025"
            elif 'notBefore' and 'notAfter' in end_date:
                end_date = end_date['notBefore']
            elif 'notBefore' in end_date:
                end_date = end_date['notBefore']
            elif 'notAfter' in end_date:
                end_date = end_date['notAfter']
            else:
                pass
        end_date = standardize_date(end_date)
    return end_date

def get_enddate_label(data):
    if 'hasTimespan' not in data['resource']:
        end_date_label = '9999'
        return end_date_label
    else:
        end_date = data['resource']['hasTimespan'][0]
        if 'timeOriginal' in end_date:
            end_date_label = end_date['timeOriginal']
        else:
            pass
        return end_date_label

In [9]:
#Merge Description and Definition

def merge_desciption_definition(data):
    if 'description' in data['resource']:
        description = data['resource']['description']
    else:
        description = ""
    if 'definition' in data['resource']:
        definition = data['resource']['definition']
    else:
        definition = ""
    
    if description and definition:
        merged_text = f"{description} {definition}"
    elif description:
        merged_text = description
    elif definition:
        merged_text = definition
    else:
        merged_text = ""
    
    return merged_text

In [10]:
#Literature

def get_locator(data):
    if 'references' in data['resource']:
        literature = data['resource']['references']
        literature = literature[0]['reference']
        return literature
    else:
        literature = ""
        return literature

In [11]:
#Spatial Coverage with Wikidata

def get_spatial_description(chronoID):
    # Retrieve the spatial ID from the mapping DataFrame
    spatial_description = df_chron_gaz_mapping.loc[df_chron_gaz_mapping['ChronoID'] == chronoID, 'Localization'].values
    if len(spatial_description) > 0:
        spatial_description = spatial_description[0]
    else:
        spatial_description = None
    return spatial_description

def get_spatial_id(chronoID):
    gaz_uri = df_chron_gaz_mapping.loc[df_chron_gaz_mapping['ChronoID'] == chronoID, 'GazetteerID'].values
    if len(gaz_uri) == 0:
        return None
    spatial_id = df_gaz_wikidata_mapping.loc[df_gaz_wikidata_mapping['GazetteerID'] == gaz_uri[0], 'Wikidata_URL'].values
    if len(spatial_id) > 0:
        spatial_id = spatial_id[0]
    else:
        spatial_id = None
    return spatial_id


In [12]:
#Chronontology URL

def get_chronontology_url(period_id):
    url = f"https://chronontology.dainst.org/period/{period_id}"
    return url

In [13]:
#note

def get_note(data):
    note = data['resource']
    if 'note' in note:
        note = note['note']
    else:
        note = ""
    return note

In [14]:
#Add Parents

def get_parent(data):
    if 'isPartOf' in data['resource']['relations']:
        realtion = data['resource']['relations']['isPartOf']
        parent_id = realtion[0]
        #print(f"Parent ID: {parent_id}")
        # Add parent information to the period data
        data['parent'] = {"id": parent_id}
        parent_uri = f"https://chronontology.dainst.org/period/{parent_id}"
        return parent_uri
    else:
        #print("No parent found for this period.")
        parent_uri = "Kein Parent."

In [15]:
#Info: PeriodO JSON Format and Mapping
#This block shows an overview what will be mapped to the PeriodO JSON format.

'''
def format_json(label, language, language_tag, names, start_date, start_date_label, end_date, end_date_label, description, locator, spatial_description, spatial_id, web_page, note):
    json_data = {        
            "type":"Period",
            "url": web_page,
            "source":{
                "locator": locator
            },
            "label": label,
            "language": language,
            "language_tag": language_tag,
            "localizedLabels": names,
            "spatialCoverage": [{
                "id": spatial_id,
            }
            ],
            "spatialCoverageDescription": spatial_description,
            "start":{
                "in":{
                    "year": start_date
                },
                "label": start_date_label
            },
            "end":{
                "in":{
                    "year": end_date
                },
                "label": end_date_label
            },
            "note": description,
            "editorialNote": note,
        }
    
    return json_data
'''

'\ndef format_json(label, language, language_tag, names, start_date, start_date_label, end_date, end_date_label, description, locator, spatial_description, spatial_id, web_page, note):\n    json_data = {        \n            "type":"Period",\n            "url": web_page,\n            "source":{\n                "locator": locator\n            },\n            "label": label,\n            "language": language,\n            "language_tag": language_tag,\n            "localizedLabels": names,\n            "spatialCoverage": [{\n                "id": spatial_id,\n            }\n            ],\n            "spatialCoverageDescription": spatial_description,\n            "start":{\n                "in":{\n                    "year": start_date\n                },\n                "label": start_date_label\n            },\n            "end":{\n                "in":{\n                    "year": end_date\n                },\n                "label": end_date_label\n            },\n            "n

In [16]:
#Start Selenium and load PeriodO Client

#Use the copied profile
profile_path = setup_periodo_profile()
options = Options()
options.add_argument(f"-profile")
options.add_argument(profile_path)

#Additional options for better stability
options.add_argument("--disable-blink-features=AutomationControlled")
options.set_preference("dom.webdriver.enabled", False)
options.set_preference("useAutomationExtension", False)

driver = webdriver.Firefox(options=options)

driver.get(periodo_client_url)

#Wait until PeriodO is loaded
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By

#Wait for a specific element (adjust according to PeriodO interface)
try:
    WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.TAG_NAME, "body"))
    )
    print("PeriodO Client successfully loaded.")
except:
    print("Warning: PeriodO Client probably not fully loaded.")

PeriodO Client successfully loaded.


In [17]:
#Set up result DataFrame

chrono_url_list = []
locator_list = []
label_list = []
language_list = []
language_tag_list = []
localized_labels_list = []
spatial_coverage_list = []
spatial_coverage_description_list = []
start_date_list = []
start_date_label_list = []
end_date_list = []
end_date_label_list = []
description_list = []
note_list = []
parent_list = []

In [18]:
#Main Loop

for index, row in df_data.iterrows():
    data = get_period_data(row['ChronoID'])
    print(data)
    # Klicke auf den Button "Add Period"
    try:
        add_period_button = WebDriverWait(driver, 10).until(
            EC.element_to_be_clickable((By.XPATH, "/html/body/div/div/div/div[1]/div[3]/a[3]"))
        )
        add_period_button.click()
    except Exception as e:
        print(f"Fehler beim Klicken auf 'Add period': {e}")
        continue
    label = get_label(data)
    label_list.append(label)
    #Insert data into PerioO
    time.sleep(1)
    try:
        label_input = WebDriverWait(driver, 10).until(
            EC.presence_of_element_located((By.XPATH, "//input[contains(@class, 'css-174ooto')]"))
        )
        label_input.send_keys(label)
    except Exception as e:
        print(f"Fehler beim Eingeben des Labels: {e}")
        continue
    spatial_description = get_spatial_description(row['ChronoID'])
    spatial_coverage_description_list.append(spatial_description)
    #Insert spatial description into PeriodO
    time.sleep(1)
    try:
        spatial_description_input = WebDriverWait(driver, 10).until(
            EC.presence_of_element_located((By.XPATH, "//input[contains(@name, 'description')]"))
        )
        spatial_description_input.send_keys(spatial_description)
    except Exception as e:
        print(f"Fehler beim Eingeben der räumlichen Beschreibung: {e}")
        continue
    description = merge_desciption_definition(data)
    description_list.append(description)
    #Insert description into PeriodO
    time.sleep(1)
    try:
        description_input = WebDriverWait(driver, 30).until(
            EC.presence_of_element_located((By.XPATH, "//textarea[contains(@class, 'css-gv8lam')]"))
        )
        description_input.send_keys(description)
    except Exception as e:
        print(f"Fehler beim Eingeben der Beschreibung: {e}")
        continue

    start_date = get_startdate(data)
    start_date_list.append(start_date)
    start_date_label_list.append(start_date)
    #Insert start date into PeriodO
    time.sleep(1)
    print(f"Start date: {start_date}")
    try:
        start_date_input = WebDriverWait(driver, 10).until(
            EC.presence_of_element_located((By.XPATH, "//label[contains(text(), 'Start label')]/following-sibling::input"))
        )
        start_date_input.send_keys(start_date)
    except Exception as e:
        print(f"Fehler beim Eingeben des Startdatums: {e}")
        continue

    end_date = get_enddate(data)
    end_date_list.append(end_date)
    end_date_label_list.append(end_date)
    #Insert end date into PeriodO
    time.sleep(1)
    print(f"End date: {end_date}")
    try:
        end_date_input = WebDriverWait(driver, 10).until(
            EC.presence_of_element_located((By.XPATH, "//label[contains(text(), 'Stop label')]/following-sibling::input"))
        )
        end_date_input.send_keys(end_date)
    except Exception as e:
        print(f"Fehler beim Eingeben des Enddatums: {e}")
        continue

    #Insert locator into PeriodO
    time.sleep(1)
    locator = get_locator(data)
    locator_list.append(locator)
    try:
        locator_input = WebDriverWait(driver, 10).until(
            EC.presence_of_element_located((By.XPATH, "//input[contains(@name, 'locator')]"))
        )
        locator_input.send_keys(locator)
    except Exception as e:
        print(f"Fehler beim Eingeben des Locators: {e}")
        continue

    #Insert URL into PeriodO
    time.sleep(1)
    web_page = get_chronontology_url(row['ChronoID'])
    chrono_url_list.append(web_page)
    try:
        url_input = WebDriverWait(driver, 10).until(
            EC.presence_of_element_located((By.XPATH, "//input[contains(@name, 'url')]"))
        )
        url_input.send_keys(web_page)
    except Exception as e:
        print(f"Fehler beim Eingeben der URL: {e}")
        continue

    #Insert note into PeriodO
    time.sleep(1)
    note = get_note(data)
    note_list.append(note)
    try:
        note_input = WebDriverWait(driver, 10).until(
            EC.presence_of_element_located((By.XPATH, "//textarea[contains(@name, 'editorial-note')]"))
        )
        note_input.send_keys(note)
    except Exception as e:
        print(f"Fehler beim Eingeben der Notiz: {e}")
        continue

    #Fill in results with function outcomes that can not be handeled by Selenium
    language = get_label_language(data)
    language_list.append(language)
    language_tag = get_language_tag(language)
    language_tag_list.append(language_tag)
    localized_labels = get_all_names(data)
    localized_labels_list.append(localized_labels)
    spatial_id = get_spatial_id(row['ChronoID'])
    spatial_coverage_list.append(spatial_id)
    parent_uri = get_parent(data)
    parent_list.append(parent_uri)

    #'''
    #klick save button
    time.sleep(2)
    try:
        save_button = WebDriverWait(driver, 50).until(
            EC.element_to_be_clickable((By.XPATH, "/html/body/div/div/div/div[2]/div/div[7]/div[3]/button"))
        )
        save_button.click()
    except Exception as e:
        print(f"Fehler beim Klicken auf den Speichern-Button: {e}")
        continue
    #wait for the save to complete
    time.sleep(3)
    #'''

{'resource': {'externalId': 'spp2143:0005', 'names': {'en': ['Sahara-Sudan Neolithic'], 'de': ['Sahara-Sudan Neolithikum'], 'fr': ['Néolithique saharo-soudanais'], 'ar': ['صَحَارَى السودان النيوليتيكية']}, 'types': ['material_culture'], 'provenance': 'SPP2143', 'tags': ['aaarc'], 'hasTimespan': [{'begin': {'at': '-10000', 'isImprecise': True}, 'end': {'at': '-2000', 'isImprecise': True}}], 'spatiallyPartOfRegion': ['http://gazetteer.dainst.org/place/2042611'], 'relations': {'isPartOf': ['L0KyXWArtepv'], 'contains': ['iXvbSSD9NtP0'], 'hasPart': ['8rAT8oyeBX1q']}, 'id': '2exnuQecx5iY', 'type': 'period'}, 'dataset': 'none', 'version': 10, 'created': {'user': 'import', 'date': '2021-05-10T14:18:09.782+02:00'}, 'modified': [{'user': 'import', 'date': '2021-05-10T14:18:26.631+02:00'}, {'user': 'import', 'date': '2021-05-10T15:03:44.825+02:00'}, {'user': 'import', 'date': '2021-05-10T15:04:04.432+02:00'}, {'user': 'import', 'date': '2021-10-04T09:50:21.639+02:00'}, {'user': 'import', 'date': 

In [19]:
#Check if the save can be done
print(f'Die Liste der Chronontology-URLs ist {len(chrono_url_list)} lang.')
print(f'Die Liste der Locator-Einträge ist {len(locator_list)} lang.')
print(f'Die Liste der Labels ist {len(label_list)} lang.')
print(f'Die Liste der Sprachen ist {len(language_list)} lang.')
print(f'Die Liste der Sprach-Tags ist {len(language_tag_list)} lang.')
print(f'Die Liste der lokalisierten Labels ist {len(localized_labels_list)} lang.')
print(f'Die Liste der räumlichen Abdeckungen ist {len(spatial_coverage_list)} lang.')
print(f'Die Liste der räumlichen Beschreibungen ist {len(spatial_coverage_description_list)} lang.')
print(f'Die Liste der Startdaten ist {len(start_date_list)} lang.')
print(f'Die Liste der Startdaten-Labels ist {len(start_date_label_list)} lang.')
print(f'Die Liste der Enddaten ist {len(end_date_list)} lang.')
print(f'Die Liste der Enddaten-Labels ist {len(end_date_label_list)} lang.')
print(f'Die Liste der Beschreibungen ist {len(description_list)} lang.')
print(f'Die Liste der Notizen ist {len(note_list)} lang.')
print(f'Die Liste der Parents ist {len(parent_list)} lang.')

print(parent_list)

Die Liste der Chronontology-URLs ist 15 lang.
Die Liste der Locator-Einträge ist 15 lang.
Die Liste der Labels ist 15 lang.
Die Liste der Sprachen ist 15 lang.
Die Liste der Sprach-Tags ist 15 lang.
Die Liste der lokalisierten Labels ist 15 lang.
Die Liste der räumlichen Abdeckungen ist 15 lang.
Die Liste der räumlichen Beschreibungen ist 15 lang.
Die Liste der Startdaten ist 15 lang.
Die Liste der Startdaten-Labels ist 15 lang.
Die Liste der Enddaten ist 15 lang.
Die Liste der Enddaten-Labels ist 15 lang.
Die Liste der Beschreibungen ist 15 lang.
Die Liste der Notizen ist 15 lang.
Die Liste der Parents ist 15 lang.
['https://chronontology.dainst.org/period/L0KyXWArtepv', None, 'https://chronontology.dainst.org/period/pUiI8ZeBglKL', 'https://chronontology.dainst.org/period/zbinn8d7e3tL', 'https://chronontology.dainst.org/period/3yErKoDsXsub', None, 'https://chronontology.dainst.org/period/zyPxQJ0NMdLl', 'https://chronontology.dainst.org/period/v5zK0yJ1FBsS', None, 'https://chronontolog

In [20]:
#Save data as Backup file

#Klick on Settings
try:
    settings_button = WebDriverWait(driver, 10).until(
        EC.element_to_be_clickable((By.XPATH, "/html/body/div/div/div/div[1]/div[2]/a[7]"))
    )
    settings_button.click()
except Exception as e:
    print(f"Fehler beim Klicken auf die Einstellungen: {e}")

#Klick on Backup
try:
    backup_button = WebDriverWait(driver, 10).until(
        EC.element_to_be_clickable((By.XPATH, "/html/body/div/div/div/div[2]/div[2]/button"))
    )
    backup_button.click()
except Exception as e:
    print(f"Fehler beim Klicken auf den Backup-Button: {e}")

#Wait for the backup to complete
time.sleep(5)

#Close the driver
driver.quit()
#Delete the copied profile
if os.path.exists(profile_path):
    shutil.rmtree(profile_path)

#Create DataFrame with results
result_df = pd.DataFrame({
    'Chronontology URL': chrono_url_list,
    'Locator': locator_list,
    'Label': label_list,
    'Language': language_list,
    'Language Tag': language_tag_list,
    'Localized Labels': localized_labels_list,
    'Spatial Coverage': spatial_coverage_list,
    'Spatial Coverage Description': spatial_coverage_description_list,
    'Start Date': start_date_list,
    'Start Date Label': start_date_label_list,
    'End Date': end_date_list,
    'End Date Label': end_date_label_list,
    'Description': description_list,
    'Note': note_list,
    'Parent': parent_list
})

#Save the DataFrame as CSV
result_df.to_csv("Chronontology_to_PeriodO_results.csv", index=False)

In [21]:
#Get the most recent file from the Downloads folder
downloads_folder = os.path.join(os.path.expanduser("~"), "Downloads")
recent_file = max([os.path.join(downloads_folder, f) for f in os.listdir(downloads_folder)], key=os.path.getctime)

#get the current working directory in Jupyter
script_directory = os.getcwd()

#Copy the most recent file to the current directory
shutil.copy(recent_file, script_directory)

'z:\\FAIR.rdm AAArC 5\\Systeme\\PeriodO\\periodo-backup_Entangled Africa Test Data_2025-07-31.json'

In [22]:
#get the filename of the most recent file

print(recent_file)

recent_filename = os.path.basename(recent_file)

print(f"Most recent file: {recent_filename}")

C:\Users\lukas\Downloads\periodo-backup_Entangled Africa Test Data_2025-07-31.json
Most recent file: periodo-backup_Entangled Africa Test Data_2025-07-31.json


Not all information could be entered with Selenium. Therefore, the backup file must now be re-imported in order to add the next data.

To do this, the backup file is automatically moved from the downloads folder to the working directory. The file name is also automatically transferred.

---

Nicht alle Informationen könnten mit Selenium eingegeben werden. Deswegen muss das Backup-File nun erneut eingelesen werden, um die nächsten Daten zu ergänzen.

Dazu wird die Backup-Datei vom Downloads-Ordner automatisch ins Arbeitsverzeichnis verschoben. Der Dateiname wird ebenfalls automatisch übernommen.

In [23]:
#read in json data
with open(f'{recent_filename}', 'r', encoding='utf-8') as file:
    data = json.load(file)

#read in results of data mapping

df_additional_data = pd.read_csv("Chronontology_to_PeriodO_results.csv")

print(data)
print(df_additional_data)

{'backend': {'label': 'Entangled Africa Test Data', 'description': 'Test data from SPP 2143 "Entangled Africa" to be integrated into Canonical.', 'created': 1753962827656, 'modified': 1753965385744, 'accessed': 1753965385745}, 'patches': [{'forward': [{'op': 'add', 'path': '/authorities/https:~1~1client.perio.do~1.well-known~1genid~1a72460e3d382e5a52fef9f6338c7eeaf', 'value': {'id': 'https://client.perio.do/.well-known/genid/a72460e3d382e5a52fef9f6338c7eeaf', 'type': 'Authority', 'periods': {}, 'source': {'title': 'Entangeld Africa Period Collection', 'citation': 'Entangeld Africa Period Collection provided by FAIR.rdm', 'url': 'https://www.dainst.blog/entangled-africa/en/home/', 'yearPublished': '2025', 'creators': [{'name': 'Lukas Lammers'}, {'name': 'Eymard Fäder'}], 'contributors': [{'name': 'Florian Lukas'}, {'name': 'Johanna Sigl'}, {'name': 'Carlos Magnavita'}, {'name': 'Simon Kellers'}, {'name': 'Friederike Jesse'}, {'name': 'Ulrike Nowotnick'}, {'name': 'Fernanda da Silva Loza

In [24]:
#get the authority id, necesary to access the periods

authority_id = data['dataset']['authorities']
authority_id = authority_id.keys()  # Get the first authority ID
authority_id = list(authority_id)[0]  # Convert to list and get the first element
print(authority_id)

https://client.perio.do/.well-known/genid/a72460e3d382e5a52fef9f6338c7eeaf


In [28]:
# Mapping von ChronoID auf genId und umgekehrt

chronoids = df_additional_data['Chronontology URL'].tolist()
periods = data['dataset']['authorities'][authority_id]['periods']

for period in periods.values():
    for chrono_id in chronoids:
        if 'url' not in period:
            continue
        if period['url'] == chrono_id:
            gen_id = period['id']
            df_additional_data.loc[df_additional_data['Chronontology URL'] == chrono_id, 'genId'] = gen_id
            print(f"GenID for {chrono_id}: {gen_id}")
            break
        
# Save the updated DataFrame with genId
df_additional_data.to_csv("Chronontology_to_PeriodO_results.csv", index=False)

#Create a dictionary with genId as Key and ChronoID as Value
genid_chronoid_mapping = dict(zip(df_additional_data['genId'], df_additional_data['Chronontology URL']))

print(genid_chronoid_mapping)


GenID for https://chronontology.dainst.org/period/2exnuQecx5iY: https://client.perio.do/.well-known/genid/8a4d8f67e383827bcf95214f51d164bd
GenID for https://chronontology.dainst.org/period/kICzVpoTKUGa: https://client.perio.do/.well-known/genid/d7355a9b8cd2cd6cabfcc6101139f6c8
GenID for https://chronontology.dainst.org/period/zUGcXUwSQyoU: https://client.perio.do/.well-known/genid/7ec8fc05f3a32f3a43f3abbb7506146a
GenID for https://chronontology.dainst.org/period/DAd023qmv7Uf: https://client.perio.do/.well-known/genid/70bd01f60b575bd71d2dfceee15b46c5
GenID for https://chronontology.dainst.org/period/2wX2Y7CG20k4: https://client.perio.do/.well-known/genid/17378eb57deb636e97d7c1162be22125
GenID for https://chronontology.dainst.org/period/l5m98QMLbLYN: https://client.perio.do/.well-known/genid/3d6d776803fcc2a4912b643ec4eb3d0f
GenID for https://chronontology.dainst.org/period/Hz85r7fk0MA5: https://client.perio.do/.well-known/genid/23fecb837ce1a6f8e133140e86f38b73
GenID for https://chrononto

In [29]:
#Main Loop for adding additional data to the periods

import ast

for idx, row in df_additional_data.iterrows():
    period_id = row['genId']
    print(f"Processing period with ID: {period_id}")
    period_data = periods[period_id]

    #Adding multilingual names
    #Localized labels are stored as string, need to convert to dict
    localized_labels_str = row['Localized Labels']
    if isinstance(localized_labels_str, str):
        try:
            localized_labels = ast.literal_eval(localized_labels_str)
        except Exception as e:
            print(f"Fehler beim Parsen von Localized Labels: {e}")
            localized_labels = {}
    else:
        localized_labels = localized_labels_str
    print(f"Localized labels: {localized_labels}")
    #print(type(localized_labels))
    period_data['localizedLabels'] = localized_labels

    #Adding spatial coverage
    wiki_id = row['Spatial Coverage']
    wiki_label = row['Spatial Coverage Description']
    spatial_coverage = [{"id": f"{wiki_id}", "label": f"{wiki_label}"}]
    print(f"Spatial coverage: {spatial_coverage}")
    period_data['spatialCoverage'] = spatial_coverage

    #Adding parents
    parent_uri = row['Parent']
    if parent_uri != "Kein Parent.":
        parent_id = row['Parent']
        #print(f"Parent ID: {parent_id}")
        parent_genid = None
        for genid, chrono_url in genid_chronoid_mapping.items():
            if chrono_url == parent_id:
                parent_genid = genid
                break
        if parent_genid:
            print(f"Parent GenID: {parent_genid}")
            period_data['broader'] = parent_genid
        else:
            continue

Processing period with ID: https://client.perio.do/.well-known/genid/8a4d8f67e383827bcf95214f51d164bd
Localized labels: {'en': ['Sahara-Sudan Neolithic'], 'de': ['Sahara-Sudan Neolithikum'], 'fr': ['Néolithique saharo-soudanais'], 'ar': ['صَحَارَى السودان النيوليتيكية']}
Spatial coverage: [{'id': 'https://www.wikidata.org/wiki/Q6583', 'label': 'Sahara'}]
Processing period with ID: https://client.perio.do/.well-known/genid/d7355a9b8cd2cd6cabfcc6101139f6c8
Localized labels: {'de': ['Leiterband Komplex', 'Leiterband-Horizont'], 'en': ['Leiterband-Phase', 'Leiterband-Horizon']}
Spatial coverage: [{'id': 'https://www.wikidata.org/wiki/Q2538740', 'label': 'Wadi Howar (Region)'}]
Processing period with ID: https://client.perio.do/.well-known/genid/7ec8fc05f3a32f3a43f3abbb7506146a
Localized labels: {'de': ['Djara A'], 'ar': ['چارة A']}
Spatial coverage: [{'id': 'https://www.wikidata.org/wiki/Q14210840', 'label': 'Djara'}]
Processing period with ID: https://client.perio.do/.well-known/genid/70b

In [30]:
#write the modified data back to a new json file
with open('modified_periodo_backup.json', 'w', encoding='utf-8') as file:
    json.dump(data, file, ensure_ascii=False, indent=4)

The JSON file you just saved can be uploaded to the PeriodO client. On the main page, go to Restore from backup/Choose backup file.

---

Die soeben gespeichtere JSON-Datei kann im PeriodO Client hochgeladen werden. Auf der Hauptseite unter Restore from backup/Choose backup file.