## Preparation
This notebook expects the following input files under data/KlosterDB Exports (exported from Microsoft Access):

- data/KlosterDB Exports/gs_monastery.xlsx — Monasteries to be uploaded (from table gs_monastery)
- data/KlosterDB Exports/gs_monastery_location.xlsx — Locations for the above monasteries (from gs_monastery_location)
- data/KlosterDB Exports/gs_places.xlsx — Place lookup (from gs_places)

### Expected directory layout
```txt
data/
  KlosterDB Exports/
    gs_monastery.xlsx
    gs_monastery_location.xlsx
    gs_places.xlsx
  reconciliation/
  results/
Klosterdb - Factgrid.ipynb
```

### Python Packages
Install the following Python Packages: 
- `pandas`
- `numpy`

In [1]:
import pandas as pd
import numpy as np
import os
import csv

<a id='section1'></a>
## 1. Creating Items for Monastery Buildings
In FactGrid, every entry from the monastery database will be represented in two core items: An item for the Building Complex where it was located and an item for the monastery itself, as an organization.
In this step, items for the monastery location will be created from the database exports. 

### 1.1 Load and prepare data
The next cell of code will do the following:
1. Load monasteries and monastery locations from `data/KlosterDB Exports/gs_monastery.xlsx` and `data/KlosterDB Exports/gs_monastery_location.xlsx` as two dataframes
2. Merge both dataframes
3. Filter for monasteries that have status `'online'`
3. Drop Columns that are not needed for upload

### SPARQL-Abfrage monasteries_in_factgrid.csv
```sparql
SELECT ?item ?KlosterdatenbankID WHERE{
  ?item wdt:P471 ?KlosterdatenbankID 
}
```
### SPARQL-Abfrage building_complexes_in_factgrid.csv
```sparql
SELECT ?item ?GSVocabTerm WHERE{
  ?item wdt:P2 wd:Q635758 .
  ?item wdt:P1301 ?GSVocabTerm
}
```

In [2]:
# Load Access exports
export_files = {}
for export_file in os.listdir("data/KlosterDB Exports/sample"):
    export_files[export_file.split(".")[0]] = f"data/KlosterDB Exports/sample/{export_file}"

# Create dataframes for each table
dataframes = {key: pd.read_excel(value) for key, value in export_files.items()}

monasteries_in_factgrid = pd.read_csv("data/reconciliation/monasteries_in_factgrid.csv")

In [3]:
# Merge gs_monastery_location and gs_monastery
merged_df = pd.merge(dataframes["gs_monastery_location"], dataframes["gs_monastery"], left_on='gsn_id', right_on='id_gsn', how='left')
# Filter for status 'online'
online_df = merged_df[merged_df["status"] == "Online"]
# Define columns to drop
drop_columns = [
    "relocated", 
    "comment", 
    "main_location", 
    "diocese_id", 
    "id_monastery", 
    "date_created", 
    "created_by_user", 
    "note", 
    "patrocinium",
    "selection", 
    "processing_status", 
    "gs_persons", 
    "selection_criteria", 
    "last_change", 
    "changed_by_user", 
    "founder",
    "Unnamed: 0_x",
    "Unnamed: 0_y"
]
# Prepare dataframe by dropping unnecessary columns
prepared_df = online_df.drop(drop_columns, axis="columns")
prepared_df = prepared_df[~prepared_df["gsn_id"].isin(monasteries_in_factgrid["KlosterdatenbankID"])]
prepared_df

Unnamed: 0,id_monastery_location,place_id,gsn_id,location_begin_tpq,location_begin_taq,location_begin_note,location_end_tpq,location_end_taq,location_end_note,longitude,latitude,location_name,id_gsn,status,monastery_name
0,16698,46481629,11457,1250,1273.0,vor 1273,1782.0,,,12.368620,50.077752,Eger,11457,Online,"Klarissenkloster Eger (Cheb), Tschechien"
1,1752,46479255,3349,1288,,,1416.0,,,18.221140,50.095780,Ratibor (Burgkapelle),3349,Online,"Kollegiatstift Ratibor (Racibórz), Polen"
2,8714,14700,4502,1302,,,1533.0,,,13.315655,53.208969,,4502,Online,Johanniterkommende Lychen
3,13567,46483163,8372,1317,,,1603.0,1653.0,nach 1603,5.487612,51.960224,,8372,Online,"Johanniterkommende Ingen, Niederlande"
4,13569,46481587,8374,1327,,,1591.0,,,5.919402,52.554468,,8374,Online,"Beginenhaus Kampen, Niederlande"
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
109,13704,46483164,8481,1447,1457.0,ca. 1452,1593.0,,,5.484776,52.223056,,8481,Online,"Schwesterhaus St. Catharina, Nijkerk, Niederlande"
110,15151,46483955,19937,1648,,,1810.0,,,15.586389,51.109483,Löwenberg,19937,Online,Franziskanerminoritenkloster Löwenberg (Lwówek...
111,15690,46484546,1192,1222,1227.0,1222/1227,1559.0,,,4.955130,50.620600,,1192,Online,"Zisterzienserinnen-, später Zisterzienserabtei..."
112,16761,15586,11513,1768,,,1811.0,,,,,,11513,Online,Johanniterkommende Gorgast


### 1.2 Constructing the label and description for new items
New items need a label in FactGrid. German labels will be constructed using the name of the monastery in the database as well as a potentially existing name of the monastery location. The schema can be expressed as follows:

`Gebäudekomplex <monastery_name> [(<location_name)]`

With `monastery_name` and `location_name` both columns from the table that was created above.

The label will be stored in a new column called `lde` according to QuickStatements requirements.

The code in the next cell will perform the following operations:
1. Create new column with labels 
2. Delete empty brackets at the end of labels. These will occur if there is no dedicated `location_name`

In [4]:
# 1. Create new column with labels
prepared_df['Lde'] = "Gebäudekomplex " + prepared_df["monastery_name"].str.cat(prepared_df["location_name"].fillna(''), sep=" (") +")"
# 2. Delete empty brackets at end of labels
prepared_df['Lde'] = prepared_df["Lde"].str.replace(r'\(\)', '', regex=True).apply(lambda x: f'\"{x.strip()}\"')
prepared_df

Unnamed: 0,id_monastery_location,place_id,gsn_id,location_begin_tpq,location_begin_taq,location_begin_note,location_end_tpq,location_end_taq,location_end_note,longitude,latitude,location_name,id_gsn,status,monastery_name,Lde
0,16698,46481629,11457,1250,1273.0,vor 1273,1782.0,,,12.368620,50.077752,Eger,11457,Online,"Klarissenkloster Eger (Cheb), Tschechien","""Gebäudekomplex Klarissenkloster Eger (Cheb), ..."
1,1752,46479255,3349,1288,,,1416.0,,,18.221140,50.095780,Ratibor (Burgkapelle),3349,Online,"Kollegiatstift Ratibor (Racibórz), Polen","""Gebäudekomplex Kollegiatstift Ratibor (Racibó..."
2,8714,14700,4502,1302,,,1533.0,,,13.315655,53.208969,,4502,Online,Johanniterkommende Lychen,"""Gebäudekomplex Johanniterkommende Lychen"""
3,13567,46483163,8372,1317,,,1603.0,1653.0,nach 1603,5.487612,51.960224,,8372,Online,"Johanniterkommende Ingen, Niederlande","""Gebäudekomplex Johanniterkommende Ingen, Nied..."
4,13569,46481587,8374,1327,,,1591.0,,,5.919402,52.554468,,8374,Online,"Beginenhaus Kampen, Niederlande","""Gebäudekomplex Beginenhaus Kampen, Niederlande"""
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
109,13704,46483164,8481,1447,1457.0,ca. 1452,1593.0,,,5.484776,52.223056,,8481,Online,"Schwesterhaus St. Catharina, Nijkerk, Niederlande","""Gebäudekomplex Schwesterhaus St. Catharina, N..."
110,15151,46483955,19937,1648,,,1810.0,,,15.586389,51.109483,Löwenberg,19937,Online,Franziskanerminoritenkloster Löwenberg (Lwówek...,"""Gebäudekomplex Franziskanerminoritenkloster L..."
111,15690,46484546,1192,1222,1227.0,1222/1227,1559.0,,,4.955130,50.620600,,1192,Online,"Zisterzienserinnen-, später Zisterzienserabtei...","""Gebäudekomplex Zisterzienserinnen-, später Zi..."
112,16761,15586,11513,1768,,,1811.0,,,,,,11513,Online,Johanniterkommende Gorgast,"""Gebäudekomplex Johanniterkommende Gorgast"""


#### 1.2.1 (optional) looking at duplicate labels
Due to the rule-based construction of the labels, dupliactes than occur as an result. For example, if a monastery had multiple different locations, but none of them have a distinct name, the label will be `Gebäudekomplex <monastery_name>` for all of them. Since they still can be distinguished from another by their identifier and coordinates, this is not necessarily a problem. However, the following cell will create a list of all the duplicate labels so that they can be examined.

**In order to resolve the duplicates**

1. Open and inspect the table located at `data/results/duplicate_building_complex_labels.xslx`
2. Add location names in the monastery database
3. Create new exports from the monastery database and replace `data/KlosterDB Exports/gs_monastery.xlsx` and `data/KlosterDB Exports/gs_monastery_location.xlsx` with the new files
4. Re-run the notebook. The cell below now should no longer contain the duplicates you resolved. 

In [5]:
duplicated_building_complex_labels = prepared_df[prepared_df.duplicated(subset="Lde", keep=False)]
duplicated_building_complex_labels.to_excel('data/results/duplicate_building_complex_labels.xlsx')
duplicated_building_complex_labels

Unnamed: 0,id_monastery_location,place_id,gsn_id,location_begin_tpq,location_begin_taq,location_begin_note,location_end_tpq,location_end_taq,location_end_note,longitude,latitude,location_name,id_gsn,status,monastery_name,Lde
17,13477,46479171,8315,1580,,,1611.0,,,5.31,51.686944,,8315,Online,"Klarissenkloster Boxtel, Niederlande","""Gebäudekomplex Klarissenkloster Boxtel, Niede..."
18,13478,46479122,8315,1472,,,1580.0,,,5.325598,51.589158,,8315,Online,"Klarissenkloster Boxtel, Niederlande","""Gebäudekomplex Klarissenkloster Boxtel, Niede..."
19,13479,46479122,8315,1611,,,1717.0,,,5.325598,51.589158,,8315,Online,"Klarissenkloster Boxtel, Niederlande","""Gebäudekomplex Klarissenkloster Boxtel, Niede..."
20,13480,46483276,8315,1719,1725.0,zwischen 1719 und 1725,,,heute,5.561667,51.823333,,8315,Online,"Klarissenkloster Boxtel, Niederlande","""Gebäudekomplex Klarissenkloster Boxtel, Niede..."
21,13481,46483282,8315,1717,,,1719.0,1725.0,zwischen 1719 und 1725,,,,8315,Online,"Klarissenkloster Boxtel, Niederlande","""Gebäudekomplex Klarissenkloster Boxtel, Niede..."
30,16931,46479130,11589,1739,,,1745.0,,,11.656017,46.71663,Brixen,11589,Online,"Institut der Englischen Fräulein Brixen, Italien","""Gebäudekomplex Institut der Englischen Fräule..."
31,16933,46479130,11589,1745,,,2011.0,,,11.656389,46.717972,Brixen,11589,Online,"Institut der Englischen Fräulein Brixen, Italien","""Gebäudekomplex Institut der Englischen Fräule..."
71,17003,46484577,11658,1559,,,1796.0,,,,,,11658,Online,"Benediktinerinnenkloster Forest, Belgien","""Gebäudekomplex Benediktinerinnenkloster Fores..."
72,17004,46484576,11658,1105,,,1559.0,,,4.316597,50.810454,,11658,Online,"Benediktinerinnenkloster Forest, Belgien","""Gebäudekomplex Benediktinerinnenkloster Fores..."
74,17116,46484689,11744,1300,1370.0,vor 1370,1559.0,,,4.981675,51.001657,,11744,Online,Augustinerchorfrauenpriorat Notre Dame Ter Elz...,"""Gebäudekomplex Augustinerchorfrauenpriorat No..."


#### 1.2.2 Constructing Descriptions
The descriptions of the new labels are generated using a rule-based approach:
`Gebäudekomplex [<location_name>] des <monastery_name> [von etwa <location_begin_taq>/<location_begin_tpq>] [bis etwa <location_end_taq>/<location_end_tpq>].`

In [6]:
from helper_functions import construct_description
for index, row in prepared_df.iterrows():
    prepared_df.loc[index, "Dde"] = construct_description(row["location_name"], row["monastery_name"], row["location_begin_taq"], row["location_begin_tpq"], row["location_end_taq"], row["location_end_tpq"])
prepared_df["Dde"] = prepared_df["Dde"].apply(lambda x:f'\"{x}\"')
prepared_df

Unnamed: 0,id_monastery_location,place_id,gsn_id,location_begin_tpq,location_begin_taq,location_begin_note,location_end_tpq,location_end_taq,location_end_note,longitude,latitude,location_name,id_gsn,status,monastery_name,Lde,Dde
0,16698,46481629,11457,1250,1273.0,vor 1273,1782.0,,,12.368620,50.077752,Eger,11457,Online,"Klarissenkloster Eger (Cheb), Tschechien","""Gebäudekomplex Klarissenkloster Eger (Cheb), ...","""Gebäudekomplex Eger des Klarissenklosters Ege..."
1,1752,46479255,3349,1288,,,1416.0,,,18.221140,50.095780,Ratibor (Burgkapelle),3349,Online,"Kollegiatstift Ratibor (Racibórz), Polen","""Gebäudekomplex Kollegiatstift Ratibor (Racibó...","""Gebäudekomplex Ratibor (Burgkapelle) des Koll..."
2,8714,14700,4502,1302,,,1533.0,,,13.315655,53.208969,,4502,Online,Johanniterkommende Lychen,"""Gebäudekomplex Johanniterkommende Lychen""","""Gebäudekomplex der Johanniterkommende Lychen"""
3,13567,46483163,8372,1317,,,1603.0,1653.0,nach 1603,5.487612,51.960224,,8372,Online,"Johanniterkommende Ingen, Niederlande","""Gebäudekomplex Johanniterkommende Ingen, Nied...","""Gebäudekomplex der Johanniterkommende Ingen, ..."
4,13569,46481587,8374,1327,,,1591.0,,,5.919402,52.554468,,8374,Online,"Beginenhaus Kampen, Niederlande","""Gebäudekomplex Beginenhaus Kampen, Niederlande""","""Gebäudekomplex des Beginenhauses Kampen, Nied..."
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
109,13704,46483164,8481,1447,1457.0,ca. 1452,1593.0,,,5.484776,52.223056,,8481,Online,"Schwesterhaus St. Catharina, Nijkerk, Niederlande","""Gebäudekomplex Schwesterhaus St. Catharina, N...","""Gebäudekomplex des Schwesterhauses St. Cathar..."
110,15151,46483955,19937,1648,,,1810.0,,,15.586389,51.109483,Löwenberg,19937,Online,Franziskanerminoritenkloster Löwenberg (Lwówek...,"""Gebäudekomplex Franziskanerminoritenkloster L...","""Gebäudekomplex Löwenberg des Franziskanermino..."
111,15690,46484546,1192,1222,1227.0,1222/1227,1559.0,,,4.955130,50.620600,,1192,Online,"Zisterzienserinnen-, später Zisterzienserabtei...","""Gebäudekomplex Zisterzienserinnen-, später Zi...","""Gebäudekomplex der Zisterzienserinnen-, späte..."
112,16761,15586,11513,1768,,,1811.0,,,,,,11513,Online,Johanniterkommende Gorgast,"""Gebäudekomplex Johanniterkommende Gorgast""","""Gebäudekomplex der Johanniterkommende Gorgast"""


### 1.2.3 Translating labels
The notebook `1a - translate.ipynb` offers functionality to automatically translate item labels and descriptions to english using an LLM via the GWDG/KISSKI API. In order to perform the translation, go through the following steps:

1. The model needs a "dictionary" to translate the labels. We offer such a dictionary, translating the most frequent terms in our database under `data/translation/translation_dictionary.csv`. You can adapt the file as needed.
2. Run the cell below to create a table that can be used as import for the translation process. 
3. Execute the notebook `1a - translate.ipynb`. Afterwards the translations will be saved to `data/translation/translated.csv`.
4. Continue with the next cell to add the translated labels to the table.

#### 1.2.3.1 Creating table for translation 

In [7]:
to_translate = prepared_df[["monastery_name", 'Lde', 'Dde']].copy()
to_translate["note"] = ""
to_translate = to_translate.rename(columns={"Lde": "building_Lde", "Dde": "building_Dde", "monastery_name" : "monastery_Lde", "note": "monastery_Dde"})
to_translate.to_csv("data/translation/to_translate.csv")

In [8]:
translated = pd.read_csv("data/translation/translated.csv")
prepared_df["Len"] = translated["building_Len"].apply(lambda x:f'\"{x}\"')
prepared_df["Den"] = translated["building_Den"].apply(lambda x:f'\"{x}\"')
prepared_df

Unnamed: 0,id_monastery_location,place_id,gsn_id,location_begin_tpq,location_begin_taq,location_begin_note,location_end_tpq,location_end_taq,location_end_note,longitude,latitude,location_name,id_gsn,status,monastery_name,Lde,Dde,Len,Den
0,16698,46481629,11457,1250,1273.0,vor 1273,1782.0,,,12.368620,50.077752,Eger,11457,Online,"Klarissenkloster Eger (Cheb), Tschechien","""Gebäudekomplex Klarissenkloster Eger (Cheb), ...","""Gebäudekomplex Eger des Klarissenklosters Ege...","""Building complex Benedicts of St. Michael, Me...","""Building complex of the Benedicts of St. Mich..."
1,1752,46479255,3349,1288,,,1416.0,,,18.221140,50.095780,Ratibor (Burgkapelle),3349,Online,"Kollegiatstift Ratibor (Racibórz), Polen","""Gebäudekomplex Kollegiatstift Ratibor (Racibó...","""Gebäudekomplex Ratibor (Burgkapelle) des Koll...","""Building complex Benedictine monastery Vornba...","""Building complex Vornbach of the Benedictine ..."
2,8714,14700,4502,1302,,,1533.0,,,13.315655,53.208969,,4502,Online,Johanniterkommende Lychen,"""Gebäudekomplex Johanniterkommende Lychen""","""Gebäudekomplex der Johanniterkommende Lychen""","""Building complex Cistercian nunnery Ophoven (...","""Building complex Ophoven of the Cistercian nu..."
3,13567,46483163,8372,1317,,,1603.0,1653.0,nach 1603,5.487612,51.960224,,8372,Online,"Johanniterkommende Ingen, Niederlande","""Gebäudekomplex Johanniterkommende Ingen, Nied...","""Gebäudekomplex der Johanniterkommende Ingen, ...","""Building complex Capuchin friary Karlstadt""","""Building complex of the Capuchin friary Karls..."
4,13569,46481587,8374,1327,,,1591.0,,,5.919402,52.554468,,8374,Online,"Beginenhaus Kampen, Niederlande","""Gebäudekomplex Beginenhaus Kampen, Niederlande""","""Gebäudekomplex des Beginenhauses Kampen, Nied...","""Building complex Benedictine monastery of Bas...","""Building complex of the Benedictine monastery..."
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
109,13704,46483164,8481,1447,1457.0,ca. 1452,1593.0,,,5.484776,52.223056,,8481,Online,"Schwesterhaus St. Catharina, Nijkerk, Niederlande","""Gebäudekomplex Schwesterhaus St. Catharina, N...","""Gebäudekomplex des Schwesterhauses St. Cathar...",,
110,15151,46483955,19937,1648,,,1810.0,,,15.586389,51.109483,Löwenberg,19937,Online,Franziskanerminoritenkloster Löwenberg (Lwówek...,"""Gebäudekomplex Franziskanerminoritenkloster L...","""Gebäudekomplex Löwenberg des Franziskanermino...",,
111,15690,46484546,1192,1222,1227.0,1222/1227,1559.0,,,4.955130,50.620600,,1192,Online,"Zisterzienserinnen-, später Zisterzienserabtei...","""Gebäudekomplex Zisterzienserinnen-, später Zi...","""Gebäudekomplex der Zisterzienserinnen-, späte...",,
112,16761,15586,11513,1768,,,1811.0,,,,,,11513,Online,Johanniterkommende Gorgast,"""Gebäudekomplex Johanniterkommende Gorgast""","""Gebäudekomplex der Johanniterkommende Gorgast""",,


### 1.3 Coordinates
The coordinates are currently stored in the columns `longitude` and `latitude`. For QuickStatements import, they need to be transformed into a single string following the template `@<longitude>/<latitude>`

The next cell creates a new column called `P48`, the [FactGrid Property for coordinates](https://database.factgrid.de/wiki/Property:P48) and stores the constructed string in it.

In [9]:
for index, row in prepared_df.iterrows():
    if (not pd.isna(row["latitude"])) and (not pd.isna(row["longitude"])):
        prepared_df.loc[index, "P48"] = f'@{row["latitude"]}/{row["longitude"]}'
prepared_df.drop_duplicates(subset="P48", inplace=True)
prepared_df

Unnamed: 0,id_monastery_location,place_id,gsn_id,location_begin_tpq,location_begin_taq,location_begin_note,location_end_tpq,location_end_taq,location_end_note,longitude,latitude,location_name,id_gsn,status,monastery_name,Lde,Dde,Len,Den,P48
0,16698,46481629,11457,1250,1273.0,vor 1273,1782.0,,,12.368620,50.077752,Eger,11457,Online,"Klarissenkloster Eger (Cheb), Tschechien","""Gebäudekomplex Klarissenkloster Eger (Cheb), ...","""Gebäudekomplex Eger des Klarissenklosters Ege...","""Building complex Benedicts of St. Michael, Me...","""Building complex of the Benedicts of St. Mich...",@50.077752/12.36862
1,1752,46479255,3349,1288,,,1416.0,,,18.221140,50.095780,Ratibor (Burgkapelle),3349,Online,"Kollegiatstift Ratibor (Racibórz), Polen","""Gebäudekomplex Kollegiatstift Ratibor (Racibó...","""Gebäudekomplex Ratibor (Burgkapelle) des Koll...","""Building complex Benedictine monastery Vornba...","""Building complex Vornbach of the Benedictine ...",@50.09578/18.22114
2,8714,14700,4502,1302,,,1533.0,,,13.315655,53.208969,,4502,Online,Johanniterkommende Lychen,"""Gebäudekomplex Johanniterkommende Lychen""","""Gebäudekomplex der Johanniterkommende Lychen""","""Building complex Cistercian nunnery Ophoven (...","""Building complex Ophoven of the Cistercian nu...",@53.208969/13.315655
3,13567,46483163,8372,1317,,,1603.0,1653.0,nach 1603,5.487612,51.960224,,8372,Online,"Johanniterkommende Ingen, Niederlande","""Gebäudekomplex Johanniterkommende Ingen, Nied...","""Gebäudekomplex der Johanniterkommende Ingen, ...","""Building complex Capuchin friary Karlstadt""","""Building complex of the Capuchin friary Karls...",@51.9602244312605/5.48761203580791
4,13569,46481587,8374,1327,,,1591.0,,,5.919402,52.554468,,8374,Online,"Beginenhaus Kampen, Niederlande","""Gebäudekomplex Beginenhaus Kampen, Niederlande""","""Gebäudekomplex des Beginenhauses Kampen, Nied...","""Building complex Benedictine monastery of Bas...","""Building complex of the Benedictine monastery...",@52.5544680656827/5.91940200615705
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
107,5785,637,40089,1175,1225.0,vor 1225,1803.0,,,7.305031,50.516582,,40089,Online,"Templerkommende, dann Johanniterkommende Hönni...","""Gebäudekomplex Templerkommende, dann Johannit...","""Gebäudekomplex der Templerkommende, dann Joha...",,,@50.51658173801344/7.305030721604308
108,16147,46484263,11227,1142,1143.0,1142/1143,1783.0,,,15.290278,49.960000,Sedletz,11227,Online,"Zisterzienserkloster Sedletz (Sedlec), Kutná H...","""Gebäudekomplex Zisterzienserkloster Sedletz (...","""Gebäudekomplex Sedletz des Zisterzienserklost...",,,@49.96/15.290278
109,13704,46483164,8481,1447,1457.0,ca. 1452,1593.0,,,5.484776,52.223056,,8481,Online,"Schwesterhaus St. Catharina, Nijkerk, Niederlande","""Gebäudekomplex Schwesterhaus St. Catharina, N...","""Gebäudekomplex des Schwesterhauses St. Cathar...",,,@52.2230564143223/5.48477573772399
110,15151,46483955,19937,1648,,,1810.0,,,15.586389,51.109483,Löwenberg,19937,Online,Franziskanerminoritenkloster Löwenberg (Lwówek...,"""Gebäudekomplex Franziskanerminoritenkloster L...","""Gebäudekomplex Löwenberg des Franziskanermino...",,,@51.10948301890577/15.586388627441908


### 1.4 Connection to places
Besides the coordinates, building complexes should also be connected with the places they are located in using the property [P83](https://database.factgrid.de/wiki/Property:P83). In order to achieve this, a matching between the table `gs_places` from the monastery database and FactGrid is needed. This can be achieved using a tool like OpenRefine, perform Coordinate Matching using QGis or match the names manually, if needed. 

**If you already have reconciled all the places** from the monasteries you want to upload against FactGrid, you can place the resulting table in `data/reconciliation/reconciliated_places.xlsx`. Make sure it has at least these two columns:
- `place_id`: the id from the gs_places table
- `factgrid_id`: The corresponding Q number from FactGrid
You can then continue with step [1.4.1](#section141)

**Otherwise** the following cell will create a table containing the place_id, the place name as well as some additional information that can be used for reconciling. It is stored at `data/reconciliation/reconcile_places.xlsx` Fill in the column `factgrid_id` using your preferred method and then place the resulting table at `data/reconciliation/reconciliated_places.xlsx`.

The cell below will do the following:
1. Load the tables gs_places
2. Merge it with the table monastery_locations_prepared to obtain all places that are relevant to the current dataset
3. Create a new table with columns that are relevant for reconciliation

In [10]:
# 2. Merge with monastery locations
monastery_locations_places_merged = pd.merge(prepared_df, dataframes["gs_places"], how='left', left_on='place_id', right_on='id_places')
monastery_locations_places_merged
# 3. Create and save new table with relevant columns
reconcile_places = monastery_locations_places_merged[["place_id", "longitude_y", "latitude_y", "place_name", "gemeinde", "kreis", "geonames_id"]]
reconcile_places= reconcile_places.drop_duplicates(subset="place_id")
reconcile_places['factgrid_id'] = 0
reconcile_places.to_excel('data/reconciliation/reconcile_places.xlsx')
reconcile_places

Unnamed: 0,place_id,longitude_y,latitude_y,place_name,gemeinde,kreis,geonames_id,factgrid_id
0,46481629,12.370556,50.079444,Cheb (Eger),Cheb,Böhmen,3077835.0,0
1,46479255,18.219722,50.092222,Racibórz (Ratibor),,,3087584.0,0
2,14700,13.317167,53.207167,Lychen,Lychen,Uckermark,2874695.0,0
3,46483163,5.484720,51.959170,Ingen,,,2753316.0,0
4,46481587,5.900000,52.550000,Kampen,Kampen,Provinz Overijssel,6559054.0,0
...,...,...,...,...,...,...,...,...
96,637,7.305167,50.517333,Bad Hönningen,Bad Hönningen,Neuwied,2953434.0,0
97,46484263,15.268160,49.948390,Kutná Hora (Kuttenberg),,,3072463.0,0
98,46483164,5.486110,52.220000,Nijkerk,,,2750065.0,0
99,46483955,15.585820,51.110740,Lwówek Śląski (Löwenberg),,Niederschlesien,3092638.0,0


<a id="section141"></a>
#### 1.4.1 Load reconciled places
Once the places have been reconciled, the filled table is loaded again and merged with the table from above. After this, **all entries that do not have a FactGrid ID are filtered out**. They are stored in a new file: `data/resulta/monasteries_without_factGrid.xlsx` so that they can be taken care of. 

The cell below will do the following:
1. Load the reconciled places
2. Merge them to the table with prepared monasteries
3. Filter out missing FactGrid Items and store them in a separate table

In [11]:
# 1. Load the reconciled places
places_reconciled = pd.read_excel("data/reconciliation/reconciliated_places.xlsx")[["place_id", "factgrid_id"]]
# 2. Merge them to the table with prepared monasteries
prepared_df = pd.merge(prepared_df, places_reconciled, how="left", on="place_id")
prepared_df = prepared_df.rename(columns={"factgrid_id":"P83"})
prepared_df
# 3. Filter out missing FactGrid Items and store them in a separate table
missing_factgrid_ids = prepared_df[prepared_df['P83'].isna()]
missing_factgrid_ids.to_excel('data/results/monasteries_without_factGrid.xlsx')
prepared_df = prepared_df.dropna(subset = 'P83')
prepared_df

Unnamed: 0,id_monastery_location,place_id,gsn_id,location_begin_tpq,location_begin_taq,location_begin_note,location_end_tpq,location_end_taq,location_end_note,longitude,...,location_name,id_gsn,status,monastery_name,Lde,Dde,Len,Den,P48,P83
0,16698,46481629,11457,1250,1273.0,vor 1273,1782.0,,,12.368620,...,Eger,11457,Online,"Klarissenkloster Eger (Cheb), Tschechien","""Gebäudekomplex Klarissenkloster Eger (Cheb), ...","""Gebäudekomplex Eger des Klarissenklosters Ege...","""Building complex Benedicts of St. Michael, Me...","""Building complex of the Benedicts of St. Mich...",@50.077752/12.36862,Q102019
1,1752,46479255,3349,1288,,,1416.0,,,18.221140,...,Ratibor (Burgkapelle),3349,Online,"Kollegiatstift Ratibor (Racibórz), Polen","""Gebäudekomplex Kollegiatstift Ratibor (Racibó...","""Gebäudekomplex Ratibor (Burgkapelle) des Koll...","""Building complex Benedictine monastery Vornba...","""Building complex Vornbach of the Benedictine ...",@50.09578/18.22114,Q21722
2,8714,14700,4502,1302,,,1533.0,,,13.315655,...,,4502,Online,Johanniterkommende Lychen,"""Gebäudekomplex Johanniterkommende Lychen""","""Gebäudekomplex der Johanniterkommende Lychen""","""Building complex Cistercian nunnery Ophoven (...","""Building complex Ophoven of the Cistercian nu...",@53.208969/13.315655,Q88340
3,13567,46483163,8372,1317,,,1603.0,1653.0,nach 1603,5.487612,...,,8372,Online,"Johanniterkommende Ingen, Niederlande","""Gebäudekomplex Johanniterkommende Ingen, Nied...","""Gebäudekomplex der Johanniterkommende Ingen, ...","""Building complex Capuchin friary Karlstadt""","""Building complex of the Capuchin friary Karls...",@51.9602244312605/5.48761203580791,Q1347411
4,13569,46481587,8374,1327,,,1591.0,,,5.919402,...,,8374,Online,"Beginenhaus Kampen, Niederlande","""Gebäudekomplex Beginenhaus Kampen, Niederlande""","""Gebäudekomplex des Beginenhauses Kampen, Nied...","""Building complex Benedictine monastery of Bas...","""Building complex of the Benedictine monastery...",@52.5544680656827/5.91940200615705,Q87014
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
96,5785,637,40089,1175,1225.0,vor 1225,1803.0,,,7.305031,...,,40089,Online,"Templerkommende, dann Johanniterkommende Hönni...","""Gebäudekomplex Templerkommende, dann Johannit...","""Gebäudekomplex der Templerkommende, dann Joha...",,,@50.51658173801344/7.305030721604308,Q82564
97,16147,46484263,11227,1142,1143.0,1142/1143,1783.0,,,15.290278,...,Sedletz,11227,Online,"Zisterzienserkloster Sedletz (Sedlec), Kutná H...","""Gebäudekomplex Zisterzienserkloster Sedletz (...","""Gebäudekomplex Sedletz des Zisterzienserklost...",,,@49.96/15.290278,Q627082
98,13704,46483164,8481,1447,1457.0,ca. 1452,1593.0,,,5.484776,...,,8481,Online,"Schwesterhaus St. Catharina, Nijkerk, Niederlande","""Gebäudekomplex Schwesterhaus St. Catharina, N...","""Gebäudekomplex des Schwesterhauses St. Cathar...",,,@52.2230564143223/5.48477573772399,Q1349336
99,15151,46483955,19937,1648,,,1810.0,,,15.586389,...,Löwenberg,19937,Online,Franziskanerminoritenkloster Löwenberg (Lwówek...,"""Gebäudekomplex Franziskanerminoritenkloster L...","""Gebäudekomplex Löwenberg des Franziskanermino...",,,@51.10948301890577/15.586388627441908,Q80833


### 1.5 Create "instance of" statement
To state that these items are building complexes, the Item [Q635758](https://database.factgrid.de/wiki/Item:Q635758) (building complex) is connected to all entries using [P2](https://database.factgrid.de/wiki/Property:P2) (instance of)

In [12]:
prepared_df["P2"] = "Q635758"
prepared_df

Unnamed: 0,id_monastery_location,place_id,gsn_id,location_begin_tpq,location_begin_taq,location_begin_note,location_end_tpq,location_end_taq,location_end_note,longitude,...,id_gsn,status,monastery_name,Lde,Dde,Len,Den,P48,P83,P2
0,16698,46481629,11457,1250,1273.0,vor 1273,1782.0,,,12.368620,...,11457,Online,"Klarissenkloster Eger (Cheb), Tschechien","""Gebäudekomplex Klarissenkloster Eger (Cheb), ...","""Gebäudekomplex Eger des Klarissenklosters Ege...","""Building complex Benedicts of St. Michael, Me...","""Building complex of the Benedicts of St. Mich...",@50.077752/12.36862,Q102019,Q635758
1,1752,46479255,3349,1288,,,1416.0,,,18.221140,...,3349,Online,"Kollegiatstift Ratibor (Racibórz), Polen","""Gebäudekomplex Kollegiatstift Ratibor (Racibó...","""Gebäudekomplex Ratibor (Burgkapelle) des Koll...","""Building complex Benedictine monastery Vornba...","""Building complex Vornbach of the Benedictine ...",@50.09578/18.22114,Q21722,Q635758
2,8714,14700,4502,1302,,,1533.0,,,13.315655,...,4502,Online,Johanniterkommende Lychen,"""Gebäudekomplex Johanniterkommende Lychen""","""Gebäudekomplex der Johanniterkommende Lychen""","""Building complex Cistercian nunnery Ophoven (...","""Building complex Ophoven of the Cistercian nu...",@53.208969/13.315655,Q88340,Q635758
3,13567,46483163,8372,1317,,,1603.0,1653.0,nach 1603,5.487612,...,8372,Online,"Johanniterkommende Ingen, Niederlande","""Gebäudekomplex Johanniterkommende Ingen, Nied...","""Gebäudekomplex der Johanniterkommende Ingen, ...","""Building complex Capuchin friary Karlstadt""","""Building complex of the Capuchin friary Karls...",@51.9602244312605/5.48761203580791,Q1347411,Q635758
4,13569,46481587,8374,1327,,,1591.0,,,5.919402,...,8374,Online,"Beginenhaus Kampen, Niederlande","""Gebäudekomplex Beginenhaus Kampen, Niederlande""","""Gebäudekomplex des Beginenhauses Kampen, Nied...","""Building complex Benedictine monastery of Bas...","""Building complex of the Benedictine monastery...",@52.5544680656827/5.91940200615705,Q87014,Q635758
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
96,5785,637,40089,1175,1225.0,vor 1225,1803.0,,,7.305031,...,40089,Online,"Templerkommende, dann Johanniterkommende Hönni...","""Gebäudekomplex Templerkommende, dann Johannit...","""Gebäudekomplex der Templerkommende, dann Joha...",,,@50.51658173801344/7.305030721604308,Q82564,Q635758
97,16147,46484263,11227,1142,1143.0,1142/1143,1783.0,,,15.290278,...,11227,Online,"Zisterzienserkloster Sedletz (Sedlec), Kutná H...","""Gebäudekomplex Zisterzienserkloster Sedletz (...","""Gebäudekomplex Sedletz des Zisterzienserklost...",,,@49.96/15.290278,Q627082,Q635758
98,13704,46483164,8481,1447,1457.0,ca. 1452,1593.0,,,5.484776,...,8481,Online,"Schwesterhaus St. Catharina, Nijkerk, Niederlande","""Gebäudekomplex Schwesterhaus St. Catharina, N...","""Gebäudekomplex des Schwesterhauses St. Cathar...",,,@52.2230564143223/5.48477573772399,Q1349336,Q635758
99,15151,46483955,19937,1648,,,1810.0,,,15.586389,...,19937,Online,Franziskanerminoritenkloster Löwenberg (Lwówek...,"""Gebäudekomplex Franziskanerminoritenkloster L...","""Gebäudekomplex Löwenberg des Franziskanermino...",,,@51.10948301890577/15.586388627441908,Q80833,Q635758


### 1.6 GS Vocabulary Statement
In order to keep a mapping between the monastery database and FactGrid, every item will receive a distinct vocabulary term that is constructed using the `id_monastery_location` from the monastery database. The FactGrid Property to use is [P1301](https://database.factgrid.de/wiki/Property:P1301) (GS vocabulary term).

Vocabulary terms for building complex items are constructed as follows:

`GSMonasteryLocation<id_monastery_location>`

The cell below will create a new column for P3101 and fill it using the construction template above.


In [13]:
prepared_df['P1301'] = prepared_df['id_monastery_location'].apply(lambda x: f'\"GSMonasteryLocation{x}\"')
prepared_df

Unnamed: 0,id_monastery_location,place_id,gsn_id,location_begin_tpq,location_begin_taq,location_begin_note,location_end_tpq,location_end_taq,location_end_note,longitude,...,status,monastery_name,Lde,Dde,Len,Den,P48,P83,P2,P1301
0,16698,46481629,11457,1250,1273.0,vor 1273,1782.0,,,12.368620,...,Online,"Klarissenkloster Eger (Cheb), Tschechien","""Gebäudekomplex Klarissenkloster Eger (Cheb), ...","""Gebäudekomplex Eger des Klarissenklosters Ege...","""Building complex Benedicts of St. Michael, Me...","""Building complex of the Benedicts of St. Mich...",@50.077752/12.36862,Q102019,Q635758,"""GSMonasteryLocation16698"""
1,1752,46479255,3349,1288,,,1416.0,,,18.221140,...,Online,"Kollegiatstift Ratibor (Racibórz), Polen","""Gebäudekomplex Kollegiatstift Ratibor (Racibó...","""Gebäudekomplex Ratibor (Burgkapelle) des Koll...","""Building complex Benedictine monastery Vornba...","""Building complex Vornbach of the Benedictine ...",@50.09578/18.22114,Q21722,Q635758,"""GSMonasteryLocation1752"""
2,8714,14700,4502,1302,,,1533.0,,,13.315655,...,Online,Johanniterkommende Lychen,"""Gebäudekomplex Johanniterkommende Lychen""","""Gebäudekomplex der Johanniterkommende Lychen""","""Building complex Cistercian nunnery Ophoven (...","""Building complex Ophoven of the Cistercian nu...",@53.208969/13.315655,Q88340,Q635758,"""GSMonasteryLocation8714"""
3,13567,46483163,8372,1317,,,1603.0,1653.0,nach 1603,5.487612,...,Online,"Johanniterkommende Ingen, Niederlande","""Gebäudekomplex Johanniterkommende Ingen, Nied...","""Gebäudekomplex der Johanniterkommende Ingen, ...","""Building complex Capuchin friary Karlstadt""","""Building complex of the Capuchin friary Karls...",@51.9602244312605/5.48761203580791,Q1347411,Q635758,"""GSMonasteryLocation13567"""
4,13569,46481587,8374,1327,,,1591.0,,,5.919402,...,Online,"Beginenhaus Kampen, Niederlande","""Gebäudekomplex Beginenhaus Kampen, Niederlande""","""Gebäudekomplex des Beginenhauses Kampen, Nied...","""Building complex Benedictine monastery of Bas...","""Building complex of the Benedictine monastery...",@52.5544680656827/5.91940200615705,Q87014,Q635758,"""GSMonasteryLocation13569"""
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
96,5785,637,40089,1175,1225.0,vor 1225,1803.0,,,7.305031,...,Online,"Templerkommende, dann Johanniterkommende Hönni...","""Gebäudekomplex Templerkommende, dann Johannit...","""Gebäudekomplex der Templerkommende, dann Joha...",,,@50.51658173801344/7.305030721604308,Q82564,Q635758,"""GSMonasteryLocation5785"""
97,16147,46484263,11227,1142,1143.0,1142/1143,1783.0,,,15.290278,...,Online,"Zisterzienserkloster Sedletz (Sedlec), Kutná H...","""Gebäudekomplex Zisterzienserkloster Sedletz (...","""Gebäudekomplex Sedletz des Zisterzienserklost...",,,@49.96/15.290278,Q627082,Q635758,"""GSMonasteryLocation16147"""
98,13704,46483164,8481,1447,1457.0,ca. 1452,1593.0,,,5.484776,...,Online,"Schwesterhaus St. Catharina, Nijkerk, Niederlande","""Gebäudekomplex Schwesterhaus St. Catharina, N...","""Gebäudekomplex des Schwesterhauses St. Cathar...",,,@52.2230564143223/5.48477573772399,Q1349336,Q635758,"""GSMonasteryLocation13704"""
99,15151,46483955,19937,1648,,,1810.0,,,15.586389,...,Online,Franziskanerminoritenkloster Löwenberg (Lwówek...,"""Gebäudekomplex Franziskanerminoritenkloster L...","""Gebäudekomplex Löwenberg des Franziskanermino...",,,@51.10948301890577/15.586388627441908,Q80833,Q635758,"""GSMonasteryLocation15151"""


### 1.7 Begin and End Dates
Since we know when a building complex was used by a monastic community, we can derive a start and end date for the building's minimal timeframe of existence. The dates can be added to FactGrid in a simple way [Step 1.7.1](#section171) using the values from columns `location_begin_taq` and `location_end_tpq` or in a more complex way [Step 1.7.2](#section172) using the columns `location_begin_note` and `location_end_note` and mapping them to a broader range of FactGrid Properties. 

<a id='section171'></a>
#### 1.7.1 Simple Variant 
In this variant, the following properties will be added to the table:

- P1124 (Begin date (terminus ante quem)) => `location_begin_taq`: We know that the building must have existed once the monastery started using it.
- P1125 (Begin date (terminus post quem)) => `location_end_tpq`: We know that the building must have existed at least until the monastery stopped using it.

Date strings are constructed according to QuickStatements Syntax: `<year>-01-01T00:00:00Z/9`. The precision is set to year (`/9`). In cases where only the terminus post quem or only the terminus ante quem are given in the table, the respective other one has to be filled in.

In addition, any existing notes from `location_begin_note` and `location_end_note` will be added as qualifiers [P787](https://database.factgrid.de/wiki/Property:P787) (Precision of Begin date) and [P788](https://database.factgrid.de/wiki/Property:P788) (Precision of End Date).

The cell below will modify the table as follows:
1. Add column P1124 and fill it with values from `location_begin_taq`
2. Add Qualifier Column qual787 and fill it with notes from `location_begin_note`
3. Add column P1125 and fill it with values from `location_end_tpq`
4. Add Qualifier Column qual788 and fill it with notes from `location_end_note`
5. Fill missing values in P1124 using `location_begin_tpq`
6. Fill missing values in P1125 using `location_end_taq`
7. Add the Julian Calendar Ending `\J` to the datestring in P1124 and P1125 if the date is before 1582

> TODO: What to do if tpq begin date or taq end date exists (also for complex variant)? 

In [14]:
# monastery_locations_prepared_simple_dates = prepared_df.copy()
# # 1. Add and fill column P1124
# monastery_locations_prepared_simple_dates['P1124'] = monastery_locations_prepared_simple_dates['location_begin_tpq'].apply(lambda x: f'{int(x)}-01-01T00:00:00Z/9' if not pd.isnull(x) else np.nan)
# # 2. Add and fill qualifier column qual787
# monastery_locations_prepared_simple_dates['qal787'] = monastery_locations_prepared_simple_dates['location_begin_note'].apply(lambda x: f'\"\"\"{x}\"\"\"' if not pd.isna(x) else np.nan)
# # 3. Add and fill column P1125
# monastery_locations_prepared_simple_dates['P1125'] = monastery_locations_prepared_simple_dates['location_end_taq'].apply(lambda x: f'{int(x)}-01-01T00:00:00Z/9' if not pd.isnull(x) else np.nan)
# # 4. Add and fill qualifier column qual788
# monastery_locations_prepared_simple_dates['qal788'] = monastery_locations_prepared_simple_dates['location_end_note'].apply(lambda x: f'\"\"\"{x}\"\"\"' if not pd.isna(x) else np.nan)
# # 5. Fill missing values in P1124 (!)
# monastery_locations_prepared_simple_dates['P1124'] = monastery_locations_prepared_simple_dates['P1124'].fillna(prepared_df['location_begin_taq'].apply(lambda x: f'{int(x)}-01-01T00:00:00Z/9' if not pd.isnull(x) else np.nan))
# # 6. Fill missing values in P1125 (!)
# monastery_locations_prepared_simple_dates['P1125'] = monastery_locations_prepared_simple_dates['P1125'].fillna(prepared_df['location_end_tpq'].apply(lambda x: f'{int(x)}-01-01T00:00:00Z/9' if not pd.isnull(x) else np.nan))
# # 7. Add Julian Calendar Ending 
# monastery_locations_prepared_simple_dates['P1124'] = monastery_locations_prepared_simple_dates['P1124'].apply(lambda x: x + "/J" if not pd.isna(x) and int(x.split("-")[0]) < 1582 else x)
# monastery_locations_prepared_simple_dates['P1125'] = monastery_locations_prepared_simple_dates['P1125'].apply(lambda x: x + "/J" if not pd.isna(x) and int(x.split("-")[0]) < 1582 else x)
# monastery_locations_prepared_simple_dates

<a id="section172"></a>
#### 1.7.2 Complex Variant
In this Variant a template matching algorithm is used to match the date notes to more precise FactGrid Properties and Qualifiers. The algorithm by Felix Schwart can be found in the[ WIAG to FactGrid Notebook](https://github.com/WIAG-ADW-GOE/sync_notebooks/blob/main/wiag_to_factgrid.ipynb), Section "Parse begin and end date from the wiag data". Make sure to update the code pasted in the next hidden cell when the notebook gets improved/updated. The two functions in the cell below apply the matching algorithm on the DataFrame entries and change the DataFrame accordingly.

In [15]:
# Code from https://github.com/WIAG-ADW-GOE/sync_notebooks/blob/main/wiag_to_factgrid.ipynb, last updated 2025-07-01 11:58

from enum import Enum
from datetime import datetime, timedelta
import re
class DateType(Enum):
    ONLY_DATE = 0
    BEGIN_DATE = 1
    END_DATE = 2

#date precision and calendar declaration (see Time at https://www.wikidata.org/wiki/Help:QuickStatements#Add_simple_statement)
PRECISION_CENTURY = 7
PRECISION_DECADE = 8
PRECISION_YEAR = 9
PRECISION_MONTH = 10
PRECISION_DAY = 11
JULIAN_ENDING = '/J'

#defining some constants for better readability of the code:
#self defined:
JHS_GROUP = r'(Jhs\.|Jahrhunderts?)'
JH_GROUP = r'(Jh\.|Jahrhundert)'
EIGTH_OF_A_CENTURY = 13
QUARTER_OF_A_CENTURY = 25
TENTH_OF_A_CENTURY = 10

ANTE_GROUP = "bis|vor|spätestens"
POST_GROUP = "nach|frühestens|ab"
CIRCA_GROUP = r"etwa|ca\.|um"
#pre-compiling the most complex pattern to increase efficiency
MOST_COMPLEX_PATTERN = re.compile(r'(wohl )?((kurz )?(' + ANTE_GROUP + '|' + POST_GROUP + r') )?((' + CIRCA_GROUP +r') )?(\d{3,4})(\?)?')

#FactGrid properties:
    # simple date properties:
DATE = 'P106' 
BEGIN_DATE = 'P49'
END_DATE = 'P50'
    # when there is uncertainty / when all we know is the latest/earliest possible date:
DATE_AFTER = 'P41' # the earliest possible date for something
DATE_BEFORE = 'P43' # the latest possible date for something
END_TERMINUS_ANTE_QUEM = 'P1123' # latest possible date of the end of a period
BEGIN_TERMINUS_ANTE_QUEM  = 'P1124' # latest possible date of the begin of a period
END_TERMINUS_POST_QUEM = 'P1125' # earliest possible date of the end of a period
BEGIN_TERMINUS_POST_QUEM = 'P1126' # earliest possible date of the beginning of a period

NOTE = 'P73' # Field for free notes
PRECISION_DATE = 'P467' # FactGrid qualifier for the specific determination of the exactness of a date
PRECISION_BEGIN_DATE = 'P785'   # qualifier to specify a begin date
PRECISION_END_DATE = 'P786'
STRING_PRECISION_BEGIN_DATE = 'P787' # qualifier to specify a begin date; string alternate to P785
STRING_PRECISION_END_DATE = 'P788'

#qualifiers/options
SHORTLY_BEFORE = 'Q255211'
SHORTLY_AFTER = 'Q266009'
LIKELY = 'Q23356'
CIRCA = 'Q10'
OR_FOLLOWING_YEAR = 'Q912616'

def format_datetime(entry: datetime, precision: int):
    ret_val =  f"+{entry.isoformat()}Z/{precision}"

    if entry.year < 1582: # declaring that the julian calendar is being used by adding '/J' to the end
        ret_val +=  JULIAN_ENDING
    
    #on FactGrid, if the date is at most accurate to a year, the day and month are set to 0. The datetime type in Python does not allow you to set the day or month to 0 so we need to replace it manually
    if precision <= PRECISION_YEAR:
        ret_val = ret_val.replace(f"{entry.year}-01-01", f"{entry.year}-00-00", 1)
    elif precision == PRECISION_MONTH:
        ret_val = ret_val.replace(f"{entry.year}-{entry.month}-01", f"{entry.year}-{entry.month}-00", 1)

    return ret_val

#only_date=True means there is only one date, not a 'begin date' and an 'end date'
def date_parsing(date_string: str, date_type: DateType):
    qualifier = ""
    entry = None
    precision = PRECISION_CENTURY

    ante_property = (match := re.search(ANTE_GROUP, date_string))
    post_property = (match := re.search(POST_GROUP, date_string))
    assert(not ante_property or not post_property)
    
    match date_type:
        case DateType.ONLY_DATE:
            string_precision_qualifier_clause = NOTE
            exact_precision_qualifier = PRECISION_DATE
            if ante_property:
                return_property = DATE_BEFORE
            elif post_property:
                return_property = DATE_AFTER
            else:
                return_property = DATE
        case DateType.BEGIN_DATE:
            string_precision_qualifier_clause = STRING_PRECISION_BEGIN_DATE
            exact_precision_qualifier = PRECISION_BEGIN_DATE
            if ante_property:
                return_property = BEGIN_TERMINUS_ANTE_QUEM
            elif post_property:
                return_property = BEGIN_TERMINUS_POST_QUEM
            else:
                return_property = BEGIN_DATE
        case DateType.END_DATE:
            string_precision_qualifier_clause = STRING_PRECISION_END_DATE
            exact_precision_qualifier = PRECISION_END_DATE
            if ante_property:
                return_property = END_TERMINUS_ANTE_QUEM
            elif post_property:
                return_property = END_TERMINUS_POST_QUEM
            else:
                return_property = END_DATE    
        case _:
            assert False, "Unexpected DateType!"
        
    string_precision_qualifier_clause += f'\t"{date_string}"'

    if date_string == '?':
        return tuple()
            
    if matches := re.match(r'(\d{1,2})\. ' + JH_GROUP, date_string):
        centuries = int(matches.group(1))
        entry = datetime(100 * (centuries), 1, 1)
    
    elif matches := re.match(r'(\d)\. Hälfte (des )?(\d{1,2})\. ' + JHS_GROUP, date_string):
        half = int(matches.group(1)) - 1
        centuries = int(matches.group(3)) - 1
        year   = centuries * 100 + (half * 50) + QUARTER_OF_A_CENTURY
        entry = datetime(year, 1, 1)
        qualifier = string_precision_qualifier_clause
    
    elif matches := re.match(r'(\w+) Viertel des (\d{1,2})\. ' + JHS_GROUP, date_string):
        number_map = {
            "erstes":  0,
            "zweites": 1,
            "drittes": 2,
            "viertes": 3,
        }
        quarter = matches.group(1)
        centuries = int(matches.group(2))
        year = (centuries - 1) * 100 + (number_map[quarter] * 25) + EIGTH_OF_A_CENTURY
        entry = datetime(year, 1, 1)
        qualifier = string_precision_qualifier_clause

    elif matches := re.match(r'frühes (\d{1,2})\. ' + JH_GROUP, date_string):
        centuries = int(matches.group(1)) - 1
        year = centuries * 100 + TENTH_OF_A_CENTURY
        entry = datetime(year, 1, 1)
        qualifier = string_precision_qualifier_clause

    elif matches := re.match(r'spätes (\d{1,2})\. ' + JH_GROUP, date_string):
        centuries = int(matches.group(1))
        year = centuries * 100 - TENTH_OF_A_CENTURY
        entry = datetime(year, 1, 1)
        qualifier = string_precision_qualifier_clause

    elif matches := re.match(r'(Anfang|Mitte|Ende) (\d{1,2})\. ' + JH_GROUP, date_string):
        number_map = {
            "Anfang":  0,
            "Mitte": 1,
            "Ende": 2,
        }
        third = number_map[matches.group(1)]
        centuries = int(matches.group(2)) - 1
        year = centuries * 100 + (third * 33) + 17
        entry = datetime(year, 1, 1)
        qualifier = string_precision_qualifier_clause

    elif matches := re.match(r'(\d{3,4})er Jahre', date_string):
        entry = datetime(int(matches.group(1)), 1, 1)
        precision = PRECISION_DECADE
    
    elif matches := re.match(r'Wende zum (\d{1,2})\. ' + JH_GROUP, date_string):
        centuries = int(matches.group(1)) - 1
        entry = datetime(centuries * 100 - 10, 1, 1)
        qualifier = string_precision_qualifier_clause

    elif matches := re.match(r'Anfang der (\d{3,4})er Jahre', date_string):
        entry = datetime(int(matches.group(1)), 1, 1)
        qualifier = string_precision_qualifier_clause
        precision = PRECISION_DECADE

    elif matches := re.match(r'\((\d{3,4})\s?\?\) (\d{3,4})', date_string):
        entry = datetime(int(matches.group(2)), 1, 1) # ignoring the year in parantheses
        precision = PRECISION_YEAR
        qualifier = string_precision_qualifier_clause
    
    elif matches := re.match(r'(\d{3,4})/(\d{3,4})', date_string):
        year1 = int(matches.group(1))
        year2 = int(matches.group(2))

        if year2 - year1 == 1:
            # check for consecutive years
            qualifier = exact_precision_qualifier + '\t' + OR_FOLLOWING_YEAR
        entry = datetime(year1, 1, 1)
        precision = PRECISION_YEAR

    # this pattern is pre-compiled above, because it's rather complex and it's much more efficient to compile it just once, instead of on every function call
    elif matches := MOST_COMPLEX_PATTERN.match(date_string):
        if matches.group(1): # if 'wohl' was found
            qualifier = exact_precision_qualifier + '\t' + LIKELY
        if matches.group(5): # if 'etwa' , 'ca.' or 'um' were found
            if len(qualifier) != 0:
                qualifier += '\t'
            qualifier += exact_precision_qualifier + '\t' + CIRCA
                
        if matches.group(3): # if 'kurz' was found -- because of how the regex is defined, this can only happen when combined with 'nach', 'bis', etc.
            if len(qualifier) != 0:
                qualifier += '\t'

            if ante_property: # already checked above whether it's before or after
                qualifier += exact_precision_qualifier + '\t' + SHORTLY_BEFORE
            else: # post_property
                qualifier += exact_precision_qualifier + '\t' + SHORTLY_AFTER

        if matches.group(8): # if a question mark at the end were found
            # TODO is it correct, that on ? the other matches ('ca.' etc.) are ignored, because it's not exact enough?
            qualifier = string_precision_qualifier_clause
        
        entry = datetime(int(matches.group(7)), 1, 1)
        precision = PRECISION_YEAR

    else:
        raise Exception(f"Couldn't parse date '{date_string}'")
        
    return (return_property, format_datetime(entry, precision), qualifier)

def parse_date(date, date_type):
    try:
        result = date_parsing(date, date_type)
    except:
        return np.nan
    return result

def process_individual_parsing_result(df, index, row, result_column):
    begin_date_parse_result = row[result_column]
    if(pd.isna(begin_date_parse_result)):
        return
    else:
        try:
            p_nr, date_string, qual = begin_date_parse_result
            if(p_nr == "P49"):
                df.loc[index, "P1124"] = date_string
            elif(p_nr == "P50"):
                df.loc[index, "P1125"] = date_string
            elif(p_nr == "P1126" or p_nr == "P1123"):
                return
            else:
                df.loc[index, p_nr] = date_string
        except:
            return

def process_date_parsing_results(df):
    for index, row in df.iterrows():
        process_individual_parsing_result(df, index, row, "begin_date_parse_result")
        process_individual_parsing_result(df, index, row, "end_date_parse_result")

The cell below does the following:
1. Create a copy of current table to work on
2. Parse entries from the column `location_begin_note` and save results in `begin_date_parse_result`
3. Parse entries from the column `location_end_note` and save results in `end_date_parse_result`
4. Process the parsing results using the function `process_date_parsing_results` from the cell above. This will create new columns for corresponding properties if necessary.
5. Fill missing values in `P1124` and `P1125` with values from `location_begin_taq` and `location_end_tpq` 
6. Fill missing values in `P1124` and `P1125` with values from `location_begin_tpq` and `location_end_taq` (!)
7. Add notes from `location_begin_note` and `location_end_note` as qualifiers `qual787` and `qual788`
8. Add ending for Julian Calendar (`\J`) to datestring if the date is before 1582

In [16]:
# # 1. Create a copy of current table to work on
# monastery_locations_prepared_complex_dates = prepared_df.copy()
# # 2. Parse location_begin_note
# monastery_locations_prepared_complex_dates['begin_date_parse_result'] = monastery_locations_prepared_complex_dates['location_begin_note'].apply(lambda x: parse_date(str(x), DateType.BEGIN_DATE))
# # 3. Parse location_end_note
# monastery_locations_prepared_complex_dates['end_date_parse_result'] = monastery_locations_prepared_complex_dates['location_end_note'].apply(lambda x: parse_date(str(x), DateType.END_DATE))
# # 4. Process parsing results
# process_date_parsing_results(monastery_locations_prepared_complex_dates)
# # 5. + 6. Fill in missing values in columns
# monastery_locations_prepared_complex_dates['P1124'] = monastery_locations_prepared_complex_dates['P1124'].fillna(prepared_df['location_begin_tpq'].apply(lambda x: f'{int(x)}-01-01T00:00:00Z/9' if not pd.isnull(x) else np.nan))
# monastery_locations_prepared_complex_dates['P1125'] = monastery_locations_prepared_complex_dates['P1125'].fillna(prepared_df['location_end_taq'].apply(lambda x: f'{int(x)}-01-01T00:00:00Z/9' if not pd.isnull(x) else np.nan))
# monastery_locations_prepared_complex_dates['P1124'] = monastery_locations_prepared_complex_dates['P1124'].fillna(prepared_df['location_begin_taq'].apply(lambda x: f'{int(x)}-01-01T00:00:00Z/9' if not pd.isnull(x) else np.nan))
# monastery_locations_prepared_complex_dates['P1125'] = monastery_locations_prepared_complex_dates['P1125'].fillna(prepared_df['location_end_tpq'].apply(lambda x: f'{int(x)}-01-01T00:00:00Z/9' if not pd.isnull(x) else np.nan))
# # 7. Insert notes as qualifiers
# monastery_locations_prepared_complex_dates.insert(monastery_locations_prepared_complex_dates.columns.get_loc("P1124")+1, "qal787", monastery_locations_prepared_complex_dates['location_begin_note'].apply(lambda x: f'\"\"\"{x}\"\"\"' if not pd.isna(x) else np.nan))
# monastery_locations_prepared_complex_dates.insert(monastery_locations_prepared_complex_dates.columns.get_loc("P1125")+1, "qal788", monastery_locations_prepared_complex_dates['location_end_note'].apply(lambda x: f'\"\"\"{x}\"\"\"' if not pd.isna(x) else np.nan))
# # 8. Add Julian Calendar Ending 
# monastery_locations_prepared_complex_dates['P1124'] = monastery_locations_prepared_complex_dates['P1124'].apply(lambda x: x + "/J" if not pd.isna(x) and int(x.split("-")[0]) < 1582 and x.split("/")[-1] != 'J' else x)
# monastery_locations_prepared_complex_dates['P1125'] = monastery_locations_prepared_complex_dates['P1125'].apply(lambda x: x + "/J" if not pd.isna(x) and int(x.split("-")[0]) < 1582 and x.split("/")[-1] != 'J' else x)
# monastery_locations_prepared_complex_dates

### 1.8 Source Statements

Every Statement in FactGrid should be supported by a Source/Reference. To achieve this, a source column `S471` is added after each relevant property to link to the Monastery Database Entries using the Property [P471](https://database.factgrid.de/wiki/Property:P471).

In [17]:
final_table = prepared_df.copy()
for colname in ["P48", "P83"]:
    final_table.insert(final_table.columns.get_loc(colname)+1, "S471", final_table["gsn_id"].apply(lambda x:f'\"{x}\"'), allow_duplicates=True)
final_table["P131"] = "Q153178"
final_table

Unnamed: 0,id_monastery_location,place_id,gsn_id,location_begin_tpq,location_begin_taq,location_begin_note,location_end_tpq,location_end_taq,location_end_note,longitude,...,Dde,Len,Den,P48,S471,P83,S471.1,P2,P1301,P131
0,16698,46481629,11457,1250,1273.0,vor 1273,1782.0,,,12.368620,...,"""Gebäudekomplex Eger des Klarissenklosters Ege...","""Building complex Benedicts of St. Michael, Me...","""Building complex of the Benedicts of St. Mich...",@50.077752/12.36862,"""11457""",Q102019,"""11457""",Q635758,"""GSMonasteryLocation16698""",Q153178
1,1752,46479255,3349,1288,,,1416.0,,,18.221140,...,"""Gebäudekomplex Ratibor (Burgkapelle) des Koll...","""Building complex Benedictine monastery Vornba...","""Building complex Vornbach of the Benedictine ...",@50.09578/18.22114,"""3349""",Q21722,"""3349""",Q635758,"""GSMonasteryLocation1752""",Q153178
2,8714,14700,4502,1302,,,1533.0,,,13.315655,...,"""Gebäudekomplex der Johanniterkommende Lychen""","""Building complex Cistercian nunnery Ophoven (...","""Building complex Ophoven of the Cistercian nu...",@53.208969/13.315655,"""4502""",Q88340,"""4502""",Q635758,"""GSMonasteryLocation8714""",Q153178
3,13567,46483163,8372,1317,,,1603.0,1653.0,nach 1603,5.487612,...,"""Gebäudekomplex der Johanniterkommende Ingen, ...","""Building complex Capuchin friary Karlstadt""","""Building complex of the Capuchin friary Karls...",@51.9602244312605/5.48761203580791,"""8372""",Q1347411,"""8372""",Q635758,"""GSMonasteryLocation13567""",Q153178
4,13569,46481587,8374,1327,,,1591.0,,,5.919402,...,"""Gebäudekomplex des Beginenhauses Kampen, Nied...","""Building complex Benedictine monastery of Bas...","""Building complex of the Benedictine monastery...",@52.5544680656827/5.91940200615705,"""8374""",Q87014,"""8374""",Q635758,"""GSMonasteryLocation13569""",Q153178
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
96,5785,637,40089,1175,1225.0,vor 1225,1803.0,,,7.305031,...,"""Gebäudekomplex der Templerkommende, dann Joha...",,,@50.51658173801344/7.305030721604308,"""40089""",Q82564,"""40089""",Q635758,"""GSMonasteryLocation5785""",Q153178
97,16147,46484263,11227,1142,1143.0,1142/1143,1783.0,,,15.290278,...,"""Gebäudekomplex Sedletz des Zisterzienserklost...",,,@49.96/15.290278,"""11227""",Q627082,"""11227""",Q635758,"""GSMonasteryLocation16147""",Q153178
98,13704,46483164,8481,1447,1457.0,ca. 1452,1593.0,,,5.484776,...,"""Gebäudekomplex des Schwesterhauses St. Cathar...",,,@52.2230564143223/5.48477573772399,"""8481""",Q1349336,"""8481""",Q635758,"""GSMonasteryLocation13704""",Q153178
99,15151,46483955,19937,1648,,,1810.0,,,15.586389,...,"""Gebäudekomplex Löwenberg des Franziskanermino...",,,@51.10948301890577/15.586388627441908,"""19937""",Q80833,"""19937""",Q635758,"""GSMonasteryLocation15151""",Q153178


### 1.9 Finalize Table

As a last step, all columns that are not needed for import are dropped, an empty qid column is added, so that new items will be created and the resulting table is stored under:
- `data/results/import_building_complexes.xlsx`
- `data/results/import_building_complexes.csv`

In [18]:
def df_to_qs_v1(df:pd.DataFrame):
    cols=pd.Series(df.columns)
    for dup in cols[cols.duplicated()].unique(): 
        cols[cols[cols == dup].index.values.tolist()] = [dup + '.' + str(i) if i != 0 else dup for i in range(sum(cols == dup))]
    df.columns=cols
    print(df)
    df_dict = df.to_dict(orient="records")
    output_collect = ""
    for item in df_dict:
        output = ""
        create_flag = False
        keys = list(item.keys())
        i = 0
        qid = None
        while i < len(keys):
            out_str = ""
            k = keys[i]
            if k.startswith("S") or k.startswith("qal"):
                i += 1
                continue
            if k == "qid":
                if pd.isna(item["qid"]):
                    create_flag = True
                    out_str += "CREATE\n"
                else:
                    qid = item["qid"]
            elif not pd.isna(item[k]):
                if create_flag:
                    out_str += "LAST"
                else:
                    out_str += qid
                out_str += f"\t{k}\t{item[k]}"
                while i+1 < len(keys) and (keys[i+1].startswith("S") or keys[i+1].startswith("qal")):
                    k = keys[i+1]
                    if k.startswith("qal"):
                        k_rep = k.replace("qal", "P")
                        if(not pd.isna(item[k])):
                            out_str += f'\t{k_rep.split(".")[0]}\t{item[k]}'
                        i += 1
                    else:
                        if(not pd.isna(item[k])):
                            out_str += f'\t{k.split(".")[0]}\t{item[k]}'
                        i += 1
                out_str += "\n"
            output += out_str
            i += 1
        output_collect += output
    return output_collect
                
                



In [19]:
final_table = final_table.drop(columns=["id_monastery_location", "place_id", "gsn_id", "location_begin_tpq", "location_begin_taq", "location_begin_note", "location_end_tpq", "location_end_taq", "location_end_note", "longitude", "latitude", "location_name", "id_gsn", "status", "monastery_name"])
final_table.insert(0, "qid", np.nan)
final_table.to_excel("data/results/import_building_complexes.xlsx", index=False)
final_table.to_csv("data/results/import_building_complexes.csv", index=False, doublequote=False, quoting=csv.QUOTE_NONE, escapechar="§")
with open("data/results/import_building_complexes.tsv", "w") as file:
    file.write(df_to_qs_v1(final_table))
final_table

     qid                                                Lde  \
0    NaN  "Gebäudekomplex Klarissenkloster Eger (Cheb), ...   
1    NaN  "Gebäudekomplex Kollegiatstift Ratibor (Racibó...   
2    NaN         "Gebäudekomplex Johanniterkommende Lychen"   
3    NaN  "Gebäudekomplex Johanniterkommende Ingen, Nied...   
4    NaN   "Gebäudekomplex Beginenhaus Kampen, Niederlande"   
..   ...                                                ...   
96   NaN  "Gebäudekomplex Templerkommende, dann Johannit...   
97   NaN  "Gebäudekomplex Zisterzienserkloster Sedletz (...   
98   NaN  "Gebäudekomplex Schwesterhaus St. Catharina, N...   
99   NaN  "Gebäudekomplex Franziskanerminoritenkloster L...   
100  NaN  "Gebäudekomplex Benediktinerpropstei Sankt Kat...   

                                                   Dde  \
0    "Gebäudekomplex Eger des Klarissenklosters Ege...   
1    "Gebäudekomplex Ratibor (Burgkapelle) des Koll...   
2       "Gebäudekomplex der Johanniterkommende Lychen"   
3    "Gebäu

Unnamed: 0,qid,Lde,Dde,Len,Den,P48,S471,P83,S471.1,P2,P1301,P131
0,,"""Gebäudekomplex Klarissenkloster Eger (Cheb), ...","""Gebäudekomplex Eger des Klarissenklosters Ege...","""Building complex Benedicts of St. Michael, Me...","""Building complex of the Benedicts of St. Mich...",@50.077752/12.36862,"""11457""",Q102019,"""11457""",Q635758,"""GSMonasteryLocation16698""",Q153178
1,,"""Gebäudekomplex Kollegiatstift Ratibor (Racibó...","""Gebäudekomplex Ratibor (Burgkapelle) des Koll...","""Building complex Benedictine monastery Vornba...","""Building complex Vornbach of the Benedictine ...",@50.09578/18.22114,"""3349""",Q21722,"""3349""",Q635758,"""GSMonasteryLocation1752""",Q153178
2,,"""Gebäudekomplex Johanniterkommende Lychen""","""Gebäudekomplex der Johanniterkommende Lychen""","""Building complex Cistercian nunnery Ophoven (...","""Building complex Ophoven of the Cistercian nu...",@53.208969/13.315655,"""4502""",Q88340,"""4502""",Q635758,"""GSMonasteryLocation8714""",Q153178
3,,"""Gebäudekomplex Johanniterkommende Ingen, Nied...","""Gebäudekomplex der Johanniterkommende Ingen, ...","""Building complex Capuchin friary Karlstadt""","""Building complex of the Capuchin friary Karls...",@51.9602244312605/5.48761203580791,"""8372""",Q1347411,"""8372""",Q635758,"""GSMonasteryLocation13567""",Q153178
4,,"""Gebäudekomplex Beginenhaus Kampen, Niederlande""","""Gebäudekomplex des Beginenhauses Kampen, Nied...","""Building complex Benedictine monastery of Bas...","""Building complex of the Benedictine monastery...",@52.5544680656827/5.91940200615705,"""8374""",Q87014,"""8374""",Q635758,"""GSMonasteryLocation13569""",Q153178
...,...,...,...,...,...,...,...,...,...,...,...,...
96,,"""Gebäudekomplex Templerkommende, dann Johannit...","""Gebäudekomplex der Templerkommende, dann Joha...",,,@50.51658173801344/7.305030721604308,"""40089""",Q82564,"""40089""",Q635758,"""GSMonasteryLocation5785""",Q153178
97,,"""Gebäudekomplex Zisterzienserkloster Sedletz (...","""Gebäudekomplex Sedletz des Zisterzienserklost...",,,@49.96/15.290278,"""11227""",Q627082,"""11227""",Q635758,"""GSMonasteryLocation16147""",Q153178
98,,"""Gebäudekomplex Schwesterhaus St. Catharina, N...","""Gebäudekomplex des Schwesterhauses St. Cathar...",,,@52.2230564143223/5.48477573772399,"""8481""",Q1349336,"""8481""",Q635758,"""GSMonasteryLocation13704""",Q153178
99,,"""Gebäudekomplex Franziskanerminoritenkloster L...","""Gebäudekomplex Löwenberg des Franziskanermino...",,,@51.10948301890577/15.586388627441908,"""19937""",Q80833,"""19937""",Q635758,"""GSMonasteryLocation15151""",Q153178


In [20]:
print(df_to_qs_v1(final_table))

     qid                                                Lde  \
0    NaN  "Gebäudekomplex Klarissenkloster Eger (Cheb), ...   
1    NaN  "Gebäudekomplex Kollegiatstift Ratibor (Racibó...   
2    NaN         "Gebäudekomplex Johanniterkommende Lychen"   
3    NaN  "Gebäudekomplex Johanniterkommende Ingen, Nied...   
4    NaN   "Gebäudekomplex Beginenhaus Kampen, Niederlande"   
..   ...                                                ...   
96   NaN  "Gebäudekomplex Templerkommende, dann Johannit...   
97   NaN  "Gebäudekomplex Zisterzienserkloster Sedletz (...   
98   NaN  "Gebäudekomplex Schwesterhaus St. Catharina, N...   
99   NaN  "Gebäudekomplex Franziskanerminoritenkloster L...   
100  NaN  "Gebäudekomplex Benediktinerpropstei Sankt Kat...   

                                                   Dde  \
0    "Gebäudekomplex Eger des Klarissenklosters Ege...   
1    "Gebäudekomplex Ratibor (Burgkapelle) des Koll...   
2       "Gebäudekomplex der Johanniterkommende Lychen"   
3    "Gebäu

**Once you uploaded the building complexes** make sure to keep a table that maps the newly created Q-numbers to the GS Vocabulary Term (`P1301`). Those will be needed in step 3.