# Notebook 1: Creating Items for Building Complexes

This notebook implements the first step of creating of the Klosterdatenbank-to-Factgrid-Workflow which is to create Items for the building complexes. It contains describing elements about the underlying data model and the workflow in general, as well as specific instructions in order to run the notebook. Markdown cells containing describing elements are marked as `#description`. Instructional sections are marked as `#instruction`.

Strictly speaking, the monastery database does not contain dedicated information on building complexes. Information on where a religious community had its place of operation is stored in the `gs_monastery_location` table. This table assigns each row of a religious community (`gsn_id`) to a location (`place_id`) and, if known, specific coordinates within this location (`longitude`, `latitude`). Such an assignment implies that the community lived or worked at this location at a certain point in time. At this point, we make the central assumption that a building complex of some kind, consisting of at least one building, must have existed. Accordingly, the building complexes created in this step represent both a row from the `gs_monastery_location` table and thus an assignment of a monastery to a specific location, as well as physical buildings in which religious communities worked and which may have continued to exist before or after their use and have experienced other use scenarios.

## Preparations

The notebook requires the following libraries to run. If an error occurs, make sure the libraries are installed on your system.

In [109]:
import pandas as pd
import numpy as np
import os
import csv

First, the export files are loaded into [Dataframes](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html). The dataframes are stored in a dictionary with the keys being the filenames, for easier access.

In [110]:
# Load Access exports
from helper_functions import load_files_from_folder, query_factgrid

export_files = load_files_from_folder("data/exports_monasteryDB", "xlsx")

# Create dataframes for each table
dataframes = {key: pd.read_excel(value) for key, value in export_files.items()}

# Add dataframe for monasteries in factGrid (stored in a different directory)
dataframes["building_complexes_in_factgrid"] = query_factgrid("building_complexes")
dataframes["monasteries_in_factgrid"] = query_factgrid("monasteries")

Since `gs_monastery_location` does not contain the name of the monasteries, the table is joined to `gs_monastery` to extract the missing information. The resulting table is cut down to the relevant columns. The resulting dataframe is being filtered to only contain information on religious comunities that have the status "online", meaning they are not currently worked on anymore. Finally, to make sure that no duplicate building complexes are being created, the table is filtered against the existing building complexes in FactGrid.

In [111]:
# Merge gs_monastery_location and gs_monastery
merged_df = pd.merge(dataframes["gs_monastery_location"], dataframes["gs_monastery"], left_on='gsn_id', right_on='id_gsn', how='left')
# Filter for status 'online'
online_df = merged_df[merged_df["status"] == "Online"]
# Define columns to drop
drop_columns = [
    "relocated", 
    "comment", 
    "main_location", 
    "diocese_id", 
    "id_monastery", 
    "date_created", 
    "created_by_user", 
    "patrocinium",
    "selection", 
    "processing_status", 
    "gs_persons", 
    "selection_criteria", 
    "last_change", 
    "changed_by_user", 
    "founder"
]
# Prepare dataframe by dropping unnecessary columns
prepared_df = online_df.drop(drop_columns, axis="columns")
prepared_df = prepared_df[~prepared_df["id_monastery_location"].isin(dataframes["building_complexes_in_factgrid"]["GSVocabTerm"].str.split("Location").str[-1].astype(int))]
print(f"{len(prepared_df[prepared_df["id_monastery_location"].isin(dataframes["building_complexes_in_factgrid"]["GSVocabTerm"].str.split("Location").str[-1].astype(int))])} locations are already in FactGrid and filtered out")
prepared_df

0 locations are already in FactGrid and filtered out


Unnamed: 0,Unnamed: 0_x,id_monastery_location,place_id,gsn_id,location_begin_tpq,location_begin_taq,location_begin_note,location_end_tpq,location_end_taq,location_end_note,longitude,latitude,location_name,Unnamed: 0_y,id_gsn,status,note,monastery_name
0,2,6053,11765,40358,1318,,,1412.0,,,7.300994,50.181608,Karden,1247,40358,Online,Die „untere Klause“ wurde 1318 als Klause für ...,Beginenhaus Karden (Untere Klause)
1,8,6059,11776,40364,1250,1298.0,zwischen 1250 und 1298,1312.0,,1312 oder kurz danach,6.641084,49.754683,Trier (Brotstraße),1253,40364,Online,1312 siedeln die Johanniter von ihrem ersten S...,Johanniterkommende Trier
2,29,7960,763,60036,1223,,,1806.0,,,10.886400,49.890400,,3395,60036,Online,Franziskaner-Observanten,Franziskanerkloster Bamberg
3,57,1926,13277,3514,1156,,erstmals erwähnt 1156,1802.0,,,6.444722,51.658333,,4773,3514,Online,auch: Frauenkloster vor dem Meertor. – Das Klo...,Benediktinerinnenkloster Hagenbusch (vor dem M...
4,80,16657,46479184,11426,1691,,,1784.0,,,14.393241,50.090289,"Prag, Hradschin",3332,11426,Online,Der Prager Bischof Johann Friedrich von Waldst...,Ursulinenkloster St. Johannes Nepomuk auf dem ...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
536,7959,17648,46479310,12033,1425,,,1796.0,1797.0,,5.241170,50.831850,,7231,12033,Online,,"Begardengemeinschaft Sint-Truiden, Belgien"
537,7971,17659,46484694,12042,1627,1629.0,1627/1629,1796.0,,,4.941530,50.811280,,7240,12042,Online,Von 1627 bis 1629 wurde ein Kloster für die Ge...,"Annunziatinnenkloster Tienen, Belgien"
538,8064,17768,46484885,12102,1372,,,1525.0,,,19.933330,54.466670,,7297,12102,Online,"1520 wurde das Gebäude zerstört, 1525 wurde da...",Augustinereremitenkloster Heiligenbeil (Mamono...
539,8137,17850,1594,12172,1332,,,1332.0,,,,,,7350,12172,Online,"Lediglich 1332 zweimal erwähnt, in einem Testa...","Schwesternsammlung ""im Haus der von Baldolzhei..."


To double-check potential duplicates, the following cell finds buildings complexes that are connected to monasteries already existent in FactGrid. If the resulting DataFrame is empty, all building complexes will be linked to newly created monastery items.

In [112]:
existing_monasteries = prepared_df[prepared_df["gsn_id"].isin(dataframes["monasteries_in_factgrid"]["KlosterdatenbankID"].astype(int))]
existing_monasteries

Unnamed: 0,Unnamed: 0_x,id_monastery_location,place_id,gsn_id,location_begin_tpq,location_begin_taq,location_begin_note,location_end_tpq,location_end_taq,location_end_note,longitude,latitude,location_name,Unnamed: 0_y,id_gsn,status,note,monastery_name
77,969,1859,46479312,3462,987,991.0,987 bis 991,1577.0,,,8.107778,48.895,,548,3462,Online,1481 von Papst Sixtus in ein Stift umgewandelt...,"Benediktinerkloster St. Peter und Paul, dann K..."
90,1186,3075,2177,20015,1128,,nach 1128,1535.0,,,9.315357,48.691892,,16,20015,Online,,Stift der Chorherren vom Heiligen Grab in Denk...
142,1791,1574,6305,3181,816,866.0,vor 866,1802.0,,,6.962556,50.94675,,3056,3181,Online,Nach einer Legende wurde das Stift durch Bisch...,"Kollegiatstift St. Kunibert, Köln"
156,2049,7795,13581,92445,1199,,um 1199,1535.0,,,13.452561,54.089309,,1281,92445,Online,Die Mönche kamen aus dem Zisterzienserkloster ...,Zisterzienserkloster Eldena (Hilda)
162,2134,1742,46479184,3339,1362,,,1503.0,1524.0,1503/1524,14.423622,50.070762,,4375,3339,Online,"Nach 1115 in Sadská, angeblich von Herzog Boři...","Kollegiatstift St. Apollinaris, Sadská, später..."
211,2823,1709,46479240,3309,668,678.0,um 673,1791.0,,,7.34158,48.54314,,562,3309,Online,Anfang des 18. Jahrhunderts nach Molsheim verl...,"Benediktinerabtei St. Trinitatis, dann Kollegi..."
213,2891,15952,46481709,10765,1201,1345.0,vor 1345,1558.0,,,26.716296,58.382192,Dorpat,5858,10765,Online,"Das Kloster ist 1345 erstmals belegt, vermutli...","Zisterzienserinnenkloster Dorpat (Tartu), Estland"
222,2989,912,46479089,844,1230,1240.0,ca. 1235,1530.0,1555.0,1530/55,10.405378,53.250724,,2638,844,Online,,Franziskanerkloster Lüneburg
230,3108,14028,46483264,3339,1115,1165.0,nach 1115,1362.0,,,,,,4375,3339,Online,"Nach 1115 in Sadská, angeblich von Herzog Boři...","Kollegiatstift St. Apollinaris, Sadská, später..."
251,3445,1328,13578,2061,1193,1243.0,vor 1243,1534.0,,,12.1444,54.09155,Rostock,4275,2061,Online,,Katharinenkloster Rostock (Franziskaner)


## Labels

It is expected that items in FactGrid have a label in at least one language. While the FactGrid ID (also referred to as the "Q-Number") uniquely identifies the item, the label serves to capture the name of the item in everyday language. The label is also indexed for text-based search. The naming of the items created in this project follows the following rule:
- For the religious communities, the name from the monastery database is used as the label, for example "Zisterzienserkloster Georgenzell".
- For the building complexes, the labels are constructed according to the following schema: `Gebäudekomplex <monastery_name> [(<location_name>)]`. Here, `monastery_name` is again the name of the religious community from the `gs_monastery` table. `location_name` is a column of the `gs_monastery_location` table. In this column, if available, the specific name given to this location is stored. 

For example, the "Benediktinerinnenkloster Mielen, Sint-Truiden, Belgien" (GSN [11665](https://klosterdatenbank.adw-goe.de/gsn/11665)) has two locations in the Belgian town of Sint-Truiden, namely the location "Sint Truiden" and the location "Metsteren" (see Figure). The constructed labels are then "Gebäudekomplex Benediktinerinnenkloster Mielen, Sint-Truiden, Belgien (Sint-Truiden)" and "Gebäudekomplex Benediktinerinnenkloster Mielen, Sint-Truiden, Belgien (Metsteren)". However, location names are not available in all these cases, which can lead to duplicates in the labels. These are displayed in the workflow, so that location names can be added to distinguish them better.

<img src="documentation-images/Standorte GSN11665.png" alt="Monastery Locations of GSN 11665" width="500">

*Figure 1: Building Complexes of the Benedictine nun's monastery Mielen in Sint-Truiden, Belgium (GSN 11665). Base-Layer: OpenStreetMap.*

The following cell constructs the location names and saves them in a new column called "Lde" (see [Quickstatements specification](https://www.wikidata.org/wiki/Help:QuickStatements#Adding_labels,_aliases,_descriptions_and_sitelinks)).

In [113]:
from helper_functions import construct_description
# 1. Create new column with labels
prepared_df['Lde'] = "Gebäudekomplex " + prepared_df["monastery_name"].str.cat(prepared_df["location_name"].fillna(''), sep=" (") +")"
for index, row in prepared_df.iterrows():
    prepared_df.loc[index, "Dde"] = construct_description(row["location_name"], row["monastery_name"], row["location_begin_taq"], row["location_begin_tpq"], row["location_end_taq"], row["location_end_tpq"])
# 2. If necessary, delete empty brackets at end of labels
prepared_df['Lde'] = prepared_df["Lde"].str.replace(r'\(\)', '', regex=True).apply(lambda x: f'\"{x.strip()}\"')
prepared_df["Dde"] = prepared_df["Dde"].apply(lambda x:f'\"{x}\"')
prepared_df

Unnamed: 0,Unnamed: 0_x,id_monastery_location,place_id,gsn_id,location_begin_tpq,location_begin_taq,location_begin_note,location_end_tpq,location_end_taq,location_end_note,longitude,latitude,location_name,Unnamed: 0_y,id_gsn,status,note,monastery_name,Lde,Dde
0,2,6053,11765,40358,1318,,,1412.0,,,7.300994,50.181608,Karden,1247,40358,Online,Die „untere Klause“ wurde 1318 als Klause für ...,Beginenhaus Karden (Untere Klause),"""Gebäudekomplex Beginenhaus Karden (Untere Kla...","""Gebäudekomplex Karden des Beginenhauses Karde..."
1,8,6059,11776,40364,1250,1298.0,zwischen 1250 und 1298,1312.0,,1312 oder kurz danach,6.641084,49.754683,Trier (Brotstraße),1253,40364,Online,1312 siedeln die Johanniter von ihrem ersten S...,Johanniterkommende Trier,"""Gebäudekomplex Johanniterkommende Trier (Trie...","""Gebäudekomplex Trier (Brotstraße) der Johanni..."
2,29,7960,763,60036,1223,,,1806.0,,,10.886400,49.890400,,3395,60036,Online,Franziskaner-Observanten,Franziskanerkloster Bamberg,"""Gebäudekomplex Franziskanerkloster Bamberg""","""Gebäudekomplex des Franziskanerklosters Bamberg"""
3,57,1926,13277,3514,1156,,erstmals erwähnt 1156,1802.0,,,6.444722,51.658333,,4773,3514,Online,auch: Frauenkloster vor dem Meertor. – Das Klo...,Benediktinerinnenkloster Hagenbusch (vor dem M...,"""Gebäudekomplex Benediktinerinnenkloster Hagen...","""Gebäudekomplex des Benediktinerinnenklosters ..."
4,80,16657,46479184,11426,1691,,,1784.0,,,14.393241,50.090289,"Prag, Hradschin",3332,11426,Online,Der Prager Bischof Johann Friedrich von Waldst...,Ursulinenkloster St. Johannes Nepomuk auf dem ...,"""Gebäudekomplex Ursulinenkloster St. Johannes ...","""Gebäudekomplex Prag, Hradschin des Ursulinenk..."
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
536,7959,17648,46479310,12033,1425,,,1796.0,1797.0,,5.241170,50.831850,,7231,12033,Online,,"Begardengemeinschaft Sint-Truiden, Belgien","""Gebäudekomplex Begardengemeinschaft Sint-Trui...","""Gebäudekomplex des Begardengemeinschaft Sint-..."
537,7971,17659,46484694,12042,1627,1629.0,1627/1629,1796.0,,,4.941530,50.811280,,7240,12042,Online,Von 1627 bis 1629 wurde ein Kloster für die Ge...,"Annunziatinnenkloster Tienen, Belgien","""Gebäudekomplex Annunziatinnenkloster Tienen, ...","""Gebäudekomplex des Annunziatinnenklosters Tie..."
538,8064,17768,46484885,12102,1372,,,1525.0,,,19.933330,54.466670,,7297,12102,Online,"1520 wurde das Gebäude zerstört, 1525 wurde da...",Augustinereremitenkloster Heiligenbeil (Mamono...,"""Gebäudekomplex Augustinereremitenkloster Heil...","""Gebäudekomplex des Augustinereremitenklosters..."
539,8137,17850,1594,12172,1332,,,1332.0,,,,,,7350,12172,Online,"Lediglich 1332 zweimal erwähnt, in einem Testa...","Schwesternsammlung ""im Haus der von Baldolzhei...","""Gebäudekomplex Schwesternsammlung ""im Haus de...","""Gebäudekomplex der Schwesternsammlung ""im Hau..."


As mentioned above, there might be duplicate labels in cases where locations don't have an explicit name. Since they still can be distinguished from another by their identifier and coordinates, this is not necessarily a problem. However, the following cell will create a list of all the duplicate labels so that they can be examined.

**In order to resolve the duplicates**

1. Open and inspect the table located at `data/intermediate_results/duplicate_building_complex_labels.xslx`
2. Add location names in the monastery database
3. Create new exports from the monastery database and replace `data/exports_monasteryDB/gs_monastery.xlsx` and `data/exports_monasteryDB/gs_monastery_location.xlsx` with the new files
4. Re-run the notebook. The cell below now should no longer contain the duplicates you resolved. 

In [114]:
duplicated_building_complex_labels = prepared_df[prepared_df.duplicated(subset="Lde", keep=False)]
duplicated_building_complex_labels.to_excel('data/intermediate_results/duplicate_building_complex_labels.xlsx')
duplicated_building_complex_labels

Unnamed: 0,Unnamed: 0_x,id_monastery_location,place_id,gsn_id,location_begin_tpq,location_begin_taq,location_begin_note,location_end_tpq,location_end_taq,location_end_note,longitude,latitude,location_name,Unnamed: 0_y,id_gsn,status,note,monastery_name,Lde,Dde
38,529,17219,46483862,11776,1513,,,1523.0,,,,,Antwerpen,7018,11776,Online,Lokalisierung nach Ortsmittelpunkt.,"Augustinereremitenkloster Antwerpen (Anvers), ...","""Gebäudekomplex Augustinereremitenkloster Antw...","""Gebäudekomplex Antwerpen des Augustinereremit..."
39,530,17220,46484698,11776,1601,1700.0,17. Jahrhundert,1796.0,,,,,Antwerpen,7018,11776,Online,Lokalisierung nach Ortsmittelpunkt.,"Augustinereremitenkloster Antwerpen (Anvers), ...","""Gebäudekomplex Augustinereremitenkloster Antw...","""Gebäudekomplex Antwerpen des Augustinereremit..."
42,554,17246,46483268,11802,1146,,,1559.0,,,3.713445,51.054329,Gent,6928,11802,Online,Das Augustinerdoppelstift der Brüder und Schwe...,Stift der Brüder und Schwestern des gemeinsame...,"""Gebäudekomplex Stift der Brüder und Schwester...","""Gebäudekomplex Gent des Stift der Brüder und ..."
44,559,17253,46484718,11802,1626,1630.0,1626/1630,1796.0,,,3.713445,51.054329,Gent,6928,11802,Online,Das Augustinerdoppelstift der Brüder und Schwe...,Stift der Brüder und Schwestern des gemeinsame...,"""Gebäudekomplex Stift der Brüder und Schwester...","""Gebäudekomplex Gent des Stift der Brüder und ..."
45,560,17254,46484718,11802,1559,,,1626.0,1630.0,1626/1630,3.713445,51.054329,Gent,6928,11802,Online,Das Augustinerdoppelstift der Brüder und Schwe...,Stift der Brüder und Schwestern des gemeinsame...,"""Gebäudekomplex Stift der Brüder und Schwester...","""Gebäudekomplex Gent des Stift der Brüder und ..."
46,565,17259,46484728,11809,1228,,,1559.0,,,3.936796,50.793288,Grimminge,6935,11809,Online,,"Zisterzienserinnenkloster Beaupré, Grimminge, ...","""Gebäudekomplex Zisterzienserinnenkloster Beau...","""Gebäudekomplex Grimminge des Zisterzienserinn..."
47,584,17279,46484729,11809,1559,,,1795.0,,,3.936796,50.793288,Grimminge,6935,11809,Online,,"Zisterzienserinnenkloster Beaupré, Grimminge, ...","""Gebäudekomplex Zisterzienserinnenkloster Beau...","""Gebäudekomplex Grimminge des Zisterzienserinn..."
48,602,17301,46483837,8534,1559,,,1784.0,,,5.991389,51.195556,,5817,8534,Online,"Tertiarinnen 1344 (vermutlich Beginen), Transf...","Franziskanertertiarinnenkloster Roermond, Nied...","""Gebäudekomplex Franziskanertertiarinnenkloste...","""Gebäudekomplex des Franziskanertertiarinnenkl..."
105,1331,13477,46479171,8315,1580,,,1611.0,,,5.31,51.686944,,5764,8315,Online,Für den Standort in Hoogstraten Lokalisierung ...,"Klarissenkloster Boxtel, Niederlande","""Gebäudekomplex Klarissenkloster Boxtel, Niede...","""Gebäudekomplex des Klarissenklosters Boxtel, ..."
106,1332,13478,46479122,8315,1472,,,1580.0,,,5.325598,51.589158,,5764,8315,Online,Für den Standort in Hoogstraten Lokalisierung ...,"Klarissenkloster Boxtel, Niederlande","""Gebäudekomplex Klarissenkloster Boxtel, Niede...","""Gebäudekomplex des Klarissenklosters Boxtel, ..."


### Translation of Labels

FactGrid is a multilingual platform. Therefore, the labels for the monasteries and building complexes should not only be created in German, but also in English. Due to the heterogeneity of the monastery names in the database, a rule-based translation is difficult to implement. Instead, a Large-Language Model was used. The model, prompting, and details of the translation are described in more detail in the notebook "1a - Translation". We are using the [GWDG/KISSKI API](https://docs.hpc.gwdg.de/services/chat-ai/index.html), so in order to execute the notebook, a [SAIA API key](https://docs.hpc.gwdg.de/services/saia/index.html) is needed. Since the translation process can take some time, it has been outsourced to a separate notebook.

In [115]:
to_translate = prepared_df[["monastery_name", 'Lde', 'Dde', "note"]].copy()
to_translate = to_translate.rename(columns={"Lde": "building_Lde", "Dde": "building_Dde", "monastery_name" : "monastery_Lde", "note": "monastery_Dde"})
to_translate.to_csv("data/translation/to_translate.csv")
to_translate

Unnamed: 0,monastery_Lde,building_Lde,building_Dde,monastery_Dde
0,Beginenhaus Karden (Untere Klause),"""Gebäudekomplex Beginenhaus Karden (Untere Kla...","""Gebäudekomplex Karden des Beginenhauses Karde...",Die „untere Klause“ wurde 1318 als Klause für ...
1,Johanniterkommende Trier,"""Gebäudekomplex Johanniterkommende Trier (Trie...","""Gebäudekomplex Trier (Brotstraße) der Johanni...",1312 siedeln die Johanniter von ihrem ersten S...
2,Franziskanerkloster Bamberg,"""Gebäudekomplex Franziskanerkloster Bamberg""","""Gebäudekomplex des Franziskanerklosters Bamberg""",Franziskaner-Observanten
3,Benediktinerinnenkloster Hagenbusch (vor dem M...,"""Gebäudekomplex Benediktinerinnenkloster Hagen...","""Gebäudekomplex des Benediktinerinnenklosters ...",auch: Frauenkloster vor dem Meertor. – Das Klo...
4,Ursulinenkloster St. Johannes Nepomuk auf dem ...,"""Gebäudekomplex Ursulinenkloster St. Johannes ...","""Gebäudekomplex Prag, Hradschin des Ursulinenk...",Der Prager Bischof Johann Friedrich von Waldst...
...,...,...,...,...
536,"Begardengemeinschaft Sint-Truiden, Belgien","""Gebäudekomplex Begardengemeinschaft Sint-Trui...","""Gebäudekomplex des Begardengemeinschaft Sint-...",
537,"Annunziatinnenkloster Tienen, Belgien","""Gebäudekomplex Annunziatinnenkloster Tienen, ...","""Gebäudekomplex des Annunziatinnenklosters Tie...",Von 1627 bis 1629 wurde ein Kloster für die Ge...
538,Augustinereremitenkloster Heiligenbeil (Mamono...,"""Gebäudekomplex Augustinereremitenkloster Heil...","""Gebäudekomplex des Augustinereremitenklosters...","1520 wurde das Gebäude zerstört, 1525 wurde da..."
539,"Schwesternsammlung ""im Haus der von Baldolzhei...","""Gebäudekomplex Schwesternsammlung ""im Haus de...","""Gebäudekomplex der Schwesternsammlung ""im Hau...","Lediglich 1332 zweimal erwähnt, in einem Testa..."


After executing the above cell, a table is generated in `data/translation` that contains all terms that should be translated: `to_translate.csv`. Execute Notebook 1a. Once the execution is completed, there should be a file named `translated.csv` that contains the translations within the `data/translation` folder. Once the file exists, you can run the next cell to load the translated labels.

In [116]:
translated = pd.read_csv("data/translation/translated.csv")
translated["building_Lde"] = translated["building_Lde"].str.strip().str.strip("\"\"\"").apply(lambda x:f'\"{x}\"' if not pd.isna(x) else np.nan)
translated
prepared_df = pd.merge(prepared_df, translated[["building_Lde", "building_Len"]], how="left", left_on="Lde", right_on="building_Lde").drop_duplicates()
prepared_df.rename(columns={"building_Len":"Len"}, inplace=True)
prepared_df["Len"] = prepared_df["Len"].apply(lambda x:f'\"{x}\"' if not pd.isna(x) else np.nan)
prepared_df

Unnamed: 0,Unnamed: 0_x,id_monastery_location,place_id,gsn_id,location_begin_tpq,location_begin_taq,location_begin_note,location_end_tpq,location_end_taq,location_end_note,...,location_name,Unnamed: 0_y,id_gsn,status,note,monastery_name,Lde,Dde,building_Lde,Len
0,2,6053,11765,40358,1318,,,1412.0,,,...,Karden,1247,40358,Online,Die „untere Klause“ wurde 1318 als Klause für ...,Beginenhaus Karden (Untere Klause),"""Gebäudekomplex Beginenhaus Karden (Untere Kla...","""Gebäudekomplex Karden des Beginenhauses Karde...","""Gebäudekomplex Beginenhaus Karden (Untere Kla...","""Building complex of the Beguines Karden (Lowe..."
1,8,6059,11776,40364,1250,1298.0,zwischen 1250 und 1298,1312.0,,1312 oder kurz danach,...,Trier (Brotstraße),1253,40364,Online,1312 siedeln die Johanniter von ihrem ersten S...,Johanniterkommende Trier,"""Gebäudekomplex Johanniterkommende Trier (Trie...","""Gebäudekomplex Trier (Brotstraße) der Johanni...","""Gebäudekomplex Johanniterkommende Trier (Trie...","""Building complex of the Knights Hospitallers ..."
2,29,7960,763,60036,1223,,,1806.0,,,...,,3395,60036,Online,Franziskaner-Observanten,Franziskanerkloster Bamberg,"""Gebäudekomplex Franziskanerkloster Bamberg""","""Gebäudekomplex des Franziskanerklosters Bamberg""","""Gebäudekomplex Franziskanerkloster Bamberg""","""Building complex of the Franciscans Bamberg"""
3,57,1926,13277,3514,1156,,erstmals erwähnt 1156,1802.0,,,...,,4773,3514,Online,auch: Frauenkloster vor dem Meertor. – Das Klo...,Benediktinerinnenkloster Hagenbusch (vor dem M...,"""Gebäudekomplex Benediktinerinnenkloster Hagen...","""Gebäudekomplex des Benediktinerinnenklosters ...","""Gebäudekomplex Benediktinerinnenkloster Hagen...","""Building complex of the Benedictine nuns Hage..."
4,80,16657,46479184,11426,1691,,,1784.0,,,...,"Prag, Hradschin",3332,11426,Online,Der Prager Bischof Johann Friedrich von Waldst...,Ursulinenkloster St. Johannes Nepomuk auf dem ...,"""Gebäudekomplex Ursulinenkloster St. Johannes ...","""Gebäudekomplex Prag, Hradschin des Ursulinenk...","""Gebäudekomplex Ursulinenkloster St. Johannes ...","""Building complex of the Ursuline monastery of..."
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
604,7959,17648,46479310,12033,1425,,,1796.0,1797.0,,...,,7231,12033,Online,,"Begardengemeinschaft Sint-Truiden, Belgien","""Gebäudekomplex Begardengemeinschaft Sint-Trui...","""Gebäudekomplex des Begardengemeinschaft Sint-...","""Gebäudekomplex Begardengemeinschaft Sint-Trui...","""Building complex Beghard community of Sint-Tr..."
605,7971,17659,46484694,12042,1627,1629.0,1627/1629,1796.0,,,...,,7240,12042,Online,Von 1627 bis 1629 wurde ein Kloster für die Ge...,"Annunziatinnenkloster Tienen, Belgien","""Gebäudekomplex Annunziatinnenkloster Tienen, ...","""Gebäudekomplex des Annunziatinnenklosters Tie...","""Gebäudekomplex Annunziatinnenkloster Tienen, ...","""Building complex Sisters of the Annunciation ..."
606,8064,17768,46484885,12102,1372,,,1525.0,,,...,,7297,12102,Online,"1520 wurde das Gebäude zerstört, 1525 wurde da...",Augustinereremitenkloster Heiligenbeil (Mamono...,"""Gebäudekomplex Augustinereremitenkloster Heil...","""Gebäudekomplex des Augustinereremitenklosters...","""Gebäudekomplex Augustinereremitenkloster Heil...","""Building complex Austin Friars of Heiligenbei..."
607,8137,17850,1594,12172,1332,,,1332.0,,,...,,7350,12172,Online,"Lediglich 1332 zweimal erwähnt, in einem Testa...","Schwesternsammlung ""im Haus der von Baldolzhei...","""Gebäudekomplex Schwesternsammlung ""im Haus de...","""Gebäudekomplex der Schwesternsammlung ""im Hau...","""Gebäudekomplex Schwesternsammlung ""im Haus de...","""Building complex Women's convent ""in the hous..."


### OPTIONAL: If working with a pre-translated file
If you are uploading a lot of monasteries at once, it may be useful to translate all of them in batches before you run this notebook. If you do so, the following cell will double-check for missing translations so that they can be added retrospectively. A new file `to_translate.csv` will be created. Execute Notebook 1a to translate the missing labels, then copy the resulting CSV to the end of `translated.csv`.

In [117]:
missing_label_translations = prepared_df[prepared_df["Len"].isna()]
to_translate = missing_label_translations[["monastery_name", 'Lde', 'Dde', "note"]].copy()
to_translate = to_translate.rename(columns={"Lde": "building_Lde", "Dde": "building_Dde", "monastery_name" : "monastery_Lde", "note": "monastery_Dde"})
to_translate.to_csv("data/translation/to_translate.csv")
to_translate

Unnamed: 0,monastery_Lde,building_Lde,building_Dde,monastery_Dde
465,Franziskanerminoritenkloster Pardubitz (Pardub...,"""Gebäudekomplex Franziskanerminoritenkloster P...","""Gebäudekomplex Pardubitz_x000d_\n\t_x000d_\nP...",Das Kloster wurde 1514/1515 bei der Kirche St....


Once the missing translations have been added, rerun the notebook.

## Geocoordinates

Our data model separates religious communities from the building complexes in which they lived and worked. The geocoordinates of a location of a religious community are properties of the building complex in this modeling. In the monastery database, there are two levels of accuracy with which the localization of a monastery location can be performed: coordinates for a monastery location will either represent the exact point where the building was located, or the central point of a place, e.g. a village, in which it was located. It is to be noted that the centroid-based location always only represents an approximation of the centroid of the modern location. In cases where the exact location of the building complex is unknown, the respective item will not be linked to any coordinates. Instead, the coordinates of the place where it is located should be queried. In all other cases, the coordinates are directly linked to the building complexes, using values from the `latitude` and `longitude` columns as [P48](https://database.factgrid.de/wiki/Property:P48).

In [118]:
for index, row in prepared_df.iterrows():
    if (not pd.isna(row["latitude"])) and (not pd.isna(row["longitude"])):
        prepared_df.loc[index, "P48"] = f'@{row["latitude"]}/{row["longitude"]}'
prepared_df

Unnamed: 0,Unnamed: 0_x,id_monastery_location,place_id,gsn_id,location_begin_tpq,location_begin_taq,location_begin_note,location_end_tpq,location_end_taq,location_end_note,...,Unnamed: 0_y,id_gsn,status,note,monastery_name,Lde,Dde,building_Lde,Len,P48
0,2,6053,11765,40358,1318,,,1412.0,,,...,1247,40358,Online,Die „untere Klause“ wurde 1318 als Klause für ...,Beginenhaus Karden (Untere Klause),"""Gebäudekomplex Beginenhaus Karden (Untere Kla...","""Gebäudekomplex Karden des Beginenhauses Karde...","""Gebäudekomplex Beginenhaus Karden (Untere Kla...","""Building complex of the Beguines Karden (Lowe...",@50.181608/7.300994
1,8,6059,11776,40364,1250,1298.0,zwischen 1250 und 1298,1312.0,,1312 oder kurz danach,...,1253,40364,Online,1312 siedeln die Johanniter von ihrem ersten S...,Johanniterkommende Trier,"""Gebäudekomplex Johanniterkommende Trier (Trie...","""Gebäudekomplex Trier (Brotstraße) der Johanni...","""Gebäudekomplex Johanniterkommende Trier (Trie...","""Building complex of the Knights Hospitallers ...",@49.7546831954651/6.64108406805845
2,29,7960,763,60036,1223,,,1806.0,,,...,3395,60036,Online,Franziskaner-Observanten,Franziskanerkloster Bamberg,"""Gebäudekomplex Franziskanerkloster Bamberg""","""Gebäudekomplex des Franziskanerklosters Bamberg""","""Gebäudekomplex Franziskanerkloster Bamberg""","""Building complex of the Franciscans Bamberg""",@49.8904/10.8864
3,57,1926,13277,3514,1156,,erstmals erwähnt 1156,1802.0,,,...,4773,3514,Online,auch: Frauenkloster vor dem Meertor. – Das Klo...,Benediktinerinnenkloster Hagenbusch (vor dem M...,"""Gebäudekomplex Benediktinerinnenkloster Hagen...","""Gebäudekomplex des Benediktinerinnenklosters ...","""Gebäudekomplex Benediktinerinnenkloster Hagen...","""Building complex of the Benedictine nuns Hage...",@51.658333/6.444722
4,80,16657,46479184,11426,1691,,,1784.0,,,...,3332,11426,Online,Der Prager Bischof Johann Friedrich von Waldst...,Ursulinenkloster St. Johannes Nepomuk auf dem ...,"""Gebäudekomplex Ursulinenkloster St. Johannes ...","""Gebäudekomplex Prag, Hradschin des Ursulinenk...","""Gebäudekomplex Ursulinenkloster St. Johannes ...","""Building complex of the Ursuline monastery of...",@50.090289/14.393241
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
604,7959,17648,46479310,12033,1425,,,1796.0,1797.0,,...,7231,12033,Online,,"Begardengemeinschaft Sint-Truiden, Belgien","""Gebäudekomplex Begardengemeinschaft Sint-Trui...","""Gebäudekomplex des Begardengemeinschaft Sint-...","""Gebäudekomplex Begardengemeinschaft Sint-Trui...","""Building complex Beghard community of Sint-Tr...",@50.83185/5.24117
605,7971,17659,46484694,12042,1627,1629.0,1627/1629,1796.0,,,...,7240,12042,Online,Von 1627 bis 1629 wurde ein Kloster für die Ge...,"Annunziatinnenkloster Tienen, Belgien","""Gebäudekomplex Annunziatinnenkloster Tienen, ...","""Gebäudekomplex des Annunziatinnenklosters Tie...","""Gebäudekomplex Annunziatinnenkloster Tienen, ...","""Building complex Sisters of the Annunciation ...",@50.81128/4.94153
606,8064,17768,46484885,12102,1372,,,1525.0,,,...,7297,12102,Online,"1520 wurde das Gebäude zerstört, 1525 wurde da...",Augustinereremitenkloster Heiligenbeil (Mamono...,"""Gebäudekomplex Augustinereremitenkloster Heil...","""Gebäudekomplex des Augustinereremitenklosters...","""Gebäudekomplex Augustinereremitenkloster Heil...","""Building complex Austin Friars of Heiligenbei...",@54.46667/19.93333
607,8137,17850,1594,12172,1332,,,1332.0,,,...,7350,12172,Online,"Lediglich 1332 zweimal erwähnt, in einem Testa...","Schwesternsammlung ""im Haus der von Baldolzhei...","""Gebäudekomplex Schwesternsammlung ""im Haus de...","""Gebäudekomplex der Schwesternsammlung ""im Hau...","""Gebäudekomplex Schwesternsammlung ""im Haus de...","""Building complex Women's convent ""in the hous...",


### Duplicates of Coordinates

There are special cases, in which there may be duplicate coordinates within the database. Generally speaking, if two coordinates of building complexes are exactly the same, we consider the building complexes to be identical, so only one item should be created. There are two occasions in which this can happen:

1. If a religious community returns to a previously inhabited building complex.
2. If the diocese in which the building complex was located changes. In this case, the change of diocese is represented as a new monastery location with identical coordinates in the monastery database.

In both cases, only one item should be created. In case 1, this item needs to be linked to the respective religious community two or more times. This is handled by Notebook 3. In case 2, it should only be linked once, but it should have two dioceses linked to reflect the change in diocese. This is handled in the section "dioceses" of this Notebook. In the next cell, a list of all coordinate duplicates is created for future use.

In [119]:
# Find occurences of identical coordinates
coord_duplicates = prepared_df[prepared_df.duplicated(subset="P48", keep=False)].dropna(subset="P48").drop_duplicates(subset="id_monastery_location", keep=False)
prepared_df.drop_duplicates(subset="P48", inplace=True)
coord_duplicates 

Unnamed: 0,Unnamed: 0_x,id_monastery_location,place_id,gsn_id,location_begin_tpq,location_begin_taq,location_begin_note,location_end_tpq,location_end_taq,location_end_note,...,Unnamed: 0_y,id_gsn,status,note,monastery_name,Lde,Dde,building_Lde,Len,P48
53,565,17259,46484728,11809,1228,,,1559.0,,,...,6935,11809,Online,,"Zisterzienserinnenkloster Beaupré, Grimminge, ...","""Gebäudekomplex Zisterzienserinnenkloster Beau...","""Gebäudekomplex Grimminge des Zisterzienserinn...","""Gebäudekomplex Zisterzienserinnenkloster Beau...","""Building complex Cistercian nunnery Beaupré, ...",@50.79328826061696/3.936796476940749
55,584,17279,46484729,11809,1559,,,1795.0,,,...,6935,11809,Online,,"Zisterzienserinnenkloster Beaupré, Grimminge, ...","""Gebäudekomplex Zisterzienserinnenkloster Beau...","""Gebäudekomplex Grimminge des Zisterzienserinn...","""Gebäudekomplex Zisterzienserinnenkloster Beau...","""Building complex Cistercian nunnery Beaupré, ...",@50.79328826061696/3.936796476940749
57,602,17301,46483837,8534,1559,,,1784.0,,,...,5817,8534,Online,"Tertiarinnen 1344 (vermutlich Beginen), Transf...","Franziskanertertiarinnenkloster Roermond, Nied...","""Gebäudekomplex Franziskanertertiarinnenkloste...","""Gebäudekomplex des Franziskanertertiarinnenkl...","""Gebäudekomplex Franziskanertertiarinnenkloste...","""Building complex of the Tertiaries Roermond, ...",@51.19555556/5.991388889
120,1332,13478,46479122,8315,1472,,,1580.0,,,...,5764,8315,Online,Für den Standort in Hoogstraten Lokalisierung ...,"Klarissenkloster Boxtel, Niederlande","""Gebäudekomplex Klarissenkloster Boxtel, Niede...","""Gebäudekomplex des Klarissenklosters Boxtel, ...","""Gebäudekomplex Klarissenkloster Boxtel, Niede...","""Building complex of the Clarissine nunnery Bo...",@51.5891579239149/5.32559784224639
124,1333,13479,46479122,8315,1611,,,1717.0,,,...,5764,8315,Online,Für den Standort in Hoogstraten Lokalisierung ...,"Klarissenkloster Boxtel, Niederlande","""Gebäudekomplex Klarissenkloster Boxtel, Niede...","""Gebäudekomplex des Klarissenklosters Boxtel, ...","""Gebäudekomplex Klarissenkloster Boxtel, Niede...","""Building complex of the Clarissine nunnery Bo...",@51.5891579239149/5.32559784224639
177,1971,5579,22164,30259,1612,,,1773.0,,,...,1046,30259,Online,Zuvor Brüder vom gemeinsamen Leben bzw. August...,Jesuitenniederlassung Marienthal,"""Gebäudekomplex Jesuitenniederlassung Marienthal""","""Gebäudekomplex der Jesuitenniederlassung Mari...","""Gebäudekomplex Jesuitenniederlassung Marienthal""","""Building complex Jesuits of Marienthal""",@50.01081337/7.947011758
187,2109,1655,22164,3255,1568,,,1587.0,,,...,1158,3255,Online,1568–1587 sind aus Pfaffenschwabenheim vertrie...,Augustinerpriorat Marienthal,"""Gebäudekomplex Augustinerpriorat Marienthal""","""Gebäudekomplex des Augustinerpriorats Marient...","""Gebäudekomplex Augustinerpriorat Marienthal""","""Building complex Augustinian priory Marienthal""",@50.01081337/7.947011758
231,2592,13775,46480085,8534,1483,,,1559.0,,,...,5817,8534,Online,"Tertiarinnen 1344 (vermutlich Beginen), Transf...","Franziskanertertiarinnenkloster Roermond, Nied...","""Gebäudekomplex Franziskanertertiarinnenkloste...","""Gebäudekomplex des Franziskanertertiarinnenkl...","""Gebäudekomplex Franziskanertertiarinnenkloste...","""Building complex of the Tertiaries Roermond, ...",@51.19555556/5.991388889
372,4702,17047,46484560,11688,1105,1115.0,um 1110,1559.0,,,...,6946,11688,Online,Priorat der Abtei Liessies in Sart-les-Moines....,"Benediktinerpriorat Sart-les-Moines, Charleroi...","""Gebäudekomplex Benediktinerpriorat Sart-les-M...","""Gebäudekomplex Roucy des Benediktinerpriorats...","""Gebäudekomplex Benediktinerpriorat Sart-les-M...","""Building complex of the Benedictine nuns Sart...",@50.45559110125386/4.404368194246168
374,4725,17070,46484561,11688,1559,,,1796.0,,,...,6946,11688,Online,Priorat der Abtei Liessies in Sart-les-Moines....,"Benediktinerpriorat Sart-les-Moines, Charleroi...","""Gebäudekomplex Benediktinerpriorat Sart-les-M...","""Gebäudekomplex Roucy des Benediktinerpriorats...","""Gebäudekomplex Benediktinerpriorat Sart-les-M...","""Building complex of the Benedictine nuns Sart...",@50.45559110125386/4.404368194246168


In [120]:
prepared_df

Unnamed: 0,Unnamed: 0_x,id_monastery_location,place_id,gsn_id,location_begin_tpq,location_begin_taq,location_begin_note,location_end_tpq,location_end_taq,location_end_note,...,Unnamed: 0_y,id_gsn,status,note,monastery_name,Lde,Dde,building_Lde,Len,P48
0,2,6053,11765,40358,1318,,,1412.0,,,...,1247,40358,Online,Die „untere Klause“ wurde 1318 als Klause für ...,Beginenhaus Karden (Untere Klause),"""Gebäudekomplex Beginenhaus Karden (Untere Kla...","""Gebäudekomplex Karden des Beginenhauses Karde...","""Gebäudekomplex Beginenhaus Karden (Untere Kla...","""Building complex of the Beguines Karden (Lowe...",@50.181608/7.300994
1,8,6059,11776,40364,1250,1298.0,zwischen 1250 und 1298,1312.0,,1312 oder kurz danach,...,1253,40364,Online,1312 siedeln die Johanniter von ihrem ersten S...,Johanniterkommende Trier,"""Gebäudekomplex Johanniterkommende Trier (Trie...","""Gebäudekomplex Trier (Brotstraße) der Johanni...","""Gebäudekomplex Johanniterkommende Trier (Trie...","""Building complex of the Knights Hospitallers ...",@49.7546831954651/6.64108406805845
2,29,7960,763,60036,1223,,,1806.0,,,...,3395,60036,Online,Franziskaner-Observanten,Franziskanerkloster Bamberg,"""Gebäudekomplex Franziskanerkloster Bamberg""","""Gebäudekomplex des Franziskanerklosters Bamberg""","""Gebäudekomplex Franziskanerkloster Bamberg""","""Building complex of the Franciscans Bamberg""",@49.8904/10.8864
3,57,1926,13277,3514,1156,,erstmals erwähnt 1156,1802.0,,,...,4773,3514,Online,auch: Frauenkloster vor dem Meertor. – Das Klo...,Benediktinerinnenkloster Hagenbusch (vor dem M...,"""Gebäudekomplex Benediktinerinnenkloster Hagen...","""Gebäudekomplex des Benediktinerinnenklosters ...","""Gebäudekomplex Benediktinerinnenkloster Hagen...","""Building complex of the Benedictine nuns Hage...",@51.658333/6.444722
4,80,16657,46479184,11426,1691,,,1784.0,,,...,3332,11426,Online,Der Prager Bischof Johann Friedrich von Waldst...,Ursulinenkloster St. Johannes Nepomuk auf dem ...,"""Gebäudekomplex Ursulinenkloster St. Johannes ...","""Gebäudekomplex Prag, Hradschin des Ursulinenk...","""Gebäudekomplex Ursulinenkloster St. Johannes ...","""Building complex of the Ursuline monastery of...",@50.090289/14.393241
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
603,7890,17577,46479212,11993,1626,,,1796.0,,,...,7191,11993,Online,Die Gemeinschaft hat ein Pesthaus betrieben.,"Kapuzinerkloster Maaseik, Belgien","""Gebäudekomplex Kapuzinerkloster Maaseik, Belg...","""Gebäudekomplex des Kapuzinerklosters Maaseik,...","""Gebäudekomplex Kapuzinerkloster Maaseik, Belg...","""Building complex of the Capuchin friary Maase...",@51.09603/5.79133
604,7959,17648,46479310,12033,1425,,,1796.0,1797.0,,...,7231,12033,Online,,"Begardengemeinschaft Sint-Truiden, Belgien","""Gebäudekomplex Begardengemeinschaft Sint-Trui...","""Gebäudekomplex des Begardengemeinschaft Sint-...","""Gebäudekomplex Begardengemeinschaft Sint-Trui...","""Building complex Beghard community of Sint-Tr...",@50.83185/5.24117
605,7971,17659,46484694,12042,1627,1629.0,1627/1629,1796.0,,,...,7240,12042,Online,Von 1627 bis 1629 wurde ein Kloster für die Ge...,"Annunziatinnenkloster Tienen, Belgien","""Gebäudekomplex Annunziatinnenkloster Tienen, ...","""Gebäudekomplex des Annunziatinnenklosters Tie...","""Gebäudekomplex Annunziatinnenkloster Tienen, ...","""Building complex Sisters of the Annunciation ...",@50.81128/4.94153
606,8064,17768,46484885,12102,1372,,,1525.0,,,...,7297,12102,Online,"1520 wurde das Gebäude zerstört, 1525 wurde da...",Augustinereremitenkloster Heiligenbeil (Mamono...,"""Gebäudekomplex Augustinereremitenkloster Heil...","""Gebäudekomplex des Augustinereremitenklosters...","""Gebäudekomplex Augustinereremitenkloster Heil...","""Building complex Austin Friars of Heiligenbei...",@54.46667/19.93333


## Connection to places

The prerequisite for connecting all building complexes with the locations in which they were found is that there are items in FactGrid for these locations. For the collection on locality data in the monastery database, the open source service [geonames](https://www.geonames.org/) was the central tool. Therefore, there is a geonames ID in the monastery database for each location. In FactGrid, there is also a qualifier (P418) for the GeoNames ID. This can be used to assign the location data to each other and to subsequently fill in missing locations. The notebook 1b - Place Matching describes this process.

In order to match all places needed, a matching between FactGrid and the place data from the monastery database is needed. All information that is already available should be placed in a file called `places_reconciled.xlsx` in the `reconciliation` folder. Make sure that the table has at least a column called `place_id` and one called `factgrid_id` that represent the id of the place in the table `gs_places` and in FactGrid respectively. The following cell will load the reconciled places and merge them to the data. If any places remain without a FactGrid id, they will be saved in a new table called `places_without_factgrid.xlsx` in the `reconciliation` folder. Find or create the missing Items in Factgrid and add the information to the `places_reconciled.xlsx` table in the `reconciliation` folder. Afterwards, re-run the workflow. 

In [121]:
# 1. Load the reconciled places
places_reconciled = pd.read_excel("data/reconciliation/places_reconciled.xlsx")[["place_id", "factgrid_id"]]
# 2. Merge them to the table with prepared monasteries
prepared_df = pd.merge(prepared_df, places_reconciled, how="left", on="place_id")
prepared_df = prepared_df.rename(columns={"factgrid_id":"P83"})
prepared_df
# 3. Filter out missing FactGrid Items and store them in a separate table
missing_factgrid_ids = prepared_df[prepared_df['P83'].isna()]
missing_factgrid_ids.to_excel('data/reconciliation/places_without_factGrid.xlsx')
prepared_df = prepared_df.dropna(subset = 'P83')
prepared_df

Unnamed: 0,Unnamed: 0_x,id_monastery_location,place_id,gsn_id,location_begin_tpq,location_begin_taq,location_begin_note,location_end_tpq,location_end_taq,location_end_note,...,id_gsn,status,note,monastery_name,Lde,Dde,building_Lde,Len,P48,P83
0,2,6053,11765,40358,1318,,,1412.0,,,...,40358,Online,Die „untere Klause“ wurde 1318 als Klause für ...,Beginenhaus Karden (Untere Klause),"""Gebäudekomplex Beginenhaus Karden (Untere Kla...","""Gebäudekomplex Karden des Beginenhauses Karde...","""Gebäudekomplex Beginenhaus Karden (Untere Kla...","""Building complex of the Beguines Karden (Lowe...",@50.181608/7.300994,Q92604
1,8,6059,11776,40364,1250,1298.0,zwischen 1250 und 1298,1312.0,,1312 oder kurz danach,...,40364,Online,1312 siedeln die Johanniter von ihrem ersten S...,Johanniterkommende Trier,"""Gebäudekomplex Johanniterkommende Trier (Trie...","""Gebäudekomplex Trier (Brotstraße) der Johanni...","""Gebäudekomplex Johanniterkommende Trier (Trie...","""Building complex of the Knights Hospitallers ...",@49.7546831954651/6.64108406805845,Q10483
2,29,7960,763,60036,1223,,,1806.0,,,...,60036,Online,Franziskaner-Observanten,Franziskanerkloster Bamberg,"""Gebäudekomplex Franziskanerkloster Bamberg""","""Gebäudekomplex des Franziskanerklosters Bamberg""","""Gebäudekomplex Franziskanerkloster Bamberg""","""Building complex of the Franciscans Bamberg""",@49.8904/10.8864,Q10308
3,57,1926,13277,3514,1156,,erstmals erwähnt 1156,1802.0,,,...,3514,Online,auch: Frauenkloster vor dem Meertor. – Das Klo...,Benediktinerinnenkloster Hagenbusch (vor dem M...,"""Gebäudekomplex Benediktinerinnenkloster Hagen...","""Gebäudekomplex des Benediktinerinnenklosters ...","""Gebäudekomplex Benediktinerinnenkloster Hagen...","""Building complex of the Benedictine nuns Hage...",@51.658333/6.444722,Q93834
4,80,16657,46479184,11426,1691,,,1784.0,,,...,11426,Online,Der Prager Bischof Johann Friedrich von Waldst...,Ursulinenkloster St. Johannes Nepomuk auf dem ...,"""Gebäudekomplex Ursulinenkloster St. Johannes ...","""Gebäudekomplex Prag, Hradschin des Ursulinenk...","""Gebäudekomplex Ursulinenkloster St. Johannes ...","""Building complex of the Ursuline monastery of...",@50.090289/14.393241,Q10447
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
448,7890,17577,46479212,11993,1626,,,1796.0,,,...,11993,Online,Die Gemeinschaft hat ein Pesthaus betrieben.,"Kapuzinerkloster Maaseik, Belgien","""Gebäudekomplex Kapuzinerkloster Maaseik, Belg...","""Gebäudekomplex des Kapuzinerklosters Maaseik,...","""Gebäudekomplex Kapuzinerkloster Maaseik, Belg...","""Building complex of the Capuchin friary Maase...",@51.09603/5.79133,Q88345
449,7959,17648,46479310,12033,1425,,,1796.0,1797.0,,...,12033,Online,,"Begardengemeinschaft Sint-Truiden, Belgien","""Gebäudekomplex Begardengemeinschaft Sint-Trui...","""Gebäudekomplex des Begardengemeinschaft Sint-...","""Gebäudekomplex Begardengemeinschaft Sint-Trui...","""Building complex Beghard community of Sint-Tr...",@50.83185/5.24117,Q1381350
450,7971,17659,46484694,12042,1627,1629.0,1627/1629,1796.0,,,...,12042,Online,Von 1627 bis 1629 wurde ein Kloster für die Ge...,"Annunziatinnenkloster Tienen, Belgien","""Gebäudekomplex Annunziatinnenkloster Tienen, ...","""Gebäudekomplex des Annunziatinnenklosters Tie...","""Gebäudekomplex Annunziatinnenkloster Tienen, ...","""Building complex Sisters of the Annunciation ...",@50.81128/4.94153,Q1381356
451,8064,17768,46484885,12102,1372,,,1525.0,,,...,12102,Online,"1520 wurde das Gebäude zerstört, 1525 wurde da...",Augustinereremitenkloster Heiligenbeil (Mamono...,"""Gebäudekomplex Augustinereremitenkloster Heil...","""Gebäudekomplex des Augustinereremitenklosters...","""Gebäudekomplex Augustinereremitenkloster Heil...","""Building complex Austin Friars of Heiligenbei...",@54.46667/19.93333,Q539910


## Instance of statement

To state that these items are building complexes, the Item [Q635758](https://database.factgrid.de/wiki/Item:Q635758) (building complex) is connected to all entries using [P2](https://database.factgrid.de/wiki/Property:P2) (instance of)

In [122]:
prepared_df["P2"] = "Q635758"
prepared_df

Unnamed: 0,Unnamed: 0_x,id_monastery_location,place_id,gsn_id,location_begin_tpq,location_begin_taq,location_begin_note,location_end_tpq,location_end_taq,location_end_note,...,status,note,monastery_name,Lde,Dde,building_Lde,Len,P48,P83,P2
0,2,6053,11765,40358,1318,,,1412.0,,,...,Online,Die „untere Klause“ wurde 1318 als Klause für ...,Beginenhaus Karden (Untere Klause),"""Gebäudekomplex Beginenhaus Karden (Untere Kla...","""Gebäudekomplex Karden des Beginenhauses Karde...","""Gebäudekomplex Beginenhaus Karden (Untere Kla...","""Building complex of the Beguines Karden (Lowe...",@50.181608/7.300994,Q92604,Q635758
1,8,6059,11776,40364,1250,1298.0,zwischen 1250 und 1298,1312.0,,1312 oder kurz danach,...,Online,1312 siedeln die Johanniter von ihrem ersten S...,Johanniterkommende Trier,"""Gebäudekomplex Johanniterkommende Trier (Trie...","""Gebäudekomplex Trier (Brotstraße) der Johanni...","""Gebäudekomplex Johanniterkommende Trier (Trie...","""Building complex of the Knights Hospitallers ...",@49.7546831954651/6.64108406805845,Q10483,Q635758
2,29,7960,763,60036,1223,,,1806.0,,,...,Online,Franziskaner-Observanten,Franziskanerkloster Bamberg,"""Gebäudekomplex Franziskanerkloster Bamberg""","""Gebäudekomplex des Franziskanerklosters Bamberg""","""Gebäudekomplex Franziskanerkloster Bamberg""","""Building complex of the Franciscans Bamberg""",@49.8904/10.8864,Q10308,Q635758
3,57,1926,13277,3514,1156,,erstmals erwähnt 1156,1802.0,,,...,Online,auch: Frauenkloster vor dem Meertor. – Das Klo...,Benediktinerinnenkloster Hagenbusch (vor dem M...,"""Gebäudekomplex Benediktinerinnenkloster Hagen...","""Gebäudekomplex des Benediktinerinnenklosters ...","""Gebäudekomplex Benediktinerinnenkloster Hagen...","""Building complex of the Benedictine nuns Hage...",@51.658333/6.444722,Q93834,Q635758
4,80,16657,46479184,11426,1691,,,1784.0,,,...,Online,Der Prager Bischof Johann Friedrich von Waldst...,Ursulinenkloster St. Johannes Nepomuk auf dem ...,"""Gebäudekomplex Ursulinenkloster St. Johannes ...","""Gebäudekomplex Prag, Hradschin des Ursulinenk...","""Gebäudekomplex Ursulinenkloster St. Johannes ...","""Building complex of the Ursuline monastery of...",@50.090289/14.393241,Q10447,Q635758
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
448,7890,17577,46479212,11993,1626,,,1796.0,,,...,Online,Die Gemeinschaft hat ein Pesthaus betrieben.,"Kapuzinerkloster Maaseik, Belgien","""Gebäudekomplex Kapuzinerkloster Maaseik, Belg...","""Gebäudekomplex des Kapuzinerklosters Maaseik,...","""Gebäudekomplex Kapuzinerkloster Maaseik, Belg...","""Building complex of the Capuchin friary Maase...",@51.09603/5.79133,Q88345,Q635758
449,7959,17648,46479310,12033,1425,,,1796.0,1797.0,,...,Online,,"Begardengemeinschaft Sint-Truiden, Belgien","""Gebäudekomplex Begardengemeinschaft Sint-Trui...","""Gebäudekomplex des Begardengemeinschaft Sint-...","""Gebäudekomplex Begardengemeinschaft Sint-Trui...","""Building complex Beghard community of Sint-Tr...",@50.83185/5.24117,Q1381350,Q635758
450,7971,17659,46484694,12042,1627,1629.0,1627/1629,1796.0,,,...,Online,Von 1627 bis 1629 wurde ein Kloster für die Ge...,"Annunziatinnenkloster Tienen, Belgien","""Gebäudekomplex Annunziatinnenkloster Tienen, ...","""Gebäudekomplex des Annunziatinnenklosters Tie...","""Gebäudekomplex Annunziatinnenkloster Tienen, ...","""Building complex Sisters of the Annunciation ...",@50.81128/4.94153,Q1381356,Q635758
451,8064,17768,46484885,12102,1372,,,1525.0,,,...,Online,"1520 wurde das Gebäude zerstört, 1525 wurde da...",Augustinereremitenkloster Heiligenbeil (Mamono...,"""Gebäudekomplex Augustinereremitenkloster Heil...","""Gebäudekomplex des Augustinereremitenklosters...","""Gebäudekomplex Augustinereremitenkloster Heil...","""Building complex Austin Friars of Heiligenbei...",@54.46667/19.93333,Q539910,Q635758


## Vocabulary Terms

In order to keep a mapping between the monastery database and FactGrid, every item will receive a distinct vocabulary term that is constructed using the `id_monastery_location` from the `gs_monastery_location` table. The FactGrid Property to use is [P1301](https://database.factgrid.de/wiki/Property:P1301) (GS vocabulary term). For the construction, the following pattern is being used: `GSMonasteryLocation<id_monastery_location>`.

In [123]:
prepared_df['P1301'] = prepared_df['id_monastery_location'].apply(lambda x: f'\"GSMonasteryLocation{x}\"')
# Handle Vocabulary Terms for duplicated coords
for index, row in prepared_df.iterrows():
    if row["id_monastery_location"] in coord_duplicates["id_monastery_location"].values:
        x = 0
        for index, row in coord_duplicates[coord_duplicates["gsn_id"] == row["gsn_id"]].iterrows():
            if x == 0:
                x += 1
                continue
            else:
                prepared_df.loc[prepared_df["P48"] == row["P48"], f"P1301.{x}" ] = f'\"GSMonasteryLocation{row["id_monastery_location"]}\"'
                x += 1
            
prepared_df

Unnamed: 0,Unnamed: 0_x,id_monastery_location,place_id,gsn_id,location_begin_tpq,location_begin_taq,location_begin_note,location_end_tpq,location_end_taq,location_end_note,...,monastery_name,Lde,Dde,building_Lde,Len,P48,P83,P2,P1301,P1301.1
0,2,6053,11765,40358,1318,,,1412.0,,,...,Beginenhaus Karden (Untere Klause),"""Gebäudekomplex Beginenhaus Karden (Untere Kla...","""Gebäudekomplex Karden des Beginenhauses Karde...","""Gebäudekomplex Beginenhaus Karden (Untere Kla...","""Building complex of the Beguines Karden (Lowe...",@50.181608/7.300994,Q92604,Q635758,"""GSMonasteryLocation6053""",
1,8,6059,11776,40364,1250,1298.0,zwischen 1250 und 1298,1312.0,,1312 oder kurz danach,...,Johanniterkommende Trier,"""Gebäudekomplex Johanniterkommende Trier (Trie...","""Gebäudekomplex Trier (Brotstraße) der Johanni...","""Gebäudekomplex Johanniterkommende Trier (Trie...","""Building complex of the Knights Hospitallers ...",@49.7546831954651/6.64108406805845,Q10483,Q635758,"""GSMonasteryLocation6059""",
2,29,7960,763,60036,1223,,,1806.0,,,...,Franziskanerkloster Bamberg,"""Gebäudekomplex Franziskanerkloster Bamberg""","""Gebäudekomplex des Franziskanerklosters Bamberg""","""Gebäudekomplex Franziskanerkloster Bamberg""","""Building complex of the Franciscans Bamberg""",@49.8904/10.8864,Q10308,Q635758,"""GSMonasteryLocation7960""",
3,57,1926,13277,3514,1156,,erstmals erwähnt 1156,1802.0,,,...,Benediktinerinnenkloster Hagenbusch (vor dem M...,"""Gebäudekomplex Benediktinerinnenkloster Hagen...","""Gebäudekomplex des Benediktinerinnenklosters ...","""Gebäudekomplex Benediktinerinnenkloster Hagen...","""Building complex of the Benedictine nuns Hage...",@51.658333/6.444722,Q93834,Q635758,"""GSMonasteryLocation1926""",
4,80,16657,46479184,11426,1691,,,1784.0,,,...,Ursulinenkloster St. Johannes Nepomuk auf dem ...,"""Gebäudekomplex Ursulinenkloster St. Johannes ...","""Gebäudekomplex Prag, Hradschin des Ursulinenk...","""Gebäudekomplex Ursulinenkloster St. Johannes ...","""Building complex of the Ursuline monastery of...",@50.090289/14.393241,Q10447,Q635758,"""GSMonasteryLocation16657""",
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
448,7890,17577,46479212,11993,1626,,,1796.0,,,...,"Kapuzinerkloster Maaseik, Belgien","""Gebäudekomplex Kapuzinerkloster Maaseik, Belg...","""Gebäudekomplex des Kapuzinerklosters Maaseik,...","""Gebäudekomplex Kapuzinerkloster Maaseik, Belg...","""Building complex of the Capuchin friary Maase...",@51.09603/5.79133,Q88345,Q635758,"""GSMonasteryLocation17577""",
449,7959,17648,46479310,12033,1425,,,1796.0,1797.0,,...,"Begardengemeinschaft Sint-Truiden, Belgien","""Gebäudekomplex Begardengemeinschaft Sint-Trui...","""Gebäudekomplex des Begardengemeinschaft Sint-...","""Gebäudekomplex Begardengemeinschaft Sint-Trui...","""Building complex Beghard community of Sint-Tr...",@50.83185/5.24117,Q1381350,Q635758,"""GSMonasteryLocation17648""",
450,7971,17659,46484694,12042,1627,1629.0,1627/1629,1796.0,,,...,"Annunziatinnenkloster Tienen, Belgien","""Gebäudekomplex Annunziatinnenkloster Tienen, ...","""Gebäudekomplex des Annunziatinnenklosters Tie...","""Gebäudekomplex Annunziatinnenkloster Tienen, ...","""Building complex Sisters of the Annunciation ...",@50.81128/4.94153,Q1381356,Q635758,"""GSMonasteryLocation17659""",
451,8064,17768,46484885,12102,1372,,,1525.0,,,...,Augustinereremitenkloster Heiligenbeil (Mamono...,"""Gebäudekomplex Augustinereremitenkloster Heil...","""Gebäudekomplex des Augustinereremitenklosters...","""Gebäudekomplex Augustinereremitenkloster Heil...","""Building complex Austin Friars of Heiligenbei...",@54.46667/19.93333,Q539910,Q635758,"""GSMonasteryLocation17768""",


In [124]:
building_complexes_with_coordinates = prepared_df[["gsn_id", "P1301", "P83", "P48", "location_begin_tpq", "location_begin_taq", "location_end_tpq", "location_end_taq", "place_id"]].rename(columns={"P1301":"id_monastery_location", "P83":"place_factgrid", "P48":"coordinates"})
building_complexes_with_coordinates["id_monastery_location"] = building_complexes_with_coordinates["id_monastery_location"].str.strip("\"").str.split("Location").str[-1].astype(int)
building_complexes_with_coordinates["latitude"] = building_complexes_with_coordinates["coordinates"].str.split("/").str[0].str[1:].astype(float)
building_complexes_with_coordinates["longitude"] = building_complexes_with_coordinates["coordinates"].str.split("/").str[1].astype(float)
building_complexes_with_coordinates.drop(columns=["coordinates"])
building_complexes_with_coordinates["place_id"] = building_complexes_with_coordinates["place_id"].astype(int)
building_complexes_with_coordinates.to_csv("data/intermediate_results/building_complexes_coordinates.csv")
building_complexes_with_coordinates

Unnamed: 0,gsn_id,id_monastery_location,place_factgrid,coordinates,location_begin_tpq,location_begin_taq,location_end_tpq,location_end_taq,place_id,latitude,longitude
0,40358,6053,Q92604,@50.181608/7.300994,1318,,1412.0,,11765,50.181608,7.300994
1,40364,6059,Q10483,@49.7546831954651/6.64108406805845,1250,1298.0,1312.0,,11776,49.754683,6.641084
2,60036,7960,Q10308,@49.8904/10.8864,1223,,1806.0,,763,49.890400,10.886400
3,3514,1926,Q93834,@51.658333/6.444722,1156,,1802.0,,13277,51.658333,6.444722
4,11426,16657,Q10447,@50.090289/14.393241,1691,,1784.0,,46479184,50.090289,14.393241
...,...,...,...,...,...,...,...,...,...,...,...
448,11993,17577,Q88345,@51.09603/5.79133,1626,,1796.0,,46479212,51.096030,5.791330
449,12033,17648,Q1381350,@50.83185/5.24117,1425,,1796.0,1797.0,46479310,50.831850,5.241170
450,12042,17659,Q1381356,@50.81128/4.94153,1627,1629.0,1796.0,,46484694,50.811280,4.941530
451,12102,17768,Q539910,@54.46667/19.93333,1372,,1525.0,,46484885,54.466670,19.933330


## Dioceses

By connecting to modern municipalities, it is possible to understand in which territorial structures the (former) building complexes are located today. However, the monastery database also contains information about the historical diocese in which the building complexes were located. This information is stored in the table `gs_places` in the column `diocese_id`. Therefore, the locations where monastery locations are located are assigned to a diocese. In FactGrid, we connect the information about the dioceses directly to the building complexes. A building complex has a property [P1003](https://database.factgrid.de/wiki/Item:Q21662) (Diocese), which connects to a diocese item, for example the Archdiocese of Mainz ([Q153230](https://database.factgrid.de/wiki/Item:Q153230)). The historical affiliation of a location to a diocese is a complex phenomenon. On the one hand, this changed over time, especially in border areas. On the other hand, it is also possible that an area that we understand today as a contiguous location was not a contiguous location around 1500 and only partially belonged to a certain diocese. Therefore, we separate the modern territorial localization (statements about the current location of the address) from the historical localization (statements about the affiliation to a diocese).

In [125]:
# Merge gs_places['diocese_id] to existing table
places_selection = dataframes["gs_places"][["id_places", "diocese_id"]]
diocese_urls_selection = dataframes["gs_id_external_urls_diocese"][dataframes["gs_id_external_urls_diocese"]["url_type_id"]==42][["diocese_id", "url_value"]]
diocese_urls_selection
prepared_df = pd.merge(prepared_df, places_selection, how="left", left_on="place_id", right_on="id_places").drop(columns="id_places")
prepared_df = pd.merge(prepared_df, diocese_urls_selection, how="left", left_on="diocese_id", right_on="diocese_id").drop(columns="diocese_id").rename(columns={"url_value":"P1003"})

# Handle dioceses for coordinate duplicates
coord_duplicates = pd.merge(coord_duplicates, places_selection, how="left", left_on="place_id", right_on="id_places").drop(columns="id_places")
coord_duplicates = pd.merge(coord_duplicates, diocese_urls_selection, how="left", left_on="diocese_id", right_on="diocese_id").drop(columns="diocese_id").rename(columns={"url_value":"P1003"})
for index, row in prepared_df.iterrows():
    if row["id_monastery_location"] in coord_duplicates["id_monastery_location"].values:
        x = 0
        for index, row in coord_duplicates[coord_duplicates["gsn_id"] == row["gsn_id"]].iterrows():
            if x == 0:
                x += 1
                continue
            else:
                prepared_df.loc[prepared_df["P48"] == row["P48"], f"P1003.{x}"] = row["P1003"]
                x += 1
prepared_df

  prepared_df.loc[prepared_df["P48"] == row["P48"], f"P1003.{x}"] = row["P1003"]


Unnamed: 0,Unnamed: 0_x,id_monastery_location,place_id,gsn_id,location_begin_tpq,location_begin_taq,location_begin_note,location_end_tpq,location_end_taq,location_end_note,...,Dde,building_Lde,Len,P48,P83,P2,P1301,P1301.1,P1003,P1003.1
0,2,6053,11765,40358,1318,,,1412.0,,,...,"""Gebäudekomplex Karden des Beginenhauses Karde...","""Gebäudekomplex Beginenhaus Karden (Untere Kla...","""Building complex of the Beguines Karden (Lowe...",@50.181608/7.300994,Q92604,Q635758,"""GSMonasteryLocation6053""",,Q153244,
1,8,6059,11776,40364,1250,1298.0,zwischen 1250 und 1298,1312.0,,1312 oder kurz danach,...,"""Gebäudekomplex Trier (Brotstraße) der Johanni...","""Gebäudekomplex Johanniterkommende Trier (Trie...","""Building complex of the Knights Hospitallers ...",@49.7546831954651/6.64108406805845,Q10483,Q635758,"""GSMonasteryLocation6059""",,Q153244,
2,29,7960,763,60036,1223,,,1806.0,,,...,"""Gebäudekomplex des Franziskanerklosters Bamberg""","""Gebäudekomplex Franziskanerkloster Bamberg""","""Building complex of the Franciscans Bamberg""",@49.8904/10.8864,Q10308,Q635758,"""GSMonasteryLocation7960""",,Q153216,
3,57,1926,13277,3514,1156,,erstmals erwähnt 1156,1802.0,,,...,"""Gebäudekomplex des Benediktinerinnenklosters ...","""Gebäudekomplex Benediktinerinnenkloster Hagen...","""Building complex of the Benedictine nuns Hage...",@51.658333/6.444722,Q93834,Q635758,"""GSMonasteryLocation1926""",,Q153225,
4,80,16657,46479184,11426,1691,,,1784.0,,,...,"""Gebäudekomplex Prag, Hradschin des Ursulinenk...","""Gebäudekomplex Ursulinenkloster St. Johannes ...","""Building complex of the Ursuline monastery of...",@50.090289/14.393241,Q10447,Q635758,"""GSMonasteryLocation16657""",,Q153263,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
449,7890,17577,46479212,11993,1626,,,1796.0,,,...,"""Gebäudekomplex des Kapuzinerklosters Maaseik,...","""Gebäudekomplex Kapuzinerkloster Maaseik, Belg...","""Building complex of the Capuchin friary Maase...",@51.09603/5.79133,Q88345,Q635758,"""GSMonasteryLocation17577""",,Q153250,
450,7959,17648,46479310,12033,1425,,,1796.0,1797.0,,...,"""Gebäudekomplex des Begardengemeinschaft Sint-...","""Gebäudekomplex Begardengemeinschaft Sint-Trui...","""Building complex Beghard community of Sint-Tr...",@50.83185/5.24117,Q1381350,Q635758,"""GSMonasteryLocation17648""",,Q153250,
451,7971,17659,46484694,12042,1627,1629.0,1627/1629,1796.0,,,...,"""Gebäudekomplex des Annunziatinnenklosters Tie...","""Gebäudekomplex Annunziatinnenkloster Tienen, ...","""Building complex Sisters of the Annunciation ...",@50.81128/4.94153,Q1381356,Q635758,"""GSMonasteryLocation17659""",,,
452,8064,17768,46484885,12102,1372,,,1525.0,,,...,"""Gebäudekomplex des Augustinereremitenklosters...","""Gebäudekomplex Augustinereremitenkloster Heil...","""Building complex Austin Friars of Heiligenbei...",@54.46667/19.93333,Q539910,Q635758,"""GSMonasteryLocation17768""",,Q153272,


In [129]:
prepared_df[prepared_df["P1003"].isna()]

Unnamed: 0,Unnamed: 0_x,id_monastery_location,place_id,gsn_id,location_begin_tpq,location_begin_taq,location_begin_note,location_end_tpq,location_end_taq,location_end_note,...,P1301.1,P1003,P1003.1,Sdewiki,Sitwiki,Splwiki,Sfrwiki,Slvwiki,Scswiki,Snlwiki
34,534,17224,46480776,11780,1135,,,1560.0,1610.0,nach 1560,...,,,,,,,,,,
35,553,17245,46484709,11801,1624,,,1796.0,,,...,,,,,,,,,,"""Hunnegem"""
37,565,17259,46484728,11809,1228,,,1559.0,,,...,"""GSMonasteryLocation17279""",,,,,,,,,
38,602,17301,46483837,8534,1559,,,1784.0,,,...,"""GSMonasteryLocation13775""",,Q153250,,,,,,,
174,2716,14554,46480117,10075,1481,1483.0,zwischen 1481 und 1483,1848.0,,,...,,,,,"""Chiesa_di_Santa_Maria_delle_Grazie_(Bellinzona)""",,,,,
175,2717,14555,46480136,10076,1472,1490.0,zwischen 1472 und 1490,1602.0,,,...,,,,,"""Chiesa_di_Santa_Maria_degli_Angeli_(Lugano)""",,,,,
241,3895,14850,46480203,10398,1765,,,,,heute,...,,,,,,,,,,
242,3934,7848,46483921,92498,1231,1238.0,1231/1238,1835.0,1836.0,1835/1836,...,,,,,,,,,,
245,4010,13812,46483837,8561,1667,,,1787.0,,,...,,,,,,,,,,
280,4694,17039,46480086,11680,1081,,,1796.0,,,...,,,,,,,,,,


## External Identifiers
In some instances, the monastery database has listed wikipedia articles that are specifically written about the building complex of a monastery. Where these exist, they should be linked to the building complex item. 

In [126]:
gs_external_url_type_with_factgrid = dataframes["gs_external_url_type_with_factgrid"].dropna(subset="factgrid_property")
url_factgrid = pd.merge(dataframes["gs_external_urls_monastery"], gs_external_url_type_with_factgrid, how="left", left_on="url_type_id", right_on="id_url_type")[["gsn_id", "url_value", "factgrid_property", "url_name_formatter"]].dropna(subset="factgrid_property")
for index, row in url_factgrid.iterrows():
    if row["gsn_id"] in prepared_df["id_gsn"].values and "Wikipedia-Artikel zum Baudenkmal" in row["url_name_formatter"]:
        prepared_df.loc[prepared_df["id_gsn"] == row["gsn_id"], row["factgrid_property"]] = f'\"{row["url_value"]}\"'
prepared_df

Unnamed: 0,Unnamed: 0_x,id_monastery_location,place_id,gsn_id,location_begin_tpq,location_begin_taq,location_begin_note,location_end_tpq,location_end_taq,location_end_note,...,P1301.1,P1003,P1003.1,Sdewiki,Sitwiki,Splwiki,Sfrwiki,Slvwiki,Scswiki,Snlwiki
0,2,6053,11765,40358,1318,,,1412.0,,,...,,Q153244,,,,,,,,
1,8,6059,11776,40364,1250,1298.0,zwischen 1250 und 1298,1312.0,,1312 oder kurz danach,...,,Q153244,,,,,,,,
2,29,7960,763,60036,1223,,,1806.0,,,...,,Q153216,,,,,,,,
3,57,1926,13277,3514,1156,,erstmals erwähnt 1156,1802.0,,,...,,Q153225,,,,,,,,
4,80,16657,46479184,11426,1691,,,1784.0,,,...,,Q153263,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
449,7890,17577,46479212,11993,1626,,,1796.0,,,...,,Q153250,,,,,,,,"""Kapucijnenkerk_(Maaseik)"""
450,7959,17648,46479310,12033,1425,,,1796.0,1797.0,,...,,Q153250,,,,,,,,
451,7971,17659,46484694,12042,1627,1629.0,1627/1629,1796.0,,,...,,,,,,,,,,
452,8064,17768,46484885,12102,1372,,,1525.0,,,...,,Q153272,,,,,,,,


## Sources / References

Every Statement in FactGrid should be supported by a Source/Reference. To achieve this, a source column `S471` is added after each relevant property to link to the Monastery Database Entries using the Property [P471](https://database.factgrid.de/wiki/Property:P471).

In [127]:
final_table = prepared_df.copy()
for colname in ["P48", "P83", "P1003"] + [c for c in final_table.columns.tolist() if c.startswith("P1003.")]:
    final_table.insert(final_table.columns.get_loc(colname)+1, "S471", final_table["gsn_id"].apply(lambda x:f'\"{x}\"'), allow_duplicates=True)
final_table["P131"] = "Q153178"
final_table

Unnamed: 0,Unnamed: 0_x,id_monastery_location,place_id,gsn_id,location_begin_tpq,location_begin_taq,location_begin_note,location_end_tpq,location_end_taq,location_end_note,...,P1003.1,S471,Sdewiki,Sitwiki,Splwiki,Sfrwiki,Slvwiki,Scswiki,Snlwiki,P131
0,2,6053,11765,40358,1318,,,1412.0,,,...,,"""40358""",,,,,,,,Q153178
1,8,6059,11776,40364,1250,1298.0,zwischen 1250 und 1298,1312.0,,1312 oder kurz danach,...,,"""40364""",,,,,,,,Q153178
2,29,7960,763,60036,1223,,,1806.0,,,...,,"""60036""",,,,,,,,Q153178
3,57,1926,13277,3514,1156,,erstmals erwähnt 1156,1802.0,,,...,,"""3514""",,,,,,,,Q153178
4,80,16657,46479184,11426,1691,,,1784.0,,,...,,"""11426""",,,,,,,,Q153178
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
449,7890,17577,46479212,11993,1626,,,1796.0,,,...,,"""11993""",,,,,,,"""Kapucijnenkerk_(Maaseik)""",Q153178
450,7959,17648,46479310,12033,1425,,,1796.0,1797.0,,...,,"""12033""",,,,,,,,Q153178
451,7971,17659,46484694,12042,1627,1629.0,1627/1629,1796.0,,,...,,"""12042""",,,,,,,,Q153178
452,8064,17768,46484885,12102,1372,,,1525.0,,,...,,"""12102""",,,,,,,,Q153178


## Finalizing

To finalize, the table is cleaned up and transformed into a variety of formats. Most importantly, you will find the V1-statements to create the new building complex items under `data/results/building_complexes/import_building_complexes.tsv`

In [128]:
from helper_functions import df_to_qs_v1

final_table["id_monastery_location"].to_csv("data/intermediate_results/new_building_complex_locations_ids.csv")

final_table = final_table.drop(columns=["Dde", "Unnamed: 0_x", "Unnamed: 0_y", "note", "building_Lde", "id_monastery_location", "place_id", "gsn_id", "location_begin_tpq", "location_begin_taq", "location_begin_note", "location_end_tpq", "location_end_taq", "location_end_note", "longitude", "latitude", "location_name", "id_gsn", "status", "monastery_name"])
final_table.insert(0, "qid", np.nan)
final_table.to_excel("data/results/building_complexes/import_building_complexes.xlsx", index=False)
final_table.to_csv("data/results/building_complexes/import_building_complexes.csv", index=False, doublequote=False, quoting=csv.QUOTE_NONE, escapechar="§") #hack to save in Quickstatements-applicable format
with open("data/results/building_complexes/import_building_complexes.csv", "r") as file:
    s = file.read()
with open("data/results/building_complexes/import_building_complexes.csv", "w") as file:
    file.write(s.replace("§", ""))
with open("data/results/building_complexes/import_building_complexes.tsv", "w") as file:
    file.write(df_to_qs_v1(final_table))


final_table

Unnamed: 0,qid,Lde,Len,P48,S471,P83,S471.1,P2,P1301,P1301.1,...,P1003.1,S471.3,Sdewiki,Sitwiki,Splwiki,Sfrwiki,Slvwiki,Scswiki,Snlwiki,P131
0,,"""Gebäudekomplex Beginenhaus Karden (Untere Kla...","""Building complex of the Beguines Karden (Lowe...",@50.181608/7.300994,"""40358""",Q92604,"""40358""",Q635758,"""GSMonasteryLocation6053""",,...,,"""40358""",,,,,,,,Q153178
1,,"""Gebäudekomplex Johanniterkommende Trier (Trie...","""Building complex of the Knights Hospitallers ...",@49.7546831954651/6.64108406805845,"""40364""",Q10483,"""40364""",Q635758,"""GSMonasteryLocation6059""",,...,,"""40364""",,,,,,,,Q153178
2,,"""Gebäudekomplex Franziskanerkloster Bamberg""","""Building complex of the Franciscans Bamberg""",@49.8904/10.8864,"""60036""",Q10308,"""60036""",Q635758,"""GSMonasteryLocation7960""",,...,,"""60036""",,,,,,,,Q153178
3,,"""Gebäudekomplex Benediktinerinnenkloster Hagen...","""Building complex of the Benedictine nuns Hage...",@51.658333/6.444722,"""3514""",Q93834,"""3514""",Q635758,"""GSMonasteryLocation1926""",,...,,"""3514""",,,,,,,,Q153178
4,,"""Gebäudekomplex Ursulinenkloster St. Johannes ...","""Building complex of the Ursuline monastery of...",@50.090289/14.393241,"""11426""",Q10447,"""11426""",Q635758,"""GSMonasteryLocation16657""",,...,,"""11426""",,,,,,,,Q153178
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
449,,"""Gebäudekomplex Kapuzinerkloster Maaseik, Belg...","""Building complex of the Capuchin friary Maase...",@51.09603/5.79133,"""11993""",Q88345,"""11993""",Q635758,"""GSMonasteryLocation17577""",,...,,"""11993""",,,,,,,"""Kapucijnenkerk_(Maaseik)""",Q153178
450,,"""Gebäudekomplex Begardengemeinschaft Sint-Trui...","""Building complex Beghard community of Sint-Tr...",@50.83185/5.24117,"""12033""",Q1381350,"""12033""",Q635758,"""GSMonasteryLocation17648""",,...,,"""12033""",,,,,,,,Q153178
451,,"""Gebäudekomplex Annunziatinnenkloster Tienen, ...","""Building complex Sisters of the Annunciation ...",@50.81128/4.94153,"""12042""",Q1381356,"""12042""",Q635758,"""GSMonasteryLocation17659""",,...,,"""12042""",,,,,,,,Q153178
452,,"""Gebäudekomplex Augustinereremitenkloster Heil...","""Building complex Austin Friars of Heiligenbei...",@54.46667/19.93333,"""12102""",Q539910,"""12102""",Q635758,"""GSMonasteryLocation17768""",,...,,"""12102""",,,,,,,,Q153178


## Next steps
As a next step, you should run notebook 2 - Monasteries to create the religious community items that go together with the building complexes. Afterwards you can copy the V1 statements from both, `data/results/building_complexes/import_building_complexes.csv` and `data/results/monasteries/import_monasteries.csv` to Quickstatements and upload.