# Notebook 0 - Utilities
This notebook is **not** part of the monastery database to FactGrid workflow. It contains useful routines that can be used to perform a variety of tasks surrounding the upload of data to FactGrid.

## FactGrid -> Klosterdatenbank
After the import of new monasteries, it is desired to add the newly created FactGrid Items for each religious community and building complex back to the monaster database. Execute the following cell to query FactGrid for monasteries and building complexes associated with the Germania Sacra or the monastery database. These can then be imported into the access database. The files can be found under `data/factgrid_data/` in both, CSV and Excel format.

In [11]:
from helper_functions import query_factgrid

# Get FactGrid Data
monasteries_in_factgrid = query_factgrid("monasteries")
building_complexes_in_factgrid = query_factgrid("building_complexes")

# Cleanup data
monasteries_in_factgrid["item"] = monasteries_in_factgrid["item"].str.split("/").str[-1]
monasteries_in_factgrid.rename(columns={"item":"url_value", "KlosterdatenbankID":"gsn_id"}, inplace=True)
monasteries_in_factgrid["url_type_id"] = 42

building_complexes_in_factgrid
building_complexes_in_factgrid["item"] = building_complexes_in_factgrid["item"].str.split("/").str[-1]
building_complexes_in_factgrid["GSVocabTerm"] = building_complexes_in_factgrid["GSVocabTerm"].str.split("Location").str[-1]
building_complexes_in_factgrid.rename(columns={"item":"factgrid_id", "GSVocabTerm":"id_monastery_location"}, inplace=True)

# Save data
monasteries_in_factgrid.to_csv("data/factgrid_data/monasteries_in_factgrid.csv")
monasteries_in_factgrid.to_excel("data/factgrid_data/monasteries_in_factgrid.xlsx")
building_complexes_in_factgrid.to_csv("data/factgrid_data/building_complexes_in_factgrid.csv")
building_complexes_in_factgrid.to_excel("data/factgrid_data/building_complexes_in_factgrid.xlsx")

## Query FactGrid using API
The function `query_factgrid` in `helper_functions.py` allows to perform specific predefined queries against FactGrid's SPARQL endpoint. If you plan to use a query frequently, it makes sense to add it to the function so that it can be called directly. However, if you just want to quickly perform a one-time-query you can execute the following cell. Make sure to change the lines with comments behind them to fit your usecase. Results will be stored in `data/factgrid_data/`

*Method was mainly coded by Luana Moares Coasta*

In [12]:
from SPARQLWrapper import SPARQLWrapper, JSON
import pandas as pd

ENDPOINT = "https://database.factgrid.de/sparql"
FILE_NAME = "query_results" # ENTER YOUR DESIRED OUTPUT-FILE-NAME HERE

# ENTER YOUR QUERY HERE
query = """
SELECT ?monastery ?monasteryLabel
       ?klosterdatenbankID
       ?coord
WHERE {
  ?monastery wdt:P2   wd:Q141472 .        # instance of → monastery
  ?monastery wdt:P471 ?klosterdatenbankID .  # Klosterdatenbank ID (P471)

  OPTIONAL { ?monastery wdt:P48 ?coord }  # coordinates, if present

  SERVICE wikibase:label { bd:serviceParam wikibase:language "en,de". }

}

LIMIT 100
"""

sparql = SPARQLWrapper(ENDPOINT)
sparql.setQuery(query)
sparql.setReturnFormat(JSON)

results = sparql.query().convert()

rows = [
    {key: val["value"] for key, val in binding.items()}
    for binding in results["results"]["bindings"]
]

df = pd.DataFrame(rows)

df.to_csv(f"data/factgrid_data/{FILE_NAME}.csv")
df.to_excel(f"data/factgrid_data/{FILE_NAME}.xlsx")