# Test Case: 
### _A Sunday walk_

First, we'll have to import and parse the text to make it readable in Python. <br> Here - in theory - we have different possibilities: 
- Use NER: packages like spacy make it possible to analyze text on various levels, including Named Entitiy Recognition. Using NER, you can not only divide sentences, tokens or assign POS-tags, but also automatically detect places / placenames. 
- Use XML: By manually annotating the places of interest in our source, we can simply export them as a list.

Since performance using only NER was in my specific case very poor -
<br> it matched a lot of places that were not relevant, as well as left some out - I decided to go for manual annotation.
<br> Nonetheless, it might be helpful for your project, so if you would still like to try it, I will provide the code here: 

In [1]:
## I used os for my project in order better organize my files, but it's not mandatory.
from os import path
xml_path = path.relpath("../Data/pages_8_47.xml")

## I used the python-module BeautifulSoup for XML-Parsing.


from bs4 import BeautifulSoup   #import the module
with open(xml_path, encoding='utf8') as file:   #load & parse the whole text and turn it into an BeautifulSoup object for further analysis.
    contents = file.read()
    soup = BeautifulSoup(contents, 'xml')

I exported the whole text as an XML-file from Transkribus (you can, of course, also export only single pages). <br>
But since I want to extract only the places of interest for Sunday, I have to divide the text beforehand. <br>
The following code uses a combination of tags and strings in order to split the text correctly. <br>
<br>
Every chapter of the book corresponds to a weekday - and so do the headings (aka "Sunday" is a heading) <br>
I decided to use this structural particularity to my advantage: By combining a filter of structural tags as well as strings, you can divide the text efficiently without having to worry about reoccuring Sundays, Mondays etc. in the text.

In [2]:
## first, we have to define which day we want to analyze:
## of course, this can be changed accordingly - e.g. from Sunday to Monday, etc.

import re

start_tag = 'hi'    #hi corresponds the the tag <hi> in XML, which marks a word as highlighted, e.g. bold or cursiv - as it is in a heading
start_string = 'Sonntag'    #the start of split

end_tag = 'hi'     
end_string = 'Montag.'     # the end of split 
place_tag = 'placeName'     # specific tags to extract - you could always also extract the whole text, or other marked entities like Person Names or Titles
should_start_parsing = False    #has to be False in order for the function to work

In [3]:
Sonntag = []        #here, the whole sunday text will be placed - in case we need it later on
Places = []     #here, the places for our route will be stored
for line in soup.find_all('l'):
    if line.string == start_string and line.string.parent.name == start_tag:
        should_start_parsing = True  

    if should_start_parsing:
        text = line.get_text()
        Sonntag.append(text)
        for child in line.descendants:
            if child.name == place_tag:
                for text in child.strings:
                    #hier gibts Verbesserungsbedarf! -> Ziel ist, das die strings von placeTags, die durch andere Tags unterbrochen werden, 
                    #z.B- <placeName> Schloss <persName> Esterhazy </persName> </placeName>
                    #nicht als getrennte, sondern ein string gespeichert werden. 
                    Places.append(text)          

    if line.string == end_string and line.string.parent.name == end_tag:
        should_start_parsing = False  
        break

Cleaning and structuring the list of places:

In [4]:
while(" " in Places):
    Places.remove(" ")
print(Places) 

['k. k. Burg', 'Burgplatz', 'Burgthor', 'kaiserlichen Stallungen', 'Ge=', 'bäude der ungarischen Garde', 'Palais des ', 'Fürsten Auersperg', 'Gebäude des ', 'Geographi=', 'schen (Militär=)Jnstituts', 'Kriminal=', 'Gebäude', 'Kahlengebirge', 'Mariahilfer Hauptstraße', 'Schön=', 'brunn', 'Grena=', 'dierkaserne', 'Getreide=', 'markt', 'artesischen Brunnen', 'Akazien=Alleen', 'Wien', 'Kettenstege', 'der steinernen Brücke', 'Polytechnische Jnstitut', 'Karlskirche', 'Burg', 'Ritter=', 'Ceremoniensaal', 'Appartements ', 'Sr Majestät des Kaisers', 'Rittersaal', 'Antiken=', 'Mineralienkabinet', 'astronomische Thurm', 'Burg', 'Hofbiblio=', 'thek=Gebäudes', 'Josephsplatze', 'Palais des ', 'Erzherzogs Karl', 'Burg=', 'platze', 'Volksgartens', 'Burg=', 'gartens', 'Burggarten', 'Gewächshäuser', 'Wall', 'Palais des ', 'Erzherzogs Karl', 'Spitalplatze', 'Hofbau=', 'amte', 'Kärnthnerthor=Theater', 'Wall', 'Kärnthnerthore', 'Kärnthnerstraße', 'Wieden', 'Baden', 'Starhembergische ', 'Freihaus', 'Polytech

In [5]:
clean_places = []
for e in Places:
    if clean_places and clean_places[-1].endswith('='):
        clean_places[-1] = clean_places[-1][:-1] + e
    elif clean_places and (clean_places[-1].endswith('des ') or clean_places[-1].endswith('des')):
        clean_places[-1] = clean_places[-1] + ' ' + e
    elif clean_places and (clean_places[-1].endswith('der ') or clean_places[-1].endswith('der')): 
        clean_places[-1] = clean_places[-1] + ' ' + e     
    elif clean_places and clean_places[-1].endswith('den '): 
        clean_places[-1] = clean_places[-1] + ' ' + e 
    elif clean_places and (clean_places[-1].endswith('Fürsten ') or clean_places[-1].endswith('.')):
        clean_places[-1] = clean_places[-1] + ' ' + e         
    else:
        clean_places.append(e)


Some manual refinements for removing comments which were also exported: 

In [6]:
clean_places.remove('ist Wien = Donau?')
clean_places.remove('Vorstadt = Bezirk')

Places that were seperated & have to be combined for further analysis: <br>
große Steinbrücke, Starhembergisches Freihaus, Appartements Sr. Majestät, 'fürstlich''Esterhazische sogenannte ''rothe Haus' <br>
<br>
Places that were combined & have to be seperated:
Hofbibliotheksgebäude - Josephplatz; Wienerwaldes - Schneeberg; 

In [7]:
## the corresponding code: first, seperate a string using reduce and replace them in the list as two new items
## the order here is important! Otherwise, the indeces may change.
# I looked up the indeces for the corresponding places beforehand using list.index('string')

import functools

clean_places[73:75] = [functools.reduce(lambda x, y: x + ' ' + y, clean_places[73:75])]
#print(clean_places[73])

clean_places[45:47] = [functools.reduce(lambda x, y: x + y, clean_places[45:47])]
#print(clean_places[45])

clean_places[22:24] = [functools.reduce(lambda x, y: x + y, clean_places[22:24])]
#print(clean_places[22])

clean_places[154:157] = [functools.reduce(lambda x, y: x + y, clean_places[154:157])]
#print(clean_places[154])

clean_places.insert(27, 'Hofbibliotek=Gebäudes')
clean_places.insert(28, 'Josephsplatze')
clean_places.remove('Hofbibliothek=Gebäudes Josephsplatze')

clean_places.insert(49, 'Wienerwaldes')
clean_places.insert(50, 'Schneeberg')
clean_places.remove('Wienerwaldes Schneeberg')

clean_places[85:87] = [functools.reduce(lambda x, y: x + ' ' + y, clean_places[85:87])]
#print(clean_places[85])


After cleaning the list of places, it might be smart to save them - e.g. in a text file.

In [8]:
clean_places_text = open('../Data/Sonntag/clean_places_text.txt','w', encoding='utf8')
for item in clean_places:
	clean_places_text.write(item+"\n")
clean_places_text.close()

Nice - we have the places for our walk! <br>
But in order to bring them onto the map, we'll have to combine them with matching coordinates. <br>
Also, it would be nicer to have only one "writing-version" of each place in our list - right now, some of them have different endings as well as different cases, which might make string matching later harder. <br>
<br>
There are several possibilities to normalize text - one idea could be lemmatization, using the spacy-package. <br> 
The problem here is that spacy splits words that belong together - as names - and the results of lemmatization might look not very language-intuitive (e.g. "Stallunge" instead of "Stallung" or "Stallungen"). <br>
I still included the code in my workbook - in other cases it might work better.

In [9]:
#import spacy
#nlp = spacy.load("de_core_news_sm")

#places_text = ' '.join(clean_places)
#nlp_places = nlp(places_text)

#lemma_list = []
#for word in nlp_places: 
    #lemma = word.lemma_
    #lemma_list.append(lemma)    
#print(lemma_list)

Another option would be fuzzy string or sequence matching, using the builtin-python package difflib: <br>
Difflib identifies differences in lists and strings based on edit distance and allows us to find and combine similar written words. <br>
Important to remember is that in order to match in a sensibel way, we have to define a good threshold. <br>
In my case, being a little bit more strict, a threshold of 0.9 worked quite well. 

In [10]:
import difflib as dl

Similiar_words = []
for word in clean_places:
    close_words = dl.get_close_matches(word, clean_places, cutoff= 0.9)     #find matches of similar written places
    Similiar_words.append(close_words)                                      #gather them in a list

place_list = []
for sublist in Similiar_words:     #only the first word or version of every similar-places-combination gets chosen as 'standard'
    sublist = sorted(sublist)
    place = sublist[0]
    place_list.append(place)  

place_list_text= open('../Data/Sonntag/place_list_text.txt','w', encoding='utf8')      #save the new list of places in a file
for item in place_list:
	place_list_text.write(item+"\n")
place_list_text.close()      

Next is to create a dataframe containing the coordinates for each place. <br>
For that, I chose to extract coordinates from pre-built and openly available dataframes like the KULTURWIKIOGD, a Data Collection from the Wien Geschichte Wiki (https://www.data.gv.at/katalog/dataset/wien-geschichte-wiki#resources), containing gis-data for historic places and sights in vienna. 
<br>
<br>
But to extract only the coordinates of places of interest - i.e., places on my place list - I will have to compare the dataframes with my list. <br>
I also chose to use fuzzy string matching for this task, using Pythons _the_fuzz_ library.

In [11]:
import pandas as pd
csv_path = path.relpath("../Data/KULTURWIKIOGD.csv")    #import the downloaded dataset
usecolumns = ['OBJECTID', 'SHAPE', 'ADRESSE', 'SEITENNAME']    #since the dataset is pretty big, I decided to use only some of the columns for efficieny
coordinates = pd.read_csv(csv_path,  index_col="OBJECTID", usecols=usecolumns)

In [12]:
# a first look at the dataframe
coordinates.head()

Unnamed: 0_level_0,SHAPE,ADRESSE,SEITENNAME
OBJECTID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
294006,POINT (16.24449608644173 48.20765995647323),"14., Mondweg 91",Franz-Sauer-Park
294007,POINT (16.258093352548578 48.19709893851195),"13., Rußpekgasse 3",Franz-Schimon-Park
294008,POINT (16.26067781801285 48.182308544859616),"13., Schweizertalstraße 29D",Franz-Schmidt-Park
294009,POINT (16.338812997907947 48.188874975260035),"15., U-Bahn-Bogen 1",Franz-Schwarz-Park
294010,POINT (16.39077256132719 48.22313059046534),"02., Nordbahnstraße 49",Franziska-Löw-Park


In [13]:
from thefuzz import process, fuzz 

coordinates['MATCH_SCORE'] = coordinates['SEITENNAME'].apply(lambda x: process.extractOne(x, place_list, scorer=fuzz.ratio)[1]) 
# create a new column called MATCH_SCORE, where the resulting scores of the comparison between our place_list will be stored. 
# process.extractOne returns two things: the closest match to a word and the corresponding matching score
# since we only need (for now) the score, I will only extract the second item
# This is important in order to compare how good the string matching workes - and in order to find a fitting threshold, where the least information is lost

coordinates = coordinates.sort_values('MATCH_SCORE', ascending=False)
coordinates_choice = coordinates.drop(coordinates[coordinates['MATCH_SCORE'] <= 84].index) #I found that 84 is a nice threshold in my case

coordinates_choice['NEUE_NAMEN'] = coordinates_choice['SEITENNAME'].apply(lambda x: process.extractOne(x, place_list, scorer=fuzz.ratio)[0]) 
# to make our dataframe later on better comparable, it would be good to use uniform names for our places - again, process.extractOne comes in handy. 
# this time, we extract the first item, the matching word, and put it into our new column, NEUE_NAMEN

coordinates_choice = coordinates_choice.drop_duplicates(subset=['NEUE_NAMEN']) 
place_coord  = coordinates_choice.drop(columns = ['SEITENNAME', 'MATCH_SCORE']) #drop unnceccessary columns

In [14]:
#place_coord

Some problems: The places are not in the correct order anymore, which is important for our route. <br>
But since our column NEUE_NAMEN and our correctly ordered place list are the same, we can use some indexing-magic to restore the order!

In [15]:
place_coord = place_coord.set_index('NEUE_NAMEN')  #set new index
place_coord = place_coord.reindex(index=place_list) #re-index again using Pandas reindex-function, this time with our place list: corresponding records get ordered according to the place list
place_coord = place_coord.reset_index() #reset the index, bc. the NAMES shouldn't be our index
place_coord.reset_index(inplace=True)
place_coord = place_coord.drop(columns=['index'])
place_coord.to_csv('../Data/Sonntag/Sonntag_coord.csv', sep=',', index=False, encoding='utf-8') #save our new Dataframe of places

We also have other resources: For example, the Gazetteer, based on the 1710 Steinhausen plan.

In [16]:
xlsx_path = path.relpath("../Data/Gazetteer_Steinhausenplan_V5.xlsx")
usecolumns = ['ID_neu', 'Toponym', 'Sicherheit', 'Longitude', 'Latitude']
Steinhausen_coordinates = pd.read_excel(xlsx_path, usecols=usecolumns, index_col=0)  

In [17]:
Steinhausen_coordinates.head()

Unnamed: 0_level_0,Toponym,Sicherheit,Longitude,Latitude
ID_neu,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1,3 Cron:,hoch,16.376307,48.211565
2,3 Fisch,hoch,16.37424,48.210826
3,3 Fisch.,hoch,16.3741,48.210932
4,3 Rueben,mittel,16.37082,48.208375
5,5. Cronen,hoch,16.368902,48.211668


Here, we basically repeat our process: 

In [18]:
Steinhausen_coordinates['MATCH_SCORE'] = Steinhausen_coordinates['Toponym'].apply(lambda x: process.extractOne(x, place_list, scorer=fuzz.ratio)[1])
Steinhausen_coordinates['NEUE_NAMEN'] = Steinhausen_coordinates['Toponym'].apply(lambda x: process.extractOne(x, place_list, scorer=fuzz.ratio)[0])
Steinhausen_coordinates = Steinhausen_coordinates.sort_values('MATCH_SCORE', ascending=False)

Steinhausen_coordinates = Steinhausen_coordinates.drop(Steinhausen_coordinates[Steinhausen_coordinates['MATCH_SCORE'] <= 80].index)
Steinhausen_choice = Steinhausen_coordinates.drop_duplicates(subset=['NEUE_NAMEN'])

Steinhausen_choice = Steinhausen_choice.set_index('NEUE_NAMEN')
Steinhausen_choice = Steinhausen_choice.reindex(index=place_list)
Steinhausen_choice = Steinhausen_choice.reset_index()
Steinhausen_choice.reset_index(inplace=True)

Steinhausen_choice = Steinhausen_choice.drop(columns=['Sicherheit', 'index', 'MATCH_SCORE', 'Toponym'])
Steinhausen_choice.to_csv('../Data/Sonntag/Steinhausen_Sonntag_coord.csv', sep=',', index=False, encoding='utf-8')

Applied processor reduces input query to empty string, all comparisons will have score 0. [Query: '<...>']
Applied processor reduces input query to empty string, all comparisons will have score 0. [Query: '<...>']
Applied processor reduces input query to empty string, all comparisons will have score 0. [Query: '<...>']
Applied processor reduces input query to empty string, all comparisons will have score 0. [Query: '<...>']
Applied processor reduces input query to empty string, all comparisons will have score 0. [Query: '<...>']
Applied processor reduces input query to empty string, all comparisons will have score 0. [Query: '<...>']
Applied processor reduces input query to empty string, all comparisons will have score 0. [Query: '<...>']
Applied processor reduces input query to empty string, all comparisons will have score 0. [Query: '<...>']
Applied processor reduces input query to empty string, all comparisons will have score 0. [Query: '<...>']
Applied processor reduces input query

The best results - and by that, I mean the most coordinates - we'd get surely get by combining knowledge, and both our new dataframes. <br>
But both dataframes have different columns and data-"types" for coordinates: 
- Wien Geschichte Wiki uses Shapefiles, 
- the Gazetteer uses Latitude and Longitude.

In order to combine them, we'll have to split the shape-column in the place_coord-Dataframe. 

In [19]:
place_coord['SHAPE'] = place_coord['SHAPE'].str.replace(r'POINT', '', regex=True)
place_coord['SHAPE'] = place_coord['SHAPE'].str.replace(r'\)', '', regex=True)
place_coord[['Longitude', 'Latitude']] = place_coord['SHAPE'].str.extract('(\d+.\d+)\s(\d+.\d+)', expand=True)


  place_coord[['Longitude', 'Latitude']] = place_coord['SHAPE'].str.extract('(\d+.\d+)\s(\d+.\d+)', expand=True)


After reshaping the place_coord, we can join both dataframes.

In [20]:
joined_places = pd.merge(place_coord, Steinhausen_choice, left_index=True, right_index=True)
joined_places = joined_places.drop(columns=["NEUE_NAMEN_y", "SHAPE"])
joined_places.to_excel('../Data/Sonntag/joined_Sonntag_coord.xlsx')

I wanted to keep the sometimes differing values of longitude and latitude of the KulturwikiOGD and the Steinhausen-Plan <br>
to check which one of the coordinates fits better on the map.
<br>
Since some coordinates from other sources had to be added manually, I exported it as excel-file. 
For later importing the data in QGIS, you will have to transform it into csv-format. 

Looking later at the data, there is an interesting variation between route-instructions and descriptions of the corresponding view at each waypoint. <br>
It might be good to distinguish between those route-instructions and viewpoint-descriptions, at best automatically. <br> <br>

An idea would be a simple rule-based-approach: only use places that were mentioned in context of 'going somewhere' - this way, I thought, one could limit the datapoints as well as put them into better context. <br>
1. For that, I extracted the verbs of the text using spacy. 
2. Then I manually edited and filtered the verbs, so that only verbs like 'going', 'walking' etc. remained. 
3. I used this verblist to extract the places in the corresponding sentences, 
4. and then filtered my original dataframe accordingly

<br> But in the end, not many places fell out of the list - probably because my approach was a little bit too rough. 
In this case, manual selection from the text might be the smartest way to go. 

In [4]:
Sonntag = ' '.join(Sonntag)

In [23]:
import spacy
nlp = spacy.load("de_core_news_sm")
spaced_sunday = nlp(Sonntag)

In [24]:
sentences = spaced_sunday.sents

In [25]:
verbs = []
for sentence in sentences: 
    for word in sentence: 
        if word.pos_ == "VERB":
            verbs.append(word)


In [26]:
with open("..Data/Sonntag/verbs_as_editable_text.txt", 'w', encoding='utf8') as output:
    for verb in verbs:
        output.write(str(verb) + '\n')

In [27]:
data = open("..Data/Sonntag/verbs_as_editable_text.txt", 'r', encoding='utf8')
cleaned_verbs = data.read()
cleaned_verbs = cleaned_verbs.split("\n")

In [28]:
selection = []
for verb in cleaned_verbs: 
    for sentence in Sonntag.split('.'):
        if verb in sentence:
            if sentence not in selection:
                selection.append(sentence)
selection = '.'.join(selection)                

In [29]:
print(selection)

Sonntag Vormittag, wo man nicht zu bestimmten Stunden in irgend eine Anstalt eilen muß. Die beliebten Or= chester von Strauß und Lanner spielen hier zuweilen. Wir beginnen von der k. Tritt man zur selben hinaus, so hat man den großen äuße= ren Burgplatz vor sich, welcher durch 2 sich kreu= zende Wege in 4 Rasenparterre abgetheilt ist. Rechts und links führen in allen Ecken Wege auf den Wall; gerade vor sich hat man das Burgthor, unter Kaiser Franz I. Zu beiden Seiten des Gebäudes führen auch Stiegen auf den Wall, die aber gewöhnlich verschlos= sen sind, so wie der Aufgang auf die Plattform des Thores. von Nobile 1822 erbaut. Gerade vor sich hat man die kaiserlichen Stallungen (von Fischer von Erlach erbaut), 600 Fuß lang, welche 400 Pferde= stände enthalten. Auf dem Ravelin steht links das neue Palais des Herzogs von Koburg, 1842 erbaut. Man geht hierauf etwas bergan zum Schot= tenthore, 1841 neu erbaut, und sieht die Schot= tengasse hinab, bis auf die Freiung und in die Herrngasse hin

In [30]:
selection_clean = re.sub(r'(\w)= ([^uo])',r'\g<1>\g<2>', selection)

In [31]:
print(selection_clean)

Sonntag Vormittag, wo man nicht zu bestimmten Stunden in irgend eine Anstalt eilen muß. Die beliebten Orchester von Strauß und Lanner spielen hier zuweilen. Wir beginnen von der k. Tritt man zur selben hinaus, so hat man den großen äußeren Burgplatz vor sich, welcher durch 2 sich kreuzende Wege in 4 Rasenparterre abgetheilt ist. Rechts und links führen in allen Ecken Wege auf den Wall; gerade vor sich hat man das Burgthor, unter Kaiser Franz I. Zu beiden Seiten des Gebäudes führen auch Stiegen auf den Wall, die aber gewöhnlich verschlossen sind, so wie der Aufgang auf die Plattform des Thores. von Nobile 1822 erbaut. Gerade vor sich hat man die kaiserlichen Stallungen (von Fischer von Erlach erbaut), 600 Fuß lang, welche 400 Pferdestände enthalten. Auf dem Ravelin steht links das neue Palais des Herzogs von Koburg, 1842 erbaut. Man geht hierauf etwas bergan zum Schottenthore, 1841 neu erbaut, und sieht die Schottengasse hinab, bis auf die Freiung und in die Herrngasse hinein. Jn der St

In [32]:
new_place_list = []
#for place in place_list:
for sentence in selection_clean.split('.'):
    for place in place_list:
        if place in sentence: 
            new_place_list.append(place)

In [33]:
print(new_place_list)

['Burgplatz', 'Burg', 'Burg', 'Burgplatz', 'Burgplatz', 'Burgthor', 'Burg', 'Burg', 'Wall', 'Wall', 'Wall', 'Wall', 'Thore', 'kaiserlichen Stallungen', 'Palais', 'Palais des Herzogs von Koburg', 'Ravelin', 'Ravelin', 'Ravelin', 'Ravelin', 'Schottenthore', 'Schottengasse', 'Freiung', 'Herrngasse', 'Palais', 'Bastei', 'Schenkenstraße', 'Palais Liechtenstein', 'Volksgarten', 'Volksgarten', 'Volksgarten', 'Volksgarten', 'Theseustempel', 'Burg', 'Burg', 'Kahlengebirge', 'Kahlengebirge', 'Praters', 'Leopoldstadt', 'Ferdinandsbrücke', 'Rothenthurm=Thore', 'Rothenthurm=Thore', 'Leopoldstadt', 'Thore', 'Leopoldstadt', 'Kahlengebirge', 'Ravelin', 'Ravelin', 'Ravelin', 'Fischerthore', 'Ravelin', 'Bastei', 'Arsenales', 'Proviantbäckerei', 'Burgplatz', 'Burg', 'Burg', 'Burgplatz', 'Volksgarten', 'Volksgarten', 'Burgplatz', 'Volksgarten', 'Volksgarten', 'Gebäude der ungarischen Garde', 'Palais', 'KriminalGebäude', 'Kahlengebirge', 'Kahlengebirge', 'Kahlengebirge', 'Grenadierkaserne', 'Getreidemarkt'

In [34]:
joined_places_nn = pd.read_excel('../Data/Sonntag/joined_Sonntag_coord_nn.xlsx')

In [35]:
Sonntags_Route = joined_places_nn.loc[joined_places_nn['NEUE_NAMEN'].isin(new_place_list)]

In [36]:
Sonntags_Route.to_csv('../Data/Sonntag/Sonntags_Route.csv', sep=',', index=False, encoding='utf-8')

Another idea would be to use LLMs - like ChatGPT - for this task. <br>
For that, I created a simple text-file of the Sunday-Chapter, and copied it into ChatGPT. <br>
Trying out different prompts might help to filter out a list of route instructions.

In [None]:
#create a text to use in ChatGPT
#print(Sonntag)
with open("../Data/Sonntag/Sonntag.txt", 'w', encoding='utf8') as text:
    text.write(str(Sonntag))

Some of my prompts: <br> <br>

"Imagine I want to take a stroll through vienna, following pre-written instructions from a text. Can you use NER to filter out only relevant route-instructions? The text goes as follows:" <br> <br>

"Given the following text: "..." What can I see from "example_place"?" <br> <br>

" [text] Wo bin ich und was sehe ich von hier aus?" <br> <br>

" [text] Wo soll man lang gehen und was sieht man jeweils? 