# Documentation on spatial analysis
The main goal in doing a spatial analysis in this project was both to visualize and compare the places in which the book and the screenplay take place. In our case Traumnovelle and Eyes Wide Shut are set in two different cities, and in particular in the movie, the places in which scenes take place are not specified. This tool can be more useful if the in both texts there are specifications about the settings, and in this way the differences are more relevant and useful for a comparison. The goal of this script is to automatize as much as possible the creation of the map, adding automatically circles corresponding to city areas and points to other types of geographical elements. Unfortunately, there's the need to insert manually geojson files corresponding to regions, states or country areas that we want to display on screen. 

The first thing that was done was to search manually for all the places inside the texts and then tag inside the **TEI header**. In this way a list of places is created using ```<listPlace>``` , each geographical element was defined using the tag ```<place>``` containing also the xml identifier and the type of geographical element (i.e. city, country, building). 
The ```<placeName>``` is also specified, and when possible, linked with the correspondent GeoNames unique identifier. 
The tag ```<settlement>``` and  ```<country>``` provide information on the city and country of the element, ```<geo>``` is used to add coordinates of the element. 

The libraries needed to extract information from the TEI document and visualize them inside a map are respectively [xml.etree](https://docs.python.org/3/library/xml.etree.elementtree.html#) and [folium](https://python-visualization.github.io/folium/).


**xml.etree** is used to progressively search into the hierarchical structure of the XML-TEI document. In this case we need to search inside the TEI header the content inside the tag ```<place>```. Then we create the dictionary `coordinates_dict` in which the content of ```<placeName>``` and of the attribute "type" consitute a tuple that represents the key of the dictionary, while the coordinates are the value of the dictionary. 

**Folium** leverages the Leaflet.js JavaScript library to generate maps directly in a Python environment. It is used here to create the map, manually specifying the focus on a specific area from which we want to start our visualization. Through an iteration inside the dictionary, if the 'type' attribute corresponds to 'city', a circle is drawn around the coordinates defined in the XML-TEI document, if it's not the case, then only a point is added in correspondence of the value of the dictionary. 

Considering that we want to visualize the areas corresponding to regions/countries/states, we need to find the corresponding geojson files and manually load them  one by one into the map. 

In conclusion, we save the map as an html file that can be embedded inside the main website for visualization.


In [1]:
from xml.etree import ElementTree as ET
import folium
import json


def generate_coordinates_dictionary(xml_file_path):
    tree = ET.parse(xml_file_path)
    root = tree.getroot()

    tei_header = root.find('{http://www.tei-c.org/ns/1.0}teiHeader')
    if tei_header is not None:
        places = tei_header.findall('.//{http://www.tei-c.org/ns/1.0}place')
        coordinates_dict = {}

        for place in places:
            place_name_element = place.find(
                '{http://www.tei-c.org/ns/1.0}placeName')
            if place_name_element is not None:
                place_name = place_name_element.text
                place_type = place.get('type')
                geo_element = place.find(
                    '{http://www.tei-c.org/ns/1.0}location/{http://www.tei-c.org/ns/1.0}geo')
                if geo_element is not None:
                    coordinates = geo_element.text
                    if coordinates:
                        try:
                            lat, lon = coordinates.split(",")
                            # Convert to float and create a tuple
                            coordinates_dict[(place_name, place_type)] = (
                                float(lat), float(lon))
                            
                        except ValueError:
                            print(
                                f"Invalid coordinates format for {place_name}: {coordinates}")
                            
        m = folium.Map(location=[48.21095900638111,16.377296146797722], zoom_start=5)
        for key, value in coordinates_dict.items():
            if key[1] == 'city':
                folium.Circle(
                    radius=5000,
                    location=value,
                    popup=key[0],
                    color="#3186cc",
                    fill=True,
                    fill_color="#3186cc",
                    tooltip=key[0],
                ).add_to(m)
            else:
               folium.Marker(location=value, popup=key[0]).add_to(m)
        
        #modo per tracciare l'area dello stato
        try:
            with open('denmark-detailed-boundary_896.geojson', 'r', encoding='utf-8') as file:
                data = json.load(file)
                folium.GeoJson(data, name="denmark", tooltip="Denmark").add_to(m)
        except Exception as e:
            print(f"Error loading GeoJSON: {e}")

        return m.save("metascript_map.html")
        



generate_coordinates_dictionary("C:/Users/crosi/Documents/GitHub/metascript/Dream_Story.xml")

In [1]:
from xml.etree import ElementTree as ET
import folium
import json


def generate_coordinates_dictionary(xml_file_path):
    tree = ET.parse(xml_file_path)
    root = tree.getroot()

    tei_header = root.find('{http://www.tei-c.org/ns/1.0}teiHeader')
    if tei_header is not None:
        places = tei_header.findall('.//{http://www.tei-c.org/ns/1.0}place')
        coordinates_dict = {}

        for place in places:
            place_name_element = place.find(
                '{http://www.tei-c.org/ns/1.0}placeName')
            if place_name_element is not None:
                place_name = place_name_element.text
                place_type = place.get('type')
                geo_element = place.find(
                    '{http://www.tei-c.org/ns/1.0}location/{http://www.tei-c.org/ns/1.0}geo')
                if geo_element is not None:
                    coordinates = geo_element.text
                    if coordinates:
                        try:
                            lat, lon = coordinates.split(",")
                            # Convert to float and create a tuple
                            coordinates_dict[(place_name, place_type)] = (
                                float(lat), float(lon))
                            
                        except ValueError:
                            print(
                                f"Invalid coordinates format for {place_name}: {coordinates}")
                            
        m = folium.Map(location=[40.71258347482157, -74.0181331499278], zoom_start=6)
        for key, value in coordinates_dict.items():
            if key[1] == 'city':
                folium.Circle(
                    radius=5000,
                    location=value,
                    popup=key[0],
                    color="#3186cc",
                    fill=True,
                    fill_color="#3186cc",
                    tooltip=key[0],
                ).add_to(m)
            else:
               folium.Marker(location=value, popup=key[0]).add_to(m)
        
        #modo per tracciare l'area dello stato
        try:
                with open('connecticut.geojson', 'r', encoding='utf-8') as file:
                    data = json.load(file)
                    folium.GeoJson(data, name="connecticut", tooltip="Connecticut").add_to(m)
                with open('new jersey.geojson', 'r', encoding='utf-8') as file:
                    data = json.load(file)
                    folium.GeoJson(data, name="newjersey", tooltip="New Jersey").add_to(m)  
        except Exception as e:
            print(f"Error loading GeoJSON: {e}")

        return m.save("screenplay_map.html")
        



generate_coordinates_dictionary("C:/Users/crosi/Documents/GitHub/metascript/eyes-wide-shut-1996-screenplay.xml")