# Documentation on network analysis
The goal of doing a character network analysis is to analyze the number of interactions that the characters have inside the story, do a comparison between interactions in the story, screenplay 1996 and final version of the screenplay (1999). 
For each of the XML documents analyzed we have created the following: a csv file containing the interactions; a static graph; an interactive graph.  

The first thing that was done was to search manually for all the direct and indirect dialogues inside the texts and then tag them inside the **XML-TEI**. The tagging in the book was done using ```<said>``` , and adding for each dialogue the attribute "who" that underlines who is speaking and "toWhom" addresses the listener. As for the screenplay, we added manually the tag ```<sp>``` which is used to identify individual speeches in performance texts, followed by the attributes "who" and "toWhom". Multiple speakers or listeners are added as a single string and separated with a comma. 

The libraries needed to extract information from the TEI document and ... are: [xml.etree](https://docs.python.org/3/library/xml.etree.elementtree.html#).


**xml.etree** is used to progressively search into the hierarchical structure of the XML-TEI document. In this case we need to search inside the TEI header the content inside the tag ```<said>``` or ```<sp>``` and get all the information contained in the tags "who" and "toWhom". This process is done for each ```<said>``` or ```<sp>``` tag and then the information is added as a tuple in a `content_list`. The script founds if speakers or listeners are more than one (so separated by a comma), splits them, adds into a list (`listener_tags` or `who_tags`) and appends them into `content_list`. `interactions` is a dictionary that is used to store the number of times the characters interact with each other, counting the number of times the tuple is inside `content_list`. 

Then, we store the data into an external csv file. 



In [2]:
from xml.etree import ElementTree as ET
import csv

def generate_network_csv(xml_file_path):
    tree = ET.parse(xml_file_path)
    root = tree.getroot()

    content_list = []
    for said_tag in root.findall('.//{http://www.tei-c.org/ns/1.0}said'):
        who_tag = said_tag.get('who')
        listener_tag = said_tag.get('toWhom')
        if who_tag is not None:
            content_list.append((who_tag,listener_tag))

    interactions={}
    for i in content_list:
        if i not in interactions:
            interactions[i] = 1
        else:
            interactions[i] +=1
    
        # Write interactions to a CSV file
    with open('interactions_book.csv', 'w', newline='') as csvfile:
        csvwriter = csv.writer(csvfile)
        
        # Write header
        csvwriter.writerow(['Character', 'Listener', 'Interaction Count'])
        
        # Write data rows
        for (who, listener), count in interactions.items():
            listener_str = listener if listener else "unknown"
            csvwriter.writerow([who, listener_str, count])

    print("CSV file 'interactions_book.csv' created.")

generate_network_csv("C:/Users/crosi/Documents/GitHub/metascript/Arthur Schnitzler - Dream Story (2003, Green Integer).xml")

CSV file 'interactions_book.csv' created.


In [None]:
from xml.etree import ElementTree as ET
import csv

def generate_network_csv(xml_file_path):
    tree = ET.parse(xml_file_path)
    root = tree.getroot()

    content_list = []
    for said_tag in root.findall('.//{http://www.tei-c.org/ns/1.0}sp'):
        who_tag = said_tag.get('who')
        listener_tag = said_tag.get('toWhom')
        if listener_tag is not None and who_tag is not None:
            if "," in listener_tag:
                listener_tags = listener_tag.split(',')
                for i in listener_tags:
                    content_list.append((who_tag, i))
            elif ',' in who_tag:
                who_tags = who_tag.split(',')
                for i in who_tags:
                    content_list.append((i, listener_tag))
            else:
                content_list.append((who_tag, listener_tag))

    interactions={}
    for i in content_list:
        if i not in interactions:
            interactions[i] = 1
        else:
            interactions[i] +=1
    
        # Write interactions to a CSV file
    with open('interactions_screenplay96.csv', 'w', newline='') as csvfile:
        csvwriter = csv.writer(csvfile)
        
        # Write header
        csvwriter.writerow(['Character', 'Listener', 'Interaction Count'])
        
        # Write data rows
        for (who, listener), count in interactions.items():
            listener_str = listener if listener else "unknown"
            csvwriter.writerow([who, listener_str, count])

    print("CSV file 'interactions_screenplay96.csv' created.")


generate_network_csv("C:/Users/crosi/Documents/GitHub/metascript/eyes-wide-shut-1996-screenplay.xml")

CSV file 'interactions_screenplay96.csv' created.
