*dramavis* is a Python program dedicated to the network analysis of dramatic texts. It computes a variety of network measures as well as graph visualisations.
HTML Other
Switch branches/tags
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
archive/v0.1
testoutput_chars
testoutput_metrics
tests
.gitignore
LICENSE
README.md
dramalyzer.py
dramaplotter.py
dramavis.yml
linacorpus.py
requirements.txt
superposter.py
workflow.py

README.md

dramavis

by Frank Fischer (@umblaetterer) & Christopher Kittel (@chris_kittel)

Purposes of this Python script:

  • reading character networks of dramatic pieces from lina-xml files;
  • plotting these networks into SVG graphs (and generating a superposter containing all graphs in chronological order, see our showcase poster);
  • writing drama network and character metrics values to CSV files.

Version history:

  • v0.0: Spaghetti-code version written in August, 2014 (never published);
  • v0.1: rewritten in June, 2015 (archived here);
  • v0.2: major rewrite in February, 2016;
  • v0.2.1: minor bugfixes and usability improvements;
  • v0.3: object-oriented restructurations of code base, introduced measures for dynamic network changes; December, 2016.
  • v0.4: rewritten in September 2017, reworked datamodel, added new metrics

New in v0.4:

  • reworked composite ranking index now based on 5 network-metrics and 3 content metrics (character-level)
  • introduced Kendall-Tau measure for ranking stability (drama-level)
  • reworked data model, now based on pandas (functions and workflow now cleaner and simpler)
  • reworked package structure, separated into workflow, I/O, plotting, and analysis

Installation

Depends on Anaconda for Python 3

Prepare:

conda env create -f dramavis.yml
source activate dramavis

Run:

python3 workflow.py --input /home/chris/data/dlina/data/zwischenformat --output charmetrics --action char_metrics --logpath all.log
  • alternative actions: corpus_metrics, both

  • additional flags

    • --debug prints alot of internal variables when running
    • --randomization prints randomized graphs, takes longer to run

Running dramavis can take up to 4 seconds per play with an average of 2.5 seconds, this is mainly due to the network randomization for statistics. plotsuperposter takes around 1 second per play.

Input data

An easy way to download the dlina 'zwischenformat' XMLs without additional repo information is this:

svn export https://github.com/dlina/project/trunk/data/zwischenformat

Then just point the input directory to the cloned folder (which should include 465 XML files in the main directory).