Materials accompanying paper "Challenging Stylometry: The Authorship of the Baroque Play La Segunda Celestina" by Laura Hernández Lorenzo (@lamusadecima) and Joanna Byszuk (@JoannaBy) presented at the Digital Humanities 2019 conference in Utrecht.
The materials we present here consist of:
- A corpus of Golden Age Spanish plays.
- List of all the plays in the corpus.
- Evaluation of the OCR difficulties which led us to transcribe the texts.
- Slides of our presentation in DH2019 (English and Spanish)
Corpus of Golden Age Spanish Plays
We have collected the plays from various sources. Most of them were taken from Canon-60. Oleza Simó, J. (2014). Canon 60. Valencia: Universitat de València.
In addition, we took some Sor Juana's plays from Biblioteca Virtual Miguel de Cervantes.
Salazar's, Vera Tassis' and one Solís' play (Las amazonas) were transcribed and revised by ourselves using the digitalisations of these plays available at the National Library of Spain website.
Procedence of every play is detailed at the List of plays.
Some of the plays are not complete or separated in various files, as the author we are interested in only write a part of the play. This is the case of:
Sor Juana Inés de la Cruz. Amor es más laberinto. Our file does not contain the second act, because this one was written by Fray Juan de Guevara.
Más triunfa el amor rendido. First act was written by Agustín de Salazar, whereas second and third act were written by Juan de Vera Tassis. Therefore, it is separated in two files.
We thank José Calvo Tello (@morethanbooks) for sending us his clean version of Canon-60 corpus.