Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Welcome! This motif-matrix repository represents the data component of the Digital Breadcrumbs of Brothers Grimm project.
WHAT DOES THIS REPOSITORY CONTAIN?
This repository collects the motifs of popular fairy tales in different languages (primarily German, English, Russian and Italian) in the form of a matrix. By 'motif' we mean the smallest thematic unit of text (following the definition of Prince's Dictionary of Narratology). For example, 'apples causes magic sleep' is a motif - can you guess which fairy tale it belongs to?
WHICH FAIRY TALES WILL I FIND IN THIS REPOSITORY?
Snow White, Puss in Boots, The Fisherman and His Wife.
WHAT IS THE OBJECTIVE OF THIS WORK?
The objective is to further research in the fields of Digital Humanities, Natural Language Processing and Folkloristics. We collect and order motifs present in fairy tales and in their parallel versions in different languages. We do this in order to build a training data-set for furthering research specifically in cross-lingual text reuse detection. By 'text reuse' we mean the written repetition of content from one text to the other (for more info about text reuse, click here).
HOW IS THE INFORMATION GATHERED?
Starting from a fairy tale present in at least one of the seven editions of the Kinder- und Hausmärchen written by the Brothers Grimm, we search for parallel fairy tales in other languages and cultures. Following an index listing all motifs we take note of those present in the chosen fairytales and we compare it with the parallel versions found. The annotations then populate our motif matrix.
WHAT DOES THIS REPOSITORY NOT CONTAIN?
We do not work with translations of tales, but only with original works.
HOW IS THE MOTIF MATRIX STRUCTURED?
The motif matrix is created and saved as an Excel Spreadsheet. We create one Excel spreadsheet per tale chosen.
- The first column on the far left represents the motifs listed by the Thompson motif index that are present in at least one version of the chosen tale.
- The other columns - progressing from left to right - correspond to the various editions (in different languages) in which the chosen fairy tale was identified by us (starting with the tales by the Brothers Grimm). Sometimes the titles of the fairy tales will look very different from each other.
- The first row contains the name of the author of the tale and the VIAF number.
- The second row contains the title of the tales and their VIAF numbers.
- The intersections will contain NULL if a particular listed motif was not found in the selected version of the tale and, by contrast, will contain a list of keywords in their base form if the particular motif was found in the tale. The keywords represent the motif and the words that compose it in the selected tale. They are recorded in the language they were found in.
WHY DO WE HAVE A VIAF NUMBER?
VIAF numbers are recorded for disambiguation - to make clear which author and edition the tales belong to.
WHO ARE WE?
We are a team of young researchers belonging to the eTRAP Early Career Research Group (Computer Science Department / Digital Humanities) at the University of Göttingen, Germany. Our names are Greta Franzini, Emily Franzini, Gabriela Rotari, Melina Jander and Marco Büchler.