Skip to content
Branch: master
Find file Copy path
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
154 lines (119 sloc) 17.1 KB

Deep Learning and Digital Humanities (20-00-1080-se Deep Learning und Digital Humanities)


The focus of the seminar is on humanities applications like poetry generation and analysis, metaphor and emotion identification, etc., and how to solve these problems with Deep Learning. Students will read papers and present them during the seminar.

Der Fokus des Seminars wird auf Humanities Anwendungen wie Gedicht-Generierung und Analyse, Metaphern- und Emotions-Identifikation, etc. liegen, und wie diese mithilfe von Deep Learning gelöst werden können. Die Studenten werden Paper lesen und diese während des Seminars präsentieren.


The seminar will be held by Steffen Eger and Thomas Haider


Biweekly, Thursdays from 18:15 to 19:45, in Darmstadt Stadtmitte S1 08/18.


The seminar will be held in English.

Course Plan

Session Date Topics Presenter
1 24.10 Kickoff & What is Digital Humanities Thomas Haider
2 31.10 What is Deep Learning Steffen Eger
3 14.11 Corpora (GLUE, Hatespeech) & Annotation (BW Scaling, Bayesian)
4 28.11 Spelling Normalization (Comparison, Multi-Task) & OCR (Image-to-text, Post-Correction)
5 12.12 Metaphors (Novel Metaphors, Similarity Network) & Emotions (Implicit Emotions, Twitter)
6 19.12 Semantic Change (Laws of semantic change, Cultural vs. Linguistic Shift) & Semantic Drift (Phylogeny, Semantic Drift)
7 16.01 Stylometry (Lyrics-based Music classification, Shakespearizing Modern English) & Variation (Regional Variation, Style-conditioned Generation)
8 23.01 Fiction (Animacy, Social Networks) & Narration (Story Generation, Emotional arcs)
9 30.01 Poetry (Hafez & Poem GAN) & Arts (Similarity for music & Art generation)


  • Corpora & Annotation

    • Ronja Laarmann-Quante, Stefanie Dipper and Eva Belke (2019). The Making of the Litkey Corpus, a richly annotated longitudinal corpus of German texts written by primary school children. In Proceedings of the ACL Linguistic Annotation Workshop (LAW 13), pp. 43?55. Florence, Italy. Link
    • Katrin Ortmann and Stefanie Dipper (2019). Variation between Different Discourse Types: Literate vs. Oral. In Proceedings of the NAACL-Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial), pp. 64-79. Minneapolis, MN. PDF
    • Kim, E. and Klinger, R., 2018, August. Who feels what and why? annotation of a literature corpus with semantic roles of emotions. COLING
    • Asai, A., Evensen, S., Golshan, B., Halevy, A., Li, V., Lopatenko, A., Stepanov, D., Suhara, Y., Tan, W.C. and Xu, Y., 2018. Happydb: A corpus of 100,000 crowdsourced happy moments. LREC 2018
    • Khandelwal, A., Swami, S., Akhtar, S.S. and Shrivastava, M., 2018. Humor detection in english-hindi code-mixed social media content: Corpus and baseline system. LREC 2018
    • Troiano, E., Padó, S. and Klinger, R., 2019. Crowdsourcing and Validating Event-focused Emotion Corpora for German and English. ACL 2019
    • Benikova, D., Biemann, C. and Reznicek, M., 2014, May. NoSta-D Named Entity Annotation for German: Guidelines and Dataset. In LREC (pp. 2524-2531).
    • Kiritchenko and Mohammad. Capturing Reliable Fine-Grained Sentiment Associations by Crowdsourcing and Best-Worst Scaling. In NAACL 2016.
    • Kiritchenko and Mohammad. Best-Worst Scaling More Reliable than Rating Scales: A Case Study on Sentiment Intensity Annotation. In ACL 2017.
    • Simpson, Do Dinh, Miller, and Gurevych. Predicting Humorousness and Metaphor Novelty with Gaussian Process Preference Learning. In ACL 2019.
    • Simpson and Gurevych. Finding Convincing Arguments Using Scalable Bayesian Preerence Learning. In TACL 2018.
    • Simpson and Gurevych. A Bayesian Approach for Sequence Tagging with Crowds. In EMNLP 2019.
  • OCR & Spelling Normalization

    • Bollmann. A Large-Scale Comparison of Historical Text Normalization Systems. In NAACL 2019.
    • Flachs, Bollmann, Sogaard. Historical Text Normalization with Delayed Rewards. In ACL 2019.
    • Bollmann, Bingel, Sogaard. Learning attention for historical text normalization by learning to pronounce. In ACL 2017.
    • Bollmann, Dipper, Petran. Evaluating Inter-Annotator Agreement on Historical Spelling Normalization. In LAW 10, 2016.
    • Schnober, Eger, Do Dinh, Gurevych. Still not there? Comparing traditional sequence-to-sequence models to encoder-decoder neural networks on monotone string translation tasks. In COLING 2016.
    • Amrhein, Clematide. Supervised OCR Error Detection and Correction Using Statistical and Neural Machine Translation Methods. In JLCL 2018.
  • Metaphors & Emotions

    • Buechel and Hahn. Readers vs. writers vs. texts: Coping with different perspectives of text understanding in emotion annotation. In LAW 2017. (Also Corpus)
    • Buechel and Hahn. 2018. Emotion Representation Mapping for Automatic Lexicon Construction (Mostly) Performs on Human Level. In COLING 2018
    • Buechel and Hahn. EmoBank: Studying the Impact of Annotation Perspective and Representation Format on Dimensional Emotion Analysis. In EACL 2017.
    • Buechel and Hahn. Word Emotion Induction for Multiple Languages as a Deep Multi-Task Learning Problem. In NAACL 2018.
    • Klinger, R., De Clercq, O., Mohammad, S.M. and Balahur, A., 2018. Iest: Wassa-2018 implicit emotions shared task. EMNLP 2018
    • Bostan and Klinger. A Survey and Experiments on Annotated Corpora for Emotion Classification in Text. In Coling 2018.
    • Do Dinh, Eger, and Gurevych. Killing Four Birds with Two Stones: Multi-Task Learning for Non-Literal Language Detection. In COLING 2018
    • DoDinh, Wieland, and Gurevych. Weeding out Conventionalized Metaphors: A Corpus of Novel Metaphor Annotations. In EMNLP 2018.
    • Rei et al. Grasping the Finer Point: A Supervised Similarity Network for Metaphor Detection. In EMNLP 2017.
    • Zayed, O., McCrae, J.P. and Buitelaar, P., 2018, June. Phrase-Level Metaphor Identification using Distributed Representations of Word Meaning. In Proceedings of the Workshop on Figurative Language Processing (pp. 81-90).
    • Kim, E. and Klinger, R., 2019. An Analysis of Emotion Communication Channels in Fan Fiction: Towards Emotional Storytelling. ACL Storytelling Workshop.
    • Mohammad et al. Metaphor as a Medium for Emotion: An Empirical Study. In *SEM 2017.
    • Tsvetkov et al. Metaphor Detection with Cross-Lingual Model Transfer. In ACL 2014.
    • Steen et al. A method for linguistic metaphor identification: From MIP to MIPVU, volume 14. John Benjamins Publishing, 2010.
    • Leong et al. A Report on the 2018 VUA Metaphor Detection Shared Task. In Workshop on Figurative Language Processing
    • Felbo, B., Mislove, A., Søgaard, A., Rahwan, I. and Lehmann, S., 2017. Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm. EMNLP 2017, Copenhagen.
    • Bulat, L., Clark, S.C. and Shutova, E., 2017. Modelling metaphor with attribute-based semantics.
  • Semantic Change & Semantic Drift

    • Hamilton et al. 2016. Diachronic Word Embeddings Reveal Statistical Laws of Semantic Change. In ACL 2016.
    • Eger and Mehler 2016. On the Linearity of Semantic Change: Investigating Meaning Variation via Dynamic Graph Models. In ACL 2016.
    • Dubossarsky, Weinshall, and Grossman. 2017. Outta control: Laws of semantic change and inherent biases in word representation model. In EMNLP 2017.
    • Hamilton et al. Cultural Shift or Linguistic Drift? Comparing Two Computational Measures of Semantic Change. In EMNLP 2016.
    • Kutuzov et al. Diachronic word embeddings and semantic shifts: a survey. In COLING 2018.
    • Dubossarsky et al. Time-Out: Temporal Referencing for Robust Modeling of Lexical Semantic Change. In EMNLP 2019.
    • Xu and Kemp. A computational evaluation of two laws of semantic change. In CogSciSoc 2015.
    • Beinborn and Choenni. Semantic Drift in Multilingual Representations. Arxiv, 2019.
    • Eger, Hoenen, Mehler. Language classification from bilingual word embedding graphs. In COLING 2016.
    • Asgari and Mofrad. Comparing Fifty Natural Languages and Twelve Genetic Languages Using Word Embedding Language Divergence (WELD) as a Quantitative Measure of Language Distance. In: Workshop on Multilingual and Cross-lingual Methods in NLP, 2016.
    • Bamman et al. Distributed representations of geographically situated language. In ACL 2014.
    • Rama and List. An automated framework for fast cognate detection and Bayesian phylogenetic inference in computational historical linguistics. In ACL 2019.
    • Cysouw. Disentangling geography from genealogy. In Peter Auer, Martin Hilpert, Anja Stukenbrock, and Benedikt Szmrecsanyi, editors, Space in Language and Linguistics: Geographical, Interactional, and Cognitive Perspectives. 2013.
  • Stylometry & Variation

    • Field, A. and Tsvetkov, Y., 2019. Entity-centric contextual affective analysis. ACL 2019
    • Jhamtani, H., Gangal, V., Hovy, E. and Nyberg, E., 2017. Shakespearizing modern language using copy-enriched sequence-to-sequence models.
    • Fell, M. and Sporleder, C., 2014. Lyrics-based analysis and classification of music. In Proceedings of COLING 2014, Technical Papers (pp. 620-631).
    • Mike Kestemont et al. 2019. Overview of the cross-domain authorship attribution task at {PAN} 2019. In Working Notes CLEF.
    • Evert et al. Towards a better understanding of Burrows?s Delta in literary authorship attribution. In Workshop on Computational Linguistics for Literature (CLfL), 2015.
    • Faruqui, M. and Pado, S., 2012. Towards a model of formal and informal address in english. In Proceedings of the 13th EACL.
    • Kao, J. and Jurafsky, D., 2012. A computational analysis of style, affect, and imagery in contemporary poetry. In Proceedings of the NAACL-HLT 2012 Workshop on Computational Linguistics for Literature (pp. 8-17).
    • Herbelot, A., 2014. The semantics of poetry: A distributional reading. Digital Scholarship in the Humanities, 30(4), pp.516-531.
    • Ousidhoum, N., Lin, Z., Zhang, H., Song, Y. and Yeung, D.Y., 2019. Multilingual and Multi-Aspect Hate Speech Analysis. arXiv preprint arXiv:1908.11049.
    • Ficler, J., & Goldberg, Y. 2017. Controlling linguistic style aspects in neural language generation. EMNLP Workshop on Stylistic Variation.
    • Smith, D.A., Cordell, R. and Dillon, E.M., 2013, October. Infectious texts: Modeling text reuse in nineteenth-century newspapers. In 2013 IEEE International Conference on Big Data (pp. 86-94). IEEE
    • Hovy, D. and Purschke, C., 2018. Capturing regional variation with distributed place representations and geographic retrofitting. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (pp. 4383-4394).
    • Dai, N., Liang, J., Qiu, X. and Huang, X., 2019. Style Transformer: Unpaired Text Style Transfer without Disentangled Latent Representation. ACL 2019
  • Literature: Fiction & Narration

    • Reagan, A.J., Mitchell, L., Kiley, D., Danforth, C.M. and Dodds, P.S., 2016. The emotional arcs of stories are dominated by six basic shapes. EPJ Data Science, 5(1), p.31.
    • Chambers, N. and Jurafsky, D., 2009, August. Unsupervised learning of narrative schemas and their participants. ACL AFNLP
    • Maharjan, S., Kar, S., Montes-y-Gómez, M., Gonzalez, F.A. and Solorio, T., 2018. Letting emotions flow: Success prediction by modeling the flow of emotions in books. NAACL.
    • Kim, E. and Klinger, R., 2019. Frowning Frodo, Wincing Leia, and a Seriously Great Friendship: Learning to Classify Emotional Relationships of Fictional Characters. NAACL-HLT
    • Xu, J., Ren, X., Zhang, Y., Zeng, Q., Cai, X. and Sun, X., 2018. A skeleton-based model for promoting coherence among sentences in narrative story generation. EMNLP
    • Rahimtoroghi, E., Wu, J., Wang, R., Anand, P. and Walker, M.A., 2017. Modelling protagonist goals and desires in first-person narrative. SIGDIAL
    • Doust, R. and Piwek, P., 2017, September. A model of suspense for narrative generation. In Proceedings of the 10th International Conference on Natural Language Generation (pp. 178-187).
    • Manjavacas, E., Karsdorp, F., Burtenshaw, B. and Kestemont, M., 2017, September. Synthetic literature: Writing science fiction in a co-creative process. CC-NLG 2017
    • Flekova, L. and Gurevych, I., 2015. Personality profiling of fictional characters using sense-level links between lexical resources. EMNLP
    • Papantoniou, K. and Konstantopoulos, S., 2016, August. Unravelling Names of Fictional Characters. ACL
    • Finlayson, M.A., 2015. ProppLearner: Deeply annotating a corpus of Russian folktales to enable the machine learning of a Russian formalist theory. DSH
    • Jahan, L., Chauhan, G. and Finlayson, M., 2018, August. A new approach to animacy detection. In Proceedings of the 27th International Conference on Computational Linguistics (pp. 1-12).
    • Karsdorp, F., van der Meulen, M., Meder, T. and van den Bosch, A., 2015. Animacy detection in stories. CMN 2015, Dagstuhl
    • Karsdorp, F., Kestemont, M., Schöch, C. and van den Bosch, A.P.J., 2015. The love equation: Computational modeling of romantic relationships in french classical drama. Dagstuhl
    • Kestemont, M., Karsdorp, F. and During, M., 2014. Mining the twentieth century?s history from the time magazine corpus. SIGHUM LaTeCH
  • Literature: Poetry & Arts

    • Zhang and Lapata. Chinese Poetry Generation with Recurrent Neural Networks. In EMNLP 2014.
    • Ghazvininejad et al. Hafez: an interactive Poetry Generation System. In ACL Demo papers 2017.
    • Hopkins and Kiela. Automatically Generating Rhythmic Verse with Neural Networks. In ACL 2017.
    • Lau et al. DeepSpeare: A joint neural model of poetic language, meter and rhyme. In ACL 2018.
    • Haider and Kuhn. Supervised Rhyme Detection with Siamese Recurrent Networks. In LaTeCH-CLfL 2018.
    • Fell, M. and Sporleder, C., 2014. Lyrics-based analysis and classification of music. COLING 2014, Technical Papers (pp. 620-631).
    • Manex Agirrezabal, Iñaki Alegria, and Mans Hulden. 2016. Machine learning for metrical analysis of English poetry. In Proceedings of COLING
    • Kao, J. and Jurafsky, D., 2012. A computational analysis of style, affect, and imagery in contemporary poetry. In Proceedings of the NAACL-HLT 2012 Workshop on Computational Linguistics for Literature (pp. 8-17).
    • Jhamtani, H., Mehta, S.V., Carbonell, J. and Berg-Kirkpatrick, T., 2019. Learning Rhyming Constraints using Structured Adversaries. EMNLP 2019
    • Alex Estes and Christopher Hench. 2016. Supervised machine learning for hybrid meter. CLfL
    • Folgert Karsdorp, Peter van Kranenburg, and Enrique Manjavacas (2019) Learning Similarity Metrics for Melody Retrieval.
    • Elgammal, A., Liu, B., Elhoseiny, M. and Mazzone, M., 2017. CAN: Creative adversarial networks, generating" art" by learning about styles and deviating from style norms. ICCC
    • Sabatelli, M., Kestemont, M., Daelemans, W. and Geurts, P., 2018. Deep transfer learning for art classification problems. ECCV
You can’t perform that action at this time.