Skip to content
/ alz Public

This repository provides the compiled full-text corpus of the Allgemeine Literatur-Zeitung (General Literature Gazette) published from 1785 to 1849

License

Notifications You must be signed in to change notification settings

JULIELab/alz

Repository files navigation

Compiling a Full-text Corpus of the Allgemeine Literatur-Zeitung (ALZ)

This repository provides the compiled full-text corpus of the Allgemeine Literatur-Zeitung (General Literature Gazette) published from 1785 to 1849.

Its current version (V2, May 2019) contains 26,612 pages of full-texts from 261 volumes which is equivalent to 120,369,005 tokens, including review volumes (the main part of ALZ), supplementary and intelligence notes. In the folder v2_201905 you can find a table of overview as well as the whole corpus in xml and txt format.

The files v2_201905/romantik.tsv and v2_201905/musik.tsv contain search results of keywords which are relevant to romanticism and music. For more datails see the scripts (v2_201905/find_romantik.py & v2_201905/find_musik.py).

Acknowledgement

This work was conducted within the Graduate School “Romanticism as a Model” (http://modellromantik.uni-jena.de) supported by the German Research Foundation (DFG) under Grant No.: GRK 2041/1.

License

This work is licensed under CC-BY-SA 4.0: https://creativecommons.org/licenses/by-sa/4.0/

Citation

Please cite the following paper if you use our corpus:

Udo Hahn, Tinghui Duan. 2019. Corpus Assembly as Text Data Integration from Digital Libraries and the Web. In JCDL ’19: Proceedings of the 2019 ACM/IEEE Joint Conference on Digital Libraries, June 02–06, 2019, Urbana-Champaign, IL, USA. [Paper]

About

This repository provides the compiled full-text corpus of the Allgemeine Literatur-Zeitung (General Literature Gazette) published from 1785 to 1849

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published