Averell, the python library and command line interface that facilitates working with existing repositories of annotated poetry. Averell is able to download an annotated corpus and reconcile different TEI entities to provide a unified JSON output at the desired granularity. That is, for their investigations some researchers might need the entire poem, poems split line by line, or even word by word if that is available. Averell allows to specify the granularity of the final generated dataset, which is a combined JSON with all the entities in the selected corpora. Each corpus in the catalog must specify the parser to produce the expected data format.
- Free software: Apache Software License 2.0
id | name | lang | size | docs | words | granularity | license |
---|---|---|---|---|---|---|---|
1 | Disco V2.1 (disco2_1) | es | 22M | 4088 | 381539 | stanza line | CC-BY |
2 | Disco V3 (disco3) | es | 28M | 4080 | 377978 | stanza line | CC-BY |
3 | Sonetos Siglo de Oro (adso) | es | 6.8M | 5078 | 466012 | stanza line | CC-BY-NC 4.0 |
4 | ADSO 100 poems corpus (adso100) | es | 128K | 100 | 9208 | stanza line | CC-BY-NC 4.0 |
5 | Poesía Lírica Castellana Siglo de Oro (plc) | es | 3.8M | 475 | 299402 | stanza line word syllable | CC-BY-NC 4.0 |
6 | Gongocorpus (gongo) | es | 9.2M | 481 | 99079 | stanza line word syllable | CC-BY-NC-ND 3.0 FR |
7 | Eighteenth Century Poetry Archive (ecpa) | en | 2400M | 3084 | 2063668 | stanza line word | CC BY-SA 4.0 |
8 | For Better For Verse (4b4v) | en | 39.5M | 103 | 41749 | stanza line | Unknown |
9 | Métrique en Ligne (mel) | fr | 183M | 5081 | 1850222 | stanza line | Unknown |
10 | Biblioteca Italiana (bibit) | it | 242M | 25341 | 7121246 | stanza line word | Unknown |
11 | Corpus of Czech Verse (czverse) | cs | 4100M | 66428 | 12636867 | stanza line word | CC-BY-SA |
12 | Stichotheque (stichopt) | pt | 11.8M | 1702 | 168411 | stanza line | Unkwown |
https://averell.readthedocs.io/
To install averell, run this command in your terminal:
pip install averell
This is the preferred method to install averell, as it will always install the most recent stable release.
If you don't have pip installed, this Python installation guide can guide you through the process.
Check usage page