Skip to content

linhd-postdata/averell

develop
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
ci
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Averell

PyPI Package latest release Travis-CI Build Status Documentation Status Zenodo DOI

Averell, the python library and command line interface that facilitates working with existing repositories of annotated poetry. Averell is able to download an annotated corpus and reconcile different TEI entities to provide a unified JSON output at the desired granularity. That is, for their investigations some researchers might need the entire poem, poems split line by line, or even word by word if that is available. Averell allows to specify the granularity of the final generated dataset, which is a combined JSON with all the entities in the selected corpora. Each corpus in the catalog must specify the parser to produce the expected data format.

  • Free software: Apache Software License 2.0

Available corpora (version 1.1.0)

id name lang size docs words granularity license
1 Disco V2.1 (disco2_1) es 22M 4088 381539 stanza line CC-BY
2 Disco V3 (disco3) es 28M 4080 377978 stanza line CC-BY
3 Sonetos Siglo de Oro (adso) es 6.8M 5078 466012 stanza line CC-BY-NC 4.0
4 ADSO 100 poems corpus (adso100) es 128K 100 9208 stanza line CC-BY-NC 4.0
5 Poesía Lírica Castellana Siglo de Oro (plc) es 3.8M 475 299402 stanza line word syllable CC-BY-NC 4.0
6 Gongocorpus (gongo) es 9.2M 481 99079 stanza line word syllable CC-BY-NC-ND 3.0 FR
7 Eighteenth Century Poetry Archive (ecpa) en 2400M 3084 2063668 stanza line word CC BY-SA 4.0
8 For Better For Verse (4b4v) en 39.5M 103 41749 stanza line Unknown
9 Métrique en Ligne (mel) fr 183M 5081 1850222 stanza line Unknown
10 Biblioteca Italiana (bibit) it 242M 25341 7121246 stanza line word Unknown
11 Corpus of Czech Verse (czverse) cs 4100M 66428 12636867 stanza line word CC-BY-SA
12 Stichotheque (stichopt) pt 11.8M 1702 168411 stanza line Unkwown

Documentation

https://averell.readthedocs.io/

Installation

To install averell, run this command in your terminal:

pip install averell

This is the preferred method to install averell, as it will always install the most recent stable release.

If you don't have pip installed, this Python installation guide can guide you through the process.

Usage

Check usage page

About

Corpora downloader and reader for annotated poetic sources

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages