ParlaMint-It is a collection of transcriptions of parliamentary sessions of the Italian Senate annotated in Universal Dependencies. The corpus is part of a larger multilingual collection of parliamentary transcripts built during the ParlaMint project (https://www.clarin.eu/parlamint).
ParlaMint-It is a sub-section of the Italian section of the ParlaMint corpus (Agnoloni et al., 2022) and includes sentences automatically annotated in the UD annotation scheme which were also manually revised. The internal composition reflects the original one of ParlaMint corpus covering debates collected during two time periods: the COVID-19 pandemic period (November 2019 - November 2020) and a previous period (March 2013 - October 2019) to be used as reference.
Sentence ids explicitly mark the source of the sentence in the whole ParlaMint corpus.
The Corpus (701 sentences; 20460 tokens) has been randomly split as follows:
- ParlaMint-It-train.conllu: 10026 tokens (326 sentences)
- ParlaMint-It-dev.conllu: 10434 tokens (375 sentences)
Tommaso Agnoloni, Roberto Bartolini, Francesca Frontini, Carlo Marchetti, Simonetta Montemagni, Valeria Quochi, Manuela Ruisi, Giulia Venturi. 2022. Making Italian Parliamentary Records Machine-Actionable: The Construction of the ParlaMint-IT Corpus. In “Proceedings of LREC 2022, Workshop of ParlaCLARIN III”, Marseille, 20 June 2022, pp. 117-124.
Tomaž Erjavec, Maciej Ogrodniczuk, Petya Osenova, Nikola Ljubešić, Kiril Simov, Andrej Pančur, Michał Rudolf, Matyáš Kopp, Starkaður Barkarson Steinþór Steingrímsson, Çağrı Çöltekin, Jesse de Does, Katrien Depuydt, Tommaso Agnoloni, Giulia Venturi, María Calzada Pérez, Luciana D. de Macedo, Costanza Navarretta, Giancarlo Luxardo, Matthew Coole, Paul Rayson, Vaidas Morkevičius, Tomas Krilavičius, Roberts Darģis, Orsolya Ring, Ruben van Heusden, Maarten Marx, and Darja Fišer. 2022. The ParlaMint corpora of parliamentary proceedings. In “Language Resources and Evaluation”, https://doi.org/10.1007/s10579-021-09574-0
- 2022-11-15 v2.11
- Initial release in Universal Dependencies.
=== Machine-readable metadata (DO NOT REMOVE!) ================================ Data available since: UD v2.11 License: CC BY-SA 4.0 Includes text: yes Genre: government legal Lemmas: manual native UPOS: manual native XPOS: not available Features: manual native Relations: manual native Contributors: Alzetta, Chiara; Sartor, Marta; Montemagni, Simonetta; Venturi, Giulia Contributing: here Contact: chiara.alzetta@ilc.cnr.it ===============================================================================