Spoken French data.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
LICENSE.txt
README.md
eval.log
fr_spoken-ud-dev.conllu
fr_spoken-ud-test.conllu
fr_spoken-ud-train.conllu
stats.xml

README.md

Summary

A Universal Dependencies corpus for spoken French.

Introduction

The corpus was converted automatically from the Rhapsodie treebank with manual corrections.

Xpos and features (which are not available in v2.2 of UD_French-Spoken) will be added to future versions of this treebank as they are encoded in the Rhapsodie treebank.

Structure

  • fr_spoken-ud-train.conllu 1153 sentences 14952 tokens
  • fr_spoken-ud-dev.conllu 907 sentences 10010 tokens
  • fr_spoken-ud-test.conllu 726 sentences 10010 tokens
  • total 2786 sentences 34972 tokens

Changelog

  • 2018-04-15 v2.2
    • Initial release

=== Machine-readable metadata (DO NOT REMOVE!) ================================ Data available since: UD v2.2 License: CC BY-SA 4.0 Includes text: yes Genre: spoken Lemmas: converted from manual UPOS: converted with corrections XPOS: not available Features: not available Relations: converted with corrections Contributors: Gerdes, Kim; Kahane, Sylvain; Yan, Chunxiao; Etienne, Aline; Courtin, Marine Contributing: here Contact: kim@gerdes.fr