Skip to content

Analysing French Universal Dependencies

Assaf Urieli edited this page Dec 19, 2018 · 3 revisions

Starting at release 5.1.2, there are options are possible when analysing French Universal Dependencies.

First, download the latest release, language packs, and configuration files from: https://github.com/joliciel-informatique/talismane/releases

Pure universal dependencies

In this case, there are no compound postags. If "des" is analysed as ADP+DET, it will be split into two lines. If it is analysed as DET, it will remain a single line.

The configuration file you want (replacing with the latest version of Talismane) is: talismane-fr-ud-output-5.2.0.conf

Use a command similar to :

java -Xmx2G -jar -Dconfig.file=talismane-fr-ud-output-5.2.0.conf talismane-core-5.2.0.jar --analyse --sessionId=fr --encoding=UTF8 --inFile=data/frTest.txt --outFile=data/frTest-ud.tal --logConfigFile=examples/conf/logback.xml

Universal dependencies + Compound PosTags

You may prefer compound postags which are not part of the UD tagset: ADP+DET (for words like "du" and "aux") and ADP+PRON (for words like "duquel", "auxquelles").

The configuration file you want (replacing with the latest version of Talismane) is: talismane-fr-ud-5.2.0.conf

Use a command similar to :

java -Xmx2G -jar -Dconfig.file=talismane-fr-ud-5.2.0.conf talismane-core-5.2.0.jar --analyse --sessionId=fr --encoding=UTF8 --inFile=data/frTest.txt --outFile=data/frTest-ud.tal --logConfigFile=examples/conf/logback.xml