Skip to content

Releases: nlp-uoregon/trankit

Fixed loading error for customized pipelines and added a function for converting trankit outputs to CoNLL-U format

19 Jun 22:52
Compare
Choose a tag to compare
  • The issue #17 of loading customized pipelines has been fixed in this new release. Please check it out here.
  • In this new release, trankit supports conversion of trankit outputs in json format to CoNLL-U format. The conversion is done via the new function trankit2conllu, which can be used as belows:
from trankit import Pipeline, trankit2conllu

p = Pipeline('english')

# document level
json_doc = p('''Hello! This is Trankit.''')
conllu_doc = trankit2conllu(json_doc)
print(conllu_doc)
#1       Hello   hello   INTJ    UH      _       0       root    _       _
#2       !       !       PUNCT   .       _       1       punct   _       _
#
#1       This    this    PRON    DT      Number=Sing|PronType=Dem        3       nsubj   _       _
#2       is      be      AUX     VBZ     Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin   3       cop     _       _
#3       Trankit Trankit PROPN   NNP     Number=Sing     0       root    _       _
#4       .       .       PUNCT   .       _       3       punct   _       _

# sentence level
json_sent = p('''This is Trankit.''', is_sent=True)
conllu_sent = trankit2conllu(json_sent)
print(conllu_sent)
#1       This    this    PRON    DT      Number=Sing|PronType=Dem        3       nsubj   _       _
#2       is      be      AUX     VBZ     Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin   3       cop     _       _
#3       Trankit Trankit PROPN   NNP     Number=Sing     0       root    _       _
#4       .       .       PUNCT   .       _       3       punct   _       _

Fixed the compatibility issue of adapter-transformers

03 Apr 18:20
Compare
Choose a tag to compare
v1.0.1

fixed compatibility issue of adapter-transformers

Trankit v1.0.0

31 Mar 18:57
Compare
Choose a tag to compare

💥 💥 💥 Trankit v1.0.0 is out:

  • 90 new pretrained transformer-based pipelines for 56 languages. The new pipelines are trained with XLM-Roberta large, which further boosts the performance significantly over 90 treebanks of the Universal Dependencies v2.5 corpus. Check out the new performance here. This page shows you how to use the new pipelines.

  • Auto Mode for multilingual pipelines. In the Auto Mode, the language of the input will be automatically detected, enabling the multilingual pipelines to process the input without specifying its language. Check out how to turn on the Auto Mode here. Thank you loretoparisi for your suggestion on this.

  • Command-line interface is now available to use. This helps users who are not familiar with Python programming language can use Trankit easily. Check out the tutorials on this page.