Indonesian Manually Tagged Corpus
Switch branches/tags
Nothing to show
Clone or download
Latest commit 557e634 Sep 9, 2016

README.md

idn-tagged-corpus

Manually Tagged Indonesian Corpus

README.md versi Bahasa

Format Data

Korpus ini menggunakan format tab-separated file (.tsv). Setiap baris berisi token beserta part-of-speech tag dari token tersebut yang terpisahkan oleh satu karakter tab(\t). Antar kalimat dipisahkan oleh satu baris kosong.

README.md English version

Data Format

Each line consists of token with its respective part-of-speech tag separated by a tab character(\t). There is an empty line between sentences.

Authors

  • Ruli Manurung
  • Arawinda Dinakaramani
  • Fam Rashel
  • Andry Luthfi

Page

For publication and more details about this work, please visit http://bahasa.cs.ui.ac.id/postag/corpus

License

This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/4.0/.