idn-tagged-corpus
Manually Tagged Indonesian Corpus
README.md versi Bahasa
Format Data
Korpus ini menggunakan format tab-separated file (.tsv). Setiap baris berisi token beserta part-of-speech tag dari token tersebut yang terpisahkan oleh satu karakter tab(\t). Antar kalimat dipisahkan oleh satu baris kosong.
README.md English version
Data Format
Each line consists of token with its respective part-of-speech tag separated by a tab character(\t). There is an empty line between sentences.
Authors
- Ruli Manurung
- Arawinda Dinakaramani
- Fam Rashel
- Andry Luthfi
Page
For publication and more details about this work, please visit http://bahasa.cs.ui.ac.id/postag/corpus
License
This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/4.0/.