Skip to content
An R Package of Word Stemming for Bahasa Indonesia Using Nazief & Adriani's Algorithm
R
Branch: master
Clone or download

Latest commit

Latest commit 9f8c4af Feb 19, 2020

Files

Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
R remove rule ^ketidak[b]. issue #1 Apr 5, 2017
data non needed! Dec 9, 2015
man remove link Dec 9, 2015
DESCRIPTION Remove link from description field Dec 9, 2015
NAMESPACE
README.md
katadasaR.Rproj add functions to delete suffix and prefix Dec 3, 2015

README.md

Note: This package might not work on the latest R version

katadasaR

Provides a function to retrieve word stem (a.k.a. word stemming) for Bahasa Indonesia using Nazief and Andriani's algorithm. It consists of set of features to remove prefixes, suffixes or both, but still unable for infixes removal. This package is ported from C sharp code provided by csharp-indonesia.com. Credit goes to original author(s).

Install

This package is currenly under development. You can install using devtools::install_github() functions.

## install.packages("devtools")
library(devtools)
install_github("nurandi/katadasaR")

Usage

katadasaR a.k.a katadasar function checks if a word is word stem and do stemming process if it is an affixed word. Unfortunately, the function only able to process one word per call. See ?katadasaR for detail.

library(katadasaR)

katadasar("makanan")

## output:
## [1] "makan"

words <- c("jakarta", "seminar", "penggunaan", "menggurui", "pelajaran", "dimana")
sapply(words, katadasaR)

## output
##    jakarta    seminar penggunaan  menggurui  pelajaran     dimana 
##  "jakarta"  "seminar"     "guna"     "guru"     "ajar"     "mana" 

Acknowledgement

Ported from: csharp-indonesia.com

Related Article

katadasaR : Stemming Bahasa Indonesia dengan R - nurandi.net

You can’t perform that action at this time.