Skip to content
forked from simedw/mecab.js

Node.js bindings for mecab (japanese word boundaries)

Notifications You must be signed in to change notification settings

huwenshuo/mecab.js

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

mecab.js

Node.js bindings for mecab (japanese word boundaries) The original project: http://mecab.googlecode.com/svn/trunk/mecab/doc/index.html

Installation

If you read Japanese, follow the instructions on http://mecab.googlecode.com/svn/trunk/mecab/doc/index.html, do not forget to compile mecab-ipadic with unicode support.

For the rest of us:

Get the latest version of mecab from http://mecab.googlecode.com/svn/trunk/mecab/doc/index.html#download

tar zxfv mecab-X.X.tar.gz
cd mecab-X.X
./configure 
make
make check
sudo make install

Get the latest version of mecab-ipadic (a dictionary) from the same site. There are other dictinaries, but I haven't tried them yet.

tar zxfv mecab-ipadic-2.7.0-XXXX.tar.gz
mecab-ipadic-2.7.0-XXXX
./configure --with-charset=utf-8
make
sudo make install

Make sure you have a recent version of Node.js installed.

git clone https://github.com/simedw/mecab.js.git
cd mecab.js
npm install

Usage

mecab  = require 'mecab'
input  = '犬が猫を追いかけていました。'

parser = new mecab.MeCab()
parser.parse input, (error, result) ->
  throw error if error
  console.log result

Gives the following output

[ '', '', '', '', '追いかけ', '', '', 'まし', '', '' ]

Todo

  • Return information for each word segment (if it's an adjective, verb etc)
  • Make the result included the second best result as well

About

Node.js bindings for mecab (japanese word boundaries)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C++ 76.9%
  • CoffeeScript 17.7%
  • Python 5.4%