Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Adds the SmartChineseAnalyzer as a plugin.

This branch is 0 commits ahead and 0 commits behind master

Fetching latest commit…

Cannot retrieve the latest commit at this time

README.textile

Adds the SmartChineseAnalyzer (http://code.google.com/p/imdict-chinese-analyzer/) as an easy-to-install plugin.

1) From a clean install, install the plugin as follows:

./plugin -url https://github.com/downloads/thmttch/elasticsearch/elasticsearch-analysis-smartchinese-0.18.0-SNAPSHOT.zip -install analysis-smartchinese

2) Create a new index, and set the default analyzer:

curl -XPUT localhost:9200/test1 -d ’
{
“analysis”: {
“analyzer”: {
“default”: {
“type”: “SmartChinese”
}
}
}
}’

3) Generate an analysis of some text. Notice that the analyzer generates both unigrams and bigrams:

curl -XGET localhost:9200/test1/analyze -d ‘{ “body” : “我说世界好!” }’
{
“tokens”: [
{
“end_offset”: 7,
“position”: 3,
“start_offset”: 3,
“token”: “text”,
“type”: “word”
},
{
“end_offset”: 12,
“position”: 7,
“start_offset”: 11,
“token”: “我”,
“type”: “word”
},
{
“end_offset”: 13,
“position”: 8,
“start_offset”: 12,
“token”: “说”,
“type”: “word”
},
{
“end_offset”: 15,
“position”: 9,
“start_offset”: 13,
“token”: “世界”,
“type”: “word”
},
{
“end_offset”: 16,
“position”: 10,
"start
offset": 15,
“token”: “好”,
“type”: “word”
}
]
}
Something went wrong with that request. Please try again.