NodeJieba 简体中文

Introduction

NodeJieba provides chinese word segmentation for Node.js based on CppJieba.

Install

npm install nodejieba

Or with npmmirror.com:

npm install nodejieba --registry=https://registry.npmmirror.com --nodejieba_binary_host_mirror=https://registry.npmmirror.com/-/binary/nodejieba/

Usage

import { cut } from "nodejieba";

const result = cut("南京市长江大桥");
console.log(result);
//["南京市","长江大桥"]

See details in test cases

Initialization

Initialization is optional and will be executed once cut is called with the default dictionaries.

Loading the default dictionaries can be called explicitly by

import { load } from "nodejieba";

load();

If a dictionary parameter is missing, its default value will be uesd.

Dictionary description

dict: the main dictionary with weight and lexical tags, it's recommended to use the default dictionary
hmmDict: hidden markov model, it's recommended to use the default dictionary
userDict: user dictionary, it's recommended to modify it to your use case
idfDict: idf information for keyword extraction
stopWordDict: list of stop words for keyword extraction

POS Tagging

import { tag } from "nodejieba";

console.log(tag("红掌拨清波"));
//[ { word: '红掌', tag: 'n' },
//  { word: '拨', tag: 'v' },
//  { word: '清波', tag: 'n' } ]

See details in test cases

Keyword Extractor

import { extract, textRankExtract } from "nodejieba";

const topN = 4;

console.log(extract("升职加薪，当上CEO，走上人生巅峰。", topN));
//[ { word: 'CEO', weight: 11.739204307083542 },
//  { word: '升职', weight: 10.8561552143 },
//  { word: '加薪', weight: 10.642581114 },
//  { word: '巅峰', weight: 9.49395840471 } ]

console.log(textRankExtract("升职加薪，当上CEO，走上人生巅峰。", topN));
//[ { word: '当上', weight: 1 },
//  { word: '不用', weight: 0.9898479330698993 },
//  { word: '多久', weight: 0.9851260595435759 },
//  { word: '加薪', weight: 0.9830464899847804 },
//  { word: '升职', weight: 0.9802777682279076 } ]

See details in test cases

Node.js Support

v16
v18
v20

Use Cases

gitbook-plugin-search-pro
pinyin

Similar projects

@node-rs/jieba

Performance

It is supposed to have the best performance out of all available Node.js modules. There is a post available in mandarin [Jieba 中文分词系列性能评测].

Online Demo

http://cppjieba-webdemo.herokuapp.com/ (chrome is suggested)

Contact

Email: i@yanyiwu.com

Author

YanyiWu
contributors

Contributors

Code Contributors

This project exists thanks to all the people who contribute.

Financial Contributors

Become a financial contributor and help us sustain our community. [Contribute]

Individuals

Organizations

Support this project with your organization. Your logo will show up here with a link to your website. Contribute

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README_EN.md

README_EN.md

NodeJieba 简体中文

Introduction

Install

Usage

Initialization

Dictionary description

POS Tagging

Keyword Extractor

Node.js Support

Use Cases

Similar projects

Performance

Online Demo

Contact

Author

Contributors

Code Contributors

Financial Contributors

Individuals

Organizations

Files

README_EN.md

Latest commit

History

README_EN.md

File metadata and controls

NodeJieba 简体中文

Introduction

Install

Usage

Initialization

Dictionary description

POS Tagging

Keyword Extractor

Node.js Support

Use Cases

Similar projects

Performance

Online Demo

Contact

Author

Contributors

Code Contributors

Financial Contributors

Individuals

Organizations