Skip to content

ja_core_news_sm-2.3.2

Compare
Choose a tag to compare
@explosion-bot explosion-bot released this 01 Jul 09:36
· 1222 commits to master since this release
1953ab5

Downloads

Details: https://spacy.io/models/ja#ja_core_news_sm

File checksum: ab346bf623f59965d4813a5b9c8d19d19ccbf37e5e0f3b99684f9b5990c1be11

Japanese multi-task CNN trained on UD_Japanese-GSD v2.6-NE. Assigns context-specific token vectors, POS tags, dependency parses and named entities.

Feature Description
Name ja_core_news_sm
Version 2.3.2
spaCy >=2.3.0,<2.4.0
Model size 7 MB
Pipeline parser, ner
Vectors 0 keys, 0 unique vectors (0 dimensions)
Sources UD_Japanese-GSD v2.6 (Omura, Mai; Miyao, Yusuke; Kanayama, Hiroshi; Matsuda, Hiroshi; Wakasa, Aya; Yamashita, Kayo; Asahara, Masayuki; Tanaka, Takaaki; Murawaki, Yugo; Matsumoto, Yuji; Mori, Shinsuke; Uematsu, Sumire; McDonald, Ryan; Nivre, Joakim; Zeman, Daniel)
UD_Japanese-GSD v2.6-NE (Megagon Labs Tokyo)
SudachiPy (Works Applications)
SudachiDict (Works Applications)
License CC BY-SA 4.0
Author Explosion and Megagon Labs Tokyo

Label Scheme

Component Labels
parser ROOT, acl, advcl, advmod, amod, aux, case, cc, ccomp, compound, cop, csubj, dep, det, dislocated, fixed, mark, nmod, nsubj, nummod, obj, obl, punct
ner CARDINAL, DATE, EVENT, FAC, GPE, LANGUAGE, LAW, LOC, MONEY, MOVEMENT, NORP, ORDINAL, ORG, PERCENT, PERSON, PET_NAME, PHONE, PRODUCT, QUANTITY, TIME, TITLE_AFFIX, WORK_OF_ART

Accuracy

Type Score
LAS 聽86.93
UAS 聽88.65
TOKEN_ACC 聽97.67
ENTS_F 聽61.04
ENTS_P 聽65.40
ENTS_R 聽57.22

Installation

pip install spacy
python -m spacy download ja_core_news_sm