Skip to content

ja_core_news_md-2.3.2

Compare
Choose a tag to compare
@explosion-bot explosion-bot released this 01 Jul 09:36
· 1222 commits to master since this release
1953ab5

Downloads

Details: https://spacy.io/models/ja#ja_core_news_md

File checksum: d685f52ecb4d8ee008fc11373fa7474a0e7d6e90ffc819b63141a11cae8b9b43

Japanese multi-task CNN trained on UD_Japanese-GSD v2.6-NE. Assigns word2vec token vectors, POS tags, dependency parses and named entities.

Feature Description
Name ja_core_news_md
Version 2.3.2
spaCy >=2.3.0,<2.4.0
Model size 37 MB
Pipeline parser, ner
Vectors 480443 keys, 20000 unique vectors (300 dimensions)
Sources UD_Japanese-GSD v2.6 (Omura, Mai; Miyao, Yusuke; Kanayama, Hiroshi; Matsuda, Hiroshi; Wakasa, Aya; Yamashita, Kayo; Asahara, Masayuki; Tanaka, Takaaki; Murawaki, Yugo; Matsumoto, Yuji; Mori, Shinsuke; Uematsu, Sumire; McDonald, Ryan; Nivre, Joakim; Zeman, Daniel)
UD_Japanese-GSD v2.6-NE (Megagon Labs Tokyo)
chiVe: Japanese Word Embedding with Sudachi & NWJC (chive-1.1-mc90-500k) (Works Applications)
SudachiPy (Works Applications)
SudachiDict (Works Applications)
License CC BY-SA 4.0
Author Explosion and Megagon Labs Tokyo

Label Scheme

Component Labels
parser ROOT, acl, advcl, advmod, amod, aux, case, cc, ccomp, compound, cop, csubj, dep, det, dislocated, fixed, mark, nmod, nsubj, nummod, obj, obl, punct
ner CARDINAL, DATE, EVENT, FAC, GPE, LANGUAGE, LAW, LOC, MONEY, MOVEMENT, NORP, ORDINAL, ORG, PERCENT, PERSON, PET_NAME, PHONE, PRODUCT, QUANTITY, TIME, TITLE_AFFIX, WORK_OF_ART

Accuracy

Type Score
LAS 聽87.43
UAS 聽89.03
TOKEN_ACC 聽97.67
ENTS_F 聽69.57
ENTS_P 聽71.84
ENTS_R 聽67.43

Installation

pip install spacy
python -m spacy download ja_core_news_md