ja_core_news_lg-2.3.2
explosion-bot
released this
01 Jul 09:36
·
1222 commits
to master
since this release
Details: https://spacy.io/models/ja#ja_core_news_lg
File checksum:
749b6484078f802544bc1c6ad15bb5f090bd40f20e4db3ff23c418804879ab5d
Japanese multi-task CNN trained on UD_Japanese-GSD v2.6-NE. Assigns word2vec token vectors, POS tags, dependency parses and named entities.
Feature | Description |
---|---|
Name | ja_core_news_lg |
Version | 2.3.2 |
spaCy | >=2.3.0,<2.4.0 |
Model size | 526 MB |
Pipeline | 聽parser , ner |
Vectors | 480443 keys, 480443 unique vectors (300 dimensions) |
Sources | UD_Japanese-GSD v2.6 (Omura, Mai; Miyao, Yusuke; Kanayama, Hiroshi; Matsuda, Hiroshi; Wakasa, Aya; Yamashita, Kayo; Asahara, Masayuki; Tanaka, Takaaki; Murawaki, Yugo; Matsumoto, Yuji; Mori, Shinsuke; Uematsu, Sumire; McDonald, Ryan; Nivre, Joakim; Zeman, Daniel) UD_Japanese-GSD v2.6-NE (Megagon Labs Tokyo) chiVe: Japanese Word Embedding with Sudachi & NWJC (chive-1.1-mc90-500k) (Works Applications) SudachiPy (Works Applications) SudachiDict (Works Applications) |
License | CC BY-SA 4.0 |
Author | Explosion and Megagon Labs Tokyo |
Label Scheme
Component | Labels |
---|---|
parser |
聽ROOT , acl , advcl , advmod , amod , aux , case , cc , ccomp , compound , cop , csubj , dep , det , dislocated , fixed , mark , nmod , nsubj , nummod , obj , obl , punct |
ner |
聽CARDINAL , DATE , EVENT , FAC , GPE , LANGUAGE , LAW , LOC , MONEY , MOVEMENT , NORP , ORDINAL , ORG , PERCENT , PERSON , PET_NAME , PHONE , PRODUCT , QUANTITY , TIME , TITLE_AFFIX , WORK_OF_ART |
Accuracy
Type | Score |
---|---|
LAS |
聽87.72 |
UAS |
聽89.24 |
TOKEN_ACC |
聽97.67 |
ENTS_F |
聽68.91 |
ENTS_P |
聽71.31 |
ENTS_R |
聽66.67 |
Installation
pip install spacy
python -m spacy download ja_core_news_lg