Skip to content

Releases: explosion/spacy-models

en_core_web_trf-3.7.3

17 Nov 08:13
f15206b
Compare
Choose a tag to compare

Downloads Downloads (wheel)

Checksum .tar.gz: dae355f7f419bee53f2804a8e62a6473425e8680ac8ff8e8a7b30b7e2b8b0c4f
Checksum .whl: f72abb34bdf174876bd4267b29b2501677e605e0a251fdc56c163003182ed68b

Details: https://spacy.io/models/en#en_core_web_trf

English transformer pipeline (Transformer(name='roberta-base', piece_encoder='byte-bpe', stride=104, type='roberta', width=768, window=144, vocab_size=50265)). Components: transformer, tagger, parser, ner, attribute_ruler, lemmatizer.

Feature Description
Name en_core_web_trf
Version 3.7.3
spaCy >=3.7.2,<3.8.0
Default Pipeline transformer, tagger, parser, attribute_ruler, lemmatizer, ner
Components transformer, tagger, parser, attribute_ruler, lemmatizer, ner
Vectors 0 keys, 0 unique vectors (0 dimensions)
Sources OntoNotes 5 (Ralph Weischedel, Martha Palmer, Mitchell Marcus, Eduard Hovy, Sameer Pradhan, Lance Ramshaw, Nianwen Xue, Ann Taylor, Jeff Kaufman, Michelle Franchini, Mohammed El-Bachouti, Robert Belvin, Ann Houston)
ClearNLP Constituent-to-Dependency Conversion (Emory University)
WordNet 3.0 (Princeton University)
roberta-base (Yinhan Liu and Myle Ott and Naman Goyal and Jingfei Du and Mandar Joshi and Danqi Chen and Omer Levy and Mike Lewis and Luke Zettlemoyer and Veselin Stoyanov)
License MIT
Author Explosion
Model size 436 MB

Label Scheme

View label scheme (112 labels for 3 components)
Component Labels
tagger $, '', ,, -LRB-, -RRB-, ., :, ADD, AFX, CC, CD, DT, EX, FW, HYPH, IN, JJ, JJR, JJS, LS, MD, NFP, NN, NNP, NNPS, NNS, PDT, POS, PRP, PRP$, RB, RBR, RBS, RP, SYM, TO, UH, VB, VBD, VBG, VBN, VBP, VBZ, WDT, WP, WP$, WRB, XX, ````
parser ROOT, acl, acomp, advcl, advmod, agent, amod, appos, attr, aux, auxpass, case, cc, ccomp, compound, conj, csubj, csubjpass, dative, dep, det, dobj, expl, intj, mark, meta, neg, nmod, npadvmod, nsubj, nsubjpass, nummod, oprd, parataxis, pcomp, pobj, poss, preconj, predet, prep, prt, punct, quantmod, relcl, xcomp
ner CARDINAL, DATE, EVENT, FAC, GPE, LANGUAGE, LAW, LOC, MONEY, NORP, ORDINAL, ORG, PERCENT, PERSON, PRODUCT, QUANTITY, TIME, WORK_OF_ART

Accuracy

Type Score
TOKEN_ACC 99.86
TOKEN_P 99.57
TOKEN_R 99.58
TOKEN_F 99.57
TAG_ACC 98.13
SENTS_P 94.89
SENTS_R 85.79
SENTS_F 90.11
DEP_UAS 95.26
DEP_LAS 93.91
ENTS_P 90.08
ENTS_R 90.30
ENTS_F 90.19

Installation

pip install spacy
python -m spacy download en_core_web_trf

en_core_web_sm-3.7.1

17 Nov 08:13
f15206b
Compare
Choose a tag to compare

Downloads Downloads (wheel)

Checksum .tar.gz: 1075c2aa2bc2fee105ab6e90a01a5d1a428c9f5b20a1fa003dc2cb6a438d295e
Checksum .whl: 86cc141f63942d4b2c5fcee06630fd6f904788d2f0ab005cce45aadb8fb73889

Details: https://spacy.io/models/en#en_core_web_sm

English pipeline optimized for CPU. Components: tok2vec, tagger, parser, senter, ner, attribute_ruler, lemmatizer.

Feature Description
Name en_core_web_sm
Version 3.7.1
spaCy >=3.7.2,<3.8.0
Default Pipeline tok2vec, tagger, parser, attribute_ruler, lemmatizer, ner
Components tok2vec, tagger, parser, senter, attribute_ruler, lemmatizer, ner
Vectors 0 keys, 0 unique vectors (0 dimensions)
Sources OntoNotes 5 (Ralph Weischedel, Martha Palmer, Mitchell Marcus, Eduard Hovy, Sameer Pradhan, Lance Ramshaw, Nianwen Xue, Ann Taylor, Jeff Kaufman, Michelle Franchini, Mohammed El-Bachouti, Robert Belvin, Ann Houston)
ClearNLP Constituent-to-Dependency Conversion (Emory University)
WordNet 3.0 (Princeton University)
License MIT
Author Explosion
Model size 12 MB

Label Scheme

View label scheme (113 labels for 3 components)
Component Labels
tagger $, '', ,, -LRB-, -RRB-, ., :, ADD, AFX, CC, CD, DT, EX, FW, HYPH, IN, JJ, JJR, JJS, LS, MD, NFP, NN, NNP, NNPS, NNS, PDT, POS, PRP, PRP$, RB, RBR, RBS, RP, SYM, TO, UH, VB, VBD, VBG, VBN, VBP, VBZ, WDT, WP, WP$, WRB, XX, _SP, ````
parser ROOT, acl, acomp, advcl, advmod, agent, amod, appos, attr, aux, auxpass, case, cc, ccomp, compound, conj, csubj, csubjpass, dative, dep, det, dobj, expl, intj, mark, meta, neg, nmod, npadvmod, nsubj, nsubjpass, nummod, oprd, parataxis, pcomp, pobj, poss, preconj, predet, prep, prt, punct, quantmod, relcl, xcomp
ner CARDINAL, DATE, EVENT, FAC, GPE, LANGUAGE, LAW, LOC, MONEY, NORP, ORDINAL, ORG, PERCENT, PERSON, PRODUCT, QUANTITY, TIME, WORK_OF_ART

Accuracy

Type Score
TOKEN_ACC 99.86
TOKEN_P 99.57
TOKEN_R 99.58
TOKEN_F 99.57
TAG_ACC 97.25
SENTS_P 92.02
SENTS_R 89.21
SENTS_F 90.59
DEP_UAS 91.75
DEP_LAS 89.87
ENTS_P 84.55
ENTS_R 84.57
ENTS_F 84.56

Installation

pip install spacy
python -m spacy download en_core_web_sm

en_core_web_md-3.7.1

17 Nov 08:13
f15206b
Compare
Choose a tag to compare

Downloads Downloads (wheel)

Checksum .tar.gz: 3273a1335fcb688be09949c5cdb73e85eb584ec3dfc50d4338c17daf6ccd4628
Checksum .whl: 6a0f857a2b4d219c6fa17d455f82430b365bf53171a2d919b9376e5dc9be032e

Details: https://spacy.io/models/en#en_core_web_md

English pipeline optimized for CPU. Components: tok2vec, tagger, parser, senter, ner, attribute_ruler, lemmatizer.

Feature Description
Name en_core_web_md
Version 3.7.1
spaCy >=3.7.2,<3.8.0
Default Pipeline tok2vec, tagger, parser, attribute_ruler, lemmatizer, ner
Components tok2vec, tagger, parser, senter, attribute_ruler, lemmatizer, ner
Vectors 514157 keys, 20000 unique vectors (300 dimensions)
Sources OntoNotes 5 (Ralph Weischedel, Martha Palmer, Mitchell Marcus, Eduard Hovy, Sameer Pradhan, Lance Ramshaw, Nianwen Xue, Ann Taylor, Jeff Kaufman, Michelle Franchini, Mohammed El-Bachouti, Robert Belvin, Ann Houston)
ClearNLP Constituent-to-Dependency Conversion (Emory University)
WordNet 3.0 (Princeton University)
Explosion Vectors (OSCAR 2109 + Wikipedia + OpenSubtitles + WMT News Crawl) (Explosion)
License MIT
Author Explosion
Model size 40 MB

Label Scheme

View label scheme (113 labels for 3 components)
Component Labels
tagger $, '', ,, -LRB-, -RRB-, ., :, ADD, AFX, CC, CD, DT, EX, FW, HYPH, IN, JJ, JJR, JJS, LS, MD, NFP, NN, NNP, NNPS, NNS, PDT, POS, PRP, PRP$, RB, RBR, RBS, RP, SYM, TO, UH, VB, VBD, VBG, VBN, VBP, VBZ, WDT, WP, WP$, WRB, XX, _SP, ````
parser ROOT, acl, acomp, advcl, advmod, agent, amod, appos, attr, aux, auxpass, case, cc, ccomp, compound, conj, csubj, csubjpass, dative, dep, det, dobj, expl, intj, mark, meta, neg, nmod, npadvmod, nsubj, nsubjpass, nummod, oprd, parataxis, pcomp, pobj, poss, preconj, predet, prep, prt, punct, quantmod, relcl, xcomp
ner CARDINAL, DATE, EVENT, FAC, GPE, LANGUAGE, LAW, LOC, MONEY, NORP, ORDINAL, ORG, PERCENT, PERSON, PRODUCT, QUANTITY, TIME, WORK_OF_ART

Accuracy

Type Score
TOKEN_ACC 99.86
TOKEN_P 99.57
TOKEN_R 99.58
TOKEN_F 99.57
TAG_ACC 97.33
SENTS_P 92.21
SENTS_R 89.37
SENTS_F 90.77
DEP_UAS 92.05
DEP_LAS 90.23
ENTS_P 84.94
ENTS_R 85.49
ENTS_F 85.22

Installation

pip install spacy
python -m spacy download en_core_web_md

en_core_web_lg-3.7.1

17 Nov 08:13
f15206b
Compare
Choose a tag to compare

Downloads Downloads (wheel)

Checksum .tar.gz: 4c8b2fd2572a5fb232c7b38345d301e7e092d1242b7184e14a86eff8ef6eb6d7
Checksum .whl: ab70aeb6172cde82508f7739f35ebc9918a3d07debeed637403c8f794ba3d3dc

Details: https://spacy.io/models/en#en_core_web_lg

English pipeline optimized for CPU. Components: tok2vec, tagger, parser, senter, ner, attribute_ruler, lemmatizer.

Feature Description
Name en_core_web_lg
Version 3.7.1
spaCy >=3.7.2,<3.8.0
Default Pipeline tok2vec, tagger, parser, attribute_ruler, lemmatizer, ner
Components tok2vec, tagger, parser, senter, attribute_ruler, lemmatizer, ner
Vectors 514157 keys, 514157 unique vectors (300 dimensions)
Sources OntoNotes 5 (Ralph Weischedel, Martha Palmer, Mitchell Marcus, Eduard Hovy, Sameer Pradhan, Lance Ramshaw, Nianwen Xue, Ann Taylor, Jeff Kaufman, Michelle Franchini, Mohammed El-Bachouti, Robert Belvin, Ann Houston)
ClearNLP Constituent-to-Dependency Conversion (Emory University)
WordNet 3.0 (Princeton University)
Explosion Vectors (OSCAR 2109 + Wikipedia + OpenSubtitles + WMT News Crawl) (Explosion)
License MIT
Author Explosion
Model size 560 MB

Label Scheme

View label scheme (113 labels for 3 components)
Component Labels
tagger $, '', ,, -LRB-, -RRB-, ., :, ADD, AFX, CC, CD, DT, EX, FW, HYPH, IN, JJ, JJR, JJS, LS, MD, NFP, NN, NNP, NNPS, NNS, PDT, POS, PRP, PRP$, RB, RBR, RBS, RP, SYM, TO, UH, VB, VBD, VBG, VBN, VBP, VBZ, WDT, WP, WP$, WRB, XX, _SP, ````
parser ROOT, acl, acomp, advcl, advmod, agent, amod, appos, attr, aux, auxpass, case, cc, ccomp, compound, conj, csubj, csubjpass, dative, dep, det, dobj, expl, intj, mark, meta, neg, nmod, npadvmod, nsubj, nsubjpass, nummod, oprd, parataxis, pcomp, pobj, poss, preconj, predet, prep, prt, punct, quantmod, relcl, xcomp
ner CARDINAL, DATE, EVENT, FAC, GPE, LANGUAGE, LAW, LOC, MONEY, NORP, ORDINAL, ORG, PERCENT, PERSON, PRODUCT, QUANTITY, TIME, WORK_OF_ART

Accuracy

Type Score
TOKEN_ACC 99.86
TOKEN_P 99.57
TOKEN_R 99.58
TOKEN_F 99.57
TAG_ACC 97.35
SENTS_P 92.19
SENTS_R 89.27
SENTS_F 90.71
DEP_UAS 92.08
DEP_LAS 90.27
ENTS_P 85.16
ENTS_R 85.70
ENTS_F 85.43

Installation

pip install spacy
python -m spacy download en_core_web_lg

zh_core_web_trf-3.7.2

01 Oct 09:30
dbe9b97
Compare
Choose a tag to compare

Downloads Downloads (wheel)

Checksum .tar.gz: 38857a79f6754b9427619362843c84c18e6410e7ba1f05a1d7aa1c91f7b08904
Checksum .whl: 16b8d4bf23d20a04cfcbe676ae1be2be4437b40cf8101c9f3e7f6db4674ec91d

Details: https://spacy.io/models/zh#zh_core_web_trf

Chinese transformer pipeline (Transformer(name='bert-base-chinese', piece_encoder='bert-wordpiece', stride=152, type='bert', width=768, window=208, vocab_size=21128)). Components: transformer, tagger, parser, ner, attribute_ruler.

Feature Description
Name zh_core_web_trf
Version 3.7.2
spaCy >=3.7.0,<3.8.0
Default Pipeline transformer, tagger, parser, attribute_ruler, ner
Components transformer, tagger, parser, attribute_ruler, ner
Vectors 0 keys, 0 unique vectors (0 dimensions)
Sources OntoNotes 5 (Ralph Weischedel, Martha Palmer, Mitchell Marcus, Eduard Hovy, Sameer Pradhan, Lance Ramshaw, Nianwen Xue, Ann Taylor, Jeff Kaufman, Michelle Franchini, Mohammed El-Bachouti, Robert Belvin, Ann Houston)
CoreNLP Universal Dependencies Converter (Stanford NLP Group)
bert-base-chinese (Hugging Face)
License MIT
Author Explosion
Model size 396 MB

Label Scheme

View label scheme (99 labels for 3 components)
Component Labels
tagger AD, AS, BA, CC, CD, CS, DEC, DEG, DER, DEV, DT, ETC, FW, IJ, INF, JJ, LB, LC, M, MSP, NN, NR, NT, OD, ON, P, PN, PU, SB, SP, URL, VA, VC, VE, VV, X
parser ROOT, acl, advcl:loc, advmod, advmod:dvp, advmod:loc, advmod:rcomp, amod, amod:ordmod, appos, aux:asp, aux:ba, aux:modal, aux:prtmod, auxpass, case, cc, ccomp, compound:nn, compound:vc, conj, cop, dep, det, discourse, dobj, etc, mark, mark:clf, name, neg, nmod, nmod:assmod, nmod:poss, nmod:prep, nmod:range, nmod:tmod, nmod:topic, nsubj, nsubj:xsubj, nsubjpass, nummod, parataxis:prnmod, punct, xcomp
ner CARDINAL, DATE, EVENT, FAC, GPE, LANGUAGE, LAW, LOC, MONEY, NORP, ORDINAL, ORG, PERCENT, PERSON, PRODUCT, QUANTITY, TIME, WORK_OF_ART

Accuracy

Type Score
TOKEN_ACC 95.85
TOKEN_P 94.58
TOKEN_R 91.36
TOKEN_F 92.94
TAG_ACC 91.75
SENTS_P 70.92
SENTS_R 67.57
SENTS_F 69.21
DEP_UAS 75.72
DEP_LAS 71.45
ENTS_P 76.09
ENTS_R 72.18
ENTS_F 74.08

Installation

pip install spacy
python -m spacy download zh_core_web_trf

zh_core_web_sm-3.7.0

01 Oct 08:58
dbe9b97
Compare
Choose a tag to compare

Downloads Downloads (wheel)

Checksum .tar.gz: c22fe1cb9a0479a297d24d33641592436d1b68385c9bbd750ea20e84c4273ef5
Checksum .whl: f51075665749e07406d629d1055ce5a68635fae6ab3c34257ee798c62b4fc431

Details: https://spacy.io/models/zh#zh_core_web_sm

Chinese pipeline optimized for CPU. Components: tok2vec, tagger, parser, senter, ner, attribute_ruler.

Feature Description
Name zh_core_web_sm
Version 3.7.0
spaCy >=3.7.0,<3.8.0
Default Pipeline tok2vec, tagger, parser, attribute_ruler, ner
Components tok2vec, tagger, parser, senter, attribute_ruler, ner
Vectors 0 keys, 0 unique vectors (0 dimensions)
Sources OntoNotes 5 (Ralph Weischedel, Martha Palmer, Mitchell Marcus, Eduard Hovy, Sameer Pradhan, Lance Ramshaw, Nianwen Xue, Ann Taylor, Jeff Kaufman, Michelle Franchini, Mohammed El-Bachouti, Robert Belvin, Ann Houston)
CoreNLP Universal Dependencies Converter (Stanford NLP Group)
License MIT
Author Explosion
Model size 46 MB

Label Scheme

View label scheme (100 labels for 3 components)
Component Labels
tagger AD, AS, BA, CC, CD, CS, DEC, DEG, DER, DEV, DT, ETC, FW, IJ, INF, JJ, LB, LC, M, MSP, NN, NR, NT, OD, ON, P, PN, PU, SB, SP, URL, VA, VC, VE, VV, X, _SP
parser ROOT, acl, advcl:loc, advmod, advmod:dvp, advmod:loc, advmod:rcomp, amod, amod:ordmod, appos, aux:asp, aux:ba, aux:modal, aux:prtmod, auxpass, case, cc, ccomp, compound:nn, compound:vc, conj, cop, dep, det, discourse, dobj, etc, mark, mark:clf, name, neg, nmod, nmod:assmod, nmod:poss, nmod:prep, nmod:range, nmod:tmod, nmod:topic, nsubj, nsubj:xsubj, nsubjpass, nummod, parataxis:prnmod, punct, xcomp
ner CARDINAL, DATE, EVENT, FAC, GPE, LANGUAGE, LAW, LOC, MONEY, NORP, ORDINAL, ORG, PERCENT, PERSON, PRODUCT, QUANTITY, TIME, WORK_OF_ART

Accuracy

Type Score
TOKEN_ACC 95.85
TOKEN_P 94.58
TOKEN_R 91.36
TOKEN_F 92.94
TAG_ACC 89.33
SENTS_P 77.85
SENTS_R 72.62
SENTS_F 75.14
DEP_UAS 69.60
DEP_LAS 64.08
ENTS_P 72.03
ENTS_R 64.93
ENTS_F 68.30

Installation

pip install spacy
python -m spacy download zh_core_web_sm

zh_core_web_md-3.7.0

01 Oct 08:58
dbe9b97
Compare
Choose a tag to compare

Downloads Downloads (wheel)

Checksum .tar.gz: 920cf2f7e8db666f22d52b763ff76cf9eeac2c7e6dbc00f5e99ed543ba7da50e
Checksum .whl: a528dbbcf7f323718be4b523559840dc850303046e25a62f9a1049b7ab9f9e68

Details: https://spacy.io/models/zh#zh_core_web_md

Chinese pipeline optimized for CPU. Components: tok2vec, tagger, parser, senter, ner, attribute_ruler.

Feature Description
Name zh_core_web_md
Version 3.7.0
spaCy >=3.7.0,<3.8.0
Default Pipeline tok2vec, tagger, parser, attribute_ruler, ner
Components tok2vec, tagger, parser, senter, attribute_ruler, ner
Vectors 500000 keys, 20000 unique vectors (300 dimensions)
Sources OntoNotes 5 (Ralph Weischedel, Martha Palmer, Mitchell Marcus, Eduard Hovy, Sameer Pradhan, Lance Ramshaw, Nianwen Xue, Ann Taylor, Jeff Kaufman, Michelle Franchini, Mohammed El-Bachouti, Robert Belvin, Ann Houston)
CoreNLP Universal Dependencies Converter (Stanford NLP Group)
Explosion fastText Vectors (cbow, OSCAR Common Crawl + Wikipedia) (Explosion)
License MIT
Author Explosion
Model size 74 MB

Label Scheme

View label scheme (100 labels for 3 components)
Component Labels
tagger AD, AS, BA, CC, CD, CS, DEC, DEG, DER, DEV, DT, ETC, FW, IJ, INF, JJ, LB, LC, M, MSP, NN, NR, NT, OD, ON, P, PN, PU, SB, SP, URL, VA, VC, VE, VV, X, _SP
parser ROOT, acl, advcl:loc, advmod, advmod:dvp, advmod:loc, advmod:rcomp, amod, amod:ordmod, appos, aux:asp, aux:ba, aux:modal, aux:prtmod, auxpass, case, cc, ccomp, compound:nn, compound:vc, conj, cop, dep, det, discourse, dobj, etc, mark, mark:clf, name, neg, nmod, nmod:assmod, nmod:poss, nmod:prep, nmod:range, nmod:tmod, nmod:topic, nsubj, nsubj:xsubj, nsubjpass, nummod, parataxis:prnmod, punct, xcomp
ner CARDINAL, DATE, EVENT, FAC, GPE, LANGUAGE, LAW, LOC, MONEY, NORP, ORDINAL, ORG, PERCENT, PERSON, PRODUCT, QUANTITY, TIME, WORK_OF_ART

Accuracy

Type Score
TOKEN_ACC 95.85
TOKEN_P 94.58
TOKEN_R 91.36
TOKEN_F 92.94
TAG_ACC 90.04
SENTS_P 78.89
SENTS_R 72.80
SENTS_F 75.72
DEP_UAS 70.50
DEP_LAS 65.22
ENTS_P 71.88
ENTS_R 67.90
ENTS_F 69.83

Installation

pip install spacy
python -m spacy download zh_core_web_md

zh_core_web_lg-3.7.0

01 Oct 08:58
dbe9b97
Compare
Choose a tag to compare

Downloads Downloads (wheel)

Checksum .tar.gz: 0a07048baf3e73f22b16a7edac47f97632772c7a05ebf1bcc51ab458f0670dcf
Checksum .whl: 6bfd1796788dc27c0f5e0cc43374eb96abe0b4f0ec1b29f19f5782051216c556

Details: https://spacy.io/models/zh#zh_core_web_lg

Chinese pipeline optimized for CPU. Components: tok2vec, tagger, parser, senter, ner, attribute_ruler.

Feature Description
Name zh_core_web_lg
Version 3.7.0
spaCy >=3.7.0,<3.8.0
Default Pipeline tok2vec, tagger, parser, attribute_ruler, ner
Components tok2vec, tagger, parser, senter, attribute_ruler, ner
Vectors 500000 keys, 500000 unique vectors (300 dimensions)
Sources OntoNotes 5 (Ralph Weischedel, Martha Palmer, Mitchell Marcus, Eduard Hovy, Sameer Pradhan, Lance Ramshaw, Nianwen Xue, Ann Taylor, Jeff Kaufman, Michelle Franchini, Mohammed El-Bachouti, Robert Belvin, Ann Houston)
CoreNLP Universal Dependencies Converter (Stanford NLP Group)
Explosion fastText Vectors (cbow, OSCAR Common Crawl + Wikipedia) (Explosion)
License MIT
Author Explosion
Model size 575 MB

Label Scheme

View label scheme (100 labels for 3 components)
Component Labels
tagger AD, AS, BA, CC, CD, CS, DEC, DEG, DER, DEV, DT, ETC, FW, IJ, INF, JJ, LB, LC, M, MSP, NN, NR, NT, OD, ON, P, PN, PU, SB, SP, URL, VA, VC, VE, VV, X, _SP
parser ROOT, acl, advcl:loc, advmod, advmod:dvp, advmod:loc, advmod:rcomp, amod, amod:ordmod, appos, aux:asp, aux:ba, aux:modal, aux:prtmod, auxpass, case, cc, ccomp, compound:nn, compound:vc, conj, cop, dep, det, discourse, dobj, etc, mark, mark:clf, name, neg, nmod, nmod:assmod, nmod:poss, nmod:prep, nmod:range, nmod:tmod, nmod:topic, nsubj, nsubj:xsubj, nsubjpass, nummod, parataxis:prnmod, punct, xcomp
ner CARDINAL, DATE, EVENT, FAC, GPE, LANGUAGE, LAW, LOC, MONEY, NORP, ORDINAL, ORG, PERCENT, PERSON, PRODUCT, QUANTITY, TIME, WORK_OF_ART

Accuracy

Type Score
TOKEN_ACC 95.85
TOKEN_P 94.58
TOKEN_R 91.36
TOKEN_F 92.94
TAG_ACC 90.33
SENTS_P 78.05
SENTS_R 72.63
SENTS_F 75.24
DEP_UAS 70.86
DEP_LAS 65.71
ENTS_P 73.55
ENTS_R 69.25
ENTS_F 71.34

Installation

pip install spacy
python -m spacy download zh_core_web_lg

xx_sent_ud_sm-3.7.0

01 Oct 08:58
dbe9b97
Compare
Choose a tag to compare

Downloads Downloads (wheel)

Checksum .tar.gz: fc769f274ad087e1ee3042d671a5487714a885d2a0fba5baea56cd5a6b23cc8d
Checksum .whl: aafb609d5a895a62ed9672fbef2aa8061106a4b164a700999a376f8529acc3ad

Details: https://spacy.io/models/xx#xx_sent_ud_sm

Multi-language pipeline optimized for CPU. Components: senter.

Feature Description
Name xx_sent_ud_sm
Version 3.7.0
spaCy >=3.7.0,<3.8.0
Default Pipeline senter
Components senter
Vectors 0 keys, 0 unique vectors (0 dimensions)
Sources Universal Dependencies v2.8 (UD_Afrikaans-AfriBooms, UD_Croatian-SET, UD_Czech-CAC, UD_Czech-CLTT, UD_Danish-DDT, UD_Dutch-Alpino, UD_Dutch-LassySmall, UD_English-EWT, UD_Finnish-FTB, UD_Finnish-TDT, UD_French-GSD, UD_French-Spoken, UD_German-GSD, UD_Indonesian-GSD, UD_Irish-IDT, UD_Italian-TWITTIRO, UD_Korean-GSD, UD_Korean-Kaist, UD_Latvian-LVTB, UD_Lithuanian-ALKSNIS, UD_Lithuanian-HSE, UD_Marathi-UFAL, UD_Norwegian-Bokmaal, UD_Norwegian-Nynorsk, UD_Norwegian-NynorskLIA, UD_Persian-Seraji, UD_Portuguese-Bosque, UD_Portuguese-GSD, UD_Romanian-Nonstandard, UD_Romanian-RRT, UD_Russian-GSD, UD_Russian-Taiga, UD_Serbian-SET, UD_Slovak-SNK, UD_Spanish-GSD, UD_Swedish-Talbanken, UD_Telugu-MTG, UD_Vietnamese-VTB) (Zeman, Daniel; Nivre, Joakim; Abrams, Mitchell; et al.)
License CC BY-SA 3.0
Author Explosion
Model size 4 MB

Label Scheme

Accuracy

Type Score
TOKEN_ACC 98.59
TOKEN_P 95.31
TOKEN_R 95.72
TOKEN_F 95.52
SENTS_P 90.66
SENTS_R 81.58
SENTS_F 85.88

Installation

pip install spacy
python -m spacy download xx_sent_ud_sm

xx_ent_wiki_sm-3.7.0

01 Oct 08:58
dbe9b97
Compare
Choose a tag to compare

Downloads Downloads (wheel)

Checksum .tar.gz: 96e9c622429d34c08127aca1689fb5c5c557bbd3027c4a5a655874dd915206cc
Checksum .whl: 66c227a793f8a79814d6ca1da7c0ae633172e2fb0a94737bc8bd2e517479e73c

Details: https://spacy.io/models/xx#xx_ent_wiki_sm

Multi-language pipeline optimized for CPU. Components: ner.

Feature Description
Name xx_ent_wiki_sm
Version 3.7.0
spaCy >=3.7.0,<3.8.0
Default Pipeline ner
Components ner
Vectors 0 keys, 0 unique vectors (0 dimensions)
Sources WikiNER (Joel Nothman, Nicky Ringland, Will Radford, Tara Murphy, James R Curran)
License MIT
Author Explosion
Model size 10 MB

Label Scheme

View label scheme (4 labels for 1 components)
Component Labels
ner LOC, MISC, ORG, PER

Accuracy

Type Score
ENTS_P 83.53
ENTS_R 82.65
ENTS_F 83.08

Installation

pip install spacy
python -m spacy download xx_ent_wiki_sm