Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLI benchmark accuracy doesn't save rendered displacy htmls #12566

Closed
jamnicki opened this issue Apr 23, 2023 · 3 comments · Fixed by #12575
Closed

CLI benchmark accuracy doesn't save rendered displacy htmls #12566

jamnicki opened this issue Apr 23, 2023 · 3 comments · Fixed by #12575
Labels
bug Bugs and behaviour differing from documentation feat / cli Feature: Command-line interface feat / spancat Feature: Span Categorizer

Comments

@jamnicki
Copy link

jamnicki commented Apr 23, 2023

The accuracy benchmark of my model does not save rendered displacy htmls as requested. Benchmark works that's why I'm confused. The model contains only transformers and spancat components. Does spancat is not yet supported? 😞

DocBin does not contain any empty docs

CLI output:

$ python -m spacy benchmark accuracy data/models/pl_spancat_acc/model-best/ data/test.spacy --output results/spacy/metrics.json --gpu-id 0 --displacy-path results/spacy/benchmark_acc_test_displacy
ℹ Using GPU: 0

================================== Results ==================================

TOK      100.00
SPAN P   79.31
SPAN R   54.19
SPAN F   64.38
SPEED    3752


============================== SPANS (per type) ==============================

                                P        R       F
nam_loc_gpe_city            77.29    74.42   75.83
nam_pro_software            82.35    36.84   50.91
nam_org_institution         63.11    50.78   56.28
nam_liv_person              87.34    82.64   84.93
nam_loc_gpe_country         95.24    85.37   90.03
 . . .
nam_pro                      0.00     0.00    0.00

✔ Generated 25 parses as HTML
results/spacy/benchmark_acc_test_displacy
✔ Saved results to
results/spacy/benchmark_acc_test_metrics.json

Random doc.to_json() from test DocBin:

{'ents': [{'end': 54, 'label': 'nam_adj_country', 'start': 44},
          {'end': 83, 'label': 'nam_liv_person', 'start': 69},
          {'end': 100, 'label': 'nam_pro_title_book', 'start': 86}],
 'spans': {'sc': [{'end': 54,
                   'kb_id': '',
                   'label': 'nam_adj_country',
                   'start': 44},
                  {'end': 83,
                   'kb_id': '',
                   'label': 'nam_liv_person',
                   'start': 69},
                  {'end': 100,
                   'kb_id': '',
                   'label': 'nam_pro_title_book',
                   'start': 86}]},
 'text': 'Niedawno czytał em nową książkę znakomitego szkockiego medioznawcy , '
         'Briana McNaira - Cultural Chaos .',
 'tokens': [{'end': 8, 'id': 0, 'start': 0},
            {'end': 15, 'id': 1, 'start': 9},
            {'end': 18, 'id': 2, 'start': 16},
            {'end': 23, 'id': 3, 'start': 19},
            {'end': 31, 'id': 4, 'start': 24},
            {'end': 43, 'id': 5, 'start': 32},
            {'end': 54, 'id': 6, 'start': 44},
            {'end': 66, 'id': 7, 'start': 55},
            {'end': 68, 'id': 8, 'start': 67},
            {'end': 75, 'id': 9, 'start': 69},
            {'end': 83, 'id': 10, 'start': 76},
            {'end': 85, 'id': 11, 'start': 84},
            {'end': 94, 'id': 12, 'start': 86},
            {'end': 100, 'id': 13, 'start': 95},
            {'end': 102, 'id': 14, 'start': 101}]}
Model config

[paths]
train = null
dev = null
vectors = null
init_tok2vec = null

[system]
gpu_allocator = "pytorch"
seed = 0

[nlp]
lang = "pl"
pipeline = ["transformer","spancat"]
batch_size = 128
disabled = []
before_creation = null
after_creation = null
after_pipeline_creation = null
tokenizer = {"@tokenizers":"spacy.Tokenizer.v1"}

[components]

[components.spancat]
factory = "spancat"
max_positive = null
scorer = {"@scorers":"spacy.spancat_scorer.v1"}
spans_key = "sc"
threshold = 0.5

[components.spancat.model]
@architectures = "spacy.SpanCategorizer.v1"

[components.spancat.model.reducer]
@layers = "spacy.mean_max_reducer.v1"
hidden_size = 128

[components.spancat.model.scorer]
@layers = "spacy.LinearLogistic.v1"
nO = null
nI = null

[components.spancat.model.tok2vec]
@architectures = "spacy-transformers.TransformerListener.v1"
grad_factor = 1.0
pooling = {"@layers":"reduce_mean.v1"}
upstream = "*"

[components.spancat.suggester]
@misc = "spacy.ngram_suggester.v1"
sizes = [1,2,3]

[components.transformer]
factory = "transformer"
max_batch_items = 4096
set_extra_annotations = {"@annotation_setters":"spacy-transformers.null_annotation_setter.v1"}

[components.transformer.model]
@architectures = "spacy-transformers.TransformerModel.v3"
name = "dkleczek/bert-base-polish-cased-v1"
mixed_precision = false

[components.transformer.model.get_spans]
@span_getters = "spacy-transformers.strided_spans.v1"
window = 128
stride = 96

[components.transformer.model.grad_scaler_config]

[components.transformer.model.tokenizer_config]
use_fast = true

[components.transformer.model.transformer_config]

[corpora]

[corpora.dev]
@readers = "spacy.Corpus.v1"
path = ${paths.dev}
max_length = 0
gold_preproc = false
limit = 0
augmenter = null

[corpora.train]
@readers = "spacy.Corpus.v1"
path = ${paths.train}
max_length = 0
gold_preproc = false
limit = 0
augmenter = null

[training]
accumulate_gradient = 3
dev_corpus = "corpora.dev"
train_corpus = "corpora.train"
seed = ${system.seed}
gpu_allocator = ${system.gpu_allocator}
dropout = 0.1
patience = 1600
max_epochs = 0
max_steps = 20000
eval_frequency = 200
frozen_components = []
annotating_components = []
before_to_disk = null
before_update = null

[training.batcher]
@batchers = "spacy.batch_by_padded.v1"
discard_oversize = true
size = 2000
buffer = 256
get_length = null

[training.logger]
@loggers = "spacy.ConsoleLogger.v1"
progress_bar = false

[training.optimizer]
@optimizers = "Adam.v1"
beta1 = 0.9
beta2 = 0.999
L2_is_weight_decay = true
L2 = 0.01
grad_clip = 1.0
use_averages = false
eps = 0.00000001

[training.optimizer.learn_rate]
@schedules = "warmup_linear.v1"
warmup_steps = 250
total_steps = 20000
initial_rate = 0.00005

[training.score_weights]
spans_sc_f = 1.0
spans_sc_p = 0.0
spans_sc_r = 0.0

[pretraining]

[initialize]
vectors = ${paths.vectors}
init_tok2vec = ${paths.init_tok2vec}
vocab_data = null
lookups = null
before_init = null
after_init = null

[initialize.components]

[initialize.tokenizer]

@rmitsch rmitsch added feat / cli Feature: Command-line interface feat / spancat Feature: Span Categorizer labels Apr 24, 2023
@rmitsch rmitsch added the bug Bugs and behaviour differing from documentation label Apr 26, 2023
@rmitsch
Copy link
Contributor

rmitsch commented Apr 26, 2023

Hi @jamnicki, thanks for bringing this up! We are working on a fix. We'll update here once this has been resolved.

@svlandeg svlandeg linked a pull request Apr 26, 2023 that will close this issue
3 tasks
@rmitsch
Copy link
Contributor

rmitsch commented Apr 28, 2023

@jamnicki The fix has been merged, so the next spaCy version should include this.

@github-actions
Copy link
Contributor

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 29, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Bugs and behaviour differing from documentation feat / cli Feature: Command-line interface feat / spancat Feature: Span Categorizer
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants