refactor token classification by ArneBinder · Pull Request #35 · ArneBinder/pie-modules

ArneBinder · 2024-01-12T22:14:39Z

This PR introduces several changes to the token classification based labeled span extraction setups. Especially, metric setup happens in the task module now and span based metrics are logged during training.

Task Modules and Metrics

WrappedMetricWithPrepareFunction: newly added

wrapper around torchmetrics.Metric that pre-processes the predictions and targets with a prepare_function
prepare_function can unbatch the predictions / targets in which case the wrapped metric is updated multiple times

PrecisionRecallAndF1ForLabeledAnnotations: make it a real torchmetrics.Metric

save tp, fn, fp as tensors in the state instead of the correct / gold / predicted annotations

LabeledSpanExtractionByTokenClassificationTaskModule: formerly TokenClassificationTaskModule

rename label_token_pad_id to label_pad_id (default as before: -100), i.e. this is breaking
unbatch_output() expects a LongTensor representing the predicted label indices (and label_pad_id on pad positions)
include special_tokens_mask in input encodings
implement configure_model_metric() that creates the following metrics:
- token metrics: micro and macro F1, without ignoring any class
- span metrics: micro, macro, and per-class Precision, Recall, and F1
add parameter log_precision_recall_metrics (default: True) to disable logging of precision and recall metrics. Disabling this is useful to not get overwhelmed by to many logged metrics

Models

WithMetricsFromTaskModule: new mixin

handles metric setup, update, and logging in the models

Model: new abstract model

contains boiler plate code that is used in most of the models
uses WithMetricsFromTaskModule

SimpleTokenClassificationModel: newly added

taken from pytorch_ie.models.TransformerTokenClassificationModel, simple wrapper around transformers.AutoModelForTokenClassification

For both models, SimpleTokenClassificationModel and TokenClassificationModelWithSeq2SeqEncoderAndCrf:

rename label_pad_token_id to label_pad_id, i.e. this is breaking
remove any manual metric setup/updating/logging (now handled by taskmodule.configure_model_metric() in WithMetricsFromTaskModule)
outsourced much code to Model

Requires

call post_prepare() in TaskModuel._from_config() pytorch-ie#399, i.e. pytorch-ie >= 0.29.7
add decode() to PyTorchIEModel pytorch-ie#402, i.e. pytorch-ie >= 0.29.8

Follow-up

upgrade pie-modules>=0.10.2 and use its token classification taskmodule argumentation-structure-identification#193
improve simple_generative model #36
separate tests for WithMetricsFromTaskModule and for DefaultModel

TODOs:

metric:
- do not prepare input in step()
- add span F1 metric
- enable macro averaging
- fix metric log keys
adjust README.md
documentation
rename to ~~LabeledSpanExtraction(TaskmoDule|Model)~~ taskmodule to LabeledSpanExtractionByTokenClassificationTaskModule

codecov-commenter · 2024-01-12T22:17:39Z

Codecov Report

Attention: 20 lines in your changes are missing coverage. Please review.

Comparison is base (5f96e5a) 95.62% compared to head (8a2b890) 95.20%.

Files	Patch %	Lines
.../pie_modules/models/simple_token_classification.py	86.48%	10 Missing ⚠️
...ken_classification_with_seq2seq_encoder_and_crf.py	81.48%	10 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #35      +/-   ##
==========================================
- Coverage   95.62%   95.20%   -0.43%     
==========================================
  Files          40       41       +1     
  Lines        3316     3417     +101     
==========================================
+ Hits         3171     3253      +82     
- Misses        145      164      +19

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

…ictions

codecov-commenter · 2024-01-13T00:12:32Z

Codecov Report

Attention: 13 lines in your changes are missing coverage. Please review.

Comparison is base (5f96e5a) 95.62% compared to head (3fd3066) 95.79%.

Files	Patch %	Lines
...ules/models/mixins/with_metrics_from_taskmodule.py	87.71%	7 Missing ⚠️
...es/metrics/wrapped_metric_with_prepare_function.py	91.11%	4 Missing ⚠️
src/pie_modules/models/model.py	97.82%	1 Missing ⚠️
.../pie_modules/models/simple_token_classification.py	97.43%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #35      +/-   ##
==========================================
+ Coverage   95.62%   95.79%   +0.16%     
==========================================
  Files          40       45       +5     
  Lines        3316     3544     +228     
==========================================
+ Hits         3171     3395     +224     
- Misses        145      149       +4

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

…TaskmoduleConfig

…not have "train" as key)

… can not have "train" as key)

…enClassificationModelWithSeq2SeqEncoderAndCrf; use taskmodule_config in model tests

…nTaskModule

…also for TokenClassificationModelWithSeq2SeqEncoderAndCrf

…allAndF1ForLabeledAnnotations

…le metrics

… is recommended to use...")

…Module

…ssificationTaskModule ad remove deprecated parameters

…add reset parameter

…skModule to DefaultModel

ArneBinder added 2 commits January 12, 2024 23:13

refactor TokenClassificationModelWithSeq2SeqEncoderAndCrf

9df962c

refactor TokenClassificationTaskModule

3bc11cd

ArneBinder added refactoring Refactoring breaking Breaking Changes labels Jan 12, 2024

add SimpleTokenClassificationModel

8a2b890

ArneBinder added 2 commits January 13, 2024 01:10

add special_tokens_mask to input encoding and use it to mask teh pred…

e3cf43b

…ictions

remove mock model

3e1f7eb

ArneBinder added 6 commits January 13, 2024 01:31

log and reset metric in _on_epoch_end callbacks

99bffdf

derive TokenClassificationModelWithSeq2SeqEncoderAndCrf from Requires…

c0f478b

…TaskmoduleConfig

call self.taskmodule.post_prepare()

9d357d4

fix masking for token wise F1 metric

635aed0

fix metric log key

8686822

F1: dont exclude any class, but use macro average

aaef134

ArneBinder mentioned this pull request Jan 14, 2024

upgrade pie-modules>=0.10.2 and use its token classification taskmodule ArneBinder/argumentation-structure-identification#193

Merged

4 tasks

ArneBinder added 14 commits January 14, 2024 22:31

use ModuleDict for metrics to have them on the correct device

ed95a1d

use metric_val, metric_test, metric_train attributes (ModuleDict can …

4b86820

…not have "train" as key)

fix: use metric_val, metric_test, metric_train attributes (ModuleDict…

be55454

… can not have "train" as key)

directly log metric and remove _on_epoch_end()

a72a9ba

change logging key parts order for loss

3943fed

add taskmodule_config fixtures; cleanup

c5b0263

add parameter metric_stages to SimpleTokenClassificationModel and Tok…

f9f9f5b

…enClassificationModelWithSeq2SeqEncoderAndCrf; use taskmodule_config in model tests

don't skip test_taskmodule_config() and test_batch()

a57170c

cleanup

fa956a1

use #35

099d3f6

use pytorch-ie>=0.29.7 because of ArneBinder/pytorch-ie#399

001096a

test metric computation for models

6524db8

compare metric_state as dict

af44069

add WrappedMetricWithPrepareFunction and use it in TokenClassificatio…

9b1645d

…nTaskModule

ArneBinder added 27 commits January 17, 2024 01:23

cleanup

206fc82

go back to use on_(train|validation|test)_epoch_end() to log metrics

fe325ee

go back to use on_(train|validation|test)_epoch_end() to log metrics …

8010279

…also for TokenClassificationModelWithSeq2SeqEncoderAndCrf

add prefix and return_recall_and_precision parameters to PrecisionRec…

090ce32

…allAndF1ForLabeledAnnotations

cleanup metrics and add token/micro/f1 to TokenClassificationTaskModu…

00aca06

…le metrics

add span/macro metrics to TokenClassificationTaskModule metrics

35e8a78

mitigate downstream UserWarning ("To copy construct from a tensor, it…

4c76624

… is recommended to use...")

add log_precision_recall_metrics parameter to TokenClassificationTask…

22745d4

…Module

increase test coverage

fb0728f

modularize: add setup_metrics() to models

3fb6104

outsource WithMetricsFromTaskModule

86c5a51

move more logic to WithMetricsFromTaskModule

783d189

outsource boilerplate code to DefaultModel

ec88450

move TokenClassificationTaskModule to LabeledSpanExtractionByTokenCla…

0d8f2e2

…ssificationTaskModule ad remove deprecated parameters

add SimpleTokenClassificationModel to README.md

e82368b

add documentation to SimpleTokenClassificationModel

4186a73

minor

80f71c4

add documentation to TokenClassificationModelWithSeq2SeqEncoderAndCrf

eebb352

move predict() from DefaultModel to WithMetricsFromTaskModule

99f8211

add documentation to WithMetricsFromTaskModule

4582a11

revert: move predict() from DefaultModel to WithMetricsFromTaskModule

08cfecd

use pytorch-ie ^0.29.8

6e7c78c

rename WithMetricsFromTaskModule._on_epoch_end() to log_metric() and …

9d88a1f

…add reset parameter

move on_(train|validation|test)_epoch_end() from to WithMetricsFromTa…

0db127d

…skModule to DefaultModel

move default_model.DefaultModel to model.Model

3fd3066

move models.model.Model to models.common.model

d80bf31

fix spelling

19e9a83

ArneBinder merged commit 8e6d5b2 into main Jan 19, 2024

ArneBinder deleted the refactor_token_classification branch January 19, 2024 15:00

ArneBinder mentioned this pull request Jan 22, 2024

reduce boilerplate code in models #31

Open

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor token classification#35

refactor token classification#35
ArneBinder merged 66 commits into
mainfrom
refactor_token_classification

ArneBinder commented Jan 12, 2024 •

edited

Loading

Uh oh!

codecov-commenter commented Jan 12, 2024

Uh oh!

codecov-commenter commented Jan 13, 2024 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ArneBinder commented Jan 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Task Modules and Metrics

Models

Requires

Follow-up

Uh oh!

codecov-commenter commented Jan 12, 2024

Codecov Report

Uh oh!

codecov-commenter commented Jan 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ArneBinder commented Jan 12, 2024 •

edited

Loading

codecov-commenter commented Jan 13, 2024 •

edited

Loading