Multi-class classification implementation #329

MortenHolmRep · 2022-10-27T17:53:09Z

Implementation of multi-class classification predictions as per #111 , where the new operational structure would imply
Raw Data -> [noise, muon, neutrino]

In the current implementation, this could also easily be expanded to
Raw Data -> [noise, muon, nu_e, nu_mu, nu_tau]
But a two-step classification, first on [noise, muon, neutrino] and then a second classification, with a neutrino-specific task, training with the identified neutrinos on [nu_e, nu_mu, nu_tau] is probably preferred for neutrino classifications.
comments on an approach or alteration to this multi-class classification implementation are welcome.

Training with the task using examples/train_model.py works with the following alterations, where the target are pid:

task = MulticlassificationTask(
        hidden_size=gnn.nb_outputs,
        target_labels=config["target"],
        loss_function=MultiClassificationCrossEntropyLoss(),
    )

and for the results

results = get_predictions(
        trainer,
        model,
        validation_dataloader,
        [config["target"] + "_noise_pred", config["target"] + "_muon_pred", config["target"]+ "_neutrino_pred"],
        additional_attributes=[config["target"], "event_no"],
    )

Implementation by @Peterandresen12 and I

Peterandresen12

Looks good to me :-)

asogaard

Hi @MortenHolmRep and @Peterandresen12,

Thanks for this PR! It would make a nice addition to the repo. 🙂

I have added a few suggestions and comments that center on the fact that, currently, the proposed task and loss function hard-code assumption about the number of classes, and the PIDs of the particles being classified. We should try to be more general if we want to implement generic multi-class classification.

src/graphnet/models/task/reconstruction.py

src/graphnet/training/loss_functions.py

MortenHolmRep · 2022-10-28T10:32:54Z

Hi @MortenHolmRep and @Peterandresen12,

Thanks for this PR! It would make a nice addition to the repo. 🙂

I have added a few suggestions and comments that center on the fact that, currently, the proposed task and loss function hard-code assumption about the number of classes, and the PIDs of the particles being classified. We should try to be more general if we want to implement generic multi-class classification.

I will look into a rework tonight, based on the suggestions 👍

device inheritance Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>

loss function renaming Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>

description reformulation Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>

trimming description Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>

description reformulation Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>

dynamical multiclass classification Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>

asogaard · 2022-11-01T07:56:59Z

Hi @MortenHolmRep,

Thanks for the quick iteration! :) I have added comments to a few threads. Please let me know if there's anything else you'd like to discuss there. Otherwise, feel free to re-request a review once you think the code in the PR is ready for a second look. :)

MortenHolmRep · 2022-11-01T09:00:36Z

Thanks for your review @asogaard! I have added some comments for now. The suggestion you made I'll have to test later today.

Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>

asogaard · 2022-12-01T13:34:38Z

@asogaard Can you review the PR again? I have attempted to clean the code, config and unused packages. The generic classification now works and produces good results. The build fails for the LogCMK class in loss_functions, which is unrelated to my implementation, I do not know why I am getting this error.

Regarding the failing unit test: As far as I can tell, the fact that we change the assertion logic regarding transform_target/transform_inference will affect (some of) the unit test(s) that check for this. This is perfectly OK — we will just have to update the unit tests to reflect this new logic. :) I can have a look at the unit test results once you've had a chance to look at the comments above.

Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>

unused import Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>

Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>

MortenHolmRep · 2022-12-04T14:58:44Z

I have added a script called "train_classification_model_without_configs.py" that follows the old structure and a new script called "train_classification_model.py" that should adhere to the new structure.
The new setup with configs is not done as I need input on how to set it up properly.

asogaard

Hi @MortenHolmRep,

I think this looks great! If it runs as expected, I am all for merging this. There is still the problem of the failing unit test that needs to be resolved. I think it is just a matter of deleting this block as it no longer represents expected behaviour. :)

Peterandresen12 · 2022-12-06T08:26:42Z

Well done!

Multi-class classification implementation

Morten Holm added 2 commits October 27, 2022 13:05

Multiclass classification implementation

ade7d47

assigned tensor to device

f40ae53

MortenHolmRep added the feature New feature or request label Oct 27, 2022

MortenHolmRep added this to the v1.0.0 / Features milestone Oct 27, 2022

MortenHolmRep requested review from asogaard and Peterandresen12 October 27, 2022 17:53

Peterandresen12 approved these changes Oct 28, 2022

View reviewed changes

asogaard removed this from the v1.0.0 / Features milestone Oct 28, 2022

asogaard requested changes Oct 28, 2022

View reviewed changes

asogaard linked an issue Oct 28, 2022 that may be closed by this pull request

Implement multi-class classification #111

Closed

asogaard assigned MortenHolmRep Oct 28, 2022

MortenHolmRep and others added 6 commits October 28, 2022 19:39

Update src/graphnet/training/loss_functions.py

04a6cd4

device inheritance Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>

Update src/graphnet/training/loss_functions.py

afecd26

loss function renaming Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>

Update src/graphnet/training/loss_functions.py

b4725f4

description reformulation Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>

Update src/graphnet/training/loss_functions.py

62a0535

trimming description Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>

Update src/graphnet/training/loss_functions.py

5ed1076

description reformulation Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>

Update src/graphnet/models/task/reconstruction.py

36ea9e9

dynamical multiclass classification Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>

MortenHolmRep assigned asogaard Oct 29, 2022

Morten Holm added 4 commits October 31, 2022 09:37

move dependencies to import and line reductions

802e38e

save config import

54e4b48

Blacken

2c05171

rearrange

2c9511d

asogaard removed their assignment Nov 1, 2022

MortenHolmRep assigned asogaard Nov 1, 2022

asogaard assigned MortenHolmRep and unassigned asogaard Dec 1, 2022

MortenHolmRep and others added 3 commits December 1, 2022 14:14

Update src/graphnet/models/task/classification.py

22f7e01

Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>

Update src/graphnet/models/task/classification.py

2a04a7b

Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>

Update src/graphnet/training/utils.py

bc5fd6c

Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>

MortenHolmRep and others added 17 commits December 3, 2022 14:25

Update tests/training/test_loss_functions.py

2e9d998

Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>

Update src/graphnet/training/loss_functions.py

f4df979

unused import Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>

Update src/graphnet/training/loss_functions.py

d939b70

Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>

Update src/graphnet/models/task/task.py

2227cec

Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>

Update src/graphnet/models/task/classification.py

64f0306

Co-authored-by: Andreas Søgaard <andreas.sogaard@gmail.com>

fix indentation

b6812f5

Add IdentityTask to init

c204661

Fix imports

3a8f3d3

Restore accidentally deleted function.

00fab12

example without config

77fef84

fix import

669a199

reconfigure example classification yaml files

864a068

restructure to fit new reduced training example

ab7b3a0

reorder imports and fix variable bug

c73a7f4

fix variable input

66f9b2c

add class_options to model

e69c0bd

simplify user input

ba5c9e3

MortenHolmRep assigned asogaard Dec 5, 2022

asogaard approved these changes Dec 6, 2022

View reviewed changes

asogaard removed their assignment Dec 6, 2022

Peterandresen12 merged commit b77b25c into graphnet-team:main Dec 6, 2022

RasmusOrsoe pushed a commit to RasmusOrsoe/graphnet that referenced this pull request Oct 25, 2023

Merge pull request graphnet-team#329 from MortenHolmRep/multiclass

9d6537d

Multi-class classification implementation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-class classification implementation #329

Multi-class classification implementation #329

MortenHolmRep commented Oct 27, 2022

Peterandresen12 left a comment

asogaard left a comment

MortenHolmRep commented Oct 28, 2022

asogaard commented Nov 1, 2022

MortenHolmRep commented Nov 1, 2022

asogaard commented Dec 1, 2022

MortenHolmRep commented Dec 4, 2022

asogaard left a comment

Peterandresen12 commented Dec 6, 2022

Multi-class classification implementation #329

Multi-class classification implementation #329

Conversation

MortenHolmRep commented Oct 27, 2022

Peterandresen12 left a comment

Choose a reason for hiding this comment

asogaard left a comment

Choose a reason for hiding this comment

MortenHolmRep commented Oct 28, 2022

asogaard commented Nov 1, 2022

MortenHolmRep commented Nov 1, 2022

asogaard commented Dec 1, 2022

MortenHolmRep commented Dec 4, 2022

asogaard left a comment

Choose a reason for hiding this comment

Peterandresen12 commented Dec 6, 2022