add metric mFscore #509

sshuair · 2021-04-23T15:07:12Z

This PR contributed a new feature: support for f-score, recall and precision evaluation metrics. Issued by #420

There are three main modifications:

add mFscore metric, it contain three sub-metrics, f-score, recall and precision.
refactor the metrics.py return value from tuple to dict.
modify the datasets custom.py evaluate method log using prettytable package instead of terminaltables, because the terminaltables is archived and no longer maintained.

And the logs look like this:

metrics: mFscore

2021-04-23 21:38:04,723 - mmseg - INFO - Loaded 1464 images
2021-04-23 21:38:06,282 - mmseg - INFO - Loaded 1449 images
2021-04-23 21:38:06,282 - mmseg - INFO - load checkpoint from fcn_hr18s_512x512_40k_voc12aug_20200614_000648-4f8d6e7f.pth
2021-04-23 21:38:06,282 - mmseg - INFO - Use load_from_local loader
2021-04-23 21:38:06,391 - mmseg - INFO - Start running, host: SENSETIME\wangshuai4@cn0214003864u, work_dir: /home/SENSETIME/wangshuai4/sensetime/mmlab-original/mmsegmentation/work_dirs/fcn_hr18s_512x512_20k_voc12aug
2021-04-23 21:38:06,391 - mmseg - INFO - workflow: [('train', 1)], max: 20000 iters
2021-04-23 21:39:47,464 - mmseg - INFO - per class results:
2021-04-23 21:39:47,465 - mmseg - INFO - 
+-------------+--------+-----------+--------+
|    Class    | Fscore | Precision | Recall |
+-------------+--------+-----------+--------+
|  background | 65.54  |   93.38   | 50.48  |
|  aeroplane  | 18.95  |   51.12   | 11.63  |
|   bicycle   |  0.37  |   29.35   |  0.18  |
|     bird    |  6.52  |   60.97   |  3.45  |
|     boat    | 10.62  |    16.4   |  7.85  |
|    bottle   |  9.14  |    46.5   |  5.07  |
|     bus     |  4.57  |   42.93   |  2.41  |
|     car     |  3.24  |    7.61   |  2.06  |
|     cat     | 39.83  |   29.06   | 63.27  |
|    chair    |  0.17  |    0.16   |  0.2   |
|     cow     | 18.06  |   27.51   | 13.44  |
| diningtable |  1.41  |   19.76   |  0.73  |
|     dog     |  39.4  |    29.2   | 60.55  |
|    horse    |  1.93  |   65.64   |  0.98  |
|  motorbike  |  1.08  |   23.88   |  0.55  |
|    person   | 32.29  |   20.72   | 73.17  |
| pottedplant |  22.2  |   22.54   | 21.87  |
|    sheep    | 25.17  |   29.76   |  21.8  |
|     sofa    |  8.28  |    4.36   | 81.81  |
|    train    | 11.72  |    11.5   | 11.96  |
|  tvmonitor  |  0.17  |    2.48   |  0.09  |
+-------------+--------+-----------+--------+
2021-04-23 21:39:47,465 - mmseg - INFO - Summary:
2021-04-23 21:39:47,465 - mmseg - INFO - 
+-------+--------+-----------+--------+
|  aAcc | Fscore | Precision | Recall |
+-------+--------+-----------+--------+
| 45.72 | 15.27  |   30.23   | 20.64  |
+-------+--------+-----------+--------+
2021-04-23 21:39:47,468 - mmseg - INFO - Iter(val) [10]	aAcc: 0.4572, mFscore: 0.1527, mPrecision: 0.3023, mRecall: 0.2064, Fscore.background: 0.6554, Fscore.aeroplane: 0.1895, Fscore.bicycle: 0.0037, Fscore.bird: 0.0652, Fscore.boat: 0.1062, Fscore.bottle: 0.0914, Fscore.bus: 0.0457, Fscore.car: 0.0324, Fscore.cat: 0.3983, Fscore.chair: 0.0017, Fscore.cow: 0.1806, Fscore.diningtable: 0.0141, Fscore.dog: 0.3940, Fscore.horse: 0.0193, Fscore.motorbike: 0.0108, Fscore.person: 0.3229, Fscore.pottedplant: 0.2220, Fscore.sheep: 0.2517, Fscore.sofa: 0.0828, Fscore.train: 0.1172, Fscore.tvmonitor: 0.0017, Precision.background: 0.9338, Precision.aeroplane: 0.5112, Precision.bicycle: 0.2935, Precision.bird: 0.6097, Precision.boat: 0.1640, Precision.bottle: 0.4650, Precision.bus: 0.4293, Precision.car: 0.0761, Precision.cat: 0.2906, Precision.chair: 0.0016, Precision.cow: 0.2751, Precision.diningtable: 0.1976, Precision.dog: 0.2920, Precision.horse: 0.6564, Precision.motorbike: 0.2388, Precision.person: 0.2072, Precision.pottedplant: 0.2254, Precision.sheep: 0.2976, Precision.sofa: 0.0436, Precision.train: 0.1150, Precision.tvmonitor: 0.0248, Recall.background: 0.5048, Recall.aeroplane: 0.1163, Recall.bicycle: 0.0018, Recall.bird: 0.0345, Recall.boat: 0.0785, Recall.bottle: 0.0507, Recall.bus: 0.0241, Recall.car: 0.0206, Recall.cat: 0.6327, Recall.chair: 0.0020, Recall.cow: 0.1344, Recall.diningtable: 0.0073, Recall.dog: 0.6055, Recall.horse: 0.0098, Recall.motorbike: 0.0055, Recall.person: 0.7317, Recall.pottedplant: 0.2187, Recall.sheep: 0.2180, Recall.sofa: 0.8181, Recall.train: 0.1196, Recall.tvmonitor: 0.0009

metrics: mIoU, mDice, mFscore

2021-04-23 21:34:50,731 - mmseg - INFO - Loaded 1464 images
2021-04-23 21:34:52,282 - mmseg - INFO - Loaded 1449 images
2021-04-23 21:34:52,282 - mmseg - INFO - load checkpoint from fcn_hr18s_512x512_40k_voc12aug_20200614_000648-4f8d6e7f.pth
2021-04-23 21:34:52,283 - mmseg - INFO - Use load_from_local loader
2021-04-23 21:34:52,369 - mmseg - INFO - Start running, host: SENSETIME\wangshuai4@cn0214003864u, work_dir: /home/SENSETIME/wangshuai4/sensetime/mmlab-original/mmsegmentation/work_dirs/fcn_hr18s_512x512_20k_voc12aug
2021-04-23 21:34:52,369 - mmseg - INFO - workflow: [('train', 1)], max: 20000 iters
2021-04-23 21:36:31,405 - mmseg - INFO - per class results:
2021-04-23 21:36:31,408 - mmseg - INFO - 
+-------------+-------+-------+-------+--------+-----------+--------+
|    Class    |  IoU  |  Acc  |  Dice | Fscore | Precision | Recall |
+-------------+-------+-------+-------+--------+-----------+--------+
|  background | 46.73 | 48.97 | 63.69 | 63.69  |   91.09   | 48.97  |
|  aeroplane  |  8.74 | 11.37 | 16.07 | 16.07  |   27.43   | 11.37  |
|   bicycle   |  2.5  |  4.81 |  4.88 |  4.88  |    4.95   |  4.81  |
|     bird    |  0.76 |  0.77 |  1.51 |  1.51  |    38.7   |  0.77  |
|     boat    |  0.0  |  0.0  |  0.0  |  nan   |    0.0    |  0.0   |
|    bottle   |  0.77 |  0.82 |  1.54 |  1.54  |   13.25   |  0.82  |
|     bus     |  1.2  |  1.24 |  2.37 |  2.37  |   29.58   |  1.24  |
|     car     |  9.13 | 21.37 | 16.73 | 16.73  |   13.74   | 21.37  |
|     cat     | 10.07 | 81.01 |  18.3 |  18.3  |   10.32   | 81.01  |
|    chair    |  0.0  |  0.0  |  0.0  |  0.0   |    0.18   |  0.0   |
|     cow     |  0.0  |  0.0  |  0.01 |  0.01  |    0.93   |  0.0   |
| diningtable |  0.0  |  0.0  |  0.0  |  nan   |    0.0    |  0.0   |
|     dog     |  4.54 | 15.24 |  8.69 |  8.69  |    6.08   | 15.24  |
|    horse    |  0.92 |  1.47 |  1.83 |  1.83  |    2.43   |  1.47  |
|  motorbike  |  5.97 | 81.26 | 11.26 | 11.26  |    6.05   | 81.26  |
|    person   |  21.2 | 43.11 | 34.98 | 34.98  |   29.43   | 43.11  |
| pottedplant |  0.06 |  0.06 |  0.11 |  0.11  |    33.1   |  0.06  |
|    sheep    |  3.19 | 34.18 |  6.19 |  6.19  |    3.4    | 34.18  |
|     sofa    |  0.87 |  1.12 |  1.73 |  1.73  |    3.82   |  1.12  |
|    train    | 11.56 | 21.46 | 20.73 | 20.73  |   20.05   | 21.46  |
|  tvmonitor  |  6.72 |  17.6 |  12.6 |  12.6  |    9.81   |  17.6  |
+-------------+-------+-------+-------+--------+-----------+--------+
2021-04-23 21:36:31,408 - mmseg - INFO - Summary:
2021-04-23 21:36:31,408 - mmseg - INFO - 
+-------+------+-------+-------+--------+-----------+--------+
|  aAcc | IoU  |  Acc  |  Dice | Fscore | Precision | Recall |
+-------+------+-------+-------+--------+-----------+--------+
| 42.48 | 6.43 | 18.37 | 10.63 | 11.75  |    16.4   | 18.37  |
+-------+------+-------+-------+--------+-----------+--------+
2021-04-23 21:36:31,410 - mmseg - INFO - Iter(val) [10]	aAcc: 0.4248, mIoU: 0.0643, mAcc: 0.1837, mDice: 0.1063, mFscore: 0.1175, mPrecision: 0.1640, mRecall: 0.1837, IoU.background: 0.4673, IoU.aeroplane: 0.0874, IoU.bicycle: 0.0250, IoU.bird: 0.0076, IoU.boat: 0.0000, IoU.bottle: 0.0077, IoU.bus: 0.0120, IoU.car: 0.0913, IoU.cat: 0.1007, IoU.chair: 0.0000, IoU.cow: 0.0000, IoU.diningtable: 0.0000, IoU.dog: 0.0454, IoU.horse: 0.0092, IoU.motorbike: 0.0597, IoU.person: 0.2120, IoU.pottedplant: 0.0006, IoU.sheep: 0.0319, IoU.sofa: 0.0087, IoU.train: 0.1156, IoU.tvmonitor: 0.0672, Acc.background: 0.4897, Acc.aeroplane: 0.1137, Acc.bicycle: 0.0481, Acc.bird: 0.0077, Acc.boat: 0.0000, Acc.bottle: 0.0082, Acc.bus: 0.0124, Acc.car: 0.2137, Acc.cat: 0.8101, Acc.chair: 0.0000, Acc.cow: 0.0000, Acc.diningtable: 0.0000, Acc.dog: 0.1524, Acc.horse: 0.0147, Acc.motorbike: 0.8126, Acc.person: 0.4311, Acc.pottedplant: 0.0006, Acc.sheep: 0.3418, Acc.sofa: 0.0112, Acc.train: 0.2146, Acc.tvmonitor: 0.1760, Dice.background: 0.6369, Dice.aeroplane: 0.1607, Dice.bicycle: 0.0488, Dice.bird: 0.0151, Dice.boat: 0.0000, Dice.bottle: 0.0154, Dice.bus: 0.0237, Dice.car: 0.1673, Dice.cat: 0.1830, Dice.chair: 0.0000, Dice.cow: 0.0001, Dice.diningtable: 0.0000, Dice.dog: 0.0869, Dice.horse: 0.0183, Dice.motorbike: 0.1126, Dice.person: 0.3498, Dice.pottedplant: 0.0011, Dice.sheep: 0.0619, Dice.sofa: 0.0173, Dice.train: 0.2073, Dice.tvmonitor: 0.1260, Fscore.background: 0.6369, Fscore.aeroplane: 0.1607, Fscore.bicycle: 0.0488, Fscore.bird: 0.0151, Fscore.boat: nan, Fscore.bottle: 0.0154, Fscore.bus: 0.0237, Fscore.car: 0.1673, Fscore.cat: 0.1830, Fscore.chair: 0.0000, Fscore.cow: 0.0001, Fscore.diningtable: nan, Fscore.dog: 0.0869, Fscore.horse: 0.0183, Fscore.motorbike: 0.1126, Fscore.person: 0.3498, Fscore.pottedplant: 0.0011, Fscore.sheep: 0.0619, Fscore.sofa: 0.0173, Fscore.train: 0.2073, Fscore.tvmonitor: 0.1260, Precision.background: 0.9109, Precision.aeroplane: 0.2743, Precision.bicycle: 0.0495, Precision.bird: 0.3870, Precision.boat: 0.0000, Precision.bottle: 0.1325, Precision.bus: 0.2958, Precision.car: 0.1374, Precision.cat: 0.1032, Precision.chair: 0.0018, Precision.cow: 0.0093, Precision.diningtable: 0.0000, Precision.dog: 0.0608, Precision.horse: 0.0243, Precision.motorbike: 0.0605, Precision.person: 0.2943, Precision.pottedplant: 0.3310, Precision.sheep: 0.0340, Precision.sofa: 0.0382, Precision.train: 0.2005, Precision.tvmonitor: 0.0981, Recall.background: 0.4897, Recall.aeroplane: 0.1137, Recall.bicycle: 0.0481, Recall.bird: 0.0077, Recall.boat: 0.0000, Recall.bottle: 0.0082, Recall.bus: 0.0124, Recall.car: 0.2137, Recall.cat: 0.8101, Recall.chair: 0.0000, Recall.cow: 0.0000, Recall.diningtable: 0.0000, Recall.dog: 0.1524, Recall.horse: 0.0147, Recall.motorbike: 0.8126, Recall.person: 0.4311, Recall.pottedplant: 0.0006, Recall.sheep: 0.3418, Recall.sofa: 0.0112, Recall.train: 0.2146, Recall.tvmonitor: 0.1760

codecov · 2021-04-23T15:19:12Z

Codecov Report

Merging #509 (df21986) into master (83d312e) will increase coverage by 0.20%.
The diff coverage is 98.14%.

@@            Coverage Diff             @@
##           master     #509      +/-   ##
==========================================
+ Coverage   86.48%   86.69%   +0.20%     
==========================================
  Files          97       99       +2     
  Lines        4974     5192     +218     
  Branches      807      838      +31     
==========================================
+ Hits         4302     4501     +199     
- Misses        519      533      +14     
- Partials      153      158       +5

Flag	Coverage Δ
unittests	`86.69% <98.14%> (+0.20%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
mmseg/core/evaluation/metrics.py	`89.28% <96.42%> (+2.71%)`	⬆️
mmseg/core/evaluation/__init__.py	`100.00% <100.00%> (ø)`
mmseg/datasets/custom.py	`89.82% <100.00%> (+0.44%)`	⬆️
mmseg/datasets/pipelines/transforms.py	`97.12% <0.00%> (-0.87%)`	⬇️
mmseg/models/necks/__init__.py	`100.00% <0.00%> (ø)`
mmseg/models/backbones/__init__.py	`100.00% <0.00%> (ø)`
mmseg/models/backbones/vit.py	`87.97% <0.00%> (ø)`
mmseg/models/necks/multilevel_neck.py	`100.00% <0.00%> (ø)`
mmseg/models/losses/utils.py	`81.57% <0.00%> (+4.91%)`	⬆️
mmseg/models/builder.py	`91.30% <0.00%> (+9.82%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 83d312e...df21986. Read the comment docs.

xvjiarui · 2021-04-23T17:33:26Z

Nice PR! Thx!

sshuair · 2021-04-29T03:02:20Z

@xvjiarui @xiexinch Hi, is there any review or merge progress? I am going to pull another PR.

xvjiarui · 2021-04-29T03:35:57Z

Hi @sshuair
Sorry for the late reply. I will review it today.

xvjiarui · 2021-04-29T04:09:24Z

mmseg/core/evaluation/metrics.py

@@ -146,7 +166,7 @@ def mean_iou(results,
        nan_to_num=nan_to_num,
        label_map=label_map,
        reduce_zero_label=reduce_zero_label)
-    return all_acc, acc, iou
+    return mIoU_result


Suggested change

return mIoU_result

return iou_result

We may use snake case for all variables.

xvjiarui · 2021-04-29T04:09:49Z

mmseg/core/evaluation/metrics.py

@@ -185,7 +206,52 @@ def mean_dice(results,
        nan_to_num=nan_to_num,
        label_map=label_map,
        reduce_zero_label=reduce_zero_label)
-    return all_acc, acc, dice
+    return mDice_result


Suggested change

return mDice_result

return dice_result

xvjiarui · 2021-04-29T04:15:08Z

mmseg/datasets/custom.py

+
+        summary_table_data = PrettyTable()
+        for key, val in ret_metrics_summary.items():
+            summary_table_data.add_column(key, [val])


We may also use term mIoU in the table.

xvjiarui · 2021-04-29T05:07:59Z

mmseg/core/evaluation/metrics.py

+        <aAcc> float: Overall accuracy on all images.
+        <Acc> ndarray: Per category accuracy, shape (num_classes, ).
+        <Dice> ndarray: Per category dice, shape (num_classes, ).


We may indent here.

xvjiarui · 2021-04-29T05:08:25Z

mmseg/core/evaluation/metrics.py

+         <aAcc> float: Overall accuracy on all images.
+         <Fscore> ndarray: Per category recall, shape (num_classes, ).
+         <Precision> ndarray: Per category precision, shape (num_classes, ).
+         <Recall> ndarray: Per category f-score, shape (num_classes, ).


We may use space 4 indent here.

tests/test_metrics.py

xvjiarui · 2021-04-29T05:11:05Z

@sshuair Very nice PR. Just a few comments.

sshuair · 2021-04-30T06:51:30Z

@xvjiarui the comments has been fixed. Please check it out.

.....
|    person   | 24.65  |   63.46   |  15.3  | 14.06 |  15.3 | 24.65 |
| pottedplant |  0.23  |   30.68   |  0.12  |  0.12 |  0.12 |  0.23 |
|    sheep    |  0.81  |   17.06   |  0.41  |  0.4  |  0.41 |  0.81 |
|     sofa    | 15.42  |   12.16   | 21.04  |  8.35 | 21.04 | 15.42 |
|    train    |  9.86  |    5.48   | 48.95  |  5.19 | 48.95 |  9.86 |
|  tvmonitor  |  3.23  |    5.39   |  2.31  |  1.64 |  2.31 |  3.23 |
+-------------+--------+-----------+--------+-------+-------+-------+
2021-04-30 13:55:06,861 - mmseg - INFO - Summary:
2021-04-30 13:55:06,861 - mmseg - INFO - 
+-------+---------+------------+---------+------+-------+-------+
|  aAcc | mFscore | mPrecision | mRecall | mIoU |  mAcc | mDice |
+-------+---------+------------+---------+------+-------+-------+
| 32.62 |  10.72  |   20.85    |  16.05  | 5.76 | 16.05 |  9.69 |
+-------+---------+------------+---------+------+-------+-------+
2021-04-30 13:55:06,864 - mmseg - INFO - Iter(val) [10] aAcc: 0.3262....
.....

* add mFscore and refactor the metrics return value * fix linting * some docstring and name fix

lorinczszabolcs · 2022-07-17T18:47:37Z

Hi! Isn't mFscore in the current implementatin the same as mDice score, since the default beta=1 is used? Am I mistaken?

* Finally fix the image-based SD tests * Remove autocast * Remove autocast in image tests

* resolve comments * update changelog * add class_weight in loss arguments * switch to mmcv 1.2.4 * use v1.1.1 as mmcv version lower bound * reorganize code * resolve comments

sshuair added 2 commits April 23, 2021 22:37

add mFscore and refactor the metrics return value

ee64ed2

fix linting

38709ff

xvjiarui requested a review from xiexinch April 23, 2021 17:29

xiexinch approved these changes Apr 26, 2021

View reviewed changes

xvjiarui reviewed Apr 29, 2021

View reviewed changes

some docstring and name fix

df21986

xvjiarui approved these changes Apr 30, 2021

View reviewed changes

xvjiarui merged commit e16e0e3 into open-mmlab:master Apr 30, 2021

MengzhangLI mentioned this pull request Sep 3, 2021

Is there a way to register a custom evaluation metric? #847

Closed

MengzhangLI mentioned this pull request Nov 12, 2021

Invitation of contributing to MMSegmentation. fundamentalvision/Auto-Seg-Loss#2

Closed

MengzhangLI mentioned this pull request Feb 15, 2022

How to add new metrics? #1286

Closed

bowenroom pushed a commit to bowenroom/mmsegmentation that referenced this pull request Feb 25, 2022

add metric mFscore (open-mmlab#509)

7fbdd6f

* add mFscore and refactor the metrics return value * fix linting * some docstring and name fix

MengzhangLI mentioned this pull request Jun 25, 2022

How can I compute the average recall for a dataset? #1678

Closed

aravind-h-v pushed a commit to aravind-h-v/mmsegmentation that referenced this pull request Mar 27, 2023

Finally fix the image-based SD tests (open-mmlab#509)

c727a6a

* Finally fix the image-based SD tests * Remove autocast * Remove autocast in image tests

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add metric mFscore #509

add metric mFscore #509

sshuair commented Apr 23, 2021

codecov bot commented Apr 23, 2021 •

edited

Loading

xvjiarui commented Apr 23, 2021

sshuair commented Apr 29, 2021

xvjiarui commented Apr 29, 2021

xvjiarui Apr 29, 2021

xvjiarui Apr 29, 2021

xvjiarui Apr 29, 2021

xvjiarui Apr 29, 2021

xvjiarui Apr 29, 2021

xvjiarui Apr 29, 2021

xvjiarui commented Apr 29, 2021

sshuair commented Apr 30, 2021 •

edited

Loading

lorinczszabolcs commented Jul 17, 2022

add metric mFscore #509

add metric mFscore #509

Conversation

sshuair commented Apr 23, 2021

codecov bot commented Apr 23, 2021 • edited Loading

Codecov Report

xvjiarui commented Apr 23, 2021

sshuair commented Apr 29, 2021

xvjiarui commented Apr 29, 2021

xvjiarui Apr 29, 2021

Choose a reason for hiding this comment

xvjiarui Apr 29, 2021

Choose a reason for hiding this comment

xvjiarui Apr 29, 2021

Choose a reason for hiding this comment

xvjiarui Apr 29, 2021

Choose a reason for hiding this comment

xvjiarui Apr 29, 2021

Choose a reason for hiding this comment

xvjiarui Apr 29, 2021

Choose a reason for hiding this comment

xvjiarui commented Apr 29, 2021

sshuair commented Apr 30, 2021 • edited Loading

lorinczszabolcs commented Jul 17, 2022

codecov bot commented Apr 23, 2021 •

edited

Loading

sshuair commented Apr 30, 2021 •

edited

Loading