The test results of the anchor based model do not match the published ones #17

zhouzhouhhh · 2022-06-28T08:01:31Z

Hello, based on the detectron2-ResNeSt library, I downloaded the anchor based model and config file you published, but my test results are inconsistent with the content published in your github. Your published results are 48.43, 47.89, my test results are 47.9, 47.9, maybe you have modified some configuration? Here are my test results:

�[32m[06/27 23:14:31 d2.evaluation.coco_evaluation]: �[0mPreparing results for COCO format ...
�[32m[06/27 23:14:31 d2.evaluation.coco_evaluation]: �[0mSaving results to PATH/TO/SAVE/RESULTS/inference/coco_instances_results.json
�[32m[06/27 23:14:35 d2.evaluation.coco_evaluation]: �[0mEvaluating predictions ...
Loading and preparing results...
DONE (t=0.42s)
creating index...
index created!
Size parameters: [[0, 10000000000.0], [0, 324], [324, 961], [961, 10000000000.0]]
Evaluate annotation type *bbox*
COCOeval_opt.evaluate() finished in 75.10 seconds.
In method
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=2000 ] = 0.479
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=2000 ] = 0.804
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=2000 ] = 0.509
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=2000 ] = 0.477
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=2000 ] = 0.491
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=2000 ] = 0.540
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.218
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=500 ] = 0.478
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=2000 ] = 0.555
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=2000 ] = 0.528
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=2000 ] = 0.573
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=2000 ] = 0.643
Precision and Recall per iou: [0.5  0.55 0.6  0.65 0.7  0.75 0.8  0.85 0.9  0.95]
[0.8039 0.7617 0.7166 0.6623 0.5963 0.5091 0.395  0.2458 0.0906 0.0083]
[0.8623 0.829  0.7875 0.7366 0.6738 0.5938 0.4904 0.3537 0.1872 0.0372]
_derive_coco_results
�[32m[06/27 23:15:54 d2.evaluation.coco_evaluation]: �[0mEvaluation results for bbox: 
|   AP   |  AP50  |  AP75  |  APs   |  APm   |  APl   |
|:------:|:------:|:------:|:------:|:------:|:------:|
| 47.895 | 80.389 | 50.911 | 47.670 | 49.056 | 54.012 |
Loading and preparing results...
DONE (t=3.93s)
creating index...
index created!
Size parameters: [[0, 10000000000.0], [0, 324], [324, 961], [961, 10000000000.0]]
Evaluate annotation type *segm*
COCOeval_opt.evaluate() finished in 82.66 seconds.
In method
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=2000 ] = 0.479
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=2000 ] = 0.808
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=2000 ] = 0.516
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=2000 ] = 0.458
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=2000 ] = 0.483
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=2000 ] = 0.569
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.214
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=500 ] = 0.472
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=2000 ] = 0.547
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=2000 ] = 0.524
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=2000 ] = 0.556
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=2000 ] = 0.633
Precision and Recall per iou: [0.5  0.55 0.6  0.65 0.7  0.75 0.8  0.85 0.9  0.95]
[0.808  0.772  0.7289 0.6765 0.6106 0.5164 0.3885 0.2239 0.0627 0.0013]
[0.8632 0.8323 0.7948 0.7475 0.6839 0.5968 0.4772 0.3193 0.1395 0.0128]
_derive_coco_results
�[32m[06/27 23:17:40 d2.evaluation.coco_evaluation]: �[0mEvaluation results for segm: 
|   AP   |  AP50  |  AP75  |  APs   |  APm   |  APl   |
|:------:|:------:|:------:|:------:|:------:|:------:|
| 47.890 | 80.797 | 51.644 | 45.752 | 48.330 | 56.935 |
�[32m[06/27 23:17:40 d2.engine.defaults]: �[0mEvaluation results for livecelltest in csv format:
�[32m[06/27 23:17:40 d2.evaluation.testing]: �[0mcopypaste: Task: bbox
�[32m[06/27 23:17:40 d2.evaluation.testing]: �[0mcopypaste: AP,AP50,AP75,APs,APm,APl
�[32m[06/27 23:17:40 d2.evaluation.testing]: �[0mcopypaste: 47.8952,80.3888,50.9107,47.6699,49.0555,54.0118
�[32m[06/27 23:17:40 d2.evaluation.testing]: �[0mcopypaste: Task: segm
�[32m[06/27 23:17:40 d2.evaluation.testing]: �[0mcopypaste: AP,AP50,AP75,APs,APm,APl
�[32m[06/27 23:17:40 d2.evaluation.testing]: �[0mcopypaste: 47.8895,80.7974,51.6442,45.7517,48.3298,56.9348

The text was updated successfully, but these errors were encountered:

ChristofferEdlund · 2022-06-28T11:01:03Z

Hi @zhouzhouhhh ! Thank you for reaching out.

Could you share the config file you used?

zhouzhouhhh · 2022-06-28T14:22:18Z

Hello, the configuration I used is provided in the git library, link: https://github.com/sartorius-research/LIVECell/blob/main/model/anchor_based/livecell_config.yaml

MODEL:
  META_ARCHITECTURE: "GeneralizedRCNN"
  WEIGHTS: "https://s3.us-west-1.wasabisys.com/resnest/detectron/mask_cascade_rcnn_ResNeSt_200_FPN_dcn_syncBN_all_tricks_3x-e1901134.pth"
  BACKBONE:
    NAME: "build_resnet_fpn_backbone"
    FREEZE_AT: 0
  MASK_ON: True
  RESNETS:
    OUT_FEATURES: ["res2", "res3", "res4", "res5"]
    DEPTH: 200
    STRIDE_IN_1X1: False
    RADIX: 2
    DEFORM_ON_PER_STAGE: [False, True, True, True] # on Res3,Res4,Res5
    DEFORM_MODULATED: True
    DEFORM_NUM_GROUPS: 2
    NORM: "SyncBN"

  FPN:
    NORM: "SyncBN"
    IN_FEATURES: ["res2", "res3", "res4", "res5"]
  ANCHOR_GENERATOR:
    SIZES: [[4], [9], [17], [31], [64], [127]]  # One size for each in feature map
    ASPECT_RATIOS: [[0.25, 0.5, 1.0, 2.0, 4.0]]  # Three aspect ratios (same for all in feature maps)
  ROI_HEADS:
    NUM_CLASSES: 1
    BATCH_SIZE_PER_IMAGE: 512
    NAME: CascadeROIHeads
    IN_FEATURES: ["p2", "p3", "p4", "p5"]
  ROI_BOX_HEAD:
    NAME: "FastRCNNConvFCHead"
    NUM_CONV: 4
    NUM_FC: 1
    NORM: "SyncBN"
    POOLER_RESOLUTION: 7
    CLS_AGNOSTIC_BBOX_REG: True
  ROI_MASK_HEAD:
    NUM_CONV: 8
    NORM: "SyncBN"
  RPN:
    IN_FEATURES: ["p2" ,"p2", "p3", "p4", "p5", "p6"]
    BATCH_SIZE_PER_IMAGE: 256
    POST_NMS_TOPK_TEST: 3000
    POST_NMS_TOPK_TRAIN: 3000
    PRE_NMS_TOPK_TEST: 6000
    PRE_NMS_TOPK_TRAIN: 12000
  RETINANET:
    NUM_CLASSES: 1
    TOPK_CANDIDATES_TEST: 3000
  PIXEL_MEAN: [128, 128, 128]
  PIXEL_STD: [11.578, 11.578, 11.578]
SOLVER:
  IMS_PER_BATCH: 16
  BASE_LR: 0.02
  STEPS: (17500, 20000)
  MAX_ITER: 30000
  CHECKPOINT_PERIOD: 500
DATASETS:
  TRAIN: ("TRAIN",)  #REPLACE TRAIN WITH THE REGISTERED NAME 
  TEST: ("TEST",)    #REPLACE TRAIN WITH THE REGISTERED NAME
  PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000
INPUT:
  MIN_SIZE_TRAIN: (440, 480, 520, 560, 580, 620)

  CROP:
    ENABLED: False
  FORMAT: "BGR"
TEST:
  DETECTIONS_PER_IMAGE: 3000 # 1000
  EVAL_PERIOD: 500
  PRECISE_BN:
    ENABLED: False
  AUG:
    ENABLED: False
OUTPUT_DIR: "PATH/TO/SAVE/RESULTS" # PATH TO SAVE THE OUTPUT RESULTS
DATALOADER:
  NUM_WORKERS: 8
VERSION: 2

RickardSjogren · 2022-06-30T06:01:46Z

Hi @zhouzhouhhh , thanks for providing the config. We are however unable to reproduce the bbox-AP discrepancy you report. Can you share information on your environment?

This model was trained using Python v.3.6.10, PyTorch v.1.5.0 and Detectron2 v.2.1

zhouzhouhhh · 2022-07-01T07:27:18Z

Thanks for your reply, my environment configuration is: Python 3.7.13, pytorch1.6.0, detectron2-ResNeStv0.1.1 @RickardSjogren

RickardSjogren · 2022-07-04T14:35:09Z

Thanks @zhouzhouhhh . Can you please check if the difference persist in another env using the same versions as the paper? There might be a minor incompatibility between the trained weights and the implementation between versions.

RickardSjogren closed this as completed Oct 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The test results of the anchor based model do not match the published ones #17

The test results of the anchor based model do not match the published ones #17

zhouzhouhhh commented Jun 28, 2022

ChristofferEdlund commented Jun 28, 2022

zhouzhouhhh commented Jun 28, 2022

RickardSjogren commented Jun 30, 2022

zhouzhouhhh commented Jul 1, 2022 •

edited

RickardSjogren commented Jul 4, 2022

The test results of the anchor based model do not match the published ones #17

The test results of the anchor based model do not match the published ones #17

Comments

zhouzhouhhh commented Jun 28, 2022

ChristofferEdlund commented Jun 28, 2022

zhouzhouhhh commented Jun 28, 2022

RickardSjogren commented Jun 30, 2022

zhouzhouhhh commented Jul 1, 2022 • edited

RickardSjogren commented Jul 4, 2022

zhouzhouhhh commented Jul 1, 2022 •

edited