Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The test results of the anchor based model do not match the published ones #17

Closed
zhouzhouhhh opened this issue Jun 28, 2022 · 5 comments

Comments

@zhouzhouhhh
Copy link

Hello, based on the detectron2-ResNeSt library, I downloaded the anchor based model and config file you published, but my test results are inconsistent with the content published in your github. Your published results are 48.43, 47.89, my test results are 47.9, 47.9, maybe you have modified some configuration? Here are my test results:

�[32m[06/27 23:14:31 d2.evaluation.coco_evaluation]: �[0mPreparing results for COCO format ...
�[32m[06/27 23:14:31 d2.evaluation.coco_evaluation]: �[0mSaving results to PATH/TO/SAVE/RESULTS/inference/coco_instances_results.json
�[32m[06/27 23:14:35 d2.evaluation.coco_evaluation]: �[0mEvaluating predictions ...
Loading and preparing results...
DONE (t=0.42s)
creating index...
index created!
Size parameters: [[0, 10000000000.0], [0, 324], [324, 961], [961, 10000000000.0]]
Evaluate annotation type *bbox*
COCOeval_opt.evaluate() finished in 75.10 seconds.
In method
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=2000 ] = 0.479
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=2000 ] = 0.804
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=2000 ] = 0.509
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=2000 ] = 0.477
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=2000 ] = 0.491
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=2000 ] = 0.540
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.218
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=500 ] = 0.478
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=2000 ] = 0.555
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=2000 ] = 0.528
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=2000 ] = 0.573
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=2000 ] = 0.643
Precision and Recall per iou: [0.5  0.55 0.6  0.65 0.7  0.75 0.8  0.85 0.9  0.95]
[0.8039 0.7617 0.7166 0.6623 0.5963 0.5091 0.395  0.2458 0.0906 0.0083]
[0.8623 0.829  0.7875 0.7366 0.6738 0.5938 0.4904 0.3537 0.1872 0.0372]
_derive_coco_results
�[32m[06/27 23:15:54 d2.evaluation.coco_evaluation]: �[0mEvaluation results for bbox: 
|   AP   |  AP50  |  AP75  |  APs   |  APm   |  APl   |
|:------:|:------:|:------:|:------:|:------:|:------:|
| 47.895 | 80.389 | 50.911 | 47.670 | 49.056 | 54.012 |
Loading and preparing results...
DONE (t=3.93s)
creating index...
index created!
Size parameters: [[0, 10000000000.0], [0, 324], [324, 961], [961, 10000000000.0]]
Evaluate annotation type *segm*
COCOeval_opt.evaluate() finished in 82.66 seconds.
In method
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=2000 ] = 0.479
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=2000 ] = 0.808
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=2000 ] = 0.516
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=2000 ] = 0.458
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=2000 ] = 0.483
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=2000 ] = 0.569
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.214
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=500 ] = 0.472
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=2000 ] = 0.547
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=2000 ] = 0.524
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=2000 ] = 0.556
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=2000 ] = 0.633
Precision and Recall per iou: [0.5  0.55 0.6  0.65 0.7  0.75 0.8  0.85 0.9  0.95]
[0.808  0.772  0.7289 0.6765 0.6106 0.5164 0.3885 0.2239 0.0627 0.0013]
[0.8632 0.8323 0.7948 0.7475 0.6839 0.5968 0.4772 0.3193 0.1395 0.0128]
_derive_coco_results
�[32m[06/27 23:17:40 d2.evaluation.coco_evaluation]: �[0mEvaluation results for segm: 
|   AP   |  AP50  |  AP75  |  APs   |  APm   |  APl   |
|:------:|:------:|:------:|:------:|:------:|:------:|
| 47.890 | 80.797 | 51.644 | 45.752 | 48.330 | 56.935 |
�[32m[06/27 23:17:40 d2.engine.defaults]: �[0mEvaluation results for livecelltest in csv format:
�[32m[06/27 23:17:40 d2.evaluation.testing]: �[0mcopypaste: Task: bbox
�[32m[06/27 23:17:40 d2.evaluation.testing]: �[0mcopypaste: AP,AP50,AP75,APs,APm,APl
�[32m[06/27 23:17:40 d2.evaluation.testing]: �[0mcopypaste: 47.8952,80.3888,50.9107,47.6699,49.0555,54.0118
�[32m[06/27 23:17:40 d2.evaluation.testing]: �[0mcopypaste: Task: segm
�[32m[06/27 23:17:40 d2.evaluation.testing]: �[0mcopypaste: AP,AP50,AP75,APs,APm,APl
�[32m[06/27 23:17:40 d2.evaluation.testing]: �[0mcopypaste: 47.8895,80.7974,51.6442,45.7517,48.3298,56.9348
@ChristofferEdlund
Copy link
Contributor

Hi @zhouzhouhhh ! Thank you for reaching out.

Could you share the config file you used?

@zhouzhouhhh
Copy link
Author

Hello, the configuration I used is provided in the git library, link: https://github.com/sartorius-research/LIVECell/blob/main/model/anchor_based/livecell_config.yaml

MODEL:
  META_ARCHITECTURE: "GeneralizedRCNN"
  WEIGHTS: "https://s3.us-west-1.wasabisys.com/resnest/detectron/mask_cascade_rcnn_ResNeSt_200_FPN_dcn_syncBN_all_tricks_3x-e1901134.pth"
  BACKBONE:
    NAME: "build_resnet_fpn_backbone"
    FREEZE_AT: 0
  MASK_ON: True
  RESNETS:
    OUT_FEATURES: ["res2", "res3", "res4", "res5"]
    DEPTH: 200
    STRIDE_IN_1X1: False
    RADIX: 2
    DEFORM_ON_PER_STAGE: [False, True, True, True] # on Res3,Res4,Res5
    DEFORM_MODULATED: True
    DEFORM_NUM_GROUPS: 2
    NORM: "SyncBN"

  FPN:
    NORM: "SyncBN"
    IN_FEATURES: ["res2", "res3", "res4", "res5"]
  ANCHOR_GENERATOR:
    SIZES: [[4], [9], [17], [31], [64], [127]]  # One size for each in feature map
    ASPECT_RATIOS: [[0.25, 0.5, 1.0, 2.0, 4.0]]  # Three aspect ratios (same for all in feature maps)
  ROI_HEADS:
    NUM_CLASSES: 1
    BATCH_SIZE_PER_IMAGE: 512
    NAME: CascadeROIHeads
    IN_FEATURES: ["p2", "p3", "p4", "p5"]
  ROI_BOX_HEAD:
    NAME: "FastRCNNConvFCHead"
    NUM_CONV: 4
    NUM_FC: 1
    NORM: "SyncBN"
    POOLER_RESOLUTION: 7
    CLS_AGNOSTIC_BBOX_REG: True
  ROI_MASK_HEAD:
    NUM_CONV: 8
    NORM: "SyncBN"
  RPN:
    IN_FEATURES: ["p2" ,"p2", "p3", "p4", "p5", "p6"]
    BATCH_SIZE_PER_IMAGE: 256
    POST_NMS_TOPK_TEST: 3000
    POST_NMS_TOPK_TRAIN: 3000
    PRE_NMS_TOPK_TEST: 6000
    PRE_NMS_TOPK_TRAIN: 12000
  RETINANET:
    NUM_CLASSES: 1
    TOPK_CANDIDATES_TEST: 3000
  PIXEL_MEAN: [128, 128, 128]
  PIXEL_STD: [11.578, 11.578, 11.578]
SOLVER:
  IMS_PER_BATCH: 16
  BASE_LR: 0.02
  STEPS: (17500, 20000)
  MAX_ITER: 30000
  CHECKPOINT_PERIOD: 500
DATASETS:
  TRAIN: ("TRAIN",)  #REPLACE TRAIN WITH THE REGISTERED NAME 
  TEST: ("TEST",)    #REPLACE TRAIN WITH THE REGISTERED NAME
  PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000
INPUT:
  MIN_SIZE_TRAIN: (440, 480, 520, 560, 580, 620)

  CROP:
    ENABLED: False
  FORMAT: "BGR"
TEST:
  DETECTIONS_PER_IMAGE: 3000 # 1000
  EVAL_PERIOD: 500
  PRECISE_BN:
    ENABLED: False
  AUG:
    ENABLED: False
OUTPUT_DIR: "PATH/TO/SAVE/RESULTS" # PATH TO SAVE THE OUTPUT RESULTS
DATALOADER:
  NUM_WORKERS: 8
VERSION: 2

@RickardSjogren
Copy link
Contributor

Hi @zhouzhouhhh , thanks for providing the config. We are however unable to reproduce the bbox-AP discrepancy you report. Can you share information on your environment?

This model was trained using Python v.3.6.10, PyTorch v.1.5.0 and Detectron2 v.2.1

@zhouzhouhhh
Copy link
Author

zhouzhouhhh commented Jul 1, 2022

Thanks for your reply, my environment configuration is: Python 3.7.13, pytorch1.6.0, detectron2-ResNeStv0.1.1 @RickardSjogren

@RickardSjogren
Copy link
Contributor

Thanks @zhouzhouhhh . Can you please check if the difference persist in another env using the same versions as the paper? There might be a minor incompatibility between the trained weights and the implementation between versions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants