Skip to content

MaskRCNN training - Issue with pytorch lightning vs pytorch #11462

@purnasai-soulpageit

Description

@purnasai-soulpageit

Model output is not the same compared to Pytorch and pytorch_lightning

We have used PyTorch detection model maskrcnn_50_fpn model in PyTorch and in PyTorch lightning to perform instance segmentation of Weapon&Knife with Same data, Data loaders, Epcohs and Environment. Framework is the only difference here.

after training for the same number of epochs(40), we can see loss following the same pattern in both PyTorch and lightning, but the prediction of the model differs a lot. Predictions from Both frameworks are below and in notebooks as well.

Pytorch prediction
output

Pytorch-lighning prediction
output1

To Reproduce

Use Below notebooks to reproduce

  1. Pytorch_lightning instance segmentation: here
  2. Pytorch instance segmentation: here
  3. sample_data: here

Expected behavior

Expected to have Prediction results same in Pytorch and in Pytorch lightning

Environment

  • CUDA:
    - GPU:
    - Tesla T4
    - available: True
    - version: 10.2
  • Packages:
    - numpy: 1.20.3
    - pyTorch_debug: False
    - pyTorch_version: 1.8.0
    - pytorch-lightning: 1.5.8
    - tqdm: 4.62.3
  • System:
    - OS: Linux
    - architecture:
    - 64bit
    -
    - processor: x86_64
    - python: 3.7.10
    - version: added test model to do also #64~18.04.1-Ubuntu SMP Fri Dec 3 17:59:13 UTC 2021

cc @awaelchli @ananthsub @ninginthecloud @rohitgr7

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingcheckpointingRelated to checkpointing

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions