Skip to content

Reproducibility Issues with Reported Accuracy #57

@miralys1

Description

@miralys1

Hi,

Thank you for your work on MambaVision! I have been trying to reproduce the reported results, specifically for the model MambaVision-T and ran into some problems.

Here is the summary:

  • I ran 16 training runs, using up to 4 GPUs. To match the global batch size, I used a larger per-GPU batch size.
  • I experimented with different seeds, but even with the same seed, I observed fluctuations of ~0.1-0.2 percentage points in accuracy.
  • My highest achieved accuracy is 82.21%, whereas the reported result is 82.3%.
  • When validating using the provided model checkpoint with validate.sh, I get an accuracy of 82.244%, which does not round to 82.3%.

My environment:

  • Python 3.10.12
  • torch==2.5.1
  • timm==1.0.14
  • einops==0.8.0
  • transformers==4.48.1
  • causal-conv1d @ file:///causal-conv1d (using the newest commit as of this post: 82867a9)
  • mamba-ssm @ file:///mamba (using the newest commit as of this post: 0cce0fa)

Could you clarify if there are any additional details regarding the training setup or hyperparameters that might explain these discrepancies? Also, was any additional post-processing or averaging applied to obtain the reported accuracy?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions