Skip to content

Can't validating on Mac MPS #566

Closed
Closed
@fatevase

Description

@fatevase

Describe the bug
when I follow doc to using mmengine, train is fine, but an error occurred while validate dataset.
User specified autocast device_type must be cuda or cpu, but got mps

Reproduction

  1. What command or script did you run?
    code from here: https://mmengine.readthedocs.io/zh_CN/latest/get_started/15_minutes.html

  2. Did you make any modifications on the code or config? Did you understand what you have modified?
    No

  3. What dataset did you use?
    cifar-10

Environment

------------------------------------------------------------
System environment:
    sys.platform: darwin
    Python: 3.9.13 (main, Aug 25 2022, 18:24:45) [Clang 12.0.0 ]
    CUDA available: False
    numpy_random_seed: 1950343038
    GCC: Apple clang version 13.1.6 (clang-1316.0.21.2.5)
    PyTorch: 1.12.1
    PyTorch compiling details: PyTorch built with:
  - GCC 4.2
  - C++ Version: 201402
  - clang 13.1.6
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: NO AVX
  - Build settings: BLAS_INFO=accelerate, BUILD_TYPE=Release, CXX_COMPILER=/Applications/Xcode_13.4.1.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -Wno-deprecated-declarations -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_PYTORCH_METAL_EXPORT -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -DUSE_COREML_DELEGATE -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-range-loop-analysis -Wno-pass-failed -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -Wno-invalid-partial-specialization -Wno-typedef-redefinition -Wno-unknown-warning-option -Wno-unused-private-field -Wno-inconsistent-missing-override -Wno-aligned-allocation-unavailable -Wno-c++14-extensions -Wno-constexpr-not-const -Wno-missing-braces -Qunused-arguments -fcolor-diagnostics -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -DUSE_MPS -fno-objc-arc -Wno-unused-private-field -Wno-missing-braces -Wno-c++14-extensions -Wno-constexpr-not-const, LAPACK_INFO=accelerate, TORCH_VERSION=1.12.1, USE_CUDA=0, USE_CUDNN=OFF, USE_EIGEN_FOR_BLAS=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=OFF, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=OFF, USE_NNPACK=ON, USE_OPENMP=OFF, USE_ROCM=OFF, 

    TorchVision: 0.13.1
    OpenCV: 4.6.0
    MMEngine: 0.1.0

Runtime environment:
    dist_cfg: {'backend': 'nccl'}
    seed: None
    Distributed launcher: none
    Distributed training: False
    GPU number: 1
------------------------------------------------------------
  1. You may add addition that may be helpful for locating the problem, such as
    use conda install PyTorch.

Bug fix
I found the error caused by autocast on runner.loops.py on 431

        with autocast(enabled=self.fp16):
            outputs = self.runner.model.test_step(data_batch)

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions