Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FileNotFoundError - caffe2_detectron_ops.dll on Windows source build if Python 3.8 used #35803

Closed
realiti4 opened this issue Apr 1, 2020 · 27 comments

Comments

@realiti4
Copy link

realiti4 commented Apr 1, 2020

Hi, I'm getting this dll import errors when I'm trying to import torch since 1.4 if I use Python 3.8 for build. The error below is from source build in a virtualenv that I tried today. I can build and import it just fine if I use Python 3.7

>>> import torch
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "F:\pythonapps\pytorch-test\env\lib\site-packages\torch\__init__.py", line 77, in <module>
    ctypes.CDLL(dll)
  File "C:\Users\onurc\AppData\Local\Programs\Python\Python38\lib\ctypes\__init__.py", line 373, in __init__
    self._handle = _dlopen(self._name, mode)
FileNotFoundError: Could not find module 'F:\pythonapps\pytorch-test\env\lib\site-packages\torch\lib\caffe2_detectron_ops.dll' (or one of its dependencies). Try using the full path with constructor syntax.

Environment

  • PyTorch Version (e.g., 1.0): master
  • OS (e.g., Linux): Windows 10
  • How you installed PyTorch (conda, pip, source): source
  • Build command you used (if compiling from source): python setup.py install
  • Python version: 3.8.2
  • CUDA/cuDNN version: 10.2
  • GPU models and configuration:
  • Any other relevant information: VS 2017
@peterjc123
Copy link
Collaborator

Looking into it.

@peterjc123
Copy link
Collaborator

peterjc123 commented Apr 2, 2020

I cannot reproduce the error locally. Could you please use Process Monitor to capture the logs of Windows API calls of python.exe and post them here?

@realiti4
Copy link
Author

realiti4 commented Apr 2, 2020

Hi, I'm uploading logs for system python.exe and virtualenv python.exe below.

logs.zip

@realiti4
Copy link
Author

realiti4 commented Apr 2, 2020

I didn't build this with Cuda since I just wanted to test if importing is working or not. Should I build with Cuda as well and then post logs?

@peterjc123
Copy link
Collaborator

Hi, I'm uploading logs for system python.exe and virtualenv python.exe below.

logs.zip

You may resolve this by doing pip install intel-openmp.

@realiti4
Copy link
Author

realiti4 commented Apr 2, 2020

Hi, I'm uploading logs for system python.exe and virtualenv python.exe below.
logs.zip

You may resolve this by doing pip install intel-openmp.

Yes my test environment didn't have intel-openmp and having intel-openmp has fixed the issue, thank you.

@girishnjha
Copy link

I have installed PyTorch 1.6 on windows7 64 bits. Trying to train OpenNMT, getting this error from init.py
OSError: [WinError 126] The specified module could not be found. Error loading "C:\Users\Girish\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\lib\caffe2_detectron_ops.dll" or one of its
dependencies.
I checked, the file is present in the folder

@peterjc123
Copy link
Collaborator

@girishnjha Have you installed intel-openmp? You may do that by doing pip install intel-openmp.

@girishnjha
Copy link

i installed intel-openmp. Still getting the same error

@peterjc123
Copy link
Collaborator

@girishnjha Could you please provide the log using Process Monitor or the Debugging Tools for Windows?

@girishnjha
Copy link

Logfile.zip

@peterjc123
Copy link
Collaborator

@girishnjha VCOMP140.dll is missing. Maybe you'll need to install VS Redist.

@ekdnam
Copy link

ekdnam commented Sep 6, 2020

You may resolve this by doing pip install intel-openmp.

This worked for me. Thanks a lot!

@LJHG
Copy link

LJHG commented Oct 19, 2020

@girishnjha VCOMP140.dll is missing. Maybe you'll need to install VS Redist.

It's working. Thanks!

@skyline75489
Copy link
Contributor

If adding vcruntime related dlls does not fix this for you, consider deleting caffe2_detectron_ops.dll if your build is with CUDA support.

At first I was building without CUDA, then with CUDA. Because I was using the same conda environment, this lead me having both caffe2_detectron_ops.dll and caffe2_detectron_ops_gpu.dll. And it got me into the DLL loading trouble:

Error loading "C:\tools\Anaconda3\envs\pytorch-build-py36\lib\site-packages\torch\lib\caffe2_detectron_ops.dll" or one of its dependencies.

Deleting caffe2_detectron_ops.dll fixes this for me. But I don't understand the root cause of this issue. Seems to me on a PyTorch build with CUDA, caffe2_detectron_ops.dll should not be loaded at all.

@Siddhr90
Copy link

Thank you @skyline75489, Deleting caffe2_detectron_ops.dll worked for me.

@TheSeeker23
Copy link

Thanks @skyline75489, this worked for me!

@ilyak93
Copy link

ilyak93 commented Apr 14, 2021

@skyline75489, used it either, it made the problem disappear, but got another error (I'm trying to install Nvidia Apex packet and this problem appeared in that process):

any clues how to fix it ?

`(GAIN2) C:\Users\Student1\PycharmProjects\GAIN2\apex\apex>pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
C:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\pip_internal\commands\install.py:230: UserWarning: Disabling all use of wheels due to the use of --build-option / --global-option / --install-option.
cmdoptions.check_install_build_global(options)
Using pip 21.0.1 from C:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\pip (python 3.6)
Non-user install because site-packages writeable
Created temporary directory: C:\Users\Student1\AppData\Local\Temp\2\pip-ephem-wheel-cache-x4uye1ij
Created temporary directory: C:\Users\Student1\AppData\Local\Temp\2\pip-req-tracker-r1hffaop
Initialized build tracking at C:\Users\Student1\AppData\Local\Temp\2\pip-req-tracker-r1hffaop
Created build tracker: C:\Users\Student1\AppData\Local\Temp\2\pip-req-tracker-r1hffaop
Entered build tracker: C:\Users\Student1\AppData\Local\Temp\2\pip-req-tracker-r1hffaop
Created temporary directory: C:\Users\Student1\AppData\Local\Temp\2\pip-install-wt_qv_ie
Processing c:\users\student1\pycharmprojects\gain2\apex\apex
Created temporary directory: C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9
Added file:///C:/Users/Student1/PycharmProjects/GAIN2/apex/apex to build tracker 'C:\Users\Student1\AppData\Local\Temp\2\pip-req-tracker-r1hffaop'
Running setup.py (path:C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\setup.py) egg_info for package from file:///C:/Users/Student1/PycharmProjects/GAIN2/apex/apex
Created temporary directory: C:\Users\Student1\AppData\Local\Temp\2\pip-pip-egg-info-u6r1s3md
Running command python setup.py egg_info

torch.__version__  = 1.7.1


running egg_info
creating C:\Users\Student1\AppData\Local\Temp\2\pip-pip-egg-info-u6r1s3md\apex.egg-info
writing C:\Users\Student1\AppData\Local\Temp\2\pip-pip-egg-info-u6r1s3md\apex.egg-info\PKG-INFO
writing dependency_links to C:\Users\Student1\AppData\Local\Temp\2\pip-pip-egg-info-u6r1s3md\apex.egg-info\dependency_links.txt
writing top-level names to C:\Users\Student1\AppData\Local\Temp\2\pip-pip-egg-info-u6r1s3md\apex.egg-info\top_level.txt
writing manifest file 'C:\Users\Student1\AppData\Local\Temp\2\pip-pip-egg-info-u6r1s3md\apex.egg-info\SOURCES.txt'
reading manifest file 'C:\Users\Student1\AppData\Local\Temp\2\pip-pip-egg-info-u6r1s3md\apex.egg-info\SOURCES.txt'
writing manifest file 'C:\Users\Student1\AppData\Local\Temp\2\pip-pip-egg-info-u6r1s3md\apex.egg-info\SOURCES.txt'
C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\setup.py:67: UserWarning: Option --pyprof not specified. Not installing PyProf dependencies!
  warnings.warn("Option --pyprof not specified. Not installing PyProf dependencies!")

Source in c:\users\student1\appdata\local\temp\2\pip-req-build-6rzo82d9 has version 0.1, which satisfies requirement apex==0.1 from file:///C:/Users/Student1/PycharmProjects/GAIN2/apex/apex
Removed apex==0.1 from file:///C:/Users/Student1/PycharmProjects/GAIN2/apex/apex from build tracker 'C:\Users\Student1\AppData\Local\Temp\2\pip-req-tracker-r1hffaop'
Created temporary directory: C:\Users\Student1\AppData\Local\Temp\2\pip-unpack-y0zjk7g5
Skipping wheel build for apex, due to binaries being disabled for it.
Installing collected packages: apex
Created temporary directory: C:\Users\Student1\AppData\Local\Temp\2\pip-record-k20gx55k
Running command 'C:\Users\Student1\anaconda3\envs\GAIN2\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\setup.py'"'"'; file='"'"'C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' --cpp_ext --cuda_ext install --record 'C:\Users\Student1\AppData\Local\Temp\2\pip-record-k20gx55k\install-record.txt' --single-version-externally-managed --compile --install-headers 'C:\Users\Student1\anaconda3\envs\GAIN2\Include\apex'

torch.__version__  = 1.7.1


C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\setup.py:67: UserWarning: Option --pyprof not specified. Not installing PyProf dependencies!
  warnings.warn("Option --pyprof not specified. Not installing PyProf dependencies!")

Compiling cuda extensions with
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Oct_23_19:32:27_Pacific_Daylight_Time_2019
Cuda compilation tools, release 10.2, V10.2.89
from C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2/bin

running install
running build
running build_py
creating build
creating build\lib.win-amd64-3.6
creating build\lib.win-amd64-3.6\apex
copying apex\__init__.py -> build\lib.win-amd64-3.6\apex
creating build\lib.win-amd64-3.6\apex\amp
copying apex\amp\amp.py -> build\lib.win-amd64-3.6\apex\amp
copying apex\amp\compat.py -> build\lib.win-amd64-3.6\apex\amp
copying apex\amp\frontend.py -> build\lib.win-amd64-3.6\apex\amp
copying apex\amp\handle.py -> build\lib.win-amd64-3.6\apex\amp
copying apex\amp\opt.py -> build\lib.win-amd64-3.6\apex\amp
copying apex\amp\rnn_compat.py -> build\lib.win-amd64-3.6\apex\amp
copying apex\amp\scaler.py -> build\lib.win-amd64-3.6\apex\amp
copying apex\amp\utils.py -> build\lib.win-amd64-3.6\apex\amp
copying apex\amp\wrap.py -> build\lib.win-amd64-3.6\apex\amp
copying apex\amp\_amp_state.py -> build\lib.win-amd64-3.6\apex\amp
copying apex\amp\_initialize.py -> build\lib.win-amd64-3.6\apex\amp
copying apex\amp\_process_optimizer.py -> build\lib.win-amd64-3.6\apex\amp
copying apex\amp\__init__.py -> build\lib.win-amd64-3.6\apex\amp
copying apex\amp\__version__.py -> build\lib.win-amd64-3.6\apex\amp
creating build\lib.win-amd64-3.6\apex\contrib
copying apex\contrib\__init__.py -> build\lib.win-amd64-3.6\apex\contrib
creating build\lib.win-amd64-3.6\apex\fp16_utils
copying apex\fp16_utils\fp16util.py -> build\lib.win-amd64-3.6\apex\fp16_utils
copying apex\fp16_utils\fp16_optimizer.py -> build\lib.win-amd64-3.6\apex\fp16_utils
copying apex\fp16_utils\loss_scaler.py -> build\lib.win-amd64-3.6\apex\fp16_utils
copying apex\fp16_utils\__init__.py -> build\lib.win-amd64-3.6\apex\fp16_utils
creating build\lib.win-amd64-3.6\apex\mlp
copying apex\mlp\mlp.py -> build\lib.win-amd64-3.6\apex\mlp
copying apex\mlp\__init__.py -> build\lib.win-amd64-3.6\apex\mlp
creating build\lib.win-amd64-3.6\apex\multi_tensor_apply
copying apex\multi_tensor_apply\multi_tensor_apply.py -> build\lib.win-amd64-3.6\apex\multi_tensor_apply
copying apex\multi_tensor_apply\__init__.py -> build\lib.win-amd64-3.6\apex\multi_tensor_apply
creating build\lib.win-amd64-3.6\apex\normalization
copying apex\normalization\fused_layer_norm.py -> build\lib.win-amd64-3.6\apex\normalization
copying apex\normalization\__init__.py -> build\lib.win-amd64-3.6\apex\normalization
creating build\lib.win-amd64-3.6\apex\optimizers
copying apex\optimizers\fused_adagrad.py -> build\lib.win-amd64-3.6\apex\optimizers
copying apex\optimizers\fused_adam.py -> build\lib.win-amd64-3.6\apex\optimizers
copying apex\optimizers\fused_lamb.py -> build\lib.win-amd64-3.6\apex\optimizers
copying apex\optimizers\fused_novograd.py -> build\lib.win-amd64-3.6\apex\optimizers
copying apex\optimizers\fused_sgd.py -> build\lib.win-amd64-3.6\apex\optimizers
copying apex\optimizers\__init__.py -> build\lib.win-amd64-3.6\apex\optimizers
creating build\lib.win-amd64-3.6\apex\parallel
copying apex\parallel\distributed.py -> build\lib.win-amd64-3.6\apex\parallel
copying apex\parallel\LARC.py -> build\lib.win-amd64-3.6\apex\parallel
copying apex\parallel\multiproc.py -> build\lib.win-amd64-3.6\apex\parallel
copying apex\parallel\optimized_sync_batchnorm.py -> build\lib.win-amd64-3.6\apex\parallel
copying apex\parallel\optimized_sync_batchnorm_kernel.py -> build\lib.win-amd64-3.6\apex\parallel
copying apex\parallel\sync_batchnorm.py -> build\lib.win-amd64-3.6\apex\parallel
copying apex\parallel\sync_batchnorm_kernel.py -> build\lib.win-amd64-3.6\apex\parallel
copying apex\parallel\__init__.py -> build\lib.win-amd64-3.6\apex\parallel
creating build\lib.win-amd64-3.6\apex\pyprof
copying apex\pyprof\__init__.py -> build\lib.win-amd64-3.6\apex\pyprof
creating build\lib.win-amd64-3.6\apex\reparameterization
copying apex\reparameterization\reparameterization.py -> build\lib.win-amd64-3.6\apex\reparameterization
copying apex\reparameterization\weight_norm.py -> build\lib.win-amd64-3.6\apex\reparameterization
copying apex\reparameterization\__init__.py -> build\lib.win-amd64-3.6\apex\reparameterization
creating build\lib.win-amd64-3.6\apex\RNN
copying apex\RNN\cells.py -> build\lib.win-amd64-3.6\apex\RNN
copying apex\RNN\models.py -> build\lib.win-amd64-3.6\apex\RNN
copying apex\RNN\RNNBackend.py -> build\lib.win-amd64-3.6\apex\RNN
copying apex\RNN\__init__.py -> build\lib.win-amd64-3.6\apex\RNN
creating build\lib.win-amd64-3.6\apex\amp\lists
copying apex\amp\lists\functional_overrides.py -> build\lib.win-amd64-3.6\apex\amp\lists
copying apex\amp\lists\tensor_overrides.py -> build\lib.win-amd64-3.6\apex\amp\lists
copying apex\amp\lists\torch_overrides.py -> build\lib.win-amd64-3.6\apex\amp\lists
copying apex\amp\lists\__init__.py -> build\lib.win-amd64-3.6\apex\amp\lists
creating build\lib.win-amd64-3.6\apex\contrib\groupbn
copying apex\contrib\groupbn\batch_norm.py -> build\lib.win-amd64-3.6\apex\contrib\groupbn
copying apex\contrib\groupbn\__init__.py -> build\lib.win-amd64-3.6\apex\contrib\groupbn
creating build\lib.win-amd64-3.6\apex\contrib\layer_norm
copying apex\contrib\layer_norm\layer_norm.py -> build\lib.win-amd64-3.6\apex\contrib\layer_norm
copying apex\contrib\layer_norm\__init__.py -> build\lib.win-amd64-3.6\apex\contrib\layer_norm
creating build\lib.win-amd64-3.6\apex\contrib\multihead_attn
copying apex\contrib\multihead_attn\encdec_multihead_attn.py -> build\lib.win-amd64-3.6\apex\contrib\multihead_attn
copying apex\contrib\multihead_attn\encdec_multihead_attn_func.py -> build\lib.win-amd64-3.6\apex\contrib\multihead_attn
copying apex\contrib\multihead_attn\fast_encdec_multihead_attn_func.py -> build\lib.win-amd64-3.6\apex\contrib\multihead_attn
copying apex\contrib\multihead_attn\fast_encdec_multihead_attn_norm_add_func.py -> build\lib.win-amd64-3.6\apex\contrib\multihead_attn
copying apex\contrib\multihead_attn\fast_self_multihead_attn_func.py -> build\lib.win-amd64-3.6\apex\contrib\multihead_attn
copying apex\contrib\multihead_attn\fast_self_multihead_attn_norm_add_func.py -> build\lib.win-amd64-3.6\apex\contrib\multihead_attn
copying apex\contrib\multihead_attn\mask_softmax_dropout_func.py -> build\lib.win-amd64-3.6\apex\contrib\multihead_attn
copying apex\contrib\multihead_attn\self_multihead_attn.py -> build\lib.win-amd64-3.6\apex\contrib\multihead_attn
copying apex\contrib\multihead_attn\self_multihead_attn_func.py -> build\lib.win-amd64-3.6\apex\contrib\multihead_attn
copying apex\contrib\multihead_attn\__init__.py -> build\lib.win-amd64-3.6\apex\contrib\multihead_attn
creating build\lib.win-amd64-3.6\apex\contrib\optimizers
copying apex\contrib\optimizers\distributed_fused_adam.py -> build\lib.win-amd64-3.6\apex\contrib\optimizers
copying apex\contrib\optimizers\distributed_fused_adam_v2.py -> build\lib.win-amd64-3.6\apex\contrib\optimizers
copying apex\contrib\optimizers\distributed_fused_adam_v3.py -> build\lib.win-amd64-3.6\apex\contrib\optimizers
copying apex\contrib\optimizers\distributed_fused_lamb.py -> build\lib.win-amd64-3.6\apex\contrib\optimizers
copying apex\contrib\optimizers\fp16_optimizer.py -> build\lib.win-amd64-3.6\apex\contrib\optimizers
copying apex\contrib\optimizers\fused_adam.py -> build\lib.win-amd64-3.6\apex\contrib\optimizers
copying apex\contrib\optimizers\fused_lamb.py -> build\lib.win-amd64-3.6\apex\contrib\optimizers
copying apex\contrib\optimizers\fused_sgd.py -> build\lib.win-amd64-3.6\apex\contrib\optimizers
copying apex\contrib\optimizers\__init__.py -> build\lib.win-amd64-3.6\apex\contrib\optimizers
creating build\lib.win-amd64-3.6\apex\contrib\sparsity
copying apex\contrib\sparsity\asp.py -> build\lib.win-amd64-3.6\apex\contrib\sparsity
copying apex\contrib\sparsity\sparse_masklib.py -> build\lib.win-amd64-3.6\apex\contrib\sparsity
copying apex\contrib\sparsity\__init__.py -> build\lib.win-amd64-3.6\apex\contrib\sparsity
creating build\lib.win-amd64-3.6\apex\contrib\transducer
copying apex\contrib\transducer\transducer.py -> build\lib.win-amd64-3.6\apex\contrib\transducer
copying apex\contrib\transducer\__init__.py -> build\lib.win-amd64-3.6\apex\contrib\transducer
creating build\lib.win-amd64-3.6\apex\contrib\xentropy
copying apex\contrib\xentropy\softmax_xentropy.py -> build\lib.win-amd64-3.6\apex\contrib\xentropy
copying apex\contrib\xentropy\__init__.py -> build\lib.win-amd64-3.6\apex\contrib\xentropy
creating build\lib.win-amd64-3.6\apex\pyprof\nvtx
copying apex\pyprof\nvtx\nvmarker.py -> build\lib.win-amd64-3.6\apex\pyprof\nvtx
copying apex\pyprof\nvtx\__init__.py -> build\lib.win-amd64-3.6\apex\pyprof\nvtx
creating build\lib.win-amd64-3.6\apex\pyprof\parse
copying apex\pyprof\parse\db.py -> build\lib.win-amd64-3.6\apex\pyprof\parse
copying apex\pyprof\parse\kernel.py -> build\lib.win-amd64-3.6\apex\pyprof\parse
copying apex\pyprof\parse\nvvp.py -> build\lib.win-amd64-3.6\apex\pyprof\parse
copying apex\pyprof\parse\parse.py -> build\lib.win-amd64-3.6\apex\pyprof\parse
copying apex\pyprof\parse\__init__.py -> build\lib.win-amd64-3.6\apex\pyprof\parse
copying apex\pyprof\parse\__main__.py -> build\lib.win-amd64-3.6\apex\pyprof\parse
creating build\lib.win-amd64-3.6\apex\pyprof\prof
copying apex\pyprof\prof\activation.py -> build\lib.win-amd64-3.6\apex\pyprof\prof
copying apex\pyprof\prof\base.py -> build\lib.win-amd64-3.6\apex\pyprof\prof
copying apex\pyprof\prof\blas.py -> build\lib.win-amd64-3.6\apex\pyprof\prof
copying apex\pyprof\prof\conv.py -> build\lib.win-amd64-3.6\apex\pyprof\prof
copying apex\pyprof\prof\convert.py -> build\lib.win-amd64-3.6\apex\pyprof\prof
copying apex\pyprof\prof\data.py -> build\lib.win-amd64-3.6\apex\pyprof\prof
copying apex\pyprof\prof\dropout.py -> build\lib.win-amd64-3.6\apex\pyprof\prof
copying apex\pyprof\prof\embedding.py -> build\lib.win-amd64-3.6\apex\pyprof\prof
copying apex\pyprof\prof\index_slice_join_mutate.py -> build\lib.win-amd64-3.6\apex\pyprof\prof
copying apex\pyprof\prof\linear.py -> build\lib.win-amd64-3.6\apex\pyprof\prof
copying apex\pyprof\prof\loss.py -> build\lib.win-amd64-3.6\apex\pyprof\prof
copying apex\pyprof\prof\misc.py -> build\lib.win-amd64-3.6\apex\pyprof\prof
copying apex\pyprof\prof\normalization.py -> build\lib.win-amd64-3.6\apex\pyprof\prof
copying apex\pyprof\prof\optim.py -> build\lib.win-amd64-3.6\apex\pyprof\prof
copying apex\pyprof\prof\output.py -> build\lib.win-amd64-3.6\apex\pyprof\prof
copying apex\pyprof\prof\pointwise.py -> build\lib.win-amd64-3.6\apex\pyprof\prof
copying apex\pyprof\prof\pooling.py -> build\lib.win-amd64-3.6\apex\pyprof\prof
copying apex\pyprof\prof\prof.py -> build\lib.win-amd64-3.6\apex\pyprof\prof
copying apex\pyprof\prof\randomSample.py -> build\lib.win-amd64-3.6\apex\pyprof\prof
copying apex\pyprof\prof\recurrentCell.py -> build\lib.win-amd64-3.6\apex\pyprof\prof
copying apex\pyprof\prof\reduction.py -> build\lib.win-amd64-3.6\apex\pyprof\prof
copying apex\pyprof\prof\softmax.py -> build\lib.win-amd64-3.6\apex\pyprof\prof
copying apex\pyprof\prof\usage.py -> build\lib.win-amd64-3.6\apex\pyprof\prof
copying apex\pyprof\prof\utility.py -> build\lib.win-amd64-3.6\apex\pyprof\prof
copying apex\pyprof\prof\__init__.py -> build\lib.win-amd64-3.6\apex\pyprof\prof
copying apex\pyprof\prof\__main__.py -> build\lib.win-amd64-3.6\apex\pyprof\prof
running build_ext
C:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\utils\cpp_extension.py:287: UserWarning: Error checking compiler version for cl: [WinError 2] The system cannot find the file specified
  warnings.warn('Error checking compiler version for {}: {}'.format(compiler, error))
building 'apex_C' extension
creating C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\build\temp.win-amd64-3.6
creating C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\build\temp.win-amd64-3.6\Release
creating C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\build\temp.win-amd64-3.6\Release\csrc
Emitting ninja build file C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\build\temp.win-amd64-3.6\Release\build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/1] cl /showIncludes /nologo /Ox /W3 /GL /DNDEBUG /MD /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /EHsc -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include\TH -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include\THC -IC:\Users\Student1\anaconda3\envs\GAIN2\include -IC:\Users\Student1\anaconda3\envs\GAIN2\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29910\ATLMFC\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29910\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\cppwinrt" -c C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\csrc\flatten_unflatten.cpp /FoC:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\build\temp.win-amd64-3.6\Release\csrc/flatten_unflatten.obj -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=apex_C -D_GLIBCXX_USE_CXX11_ABI=0 /std:c++14
C:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include\ATen/core/ivalue_inl.h(389): warning C4101: 'e': unreferenced local variable
C:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include\pybind11\detail/common.h(106): warning C4005: 'HAVE_SNPRINTF': macro redefinition
C:\Users\Student1\anaconda3\envs\GAIN2\include\pyerrors.h(489): note: see previous definition of 'HAVE_SNPRINTF'
C:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include\torch/csrc/utils/tensor_flatten.h(36): warning C4996: 'at::Tensor::type': Tensor.type() is deprecated. Instead use Tensor.options(), which in many cases (e.g. in a constructor) is a drop-in replacement. If you were using data from type(), that is now available from Tensor itself, so instead of tensor.type().scalar_type(), use tensor.scalar_type() instead and instead of tensor.type().backend() use tensor.device().
C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29910\bin\HostX86\x64\link.exe /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:C:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\lib /LIBPATH:C:\Users\Student1\anaconda3\envs\GAIN2\libs /LIBPATH:C:\Users\Student1\anaconda3\envs\GAIN2\PCbuild\amd64 "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29910\ATLMFC\lib\x64" "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29910\lib\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.19041.0\ucrt\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.19041.0\um\x64" c10.lib torch.lib torch_cpu.lib torch_python.lib /EXPORT:PyInit_apex_C C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\build\temp.win-amd64-3.6\Release\csrc/flatten_unflatten.obj /OUT:build\lib.win-amd64-3.6\apex_C.cp36-win_amd64.pyd /IMPLIB:C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\build\temp.win-amd64-3.6\Release\csrc\apex_C.cp36-win_amd64.lib
   Creating library C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\build\temp.win-amd64-3.6\Release\csrc\apex_C.cp36-win_amd64.lib and object C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\build\temp.win-amd64-3.6\Release\csrc\apex_C.cp36-win_amd64.exp
Generating code
Finished generating code
building 'amp_C' extension
Emitting ninja build file C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\build\temp.win-amd64-3.6\Release\build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/11] C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\bin\nvcc --use-local-env -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include\TH -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\include" -IC:\Users\Student1\anaconda3\envs\GAIN2\include -IC:\Users\Student1\anaconda3\envs\GAIN2\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29910\ATLMFC\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29910\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\cppwinrt" -c C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\csrc\multi_tensor_l2norm_kernel.cu -o C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\build\temp.win-amd64-3.6\Release\csrc/multi_tensor_l2norm_kernel.obj -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -lineinfo -O3 --use_fast_math -DVERSION_GE_1_1 -DVERSION_GE_1_3 -DVERSION_GE_1_5 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=amp_C -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=sm_75
FAILED: C:/Users/Student1/AppData/Local/Temp/2/pip-req-build-6rzo82d9/build/temp.win-amd64-3.6/Release/csrc/multi_tensor_l2norm_kernel.obj
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\bin\nvcc --use-local-env -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include\TH -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\include" -IC:\Users\Student1\anaconda3\envs\GAIN2\include -IC:\Users\Student1\anaconda3\envs\GAIN2\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29910\ATLMFC\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29910\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\cppwinrt" -c C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\csrc\multi_tensor_l2norm_kernel.cu -o C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\build\temp.win-amd64-3.6\Release\csrc/multi_tensor_l2norm_kernel.obj -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -lineinfo -O3 --use_fast_math -DVERSION_GE_1_1 -DVERSION_GE_1_3 -DVERSION_GE_1_5 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=amp_C -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=sm_75
C:/Users/Student1/anaconda3/envs/GAIN2/lib/site-packages/torch/include\c10/util/ThreadLocalDebugInfo.h(12): warning: modifier is ignored on an enum specifier

C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.28.29910/include\type_traits(1066): error: static assertion failed with "You've instantiated std::aligned_storage<Len, Align> with an extended alignment (in other words, Align > alignof(max_align_t)). Before VS 2017 15.8, the member "type" would non-conformingly have an alignment of only alignof(max_align_t). VS 2017 15.8 was fixed to handle this correctly, but the fix inherently changes layout and breaks binary compatibility (*only* for uses of aligned_storage with extended alignments). Please define either (1) _ENABLE_EXTENDED_ALIGNED_STORAGE to acknowledge that you understand this message and that you actually want a type with an extended alignment, or (2) _DISABLE_EXTENDED_ALIGNED_STORAGE to silence this message and get the old non-conforming behavior."
          detected during:
            instantiation of class "std::_Aligned<_Len, _Align, double, false> [with _Len=16ULL, _Align=16ULL]"
(1086): here
            instantiation of class "std::_Aligned<_Len, _Align, int, false> [with _Len=16ULL, _Align=16ULL]"
(1093): here
            instantiation of class "std::_Aligned<_Len, _Align, short, false> [with _Len=16ULL, _Align=16ULL]"
(1100): here
            instantiation of class "std::_Aligned<_Len, _Align, char, false> [with _Len=16ULL, _Align=16ULL]"
(1107): here
            instantiation of class "std::aligned_storage<_Len, _Align> [with _Len=16ULL, _Align=16ULL]"
C:/Users/Student1/AppData/Local/Temp/2/pip-req-build-6rzo82d9/csrc/multi_tensor_l2norm_kernel.cu(24): here
            instantiation of "void load_store(T *, T *, int, int) [with T=float]"
C:/Users/Student1/AppData/Local/Temp/2/pip-req-build-6rzo82d9/csrc/multi_tensor_l2norm_kernel.cu(69): here
            instantiation of "void L2NormFunctor<x_t>::operator()(int, volatile int *, TensorListMetadata<1> &, float *, float *, __nv_bool, int) [with x_t=float]"
C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\csrc\multi_tensor_apply.cuh(38): here
            instantiation of "void multi_tensor_apply_kernel(int, volatile int *, T, U, ArgTypes...) [with T=TensorListMetadata<1>, U=L2NormFunctor<float>, ArgTypes=<float *, float *, __nv_bool, int>]"
C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\csrc\multi_tensor_apply.cuh(109): here
            instantiation of "void multi_tensor_apply<depth,T,ArgTypes...>(int, int, const at::Tensor &, const std::vector<std::vector<at::Tensor, std::allocator<at::Tensor>>, std::allocator<std::vector<at::Tensor, std::allocator<at::Tensor>>>> &, T, ArgTypes...) [with depth=1, T=L2NormFunctor<float>, ArgTypes=<float *, float *, __nv_bool, int>]"
C:/Users/Student1/AppData/Local/Temp/2/pip-req-build-6rzo82d9/csrc/multi_tensor_l2norm_kernel.cu(326): here

1 error detected in the compilation of "C:/Users/Student1/AppData/Local/Temp/2/tmpxft_00001ff0_00000000-7_multi_tensor_l2norm_kernel.cpp1.ii".
multi_tensor_l2norm_kernel.cu
[2/11] C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\bin\nvcc --use-local-env -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include\TH -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\include" -IC:\Users\Student1\anaconda3\envs\GAIN2\include -IC:\Users\Student1\anaconda3\envs\GAIN2\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29910\ATLMFC\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29910\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\cppwinrt" -c C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\csrc\multi_tensor_axpby_kernel.cu -o C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\build\temp.win-amd64-3.6\Release\csrc/multi_tensor_axpby_kernel.obj -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -lineinfo -O3 --use_fast_math -DVERSION_GE_1_1 -DVERSION_GE_1_3 -DVERSION_GE_1_5 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=amp_C -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=sm_75
FAILED: C:/Users/Student1/AppData/Local/Temp/2/pip-req-build-6rzo82d9/build/temp.win-amd64-3.6/Release/csrc/multi_tensor_axpby_kernel.obj
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\bin\nvcc --use-local-env -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include\TH -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\include" -IC:\Users\Student1\anaconda3\envs\GAIN2\include -IC:\Users\Student1\anaconda3\envs\GAIN2\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29910\ATLMFC\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29910\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\cppwinrt" -c C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\csrc\multi_tensor_axpby_kernel.cu -o C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\build\temp.win-amd64-3.6\Release\csrc/multi_tensor_axpby_kernel.obj -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -lineinfo -O3 --use_fast_math -DVERSION_GE_1_1 -DVERSION_GE_1_3 -DVERSION_GE_1_5 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=amp_C -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=sm_75
C:/Users/Student1/anaconda3/envs/GAIN2/lib/site-packages/torch/include\c10/util/ThreadLocalDebugInfo.h(12): warning: modifier is ignored on an enum specifier

C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.28.29910/include\type_traits(1066): error: static assertion failed with "You've instantiated std::aligned_storage<Len, Align> with an extended alignment (in other words, Align > alignof(max_align_t)). Before VS 2017 15.8, the member "type" would non-conformingly have an alignment of only alignof(max_align_t). VS 2017 15.8 was fixed to handle this correctly, but the fix inherently changes layout and breaks binary compatibility (*only* for uses of aligned_storage with extended alignments). Please define either (1) _ENABLE_EXTENDED_ALIGNED_STORAGE to acknowledge that you understand this message and that you actually want a type with an extended alignment, or (2) _DISABLE_EXTENDED_ALIGNED_STORAGE to silence this message and get the old non-conforming behavior."
          detected during:
            instantiation of class "std::_Aligned<_Len, _Align, double, false> [with _Len=16ULL, _Align=16ULL]"
(1086): here
            instantiation of class "std::_Aligned<_Len, _Align, int, false> [with _Len=16ULL, _Align=16ULL]"
(1093): here
            instantiation of class "std::_Aligned<_Len, _Align, short, false> [with _Len=16ULL, _Align=16ULL]"
(1100): here
            instantiation of class "std::_Aligned<_Len, _Align, char, false> [with _Len=16ULL, _Align=16ULL]"
(1107): here
            instantiation of class "std::aligned_storage<_Len, _Align> [with _Len=16ULL, _Align=16ULL]"
C:/Users/Student1/AppData/Local/Temp/2/pip-req-build-6rzo82d9/csrc/multi_tensor_axpby_kernel.cu(23): here
            instantiation of "void load_store(T *, T *, int, int) [with T=float]"
C:/Users/Student1/AppData/Local/Temp/2/pip-req-build-6rzo82d9/csrc/multi_tensor_axpby_kernel.cu(68): here
            instantiation of "void AxpbyFunctor<x_t, y_t, out_t>::operator()(int, volatile int *, TensorListMetadata<3> &, float, float, int) [with x_t=float, y_t=float, out_t=float]"
C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\csrc\multi_tensor_apply.cuh(38): here
            instantiation of "void multi_tensor_apply_kernel(int, volatile int *, T, U, ArgTypes...) [with T=TensorListMetadata<3>, U=AxpbyFunctor<float, float, float>, ArgTypes=<float, float, int>]"
C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\csrc\multi_tensor_apply.cuh(109): here
            instantiation of "void multi_tensor_apply<depth,T,ArgTypes...>(int, int, const at::Tensor &, const std::vector<std::vector<at::Tensor, std::allocator<at::Tensor>>, std::allocator<std::vector<at::Tensor, std::allocator<at::Tensor>>>> &, T, ArgTypes...) [with depth=3, T=AxpbyFunctor<float, float, float>, ArgTypes=<float, float, int>]"
C:/Users/Student1/AppData/Local/Temp/2/pip-req-build-6rzo82d9/csrc/multi_tensor_axpby_kernel.cu(141): here

1 error detected in the compilation of "C:/Users/Student1/AppData/Local/Temp/2/tmpxft_00000f84_00000000-7_multi_tensor_axpby_kernel.cpp1.ii".
multi_tensor_axpby_kernel.cu
[3/11] C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\bin\nvcc --use-local-env -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include\TH -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\include" -IC:\Users\Student1\anaconda3\envs\GAIN2\include -IC:\Users\Student1\anaconda3\envs\GAIN2\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29910\ATLMFC\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29910\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\cppwinrt" -c C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\csrc\multi_tensor_scale_kernel.cu -o C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\build\temp.win-amd64-3.6\Release\csrc/multi_tensor_scale_kernel.obj -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -lineinfo -O3 --use_fast_math -DVERSION_GE_1_1 -DVERSION_GE_1_3 -DVERSION_GE_1_5 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=amp_C -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=sm_75
FAILED: C:/Users/Student1/AppData/Local/Temp/2/pip-req-build-6rzo82d9/build/temp.win-amd64-3.6/Release/csrc/multi_tensor_scale_kernel.obj
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\bin\nvcc --use-local-env -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include\TH -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\include" -IC:\Users\Student1\anaconda3\envs\GAIN2\include -IC:\Users\Student1\anaconda3\envs\GAIN2\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29910\ATLMFC\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29910\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\cppwinrt" -c C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\csrc\multi_tensor_scale_kernel.cu -o C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\build\temp.win-amd64-3.6\Release\csrc/multi_tensor_scale_kernel.obj -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -lineinfo -O3 --use_fast_math -DVERSION_GE_1_1 -DVERSION_GE_1_3 -DVERSION_GE_1_5 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=amp_C -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=sm_75
C:/Users/Student1/anaconda3/envs/GAIN2/lib/site-packages/torch/include\c10/util/ThreadLocalDebugInfo.h(12): warning: modifier is ignored on an enum specifier

C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.28.29910/include\type_traits(1066): error: static assertion failed with "You've instantiated std::aligned_storage<Len, Align> with an extended alignment (in other words, Align > alignof(max_align_t)). Before VS 2017 15.8, the member "type" would non-conformingly have an alignment of only alignof(max_align_t). VS 2017 15.8 was fixed to handle this correctly, but the fix inherently changes layout and breaks binary compatibility (*only* for uses of aligned_storage with extended alignments). Please define either (1) _ENABLE_EXTENDED_ALIGNED_STORAGE to acknowledge that you understand this message and that you actually want a type with an extended alignment, or (2) _DISABLE_EXTENDED_ALIGNED_STORAGE to silence this message and get the old non-conforming behavior."
          detected during:
            instantiation of class "std::_Aligned<_Len, _Align, double, false> [with _Len=16ULL, _Align=16ULL]"
(1086): here
            instantiation of class "std::_Aligned<_Len, _Align, int, false> [with _Len=16ULL, _Align=16ULL]"
(1093): here
            instantiation of class "std::_Aligned<_Len, _Align, short, false> [with _Len=16ULL, _Align=16ULL]"
(1100): here
            instantiation of class "std::_Aligned<_Len, _Align, char, false> [with _Len=16ULL, _Align=16ULL]"
(1107): here
            instantiation of class "std::aligned_storage<_Len, _Align> [with _Len=16ULL, _Align=16ULL]"
C:/Users/Student1/AppData/Local/Temp/2/pip-req-build-6rzo82d9/csrc/multi_tensor_scale_kernel.cu(25): here
            instantiation of "void load_store(T *, T *, int, int) [with T=float]"
C:/Users/Student1/AppData/Local/Temp/2/pip-req-build-6rzo82d9/csrc/multi_tensor_scale_kernel.cu(64): here
            instantiation of "void ScaleFunctor<in_t, out_t>::operator()(int, volatile int *, TensorListMetadata<2> &, float) [with in_t=float, out_t=float]"
C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\csrc\multi_tensor_apply.cuh(38): here
            instantiation of "void multi_tensor_apply_kernel(int, volatile int *, T, U, ArgTypes...) [with T=TensorListMetadata<2>, U=ScaleFunctor<float, float>, ArgTypes=<float>]"
C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\csrc\multi_tensor_apply.cuh(109): here
            instantiation of "void multi_tensor_apply<depth,T,ArgTypes...>(int, int, const at::Tensor &, const std::vector<std::vector<at::Tensor, std::allocator<at::Tensor>>, std::allocator<std::vector<at::Tensor, std::allocator<at::Tensor>>>> &, T, ArgTypes...) [with depth=2, T=ScaleFunctor<float, float>, ArgTypes=<float>]"
C:/Users/Student1/AppData/Local/Temp/2/pip-req-build-6rzo82d9/csrc/multi_tensor_scale_kernel.cu(124): here

1 error detected in the compilation of "C:/Users/Student1/AppData/Local/Temp/2/tmpxft_00001f98_00000000-7_multi_tensor_scale_kernel.cpp1.ii".
multi_tensor_scale_kernel.cu
[4/11] C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\bin\nvcc --use-local-env -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include\TH -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\include" -IC:\Users\Student1\anaconda3\envs\GAIN2\include -IC:\Users\Student1\anaconda3\envs\GAIN2\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29910\ATLMFC\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29910\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\cppwinrt" -c C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\csrc\multi_tensor_lamb.cu -o C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\build\temp.win-amd64-3.6\Release\csrc/multi_tensor_lamb.obj -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -lineinfo -O3 --use_fast_math -DVERSION_GE_1_1 -DVERSION_GE_1_3 -DVERSION_GE_1_5 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=amp_C -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=sm_75
FAILED: C:/Users/Student1/AppData/Local/Temp/2/pip-req-build-6rzo82d9/build/temp.win-amd64-3.6/Release/csrc/multi_tensor_lamb.obj
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\bin\nvcc --use-local-env -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include\TH -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\include" -IC:\Users\Student1\anaconda3\envs\GAIN2\include -IC:\Users\Student1\anaconda3\envs\GAIN2\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29910\ATLMFC\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29910\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\cppwinrt" -c C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\csrc\multi_tensor_lamb.cu -o C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\build\temp.win-amd64-3.6\Release\csrc/multi_tensor_lamb.obj -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -lineinfo -O3 --use_fast_math -DVERSION_GE_1_1 -DVERSION_GE_1_3 -DVERSION_GE_1_5 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=amp_C -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=sm_75
C:/Users/Student1/anaconda3/envs/GAIN2/lib/site-packages/torch/include\c10/util/ThreadLocalDebugInfo.h(12): warning: modifier is ignored on an enum specifier

C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.28.29910/include\type_traits(1066): error: static assertion failed with "You've instantiated std::aligned_storage<Len, Align> with an extended alignment (in other words, Align > alignof(max_align_t)). Before VS 2017 15.8, the member "type" would non-conformingly have an alignment of only alignof(max_align_t). VS 2017 15.8 was fixed to handle this correctly, but the fix inherently changes layout and breaks binary compatibility (*only* for uses of aligned_storage with extended alignments). Please define either (1) _ENABLE_EXTENDED_ALIGNED_STORAGE to acknowledge that you understand this message and that you actually want a type with an extended alignment, or (2) _DISABLE_EXTENDED_ALIGNED_STORAGE to silence this message and get the old non-conforming behavior."
          detected during:
            instantiation of class "std::_Aligned<_Len, _Align, double, false> [with _Len=16ULL, _Align=16ULL]"
(1086): here
            instantiation of class "std::_Aligned<_Len, _Align, int, false> [with _Len=16ULL, _Align=16ULL]"
(1093): here
            instantiation of class "std::_Aligned<_Len, _Align, short, false> [with _Len=16ULL, _Align=16ULL]"
(1100): here
            instantiation of class "std::_Aligned<_Len, _Align, char, false> [with _Len=16ULL, _Align=16ULL]"
(1107): here
            instantiation of class "std::aligned_storage<_Len, _Align> [with _Len=16ULL, _Align=16ULL]"
C:/Users/Student1/AppData/Local/Temp/2/pip-req-build-6rzo82d9/csrc/multi_tensor_lamb.cu(23): here
            instantiation of "void load_store(T *, T *, int, int) [with T=float]"
C:/Users/Student1/AppData/Local/Temp/2/pip-req-build-6rzo82d9/csrc/multi_tensor_lamb.cu(101): here
            instantiation of "void LAMBStage1Functor<T>::operator()(int, volatile int *, TensorListMetadata<4> &, float, float, float, float, float, float, adamMode_t, float, const float *, float) [with T=float]"
C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\csrc\multi_tensor_apply.cuh(38): here
            instantiation of "void multi_tensor_apply_kernel(int, volatile int *, T, U, ArgTypes...) [with T=TensorListMetadata<4>, U=LAMBStage1Functor<float>, ArgTypes=<float, float, float, float, float, float, adamMode_t, float, float *, float>]"
C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\csrc\multi_tensor_apply.cuh(109): here
            instantiation of "void multi_tensor_apply<depth,T,ArgTypes...>(int, int, const at::Tensor &, const std::vector<std::vector<at::Tensor, std::allocator<at::Tensor>>, std::allocator<std::vector<at::Tensor, std::allocator<at::Tensor>>>> &, T, ArgTypes...) [with depth=4, T=LAMBStage1Functor<float>, ArgTypes=<float, float, float, float, float, float, adamMode_t, float, float *, float>]"
C:/Users/Student1/AppData/Local/Temp/2/pip-req-build-6rzo82d9/csrc/multi_tensor_lamb.cu(375): here

1 error detected in the compilation of "C:/Users/Student1/AppData/Local/Temp/2/tmpxft_00000e14_00000000-7_multi_tensor_lamb.cpp1.ii".
multi_tensor_lamb.cu
[5/11] C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\bin\nvcc --use-local-env -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include\TH -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\include" -IC:\Users\Student1\anaconda3\envs\GAIN2\include -IC:\Users\Student1\anaconda3\envs\GAIN2\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29910\ATLMFC\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29910\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\cppwinrt" -c C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\csrc\multi_tensor_sgd_kernel.cu -o C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\build\temp.win-amd64-3.6\Release\csrc/multi_tensor_sgd_kernel.obj -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -lineinfo -O3 --use_fast_math -DVERSION_GE_1_1 -DVERSION_GE_1_3 -DVERSION_GE_1_5 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=amp_C -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=sm_75
C:/Users/Student1/anaconda3/envs/GAIN2/lib/site-packages/torch/include\c10/util/ThreadLocalDebugInfo.h(12): warning: modifier is ignored on an enum specifier

C:/Users/Student1/anaconda3/envs/GAIN2/lib/site-packages/torch/include\c10/util/ThreadLocalDebugInfo.h(12): warning: modifier is ignored on an enum specifier

multi_tensor_sgd_kernel.cu
[6/11] cl /showIncludes /nologo /Ox /W3 /GL /DNDEBUG /MD /MD /wd4819 /wd4251 /wd4244 /wd4267 /wd4275 /wd4018 /wd4190 /EHsc -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include\TH -IC:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\include" -IC:\Users\Student1\anaconda3\envs\GAIN2\include -IC:\Users\Student1\anaconda3\envs\GAIN2\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29910\ATLMFC\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29910\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\cppwinrt" -c C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\csrc\amp_C_frontend.cpp /FoC:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\build\temp.win-amd64-3.6\Release\csrc/amp_C_frontend.obj -O3 -DVERSION_GE_1_1 -DVERSION_GE_1_3 -DVERSION_GE_1_5 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=amp_C -D_GLIBCXX_USE_CXX11_ABI=0 /std:c++14
cl : Command line warning D9002 : ignoring unknown option '-O3'
C:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include\ATen/core/ivalue_inl.h(389): warning C4101: 'e': unreferenced local variable
C:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\include\pybind11\detail/common.h(106): warning C4005: 'HAVE_SNPRINTF': macro redefinition
C:\Users\Student1\anaconda3\envs\GAIN2\include\pyerrors.h(489): note: see previous definition of 'HAVE_SNPRINTF'
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
  File "C:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\utils\cpp_extension.py", line 1539, in _run_ninja_build
    env=env)
  File "C:\Users\Student1\anaconda3\envs\GAIN2\lib\subprocess.py", line 438, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\setup.py", line 496, in <module>
    extras_require=extras,
  File "C:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\setuptools\__init__.py", line 153, in setup
    return distutils.core.setup(**attrs)
  File "C:\Users\Student1\anaconda3\envs\GAIN2\lib\distutils\core.py", line 148, in setup
    dist.run_commands()
  File "C:\Users\Student1\anaconda3\envs\GAIN2\lib\distutils\dist.py", line 955, in run_commands
    self.run_command(cmd)
  File "C:\Users\Student1\anaconda3\envs\GAIN2\lib\distutils\dist.py", line 974, in run_command
    cmd_obj.run()
  File "C:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\setuptools\command\install.py", line 61, in run
    return orig.install.run(self)
  File "C:\Users\Student1\anaconda3\envs\GAIN2\lib\distutils\command\install.py", line 545, in run
    self.run_command('build')
  File "C:\Users\Student1\anaconda3\envs\GAIN2\lib\distutils\cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "C:\Users\Student1\anaconda3\envs\GAIN2\lib\distutils\dist.py", line 974, in run_command
    cmd_obj.run()
  File "C:\Users\Student1\anaconda3\envs\GAIN2\lib\distutils\command\build.py", line 135, in run
    self.run_command(cmd_name)
  File "C:\Users\Student1\anaconda3\envs\GAIN2\lib\distutils\cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "C:\Users\Student1\anaconda3\envs\GAIN2\lib\distutils\dist.py", line 974, in run_command
    cmd_obj.run()
  File "C:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\setuptools\command\build_ext.py", line 79, in run
    _build_ext.run(self)
  File "C:\Users\Student1\anaconda3\envs\GAIN2\lib\distutils\command\build_ext.py", line 339, in run
    self.build_extensions()
  File "C:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\utils\cpp_extension.py", line 670, in build_extensions
    build_ext.build_extensions(self)
  File "C:\Users\Student1\anaconda3\envs\GAIN2\lib\distutils\command\build_ext.py", line 448, in build_extensions
    self._build_extensions_serial()
  File "C:\Users\Student1\anaconda3\envs\GAIN2\lib\distutils\command\build_ext.py", line 473, in _build_extensions_serial
    self.build_extension(ext)
  File "C:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\setuptools\command\build_ext.py", line 196, in build_extension
    _build_ext.build_extension(self, ext)
  File "C:\Users\Student1\anaconda3\envs\GAIN2\lib\distutils\command\build_ext.py", line 533, in build_extension
    depends=ext.depends)
  File "C:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\utils\cpp_extension.py", line 652, in win_wrap_ninja_compile
    with_cuda=with_cuda)
  File "C:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\utils\cpp_extension.py", line 1255, in _write_ninja_file_and_compile_objects
    error_prefix='Error compiling objects for extension')
  File "C:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\torch\utils\cpp_extension.py", line 1555, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error compiling objects for extension
Running setup.py install for apex ... error

ERROR: Command errored out with exit status 1: 'C:\Users\Student1\anaconda3\envs\GAIN2\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\setup.py'"'"'; file='"'"'C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' --cpp_ext --cuda_ext install --record 'C:\Users\Student1\AppData\Local\Temp\2\pip-record-k20gx55k\install-record.txt' --single-version-externally-managed --compile --install-headers 'C:\Users\Student1\anaconda3\envs\GAIN2\Include\apex' Check the logs for full command output.
Exception information:
Traceback (most recent call last):
File "C:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\pip_internal\req\req_install.py", line 826, in install
req_description=str(self.req),
File "C:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\pip_internal\operations\install\legacy.py", line 86, in install
raise LegacyInstallFailure
pip._internal.operations.install.legacy.LegacyInstallFailure

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\pip_internal\cli\base_command.py", line 189, in main
status = self.run(options, args)
File "C:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\pip_internal\cli\req_command.py", line 178, in wrapper
return func(self, options, args)
File "C:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\pip_internal\commands\install.py", line 400, in run
pycompile=options.compile,
File "C:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\pip_internal\req_init
.py", line 88, in install_given_reqs
pycompile=pycompile,
File "C:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\pip_internal\req\req_install.py", line 830, in install
six.reraise(*exc.parent)
File "C:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\pip_vendor\six.py", line 703, in reraise
raise value
File "C:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\pip_internal\operations\install\legacy.py", line 76, in install
cwd=unpacked_source_directory,
File "C:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\pip_internal\utils\subprocess.py", line 293, in runner
spinner=spinner,
File "C:\Users\Student1\anaconda3\envs\GAIN2\lib\site-packages\pip_internal\utils\subprocess.py", line 258, in call_subprocess
raise InstallationSubprocessError(proc.returncode, command_desc)
pip._internal.exceptions.InstallationSubprocessError: Command errored out with exit status 1: 'C:\Users\Student1\anaconda3\envs\GAIN2\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\setup.py'"'"'; file='"'"'C:\Users\Student1\AppData\Local\Temp\2\pip-req-build-6rzo82d9\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' --cpp_ext --cuda_ext install --record 'C:\Users\Student1\AppData\Local\Temp\2\pip-record-k20gx55k\install-record.txt' --single-version-externally-managed --compile --install-headers 'C:\Users\Student1\anaconda3\envs\GAIN2\Include\apex' Check the logs for full command output.
Removed build tracker: 'C:\Users\Student1\AppData\Local\Temp\2\pip-req-tracker-r1hffaop'`

@albert-jin
Copy link

thank you evey body

@mche0106
Copy link

mche0106 commented Jul 27, 2021

Python version : 3.8.8
When executing "import torch" got the same error
This problem didn't occur to me before, but after installing torchvision, I am getting this error.
If I uninstall torchvision, this problem will disappear, but I need torchvision, are there any work around?

UPDATE:
can be solved by deleting caffe2_detectron_ops.dll but then torchvision will raise error as
AttributeError: module 'torch' has no attribute '_utils_internal'

SOLVED:
https://stackoverflow.com/questions/64653750/pytorch-dll-issues-for-caffe2

@Folkdance1526
Copy link

WOW!!! Thank you @skyline75489, it worked for me,new bee!

@lucasmieri
Copy link

caffe2_detectron_ops.dll
Fixed the issue, thanks a lot

@MPS-MASTER
Copy link

That solves the loading problem thanks!

@qq1243196045
Copy link

I delete this file without downloads anything solved this problem!

@jcwchen
Copy link
Contributor

jcwchen commented Jan 14, 2022

Just provide another possible workaround: I still bumped into the same error in the latest source code after building PyTorch with cmake_generator=Visual Studio 16 2019. However, I can import torch successfully after building PyTorch by ninja (cmake_generator= and USE_NINJA=ON).

@illtellyoulater
Copy link

illtellyoulater commented Feb 7, 2022

I was getting this error no matter which python version I used (3.7 or 3.8).

The problem was that in my home directory I had a .condarc file which was configured (due to previous experiments) to give higher priority to packages from the conda-forge channel, rather than conda's default main channel, thus causing the installation of wrong packages when doing conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch as instructed by pytorch.org.

This was a bit nasty to debug, because the .condarc file was persisting even after multiple anaconda full uninstalls and reinstalls.

Anyway, after deleting .condarc and cleaning the package cache (with conda clean --all) and installing pytorch again (with the same command above from pytorch.org), the error was gone.

I hope it helps!

P.S.

@PotatoLzy
Copy link

Thanks @skyline75489 , I deleted caffe2_detectron_ops.dll caffe2_module_test_dynamic.dll and caffe2_observers.dll and it finally works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.