Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA out of memory on test.py #15

Closed
Xu-Justin opened this issue Apr 19, 2022 · 2 comments
Closed

CUDA out of memory on test.py #15

Xu-Justin opened this issue Apr 19, 2022 · 2 comments

Comments

@Xu-Justin
Copy link
Contributor

I'm following the installation guide. When running test.py on step 4, I got RuntimeError: CUDA out of memory. Is it okay to proceed (using smaller batch on training or inference), or will it have any effect on the performance?

* True check_forward_equal_with_pytorch_double: max_abs_err 8.67e-19 max_rel_err 2.35e-16
* True check_forward_equal_with_pytorch_float: max_abs_err 4.66e-10 max_rel_err 1.13e-07
* True check_gradient_numerical(D=30)
* True check_gradient_numerical(D=32)
* True check_gradient_numerical(D=64)
* True check_gradient_numerical(D=71)
* True check_gradient_numerical(D=1025)
Traceback (most recent call last):
  File "/home/azureuser/WilliamJustin/DAB-DETR/models/dab_deformable_detr/ops/test.py", line 86, in <module>
    check_gradient_numerical(channels, True, True, True)
  File "/home/azureuser/WilliamJustin/DAB-DETR/models/dab_deformable_detr/ops/test.py", line 76, in check_gradient_numerical
    gradok = gradcheck(func, (value.double(), shapes, level_start_index, sampling_locations.double(), attention_weights.double(), im2col_step))
  File "/home/azureuser/miniconda3/envs/jstnxu-DAB-DETR/lib/python3.9/site-packages/torch/autograd/gradcheck.py", line 1400, in gradcheck
    return _gradcheck_helper(**args)
  File "/home/azureuser/miniconda3/envs/jstnxu-DAB-DETR/lib/python3.9/site-packages/torch/autograd/gradcheck.py", line 1414, in _gradcheck_helper
    _gradcheck_real_imag(gradcheck_fn, func, func_out, tupled_inputs, outputs, eps,
  File "/home/azureuser/miniconda3/envs/jstnxu-DAB-DETR/lib/python3.9/site-packages/torch/autograd/gradcheck.py", line 1061, in _gradcheck_real_imag
    gradcheck_fn(func, func_out, tupled_inputs, outputs, eps,
  File "/home/azureuser/miniconda3/envs/jstnxu-DAB-DETR/lib/python3.9/site-packages/torch/autograd/gradcheck.py", line 1097, in _slow_gradcheck
    numerical = _transpose(_get_numerical_jacobian(func, tupled_inputs, outputs, eps=eps, is_forward_ad=use_forward_ad))
  File "/home/azureuser/miniconda3/envs/jstnxu-DAB-DETR/lib/python3.9/site-packages/torch/autograd/gradcheck.py", line 146, in _get_numerical_jacobian
    jacobians += [get_numerical_jacobian_wrt_specific_input(fn, inp_idx, inputs, outputs, eps,
  File "/home/azureuser/miniconda3/envs/jstnxu-DAB-DETR/lib/python3.9/site-packages/torch/autograd/gradcheck.py", line 290, in get_numerical_jacobian_wrt_specific_input
    return _combine_jacobian_cols(jacobian_cols, outputs, input, input.numel())
  File "/home/azureuser/miniconda3/envs/jstnxu-DAB-DETR/lib/python3.9/site-packages/torch/autograd/gradcheck.py", line 230, in _combine_jacobian_cols
    jacobians = _allocate_jacobians_with_outputs(outputs, numel, dtype=input.dtype if input.dtype.is_complex else None)
  File "/home/azureuser/miniconda3/envs/jstnxu-DAB-DETR/lib/python3.9/site-packages/torch/autograd/gradcheck.py", line 45, in _allocate_jacobians_with_outputs
    out.append(t.new_zeros((numel_input, t.numel()), **options))
RuntimeError: CUDA out of memory. Tried to allocate 7.50 GiB (GPU 0; 15.75 GiB total capacity; 7.50 GiB already allocated; 7.30 GiB free; 7.50 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
@SlongLiu
Copy link
Collaborator

It seems like there is no severe problem in your installation, hence I think it is okay to continue.
You can try experiments with small batch size.

@Xu-Justin
Copy link
Contributor Author

Xu-Justin commented Apr 19, 2022

Thank you for the fast response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants