Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not working with GPU on Windows #109

Closed
osamahothman opened this issue May 22, 2024 · 4 comments
Closed

Not working with GPU on Windows #109

osamahothman opened this issue May 22, 2024 · 4 comments

Comments

@osamahothman
Copy link

I am trying to use surya on Windows with Nvidia GPU, and cuda enabled. It returns empty predictions and the logs through the following error:
Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED

@VikParuchuri
Copy link
Owner

VikParuchuri commented May 23, 2024

I have only seen this as a warning with pytorch 2.3 (it doesn't stop execution). You may have an older version of CUDA that needs to be updated if it's an exception.

@osamahothman
Copy link
Author

Yes, it is a warning, The script executes, but returns empty predictions for any given file. Only the image included in the repo gives true predictions.
This is the only warning in the logs.

@akshay6893

This comment was marked as outdated.

@akshay6893
Copy link

akshay6893 commented Jun 27, 2024

It turns out that the layer -

hidden_states = self.sr(hidden_states)

where,

self.sr = nn.Conv2d(
                hidden_size, hidden_size, kernel_size=sequence_reduction_ratio, stride=sequence_reduction_ratio
            )

in the class SegformerEfficientSelfAttention was the problem.

It is a known problem with conv2d that on some architectures with float16 ( weights and inputs) it gives nan values.

So i changed float16 to float32 as shown below in settings.py -

@computed_field
@property
 def MODEL_DTYPE(self) -> torch.dtype:
      return torch.float32 if self.TORCH_DEVICE_MODEL == "cpu" else torch.float32

@computed_field
@property
def MODEL_DTYPE_DETECTION(self) -> torch.dtype:
    return torch.float32 if self.TORCH_DEVICE_DETECTION == "cpu" else torch.float32

Another error i got was -

CUDA error: an illegal memory access was encountered

which i solved by adding os.environ['TORCH_CUDNN_V8_API_DISABLED'] = "1" in my jupyter notebook cell.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants