Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MeanIoU and its child classes throwing error when used with ignore_class #19524

Open
savindi-wijenayaka opened this issue Apr 16, 2024 · 4 comments
Assignees

Comments

@savindi-wijenayaka
Copy link

Problem

When using jax backend with MeanIoU or its child class, training throws an error.

Code to reproduce

https://gist.github.com/savindi-wijenayaka/43da7ac5930afc3ffbf20686ecca1193

Please add the ignore_class=0 to the MeanIoU and OneHotMeanIoU inititialization step.

Observations

  • Without ignore_class: train the model
  • With ignore_class: Throws an error -> Array boolean indices must be concrete; got ShapedArray(bool[2097152])

Warnings noticed in logs

W external/xla/xla/service/gpu/nvptx_compiler.cc:718] The NVIDIA driver's CUDA version is 12.3 which is older than the ptxas CUDA version (12.4.131). Because the driver is older than the ptxas version, XLA is disabling parallel compilation, which may slow down compilation. You should update your NVIDIA driver or use the NVIDIA-provided CUDA forward compatibility packages.

Version details:

  • OS: Red Hat Enterprise Linux 9.3 (Plow)
  • GPU: NVIDIA H100 PCIe
  • CUDA Version: 12.3
  • NVIDIA-SMI 545.23.08
  • Driver Version: 545.23.08
  • jax: 0.4.26
@fchollet
Copy link
Member

Thanks for the report. I just fixed it at HEAD.

@SuryanarayanaY
Copy link
Collaborator

Hi @savindi-wijenayaka ,

Could you please check and confirm whether we can mark this issue as resolved. Thanks!

Copy link

github-actions bot commented May 7, 2024

This issue is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you.

@github-actions github-actions bot added the stale label May 7, 2024
@savindi-wijenayaka
Copy link
Author

savindi-wijenayaka commented May 8, 2024

Hi @fchollet and @SuryanarayanaY,

I installed keras from the merge commit after the fix (pip install git+https://github.com/keras-team/keras.git@fed28a7357e13aeb955f891747a1f9b26d5bc581) and run the above code. No errors were thrown. However, there is a recurring warning:

'+ptx84' is not a recognized feature for this target (ignoring feature)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants