You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I encountered a strange issue in which if you load both tensorflow and jax in the notebook, one of them fails to work. I created this notebook to reproduce this issue.
I am using the free tier. Maybe premium users don't get this error.
The code works for CPU and GPU runtime.
I did not test it on the deprecated TPU v1 runtime.
The 3rd cell would report an ioctl failed error. However, if you restart the session and skip the 2nd cell, which loads the jax lib, the 3rd cell works. In this state, if you go back and run the 2nd cell, you would get an ioctl failed error in this cell.
It feels like one of the libs is trying to hold the TPU device exclusively.
Describe the expected behavior
The simple example should work without any errors.
What web browser you are using
(Chrome, Firefox, Safari, etc.)
Additional context
Link to a minimal, public, self-contained notebook that reproduces this issue.
Hello! Unfortunately, you're correct that the libraries hold exclusive locks on the TPU currently. And unfortunately, TF is very aggressive about acquiring the TPU, so it'll acquire an exclusive lock on the TPU even if you're using it just for tf.data. If you want to use tf.data at the same time as you're using a different framework, you'll need to downgrade the TF version to the CPU version.
So, TF and flax only compete to grab TPU, but not GPU? Will this change to
compete for GPU as well in the future?
Also, if I prefer flax for training, what is the recommended library for
managing training data?
Thanks
On Thu, 23 May 2024, 06:47 Benjamin Bastian, ***@***.***> wrote:
Hello! Unfortunately, you're correct that the libraries hold exclusive
locks on the TPU currently. And unfortunately, TF is very aggressive about
acquiring the TPU, so it'll acquire an exclusive lock on the TPU even if
you're using it just for tf.data. If you want to use tf.data at the same
time as you're using a different framework, you'll need to downgrade the TF
version to the CPU version.
pip uninstall tensorflow -y
pip install tensorflow-cpu
—
Reply to this email directly, view it on GitHub
<#4567 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAQBTOAFHWLUCUC6JASCJ3ZDT77LAVCNFSM6AAAAABHXLHFASVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRVG4YTQMRWGY>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
Describe the current behavior
I encountered a strange issue in which if you load both tensorflow and jax in the notebook, one of them fails to work. I created this notebook to reproduce this issue.
To reproduce this issue:
Connect to the TPU v2 runtime.
ioctl failed
error. However, if you restart the session and skip the 2nd cell, which loads the jax lib, the 3rd cell works. In this state, if you go back and run the 2nd cell, you would get anioctl failed
error in this cell.It feels like one of the libs is trying to hold the TPU device exclusively.
Describe the expected behavior
The simple example should work without any errors.
What web browser you are using
(Chrome, Firefox, Safari, etc.)
Additional context
Link to a minimal, public, self-contained notebook that reproduces this issue.
https://colab.research.google.com/drive/1jDkLsEkWA5KDVoCTCkXNmz955Qgwn4g9?usp=sharing
The text was updated successfully, but these errors were encountered: