Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TPU support is incomplete #24412

Open
martinwicke opened this issue Dec 18, 2018 · 12 comments

Comments

10 participants
@martinwicke
Copy link
Member

commented Dec 18, 2018

TensorFlow version (use command below): 2.0 preview

TPU support is work in progress, and the 2.0 preview does not yet contain a DistributionStrategy for TPU.

This is a tracking issue which will be updated when progress is made on this issue.

@huan

This comment has been minimized.

Copy link
Contributor

commented Apr 15, 2019

Hello @martinwicke, Thanks for set up this thread for the main tracking issue for TPU support with TF 2.0.

Do we have any plan of ETA now?

@martinwicke

This comment has been minimized.

Copy link
Member Author

commented Apr 15, 2019

@jhseu Any timeline you can share?

@jhseu

This comment has been minimized.

Copy link
Member

commented Apr 15, 2019

It works right now at master, but we don't have a matching Cloud TPU release. We'll release an official Cloud TPU version alongside TF 2.0 final.

@huan

This comment has been minimized.

Copy link
Contributor

commented Apr 15, 2019

@jhseu Thanks for letting me know that the master had already worked!

Do we have any code example to show how it works in TF 2.0 with TPU?

A demo with serval lines of core API calls will be enough, thanks!

@jhseu

This comment has been minimized.

Copy link
Member

commented Apr 15, 2019

@huan Yeah, there's an example here:
https://www.tensorflow.org/guide/distribute_strategy

You would use TPUStrategy instead of MirroredStrategy.

@thcktw

This comment has been minimized.

Copy link

commented Apr 17, 2019

@jhseu Hi, Is it work on Colab TPU ? I got this error "InvalidArgumentError: /job:tpu_worker/replica:0/task:1/device:CPU:0 unknown device."

@bduclaux

This comment has been minimized.

Copy link

commented Apr 21, 2019

@jhseu @TTaEE Same problem here.

It seems that there is an issue with the job_name parameter in TPUClusterResolver.
Either 'worker' or 'tpu_worker' don't work when using the TPUStrategy scope() method, in combination with a call to tf.config.experimental_connect_to_host.

I have submitted a bug report at #27992 , but it would be super helful to get a working notebook using TF 2.0 and TPUStrategy on Colab.

@bduclaux

This comment has been minimized.

Copy link

commented Apr 21, 2019

The following code generates an exception on Colab with Tensorflow version 2.0.0-dev20190421 when instatiating a basic Keras model with the scope of a TPUStrategy.

ValueError: variable object with name 'cd2c89b7-88b7-44c8-ad83-06c2a9158347' already created. Use get_variable() if reuse is desired.

!pip install --upgrade tensorflow==2.0.0-alpha0
!pip install --upgrade tf-nightly-2.0-preview

import tensorflow as tf
import os
import sys

print("Tensorflow version " + tf.__version__)

TPU_WORKER = 'grpc://' + os.environ['COLAB_TPU_ADDR']
tf.config.experimental_connect_to_host(TPU_WORKER)
resolver = tf.distribute.cluster_resolver.TPUClusterResolver() 
tf.tpu.experimental.initialize_tpu_system(resolver)
strategy = tf.distribute.experimental.TPUStrategy(resolver)
devices=tf.config.experimental_list_devices()
print(*devices,sep="\n")

with strategy.scope():
  model = tf.keras.Sequential([tf.keras.layers.Dense(1, input_shape=(1,))])
  model.compile(loss='mse', optimizer='sgd')

@AntGul

This comment has been minimized.

Copy link

commented May 13, 2019

It would be good to have one working example with TF2.0.

  1. This is a great article, but unfortunately, the code - does not work with TF2.0
    see
    tpu_model = tf.contrib.tpu.keras_to_tpu_model(
    model,
    strategy=tf.contrib.tpu.TPUDistributionStrategy(
    tf.contrib.cluster_resolver.TPUClusterResolver(TPU_ADDRESS)))

Keras support is now deprecated in support of TPU Strategy. Please follow the distribution strategy guide on tensorflow.org to migrate to the 2.0 supported version.

  1. On the other hand the example here on distribution strategy
    does not seem to work either
    as already mentioned above
@lukemelas

This comment has been minimized.

Copy link

commented Jun 8, 2019

Given that TF 2.0 beta is now out, is there an update with regard to this issue (about either the status or the timeline of TPU support)?

@huan

This comment has been minimized.

Copy link
Contributor

commented Jun 8, 2019

@lukemelas +1

A roadmap or ETA would be very helpful for the cloud TPU fans!

@chiayewken

This comment has been minimized.

Copy link

commented Jun 8, 2019

With reference to #29550, TPUStrategy in Tensorflow 2.0 Beta has not been working for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.