Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while retrieving transformer weights in intro notebook #907

Closed
Cquential opened this issue Jul 30, 2020 · 6 comments
Closed

Error while retrieving transformer weights in intro notebook #907

Cquential opened this issue Jul 30, 2020 · 6 comments

Comments

@Cquential
Copy link

Description

Initiating weights from the gcloud source leads to a NotFoundError
...

Environment information

OS: Pop!_OS 20.04 (Based on Ubuntu 20.04)

$ pip freeze | grep trax
trax==1.3.4

$ pip freeze | grep tensor
mesh-tensorflow==0.1.16
tensor2tensor==1.15.7
tensorboard==2.3.0
tensorboard-plugin-wit==1.6.0.post3
tensorflow==2.3.0
tensorflow-addons==0.10.0
tensorflow-datasets==3.2.1
tensorflow-estimator==2.3.0
tensorflow-gan==2.0.0
tensorflow-gpu==2.3.0
tensorflow-hub==0.8.0
tensorflow-metadata==0.22.2
tensorflow-probability==0.7.0
tensorflow-text==2.3.0

$ pip freeze | grep jax
jax==0.1.74
jaxlib==0.1.52

$ python -V
Python 3.8.2

For bugs: reproduction and error logs

# Steps to reproduce:
...
model = trax.models.Transformer(
    input_vocab_size=33300, 
    d_model=512, d_ff=2048, 
    n_heads=8, n_encoder_layers=6, n_decoder_layers=6, 
    max_len=2048, mode='predict') 
# Initialize using pre-trained weights. 
model.init_from_file('gs://trax-ml/models/translation/ende_wmt32k.pkl.gz',  weights_only=True)
# Error logs:
...
NotFoundError: Error executing an HTTP request: HTTP response code 404 with body '<?xml version='1.0' encoding='UTF-8'?><Error><Code>NoSuchKey</Code><Message>The specified key does not exist.</Message><Details>No such object: trax-ml/models/translation/end_wmt32k.pkl.gz</Details></Error>'
	 when reading gs://trax-ml/models/translation/end_wmt32k.pkl.gz

@lukaszkaiser
Copy link
Contributor

Hmm, the code seems to say model.init_from_file('gs://trax-ml/models/translation/ende_wmt32k.pkl.gz') but your error says gs://trax-ml/models/translation/end_wmt32k.pkl.gz -- there is an "e" missing in "ende"?

@Cquential
Copy link
Author

Unfortunately, correcting the typo has not solved the issue yet

model.init_from_file('gs://trax-ml/models/translation/ende_wmt32k.pkl.gz', weights_only=True)
07-30 23:15:42.979550: W tensorflow/core/platform/cloud/google_auth_provider.cc:184] 

All attempts to get a Google authentication bearer token failed, returning an empty token. Retrieving token from files failed with "Not found: Could not locate the credentials file.".
Retrieving token from GCE failed with
 "Failed precondition: Error executing an HTTP request: libcurl code 6 meaning 'Couldn't resolve host name', error details: Couldn't resolve host 'metadata'".

@Cquential
Copy link
Author

Cquential commented Jul 30, 2020

Interestingly I seem to have encountered a new issue with trax now,

UserWarning: No GPU/TPU found, falling back to CPU.

But tensorflow seems to recognise my Nvidia 1050

tf.config.get_visible_devices('GPU')                                    
Out[5]: [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

Should I report this as a separate issue?

@lukaszkaiser
Copy link
Contributor

@Cquential : if you're using an installation at home, you need to follow the instructions for JAX on GPU:
https://github.com/google/jax#pip-installation

@lukaszkaiser
Copy link
Contributor

Did the above help? For files with gs:// prefix -- they are on GCP, so if you cannot access them directly, copy them locally with gsutil cp gs://... [local dir]. And install gsutil as described here: https://cloud.google.com/storage/docs/gsutil_install

@Cquential
Copy link
Author

@lukaszkaiser , installing JAX with GPU worked(atleast the error message has disappeared). How do I verify trax is using my GPU?. Will try out gsutil and respond.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants