Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error running Common Crawl example #11

Closed
RobertLiJN opened this issue Mar 13, 2023 · 2 comments
Closed

Error running Common Crawl example #11

RobertLiJN opened this issue Mar 13, 2023 · 2 comments

Comments

@RobertLiJN
Copy link

Sorry to interrupt! When running

python3 .local/lib/python3.8/site-packages/paxml/main.py \
--exp=tasks.lm.params.c4.C4Spmd1BAdam4Replicas \
--job_log_dir=gs://<your-bucket> 

in the examples, I encountered the following error seeming to suggest I cannot load from the bucket provided in c4.py

Traceback (most recent call last):
  File ".local/lib/python3.8/site-packages/paxml/main.py", line 407, in <module>
    app.run(main, flags_parser=absl_flags.flags_parser)
  File "/usr/local/lib/python3.8/dist-packages/absl/app.py", line 308, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.8/dist-packages/absl/app.py", line 254, in _run_main
    sys.exit(main(argv))
  File ".local/lib/python3.8/site-packages/paxml/main.py", line 382, in main
    run(experiment_config=experiment_config,
  File ".local/lib/python3.8/site-packages/paxml/main.py", line 336, in run
    search_space = tuning_lib.get_search_space(experiment_config)
  File "/home/robertli/.local/lib/python3.8/site-packages/paxml/tuning_lib.py", line 81, in get_search_space
    search_space = pg.hyper.trace(inspect_search_space, require_hyper_name=True)
  File "/home/robertli/.local/lib/python3.8/site-packages/pyglove/core/hyper/dynamic_evaluation.py", line 586, in trace
    fun()
  File "/home/robertli/.local/lib/python3.8/site-packages/paxml/tuning_lib.py", line 77, in inspect_search_space
    _ = instantiate(d)
  File "/home/robertli/.local/lib/python3.8/site-packages/praxis/base_hyperparams.py", line 1103, in instantiate
    return config.Instantiate(**kwargs)
  File "/home/robertli/.local/lib/python3.8/site-packages/praxis/base_hyperparams.py", line 601, in Instantiate
    return self.cls(self, **kwargs)
  File "/home/robertli/.local/lib/python3.8/site-packages/paxml/seqio_input.py", line 443, in __init__
    self._dataset = self._get_dataset()
  File "/home/robertli/.local/lib/python3.8/site-packages/paxml/seqio_input.py", line 551, in _get_dataset
    ds = self._get_backing_ds(
  File "/home/robertli/.local/lib/python3.8/site-packages/paxml/seqio_input.py", line 686, in _get_backing_ds
    ds = self.mixture_or_task.get_dataset(
  File "/home/robertli/.local/lib/python3.8/site-packages/seqio/dataset_providers.py", line 1205, in get_dataset
    len(self.source.list_shards(split=split)) >= shard_info.num_shards)
  File "/home/robertli/.local/lib/python3.8/site-packages/seqio/dataset_providers.py", line 455, in list_shards
    return [_get_filename(info) for info in self.tfds_dataset.files(split)]
  File "/home/robertli/.local/lib/python3.8/site-packages/seqio/utils.py", line 152, in files
    split_info = self.builder.info.splits[split]
  File "/home/robertli/.local/lib/python3.8/site-packages/seqio/utils.py", line 129, in builder
    LazyTfdsLoader._MEMOIZED_BUILDERS[builder_key] = tfds.builder(
  File "/usr/lib/python3.8/contextlib.py", line 75, in inner
    return func(*args, **kwds)
  File "/home/robertli/.local/lib/python3.8/site-packages/tensorflow_datasets/core/logging/__init__.py", line 169, in __call__
    return function(*args, **kwargs)
  File "/home/robertli/.local/lib/python3.8/site-packages/tensorflow_datasets/core/load.py", line 202, in builder
    return read_only_builder.builder_from_files(str(name), **builder_kwargs)
  File "/usr/lib/python3.8/contextlib.py", line 75, in inner
    return func(*args, **kwds)
  File "/home/robertli/.local/lib/python3.8/site-packages/tensorflow_datasets/core/read_only_builder.py", line 259, in builder_from_files
    builder_dir = _find_builder_dir(name, **builder_kwargs)
  File "/home/robertli/.local/lib/python3.8/site-packages/tensorflow_datasets/core/read_only_builder.py", line 327, in _find_builder_dir
    builder_dir = _find_builder_dir_single_dir(
  File "/home/robertli/.local/lib/python3.8/site-packages/tensorflow_datasets/core/read_only_builder.py", line 417, in _find_builder_dir_single_dir
    found_version_str = _get_version_str(
  File "/home/robertli/.local/lib/python3.8/site-packages/tensorflow_datasets/core/read_only_builder.py", line 484, in _get_version_str
    all_versions = version_lib.list_all_versions(os.fspath(builder_dir))
  File "/home/robertli/.local/lib/python3.8/site-packages/tensorflow_datasets/core/utils/version.py", line 193, in list_all_versions
    if not root_dir.exists():
  File "/home/robertli/.local/lib/python3.8/site-packages/etils/epath/gpath.py", line 130, in exists
    return self._backend.exists(self._path_str)
  File "/home/robertli/.local/lib/python3.8/site-packages/etils/epath/backend.py", line 204, in exists
    return self.gfile.exists(path)
  File "/home/robertli/.local/lib/python3.8/site-packages/tensorflow/python/lib/io/file_io.py", line 288, in file_exists_v2
    _pywrap_file_io.FileExists(compat.path_to_bytes(path))
tensorflow.python.framework.errors_impl.PermissionDeniedError: Error executing an HTTP request: HTTP response code 403 with body '{
  "error": {
    "code": 403,
    "message": "991053624826-compute@developer.gserviceaccount.com does not have storage.objects.get access to the Google Cloud Storage object. Permission 'storage.objects.get' denied on resource (or it may not exist).",
    "errors": [
      {
        "message": "991053624826-compute@developer.gserviceaccount.com does not have storage.objects.get access to the Google Cloud Storage object. Permission 'storage.objects.get' denied on resource (or it may not exist)."'
	 when reading metadata of gs://mlperf-llm-public2/c4/en

I wonder if this is because I haven't configured something correctly, because the bucket seems like a public one.

I tried using the TFDS default bucket (gs://tfds-data/datasets) instead of gs://mlperf-llm-public2 and this problem doesn't arise, but it requires me to choose among available versions of c4 (not 3.0.4). Even then, I cannot proceed because it gives me some other error.

Thanks in advance for your attention and help!

@mathemakitten
Copy link

Just a quick note that I don't think the perms on gs://mlperf-llm-public2 are configured properly for public access — I can access buckets like gs://t5-data/vocabs/ no problem, but not this one. I get a similar error as above when trying to grab the spm file per the README (gs://mlperf-llm-public2/vocab/c4_en_301_5Mexp2_spm.model).

@zhangqiaorjc
Copy link
Member

@mathemakitten sorry for the late reply, the mlperf-llm-public2 isn't supposed to be public yet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants