Skip to content

dataset_setup.py --fastmri fails with TypeError after fastMRI download completed #668

@alex77g2

Description

@alex77g2

dataset_setup.py --fastmri fails with TypeError after fastMRI download completed, due to obvious number of arguments mismatch.

Description

An internal error (number of positional arguments) happens inside datasets/dataset_setup.py, line 715, in main
Line 715: setup_fastmri(data_dir, updated_data_dir) # 2 args
Line 402: def setup_fastmri(data_dir): # 1 arg only
This trivial error should be easy to fix for the author of this python-file.
I tried 3 other datasets (mnist, wmt, ogbg) via dataset_setup.py (as in readme) and they prepare fine (download, extract, etc).

Steps to Reproduce

An ordered list of steps to recreating the issue

python3 --version # Python 3.11.6
pip list | grep absl # absl-py  1.4.0
python3 datasets/dataset_setup.py --data_dir $DATA_DIR --fastmri \
--fastmri_knee_singlecoil_train_url 'https://fastmri-dataset.s3.amazonaws.com/v2.0/knee_singlecoil_train.tar.xz?hidden' \
--fastmri_knee_singlecoil_val_url 'https://fastmri-dataset.s3.amazonaws.com/v2.0/knee_singlecoil_val.tar.xz?hidden' \
--fastmri_knee_singlecoil_test_url 'https://fastmri-dataset.s3.amazonaws.com/v2.0/knee_singlecoil_test.tar.xz?hidden'
...
2024-03-02 08:03:54.070394: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-03-02 08:03:54.718935: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1956] Cannot dlopen some GPU libraries.
...
I0302 08:04:08.161000 139808911519808 dataset_setup.py:714] fastMRI download completed. Extracting...
Traceback (most recent call last):
  File "/home/aleks/git/algorithmic-efficiency/datasets/dataset_setup.py", line 755, in <module>
    app.run(main)
  File "/home/aleks/git/env_mlc/lib/python3.11/site-packages/absl/app.py", line 308, in run
    _run_main(main, args)
  File "/home/aleks/git/env_mlc/lib/python3.11/site-packages/absl/app.py", line 254, in _run_main
    sys.exit(main(argv))
             ^^^^^^^^^^
  File "/home/aleks/git/algorithmic-efficiency/datasets/dataset_setup.py", line 715, in main
    setup_fastmri(data_dir, updated_data_dir)
TypeError: setup_fastmri() takes 1 positional argument but 2 were given

Source or Possible Fix

See Description (obvious number of orguments mismatch inside datasets/dataset_setup.py).
Adding a 2nd dummy-arg to setup_fastmri() cause a file not found error soon after. def setup_fastmri(data_dir, dummy):
FileNotFoundError: [Errno 2] No such file or directory: '/home/aleks/data/knee_singlecoil_train.tar.xz'
It should be "~/data/fastmri/knee_singlecoil_train.tar.xz"
Adding a 2nd dummy-arg before to def setup_fastmri(dummy, data_dir): solves the issue (no error messages).
But I cannot judge now if the folder structure is correct as intended for the benchmark itself.

Before I checked the code I tried this:
I tried downgrading absl-py, but "algorithmic-efficiency 0.1.0 requires absl-py==1.4.0".
Disk has 1500 MB free space.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions