Skip to content
This repository was archived by the owner on Dec 9, 2024. It is now read-only.
This repository was archived by the owner on Dec 9, 2024. It is now read-only.

Using Image Net data set #176

@abidmalikwaterloo

Description

@abidmalikwaterloo

I am trying to use the ImageNET data . I am using the script to prepare the data for TF. The data is in the directory:

/home/amalik/NEWIMAGENET/tfrecrd/train

Some eaxmples of the outputfiles:

train-00000-of-01024 train-00106-of-01024 train-00212-of-01024 train-00318-of-01024 train-00424-of-01024 train-00530-of-01024 train-00636-of-01024 train-00742-of-01024 train-00848-of-01024 train-00954-of-01024 validation-00036-of-00128
train-00001-of-01024 train-00107-of-01024 train-00213-of-01024 train-00319-of-01024 train-00425-of-01024 train-00531-of-01024 train-00637-of-01024 train-00743-of-01024 train-00849-of-01024 train-00955-of-01024 validation-00037-of-00128
train-00002-of-01024 train-00108-of-01024 train-00214-of-01024 train-00320-of-01024 train-00426-of-01024 train-00532-of-01024 train-00638-of-01024 train-00744-of-01024 train-00850-of-01024 train-00956-of-01024 validation-00038-of-00128
train-00003-of-01024 train-00109-of-01024 train-00215-of-01024 train-00321-of-01024 train-00427-of-01024 train-00533-of-01024 train-00639-of-01024 train-00745-of-01024 train-00851-of-01024 train-00957-of-01024 validation-00039-of-00128
train-00004-of-01024 train-00110-of-01024 train-00216-of-01024 train-00322-of-01024 train-00428-of-01024 train-00534-of-01024 train-00640-of-01024 train-00746-of-01024 train-00852-of-01024 train-00958-of-01024 validation-00040-of-00128
train-00005-of-01024 train-00111-of-01024 train-00217-of-01024 train-00323-of-01024 train-00429-of-01024 train-00535-of-01024 train-00641-of-01024 train-00747-of-01024 train-00853-of-01024 train-00959-of-01024 validation-00041-of-00128
train-00006-of-01024 train-00112-of-01024 train-00218-of-01024 train-00324-of-01024 train-00430-of-01024 train-00536-of-01024 train-00642-of-01024 train-00748-of-01024 train-00854-of-01024 train-00960-of-01024 validation-00042-of-00128
train-00007-of-01024 train-00113-of-01024 train-00219-of-01024 train-00325-of-01024 train-00431-of-01024 train-00537-of-01024 train-00643-of-01024 train-00749-of-01024 train-00855-of-01024 train-00961-of-01024 validation-00043-of-00128
train-00008-of-01024 train-00114-of-01024 train-00220-of-01024 train-00326-of-01024 train-00432-of-01024 train-00538-of-01024 train-00644-of-01024 train-00750-of-01024 train-00856-of-01024 train-00962-of-01024 validation-00044-of-00128
train-00009-of-01024 train-00115-of-01024 train-00221-of-01024 train-00327-of-01024 train-00433-of-01024 train-00539-of-01024 train-00645-of-01024 train-00751-of-01024 train-00857-of-01024 train-00963-of-01024 validation-00045-of-00128
train-00010-of-01024 train-00116-of-01024 train-00222-of-01024 train-00328-of-01024 train-00434-of-01024 train-00540-of-01024 train-00646-of-01024 train-00752-of-01024 train-00858-of-01024 train-00964-of-01024 validation-00046-of-00128
train-00011-of-01024 train-00117-of-01024 train-00223-of-01024 train-00329-of-01024 train-00435-of-01024 train-00541-of-01024 train-00647-of-01024 train-00753-of-01024 train-00859-of-01024 train-00965-of-01024 validation-00047-of-00128
train-00012-of-01024 train-00118-of-01024 train-00224-of-01024 train-00330-of-01024 train-00436-of-01024 train-00542-of-01024 train-00648-of-01024 train-00754-of-01024 train-00860-of-01024 train-00966-of-01024 validation-00048-of-00128
train-00013-of-01024 train-00119-of-01024 train-00225-of-01024 train-00331-of-01024 train-00437-of-01024 train-00543-of-01024 train-00649-of-01024 train-00755-of-01024 train-00861-of-01024 train-00967-of-01024 validation-00049-of-00128
train-00014-of-01024 train-00120-of-01024 train-00226-of-01024 train-00332-of-01024 train-00438-of-01024 train-00544-of-01024 train-00650-of-01024 train-00756-of-01024 train-00862-of-01024 train-00968-of-01024 validation-00050-of-00128
train-00015-of-01024 train-00121-of-01024 train-00227-of-01024 train-00333-of-01024 train-00439-of-01024 train-00545-of-01024 train-00651-of-01024 train-00757-of-01024 train-00863-of-01024 train-00969-of-01024 validation-00051-of-00128
train-0001

===

I am using the following to run the benchmarks for alexnet:

mpirun -np 1 -npernode 1 -x NCCL_DEBUG=INFO python tf_cnn_benchmarks.py --model=alexnet --batch_size=64 --data_name=imagenet --data_dir=/home/amalik/NEWIMAGENET/tfrecrd/train --horovod_device=gpu --num_batches=200 --print_training_accuracy --variable_update horovod

However, I am getting the following error:

WARNING: Logging before flag parsing goes to stderr.
W0430 10:17:19.406574 140542494578496 tf_logging.py:126] From /home/amalik/tenENV/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/datasets/base.py:198: retry (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version.
Instructions for updating:
Use the retry module or similar alternatives.
Traceback (most recent call last):
File "tf_cnn_benchmarks.py", line 60, in
app.run(main) # Raises error on invalid flags, unlike tf.app.run()
File "/home/amalik/.local/lib/python2.7/site-packages/absl/app.py", line 274, in run
_run_main(main, argv)
File "/home/amalik/.local/lib/python2.7/site-packages/absl/app.py", line 238, in _run_main
sys.exit(main(argv))
File "tf_cnn_benchmarks.py", line 56, in main
bench.run()
File "/home/amalik/horovod/benchmarks/scripts/tf_cnn_benchmarks/benchmark_cnn.py", line 1270, in run
return self._benchmark_cnn()
File "/home/amalik/horovod/benchmarks/scripts/tf_cnn_benchmarks/benchmark_cnn.py", line 1391, in _benchmark_cnn
(image_producer_ops, enqueue_ops, fetches) = self._build_model()
File "/home/amalik/horovod/benchmarks/scripts/tf_cnn_benchmarks/benchmark_cnn.py", line 1702, in _build_model
image_producer_stages) = self._build_image_processing(shift_ratio=0)
File "/home/amalik/horovod/benchmarks/scripts/tf_cnn_benchmarks/benchmark_cnn.py", line 1645, in _build_image_processing
shift_ratio=shift_ratio)
File "/home/amalik/horovod/benchmarks/scripts/tf_cnn_benchmarks/preprocessing.py", line 513, in minibatch
self.parse_and_preprocess, dataset, subset, self.train, cache_data)
File "/home/amalik/horovod/benchmarks/scripts/tf_cnn_benchmarks/data_utils.py", line 82, in create_iterator
file_names = gfile.Glob(glob_pattern)
File "/home/amalik/tenENV/lib/python2.7/site-packages/tensorflow/python/lib/io/file_io.py", line 339, in get_matching_files
compat.as_bytes(single_filename), status)
File "/home/amalik/tenENV/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 516, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.NotFoundError: /home/amalik/NEWIMAGENET/tfrecrd/train; No such file or directory

mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

Process name: [[49226,1],0]
Exit code: 1

How should I access the data for training?

Thanks

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions