Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help wanted: Runtime error using Numpy readers: Unknown Numpy type string. #4744

Closed
omar-valerio opened this issue Mar 25, 2023 · 8 comments
Closed
Assignees
Labels
bug Something isn't working enhancement New feature or request

Comments

@omar-valerio
Copy link

Hi DALI Team,

I am following the example to read files containing numpy arrays given as a file list inside a text file. I got the following runtime error message:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[36], line 51
     48     print(f.read())
     49 print('\n')
---> 51 data2 = run(pipe2(file_list_path=filelist_path, data_dir=input_data_dir))
     52 #assert_all_equal(data1, data2)

Cell In[36], line 32, in run(p)
     30 def run(p):
     31     p.build()  # build the pipeline
---> 32     outputs = p.run()  # Run once
     33     # Getting the batch as a list of numpy arrays, for displaying
     34     batch = [np.array(outputs[0][s]) for s in range(batch_size)]

File ~/anaconda3/envs/lightning/lib/python3.8/site-packages/nvidia/dali/pipeline.py:1054, in Pipeline.run(self)
   1052 with self._check_api_type_scope(types.PipelineAPIType.BASIC):
   1053     self.schedule_run()
-> 1054     return self.outputs()

File ~/anaconda3/envs/lightning/lib/python3.8/site-packages/nvidia/dali/pipeline.py:953, in Pipeline.outputs(self)
    951 self._batches_to_consume -= 1
    952 self._gpu_batches_to_consume -= 1
--> 953 return self._outputs()

File ~/anaconda3/envs/lightning/lib/python3.8/site-packages/nvidia/dali/pipeline.py:1037, in Pipeline._outputs(self)
   1035 if not self._built:
   1036     raise RuntimeError("Pipeline must be built first.")
-> 1037 return self._pipe.Outputs()

RuntimeError: Critical error in pipeline:
Error when executing GPU operator readers__Numpy encountered:
Error in thread 0: [/opt/dali/dali/operators/reader/loader/numpy_loader_gpu.cc:44] [/opt/dali/dali/util/numpy.cc:36] Unknown Numpy type string
Stacktrace (9 entries):
[frame 0]: /home/valerio/anaconda3/envs/lightning/lib/python3.8/site-packages/nvidia/dali/libdali.so(+0xbf52b) [0x7f2969d7b52b]
[frame 1]: /home/valerio/anaconda3/envs/lightning/lib/python3.8/site-packages/nvidia/dali/libdali.so(+0x8b9bc) [0x7f2969d479bc]
[frame 2]: /home/valerio/anaconda3/envs/lightning/lib/python3.8/site-packages/nvidia/dali/libdali.so(dali::numpy::ParseHeaderContents(dali::numpy::HeaderData&, std::string const&)+0x10b) [0x7f2969ea3e4b]
[frame 3]: /home/valerio/anaconda3/envs/lightning/lib/python3.8/site-packages/nvidia/dali/libdali.so(dali::numpy::ParseHeader(dali::numpy::HeaderData&, dali::InputStream*)+0x307) [0x7f2969ea49b7]
[frame 4]: /home/valerio/anaconda3/envs/lightning/lib/python3.8/site-packages/nvidia/dali/libdali_operators.so(+0x3bddfab) [0x7f292e2ddfab]
[frame 5]: /home/valerio/anaconda3/envs/lightning/lib/python3.8/site-packages/nvidia/dali/libdali.so(dali::ThreadPool::ThreadMain(int, int, bool, std::string const&)+0x1e6) [0x7f2969e559a6]
[frame 6]: /home/valerio/anaconda3/envs/lightning/lib/python3.8/site-packages/nvidia/dali/libdali.so(+0x6a5a60) [0x7f296a361a60]
[frame 7]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x8609) [0x7f2987b1d609]
[frame 8]: /lib/x86_64-linux-gnu/libc.so.6(clone+0x43) [0x7f29878e8133]
. File: /gpfs/soma_local/ibs/data/batch_2_AP_DEND.npy
Stacktrace (6 entries):
[frame 0]: /home/valerio/anaconda3/envs/lightning/lib/python3.8/site-packages/nvidia/dali/libdali_operators.so(+0x63341e) [0x7f292ad3341e]
[frame 1]: /home/valerio/anaconda3/envs/lightning/lib/python3.8/site-packages/nvidia/dali/libdali_operators.so(+0x4fa736) [0x7f292abfa736]
[frame 2]: /home/valerio/anaconda3/envs/lightning/lib/python3.8/site-packages/nvidia/dali/libdali.so(dali::ThreadPool::ThreadMain(int, int, bool, std::string const&)+0x1e6) [0x7f2969e559a6]
[frame 3]: /home/valerio/anaconda3/envs/lightning/lib/python3.8/site-packages/nvidia/dali/libdali.so(+0x6a5a60) [0x7f296a361a60]
[frame 4]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x8609) [0x7f2987b1d609]
[frame 5]: /lib/x86_64-linux-gnu/libc.so.6(clone+0x43) [0x7f29878e8133]

Current pipeline object is no longer valid.

@omar-valerio omar-valerio changed the title Hel wanted: Runtime error using Numpy readers: Unknown Numpy type string. Help wanted: Runtime error using Numpy readers: Unknown Numpy type string. Mar 25, 2023
@JanuszL JanuszL added the bug Something isn't working label Mar 27, 2023
@JanuszL
Copy link
Contributor

JanuszL commented Mar 27, 2023

Hi @omar-valerio,

It seems that the file format you use is not supported by DALI. Is there any chance you can share a toy file (not necessarily with the real data) that reproduces this problem so we can examine it?

@omar-valerio
Copy link
Author

Hi @JanuszL ,

Thanks. I'm attaching one of the *.npy files (compressed as zip). Loading the files directly with numpy works.

import os
import numpy as np
from nvidia.dali import pipeline_def, fn

batch_size = 100
data_dir = '/home/valerio/bottleneck'

input_data = np.load(os.path.join(data_dir, 'test_file.npy'))
print(f'input_data shape: {input_data.shape}')
print(input_data[0:10,0:10])

print(f'data_dir: {data_dir}')
print(f'batch_size: {batch_size}')
print(f'numpy version: {np.__version__}')

@pipeline_def(batch_size=batch_size, num_threads=3, device_id=0)
def pipe1():
    data = fn.readers.numpy(device='cpu', file_root=data_dir, file_filter='test_file.npy')
    return data

def run(p):
    p.build()  # build the pipeline
    outputs = p.run()  # Run once
    # Getting the batch as a list of numpy arrays, for displaying
    batch = [np.array(outputs[0][s]) for s in range(batch_size)]
    return batch

data1 = run(pipe1())

output:

input_data shape: (500, 140)
[[False False False False False False False False False False]
 [False False False False False False False False False False]
  ...................
 [False False False False  True False False False False False]
 [False False False False False False False False False False]]
data_dir: /home/valerio/bottleneck
batch_size: 100
numpy version: 1.24.2
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[13], line 29
     26     batch = [np.array(outputs[0][s]) for s in range(batch_size)]
     27     return batch
---> 29 data1 = run(pipe1())

Cell In[13], line 24, in run(p)
     22 def run(p):
     23     p.build()  # build the pipeline
---> 24     outputs = p.run()  # Run once
     25     # Getting the batch as a list of numpy arrays, for displaying
     26     batch = [np.array(outputs[0][s]) for s in range(batch_size)]

File ~/anaconda3/envs/lightning/lib/python3.8/site-packages/nvidia/dali/pipeline.py:1054, in Pipeline.run(self)
   1052 with self._check_api_type_scope(types.PipelineAPIType.BASIC):
   1053     self.schedule_run()
-> 1054     return self.outputs()

File ~/anaconda3/envs/lightning/lib/python3.8/site-packages/nvidia/dali/pipeline.py:953, in Pipeline.outputs(self)
    951 self._batches_to_consume -= 1
    952 self._gpu_batches_to_consume -= 1
--> 953 return self._outputs()

File ~/anaconda3/envs/lightning/lib/python3.8/site-packages/nvidia/dali/pipeline.py:1037, in Pipeline._outputs(self)
   1035 if not self._built:
   1036     raise RuntimeError("Pipeline must be built first.")
-> 1037 return self._pipe.Outputs()

RuntimeError: Critical error in pipeline:
Error when executing CPU operator readers__Numpy encountered:
[/opt/dali/dali/operators/reader/loader/numpy_loader.cc:86] [/opt/dali/dali/util/numpy.cc:36] Unknown Numpy type string
Stacktrace (11 entries):
[frame 0]: /home/valerio/anaconda3/envs/lightning/lib/python3.8/site-packages/nvidia/dali/libdali.so(+0xbf52b) [0x7f8a1580b52b]
[frame 1]: /home/valerio/anaconda3/envs/lightning/lib/python3.8/site-packages/nvidia/dali/libdali.so(+0x8b9bc) [0x7f8a157d79bc]
[frame 2]: /home/valerio/anaconda3/envs/lightning/lib/python3.8/site-packages/nvidia/dali/libdali.so(dali::numpy::ParseHeaderContents(dali::numpy::HeaderData&, std::string const&)+0x10b) [0x7f8a15933e4b]
[frame 3]: /home/valerio/anaconda3/envs/lightning/lib/python3.8/site-packages/nvidia/dali/libdali.so(dali::numpy::ParseHeader(dali::numpy::HeaderData&, dali::InputStream*)+0x307) [0x7f8a159349b7]
[frame 4]: /home/valerio/anaconda3/envs/lightning/lib/python3.8/site-packages/nvidia/dali/libdali_operators.so(+0x3bd8f4b) [0x7f89d8aaef4b]
[frame 5]: /home/valerio/anaconda3/envs/lightning/lib/python3.8/site-packages/nvidia/dali/libdali_operators.so(+0x3d5f69a) [0x7f89d8c3569a]
[frame 6]: /home/valerio/anaconda3/envs/lightning/lib/python3.8/site-packages/nvidia/dali/libdali_operators.so(+0x3d60672) [0x7f89d8c36672]
[frame 7]: /home/valerio/anaconda3/envs/lightning/lib/python3.8/site-packages/nvidia/dali/libdali_operators.so(+0x3d5e695) [0x7f89d8c34695]
[frame 8]: /home/valerio/anaconda3/envs/lightning/lib/python3.8/site-packages/nvidia/dali/libdali_operators.so(+0x46476a0) [0x7f89d951d6a0]
[frame 9]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x8609) [0x7f8a331a7609]
[frame 10]: /lib/x86_64-linux-gnu/libc.so.6(clone+0x43) [0x7f8a32f72133]
. File: ./test_file.npy
Stacktrace (8 entries):
[frame 0]: /home/valerio/anaconda3/envs/lightning/lib/python3.8/site-packages/nvidia/dali/libdali_operators.so(+0x63341e) [0x7f89d550941e]
[frame 1]: /home/valerio/anaconda3/envs/lightning/lib/python3.8/site-packages/nvidia/dali/libdali_operators.so(+0x4fa28e) [0x7f89d53d028e]
[frame 2]: /home/valerio/anaconda3/envs/lightning/lib/python3.8/site-packages/nvidia/dali/libdali_operators.so(+0x3d5f69a) [0x7f89d8c3569a]
[frame 3]: /home/valerio/anaconda3/envs/lightning/lib/python3.8/site-packages/nvidia/dali/libdali_operators.so(+0x3d60672) [0x7f89d8c36672]
[frame 4]: /home/valerio/anaconda3/envs/lightning/lib/python3.8/site-packages/nvidia/dali/libdali_operators.so(+0x3d5e695) [0x7f89d8c34695]
[frame 5]: /home/valerio/anaconda3/envs/lightning/lib/python3.8/site-packages/nvidia/dali/libdali_operators.so(+0x46476a0) [0x7f89d951d6a0]
[frame 6]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x8609) [0x7f8a331a7609]
[frame 7]: /lib/x86_64-linux-gnu/libc.so.6(clone+0x43) [0x7f8a32f72133]

Current pipeline object is no longer valid.

test_file.zip

@JanuszL
Copy link
Contributor

JanuszL commented Mar 27, 2023

Hi @omar-valerio,

The bool is not supported by the numpy reader operator. #4745 PR should enable it.
Please check the nighly build that follows the merge of the mentioned commit.

@JanuszL JanuszL added the enhancement New feature or request label Mar 27, 2023
@JanuszL JanuszL added this to the Release_1.25.0 milestone Mar 27, 2023
@omar-valerio
Copy link
Author

Hi @JanuszL ,

Thank you very much for the swift response. Unfortunately, I'm not familiar with tracking the PR. I ran the following command and I believe there is a nightly build from yesterday (1.25.0.dev20230327). Is this the one with the bool support?

pip index versions nvidia-dali-nightly-cuda110 --pre --extra-index-url https://developer.download.nvidia.com/compute/redist/nightly
WARNING: pip index is currently an experimental command. It may be removed/changed in a future release without prior warning.
nvidia-dali-nightly-cuda110 (1.25.0.dev20230327)
Available versions: 1.25.0.dev20230327, 1.25.0.dev20230324, 1.25.0.dev20230323, 1.25.0.dev20230320, 1.25.0.dev20230315, 1.25.0.dev20230314

@JanuszL
Copy link
Contributor

JanuszL commented Mar 28, 2023

Hi @omar-valerio,

It hasn't been merged yet, will let you know when the build is available.

@JanuszL
Copy link
Contributor

JanuszL commented Mar 30, 2023

@omar-valerio - the latest nightly build (1.25.0.dev20230330) should include the fix.

@omar-valerio
Copy link
Author

Hi @JanuszL - I just installed the nightly build and I'm able to read the numpy files. I noticed that I had to cast to FLOAT so that the PyTorch DaliGenericIterator can use my pipeline. I am posting the code for future reference. Thanks.

import os
import numpy as np
import nvidia.dali
from nvidia.dali import pipeline_def, fn
import nvidia.dali.types as types

batch_size = 4
data_dir = '/home/valerio/bottleneck'

input_data = np.load(os.path.join(data_dir, 'test_file.npy'))
print(f'input_data shape: {input_data.shape}')

print(f'data_dir: {data_dir}')
print(f'batch_size: {batch_size}')
print(f'numpy version: {np.__version__}')
print(f'nvidia.dali version: {nvidia.dali.__version__}')

@pipeline_def(batch_size=batch_size, num_threads=4, device_id=0)
def train_pipe(device='cpu'):
    data = fn.readers.numpy(name='Reader', device=device, file_root=data_dir, file_filter='test_file.npy')
    data = fn.cast(data, dtype=types.FLOAT) # PyTorch expects data as float
    return data

batch_size = 4
train_pipeline = train_pipe(
    batch_size=batch_size, device='cpu', num_threads=4, device_id=0)

train_pipeline.build() # build the pipeline

from nvidia.dali.plugin.pytorch import DALIGenericIterator, LastBatchPolicy

dali_iter = DALIGenericIterator(
    train_pipeline,
    ["data"],
    reader_name="Reader", last_batch_policy=LastBatchPolicy.PARTIAL)

for i, data in enumerate(dali_iter):
    for d in data:
        array = d["data"]
        print(f'batch: {i} array.shape: {array.shape} array.dtype: {array.dtype}')

My output:

input_data shape: (500, 140)
data_dir: /home/valerio/bottleneck
batch_size: 4
numpy version: 1.24.2
nvidia.dali version: 1.25.0dev.20230330
batch: 0 array.shape: torch.Size([1, 500, 140]) array.dtype: torch.float32

@JanuszL
Copy link
Contributor

JanuszL commented Mar 31, 2023

There is another PR that has just been merged that should enable bools for Torch integration. I should be available in the next nightly build.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants