Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPUdirect #7

Open
kanglcn opened this issue Apr 15, 2023 · 7 comments
Open

GPUdirect #7

kanglcn opened this issue Apr 15, 2023 · 7 comments

Comments

@kanglcn
Copy link
Owner

kanglcn commented Apr 15, 2023

Use https://github.com/rapidsai/kvikio when it is mature.

@beckernick
Copy link

Hi @kanglcn ! I came across this issue due to the rapidsai/kvikio reference. I work on the RAPIDS team at NVIDIA.

If you're willing to share, I'd love to learn more about what potential KvikIO features and functionality would be important for your use cases.

@kanglcn
Copy link
Owner Author

kanglcn commented Apr 17, 2023

Hi @beckernick . It looks I can't install kvikio with python 3.10.

mamba install -c rapidsai kvikio=23.02 

Looking for: ['kvikio=23.02']

conda-forge/linux-64                                        Using cache
conda-forge/noarch                                          Using cache
pkgs/main/linux-64                                            No change
pkgs/main/noarch                                              No change
pkgs/r/linux-64                                               No change
rapidsai/linux-64                                             No change
rapidsai/noarch                                               No change
pkgs/r/noarch                                                 No change

Pinned packages:
  - python 3.10.*


Could not solve for environment specs
The following packages are incompatible
└─ kvikio 23.02**  is installable with the potential options
   ├─ kvikio 23.02.00 would require
   │  └─ python >=3.8,<3.9.0a0 , which can be installed;
   └─ kvikio 23.02.00 would require
      └─ python >=3.9,<3.10.0a0 , which can be installed.

@beckernick
Copy link

We've just released v23.04, which includes Python 3.10 support. Would you be able to give that a test?

@kanglcn
Copy link
Owner Author

kanglcn commented Apr 18, 2023

Thank you @beckernick ! I have successfully installed it.

I am working on satellite image processings. My current workflow is:

  • read images from zarr.load;
  • convert numpy array to cupy with cp.asarray;
  • do my processing with cupy;
  • convert cupy array to numpy back with cp.asnumpy;
  • save the data with zarr.save.

The data reading and writing is too slow. That is why I am looking for kvikio.

After I install it, I find a problem in reading zarr:

rslc_path = '../../data/rslc.zarr'
rslc_zarr = zarr.open(rslc_path,mode='r')
rslc_cpu = rslc_zarr[:]

The data is successfully load into memory. But when I use kvikio:

rslc_zarr = zarr.open(store=GDSStore('./rslc_gpu.zarr'),mode='r')
rslc_gpu = rslc_zarr[:]

I got an error:

Click me
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[37], line 1
----> 1 rslc_gpu = rslc_zarr[:]

File ~/miniconda3/envs/work/lib/python3.10/site-packages/zarr/core.py:821, in Array.__getitem__(self, selection)
    819     result = self.vindex[selection]
    820 else:
--> 821     result = self.get_basic_selection(pure_selection, fields=fields)
    822 return result

File ~/miniconda3/envs/work/lib/python3.10/site-packages/zarr/core.py:947, in Array.get_basic_selection(self, selection, out, fields)
    944     return self._get_basic_selection_zd(selection=selection, out=out,
    945                                         fields=fields)
    946 else:
--> 947     return self._get_basic_selection_nd(selection=selection, out=out,
    948                                         fields=fields)

File ~/miniconda3/envs/work/lib/python3.10/site-packages/zarr/core.py:990, in Array._get_basic_selection_nd(self, selection, out, fields)
    984 def _get_basic_selection_nd(self, selection, out=None, fields=None):
    985     # implementation of basic selection for array with at least one dimension
    986 
    987     # setup indexer
    988     indexer = BasicIndexer(selection, self)
--> 990     return self._get_selection(indexer=indexer, out=out, fields=fields)

File ~/miniconda3/envs/work/lib/python3.10/site-packages/zarr/core.py:1285, in Array._get_selection(self, indexer, out, fields)
   1275 if (
   1276     not hasattr(self.chunk_store, "getitems") and not (
   1277         hasattr(self.chunk_store, "get_partial_values") and
   (...)
   1280 ) or any(map(lambda x: x == 0, self.shape)):
   1281     # sequentially get one key at a time from storage
   1282     for chunk_coords, chunk_selection, out_selection in indexer:
   1283 
   1284         # load chunk selection into output array
-> 1285         self._chunk_getitem(chunk_coords, chunk_selection, out, out_selection,
   1286                             drop_axes=indexer.drop_axes, fields=fields)
   1287 else:
   1288     # allow storage to get multiple items at once
   1289     lchunk_coords, lchunk_selection, lout_selection = zip(*indexer)

File ~/miniconda3/envs/work/lib/python3.10/site-packages/zarr/core.py:2006, in Array._chunk_getitem(self, chunk_coords, chunk_selection, out, out_selection, drop_axes, fields)
   2003         out[out_selection] = fill_value
   2005 else:
-> 2006     self._process_chunk(out, cdata, chunk_selection, drop_axes,
   2007                         out_is_ndarray, fields, out_selection)

File ~/miniconda3/envs/work/lib/python3.10/site-packages/zarr/core.py:1959, in Array._process_chunk(self, out, cdata, chunk_selection, drop_axes, out_is_ndarray, fields, out_selection, partial_read_decode)
   1956     tmp = np.squeeze(tmp, axis=drop_axes)
   1958 # store selected data in output
-> 1959 out[out_selection] = tmp

File cupy/_core/core.pyx:1473, in cupy._core.core._ndarray_base.__array__()

TypeError: Implicit conversion to a NumPy array is not allowed. Please use `.get()` to construct a NumPy array explicitly.
Can you please help me find out how to correctly use `kvikio`?

@madsbk
Copy link

madsbk commented Apr 18, 2023

Hi @kanglcn, the GPU array support in Zarr is still in development. we just merged the final piece, which will be included in the next Zarr release v2.15.

We need to make KvikIO use this new Zarr feature and then everything should just work hopefully :)

Are you using compression?
Zarr only comes with CPU compressions but we plan to implement GPU compression using NVCOMP.

@kanglcn
Copy link
Owner Author

kanglcn commented Apr 18, 2023

Thanks @madsbk for letting me know!
I haven't used any compression now. But I definitely will try it if it can help speed up the IO.

Would you consider adding support for dask? i.e. dask.array.to_zarr and dask.array.from_zarr. I am scaling my code with dask. If you have that plan, that will be very helpful to me!

@madsbk
Copy link

madsbk commented Apr 18, 2023

Yes, the plan is to support dask.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants