Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Rename kDLGPU to kDLCUDA, and kDLCPUPinned to kDLCUDAHost #67

Closed
YuchenJin opened this issue May 7, 2021 · 13 comments
Closed

[RFC] Rename kDLGPU to kDLCUDA, and kDLCPUPinned to kDLCUDAHost #67

YuchenJin opened this issue May 7, 2021 · 13 comments

Comments

@YuchenJin
Copy link
Contributor

YuchenJin commented May 7, 2021

This RFC proposes to rename kDLGPU to kDLCUDA, and kDLCPUPinned to kDLCUDAHost. Two main reasons for this renaming:

  1. There are now more kinds of GPUs, e.g., AMD Radeon GPUs and Qualcomm Adreno GPUs.
  2. Mainstream frameworks like PyTorch clearly indicate CUDA device, e.g., PyTorch uses torch.cuda to support CUDA tensor types, so this renaming will make it more consistent with the other frameworks.

Look forward to hearing your thoughts!

@tqchen
Copy link
Member

tqchen commented May 7, 2021

cc @kkraus14 @leofang @prigoyal @zdevito @rgommers @oleksandr-pavlyk

@oleksandr-pavlyk
Copy link
Contributor

I think it would be feasible to add kDLGPU as a synonym to kDLCUDA for some time should any compatibility concerns arise.
I am in favor of renaming otherwise.

Renaming will likely bump DLPACK_VERSION, and with adoption of DLPack in Python array API, how would a Python user (consumer library author) know what DLPack version the producer has been using?

Additionally, is there any guidelines about extended DLDeviceType enum? Are kDLExtDev allowed to be moved, i.e. if I wanted to add kDLSycl_LevelZero_GPU and bunch of other values to the enum, would I do that before the guard, or after?

@tqchen
Copy link
Member

tqchen commented May 7, 2021

All of the changes so far(including this one) are ABI compatible, which means a system uses an earlier version of DLPack can be imported in a later version(because the number has not change, we are just rename the field). For the same reason, we might want to avoid change the number (e.g. DLExtDev) but instead introduce new numbers.

ABI breaking changes shall be considered more seriously, and so far we have not yet encountered them

@kkraus14
Copy link

kkraus14 commented May 7, 2021

While this isn't an ABI breaking change it will break the build of someone using kDLGPU and kDLCPUPinned.

@tqchen
Copy link
Member

tqchen commented May 7, 2021

We would assume the consumer of dlpack maintain a copy(or submodule hash) of a particular version. So the framework can choose to upgrade the DLPack while updating the usage of kDLGPU to kDLCUDA. Perhaps we can keeping kDLGPU, kDLCPUPinned for another release cycle as alias would resolve the source compatibility issue

@leofang
Copy link
Contributor

leofang commented May 7, 2021

I took a quick look at PyTorch and I think this line should be fixed, but it's likely irrelevant of this RFC (not sure):
https://github.com/pytorch/pytorch/blob/4cb534f92ef6f5b2ec99109b0329f93a859ae831/caffe2/python/pybind_state_dlpack.cc#L11
cc: @emcastillo @asi1024 @kmaehashi for vis

By the way, can we please add a new entry kDLCUDAManaged following this renaming (which I am supportive for clarity if no backward incompatibility happens)? It's very useful for a project I am about to start prototyping. Moreover, managed memory should not be viewed as a CPU-only or GPU-only entity, but something distinct. I am surprised it hasn't been proposed yet.

@leofang
Copy link
Contributor

leofang commented May 7, 2021

following this renaming (which I am supportive for clarity if no backward incompatibility happens)

On a second thought, kDLCPUPinned has its own value, as both CUDA and ROCm provide pinned memory. If we are renaming this entry, I'd prefer to rename kDLCPUPinned to kDLCUDAPinned and add kDLROCMPinned (following kDLROCM).

EDIT: fix uppercase names.

@rgommers
Copy link
Contributor

rgommers commented May 8, 2021

I think renaming is much better than using an alias. That just drags out the work rather than reduce it, unless the intent is to keep an alias forever (which seems not desirable). And while C enum's can have duplicate values, it is normally expected for them to unique.

It seems to me like there is no real source compatibility issue, people just need to rename if they upgrade their vendored copy of dlpack.h.

Recently DLContext was renamed to DLDevice, that didn't give any issues AFAIK.

@tqchen
Copy link
Member

tqchen commented May 8, 2021

To follow on @leofang 's comment, kDLCUDACPUPinned might be less ambiguous, mainly because the memory is still allocated on the host, rather on the device. If we want a shorter name, kDLCUDAHost might be fine since host usually refers to the host memory

@leofang
Copy link
Contributor

leofang commented May 9, 2021

If we want a shorter name, kDLCUDAHost might be fine since host usually refers to the host memory

I think this is fine since pinned memory is allocated via cudaMallocHost (and likewise in ROCm/HIP).

So, let me recap what has been discussed so far:

  • kDLGPU -> kDLCUDA (rename)
  • kDLCPUPinned -> kDLCUDAHost (rename)
  • kDLCUDAManaged (new)
  • kDLROCMHost (new)

For now we do not add kDLROCMManaged because there's no managed memory in ROCm/HIP, but it can be easily added once it's provided and needed.

EDIT: fix uppercase names.

@tqchen
Copy link
Member

tqchen commented May 9, 2021

The managed memory could go to a second RFC as it would be more involved than renaming

@YuchenJin
Copy link
Contributor Author

Thank you all for the discussion! I will create a PR to do the renaming only for now.

@tqchen
Copy link
Member

tqchen commented May 11, 2021

Implemented by #68

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants