-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use mtgp32_kernel_params field inside of mtgp_engine for better OO design. #22
Comments
@Jorghi12 But why exactly do these new frameworks need access to "private" types? (by "private" I mean anything that is not listed in https://docs.nvidia.com/cuda/curand/group__HOST.html and https://docs.nvidia.com/cuda/curand/group__DEVICE.html). |
The type isn't private in any way. The links you've presented only list the functions available in the Host & Device API. The reason mtgp32_kernel_params_t isn't listed there is simply because it is not a function. Take a look at the Device API. You'll see some of the functions take a mtgp32_kernel_params_t object. Surely, if the type mtgp32_kernel_params_t were private, then the function curandMakeMTGP32Constants would be unusable. |
That's exactly what I'm asking about: both host and device cuRAND APIs work with What does pyTorch uses such types? Does it implement something that is not possible to accomplish using public APIs? |
I just didn't see any reason for the divergence away from the CUDA API and wondered why such decisions were made. This isn't urgent but it'd be nice to have. |
@ex-rzr I think my original proposal may not be coming across clearly enough. I don't want / need mtgp32_kernel_params exposed at all. I'm simply saying that the mtgp_engine class should have a pointer to a mtgp32_kernel_params_t object. |
The fact that it's used in PyTorch does not mean it's correct usage. |
@jszuppe I believe we're discussing different things. From within device code, surely it would still be a nice to have to be able to pass mtgp32_kernel_params_t objects around to different mtgp_engine classes instead. You can ignore that link, since that's tangential to this conversation. |
Yeah. We get that. We don't remember exactly why it was done that way. It might have to do something with performance, or just missed (it happens; that was not the most important part of the project and it's not like the time was unlimited).
Maybe. We're just saying: what's not in a public API is an implementation detail, and I think that's obvious. (And I'm not talking about struct, but it's fields.) |
Check out Nvidia's guide for CURAND, page 55. @jszuppe Sounds great. Yes I agree, it's not part of the public API, but it would be nice to maintain the idiosyncrasies. I wouldn't have expected this to be caught at all given how subtle it is and how unrelated it is to the actual public API. The rocRAND API works very well as a whole and has improved on many different levels from the hcRNG package. |
Yeah. We can't predict what implementation details or implementation defined behaviours (like in here) will be used by developers that uses cuRAND. We can just hope that developers are sane ;) and it won't happen often. We will consider you request and check how it affects performance.
Thanks! Let's not talk about hcRNG :D |
I'm not sure if we will do that change as using pointer significantly decreases performance of MTGP32 RNG. The drops are: from 180GB/s to 102GB/s on my GTX1080, and from 263GB/s to 156 GB/s on MI25. btw. my_state.pos_tbl = mtgp32_kernel_params.pos_tbl;
my_state.param_tbl = mtgp32_kernel_params.param_tbl;
my_state.temper_tbl = mtgp32_kernel_params.temper_tbl;
my_state.single_temper_tbl = mtgp32_kernel_params.single_temper_tbl;
my_state.sh1_tbl = mtgp32_kernel_params.sh1_tbl;
my_state.sh2_tbl = mtgp32_kernel_params.sh2_tbl;
my_state.mask = mtgp32_kernel_params.mask; is incorrect. For example: |
@jszuppe Yeah, adding a function for that is just fine. |
I'm closing this. Please reopen if you still have some issues. |
I don't see why you're diverging from CUDA for no good reason. I noticed a redundancy with the design of the mtgp_engine class.
This is how it's already done in CUDA.
Let's take a look at rocRAND.
Fields of the mtgp32_kernel_params_t class:
Fields of t mtgp32_kernel_params_t:
Why not simply define the mtgp_engine class to have a pointer to mtgp32_kernel_params_t?
Example:
Reason One
Better object oriented design. I should be able to create a single mtgp32_kernel_params and reuse that object in other hiprandStateMtgp32_t states. Otherwise, you'll have to do bloated things such as
Reason Two
Unnecessary divergence from the CUDA API leading to more work to have new frameworks run seamlessly.
The text was updated successfully, but these errors were encountered: