Use mtgp32_kernel_params field inside of mtgp_engine for better OO design. #22

Jorghi12 · 2018-07-11T18:41:06Z

I don't see why you're diverging from CUDA for no good reason. I noticed a redundancy with the design of the mtgp_engine class.

This is how it's already done in CUDA.

struct curandStateMtgp32 {
    unsigned int s[MTGP32_STATE_SIZE];
    int offset;
    int pIdx;
    mtgp32_kernel_params_t * k;
    int precise_double_flag;
};

Let's take a look at rocRAND.
Fields of the mtgp32_kernel_params_t class:

public:
    // State
    mtgp32_state m_state;
    // Parameters
    unsigned int pos_tbl;
    unsigned int param_tbl[MTGP_TS];
    unsigned int temper_tbl[MTGP_TS];
    unsigned int single_temper_tbl[MTGP_TS];
    unsigned int sh1_tbl;
    unsigned int sh2_tbl;
    unsigned int mask;

Fields of t mtgp32_kernel_params_t:

    unsigned int pos_tbl[MTGP_BN_MAX];
    unsigned int param_tbl[MTGP_BN_MAX][MTGP_TS];
    unsigned int temper_tbl[MTGP_BN_MAX][MTGP_TS];
    unsigned int single_temper_tbl[MTGP_BN_MAX][MTGP_TS];
    unsigned int sh1_tbl[MTGP_BN_MAX];
    unsigned int sh2_tbl[MTGP_BN_MAX];
    unsigned int mask[1];

Why not simply define the mtgp_engine class to have a pointer to mtgp32_kernel_params_t?

Example:

public:
    // State
    mtgp32_state m_state;
    // Parameters
    mtgp32_kernel_params_t * k;

Reason One
Better object oriented design. I should be able to create a single mtgp32_kernel_params and reuse that object in other hiprandStateMtgp32_t states. Otherwise, you'll have to do bloated things such as

my_state.pos_tbl = mtgp32_kernel_params.pos_tbl;
my_state.param_tbl = mtgp32_kernel_params.param_tbl;
my_state.temper_tbl = mtgp32_kernel_params.temper_tbl;
my_state.single_temper_tbl = mtgp32_kernel_params.single_temper_tbl;
my_state.sh1_tbl = mtgp32_kernel_params.sh1_tbl;
my_state.sh2_tbl = mtgp32_kernel_params.sh2_tbl;
my_state.mask = mtgp32_kernel_params.mask;

Reason Two
Unnecessary divergence from the CUDA API leading to more work to have new frameworks run seamlessly.

The text was updated successfully, but these errors were encountered:

Jorghi12 · 2018-07-11T19:27:28Z

@jszuppe @ex-rzr

ex-rzr · 2018-07-12T02:29:33Z

@Jorghi12 But why exactly do these new frameworks need access to "private" types? (by "private" I mean anything that is not listed in https://docs.nvidia.com/cuda/curand/group__HOST.html and https://docs.nvidia.com/cuda/curand/group__DEVICE.html).

Jorghi12 · 2018-07-12T06:51:01Z

The type isn't private in any way.

The links you've presented only list the functions available in the Host & Device API.

The reason mtgp32_kernel_params_t isn't listed there is simply because it is not a function.

Take a look at the Device API. You'll see some of the functions take a mtgp32_kernel_params_t object.
https://docs.nvidia.com/cuda/curand/group__DEVICE.html
_host__ curandStatus_t curandMakeMTGP32Constants ( const mtgp32_params_fast_t params[], mtgp32_kernel_params_t* p )

Surely, if the type mtgp32_kernel_params_t were private, then the function curandMakeMTGP32Constants would be unusable.

ex-rzr · 2018-07-12T07:00:25Z

That's exactly what I'm asking about: both host and device cuRAND APIs work with *_t types, i.e. there is mtgp32_kernel_params_t but no mtgp32_kernel_params.

What does pyTorch uses such types? Does it implement something that is not possible to accomplish using public APIs?

Jorghi12 · 2018-07-12T07:02:46Z

I just didn't see any reason for the divergence away from the CUDA API and wondered why such decisions were made. This isn't urgent but it'd be nice to have.

Jorghi12 · 2018-07-12T07:05:24Z

@ex-rzr I think my original proposal may not be coming across clearly enough.

I don't want / need mtgp32_kernel_params exposed at all.

I'm simply saying that the mtgp_engine class should have a pointer to a mtgp32_kernel_params_t object.

jszuppe · 2018-07-12T07:05:52Z

mtgp32_kernel_params_t is not private, but its fields are implementation details. The same goes for fields of curandState<> structures.

Here's an example with PyTorch & CUDA: https://github.com/pytorch/pytorch/blob/master/aten/src/THC/THCTensorRandom.cu#L64

The fact that it's used in PyTorch does not mean it's correct usage.

Jorghi12 · 2018-07-12T07:08:04Z

@jszuppe I believe we're discussing different things. From within device code, surely it would still be a nice to have to be able to pass mtgp32_kernel_params_t objects around to different mtgp_engine classes instead. You can ignore that link, since that's tangential to this conversation.

jszuppe · 2018-07-12T07:13:59Z

Please re-read the initial message.

Yeah. We get that. We don't remember exactly why it was done that way. It might have to do something with performance, or just missed (it happens; that was not the most important part of the project and it's not like the time was unlimited). mtgp32_kernel_params_t is not the same as fields in mtgp_engine, because it's bigger. I get that pointer may solve that, and that's a valid idea. We would need to check performance. We will look at this when we have time.

I believe we're discussing different things.

Maybe. We're just saying: what's not in a public API is an implementation detail, and I think that's obvious. (And I'm not talking about struct, but it's fields.)

Jorghi12 · 2018-07-12T07:14:27Z

Check out Nvidia's guide for CURAND, page 55.
https://developer.download.nvidia.com/compute/DevZone/docs/html/CUDALibraries/doc/CURAND_Library.pdf

@jszuppe Sounds great. Yes I agree, it's not part of the public API, but it would be nice to maintain the idiosyncrasies.

I wouldn't have expected this to be caught at all given how subtle it is and how unrelated it is to the actual public API.

The rocRAND API works very well as a whole and has improved on many different levels from the hcRNG package.

jszuppe · 2018-07-12T07:52:12Z

Sounds great. Yes I agree, it's not part of the public API, but it would be nice to maintain the idiosyncrasies.

I wouldn't have expected this to be caught at all given how subtle it is and how unrelated it is to the actual public API.

Yeah. We can't predict what implementation details or implementation defined behaviours (like in here) will be used by developers that uses cuRAND. We can just hope that developers are sane ;) and it won't happen often. We will consider you request and check how it affects performance.

The rocRAND API works very well as a whole and has improved on many different levels from the hcRNG package.

Thanks! Let's not talk about hcRNG :D

jszuppe · 2018-07-18T11:47:34Z

I'm not sure if we will do that change as using pointer significantly decreases performance of MTGP32 RNG. The drops are: from 180GB/s to 102GB/s on my GTX1080, and from 263GB/s to 156 GB/s on MI25.

btw.

my_state.pos_tbl = mtgp32_kernel_params.pos_tbl;
my_state.param_tbl = mtgp32_kernel_params.param_tbl;
my_state.temper_tbl = mtgp32_kernel_params.temper_tbl;
my_state.single_temper_tbl = mtgp32_kernel_params.single_temper_tbl;
my_state.sh1_tbl = mtgp32_kernel_params.sh1_tbl;
my_state.sh2_tbl = mtgp32_kernel_params.sh2_tbl;
my_state.mask = mtgp32_kernel_params.mask;

is incorrect. For example: mtgp32_kernel_params.pos_tbl is array and my_state.pos_tbl is a single unsigned int value. If you need simple function for switching those parameters we can just add a function/method for that.

Jorghi12 · 2018-07-22T08:56:23Z

@jszuppe Yeah, adding a function for that is just fine.

jszuppe · 2018-07-23T12:18:33Z

@Jorghi12 check 23a7294 (develop branch)

jszuppe · 2018-07-30T10:50:08Z

@Jorghi12 Have you checked 23a7294? Can I close this issue?

jszuppe · 2018-08-10T07:45:44Z

I'm closing this. Please reopen if you still have some issues.

Jorghi12 changed the title ~~Better to use a mtgp32_kernel_params field inside of the mtgp_engine structs.~~ Use mtgp32_kernel_params field inside of mtgp_engine for better OO design. Jul 11, 2018

jszuppe mentioned this issue Jul 16, 2018

Make the state structs accessible directly from C. #21

Closed

Jorghi12 mentioned this issue Jul 17, 2018

Replace hcRNG with rocRAND. ROCm/pytorch#41

Merged

jszuppe closed this as completed Aug 10, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use mtgp32_kernel_params field inside of mtgp_engine for better OO design. #22

Use mtgp32_kernel_params field inside of mtgp_engine for better OO design. #22

Jorghi12 commented Jul 11, 2018 •

edited

Loading

Jorghi12 commented Jul 11, 2018

ex-rzr commented Jul 12, 2018

Jorghi12 commented Jul 12, 2018 •

edited

Loading

ex-rzr commented Jul 12, 2018

Jorghi12 commented Jul 12, 2018

Jorghi12 commented Jul 12, 2018 •

edited

Loading

jszuppe commented Jul 12, 2018

Jorghi12 commented Jul 12, 2018 •

edited

Loading

jszuppe commented Jul 12, 2018 •

edited

Loading

Jorghi12 commented Jul 12, 2018 •

edited

Loading

jszuppe commented Jul 12, 2018 •

edited

Loading

jszuppe commented Jul 18, 2018 •

edited

Loading

Jorghi12 commented Jul 22, 2018

jszuppe commented Jul 23, 2018 •

edited

Loading

jszuppe commented Jul 30, 2018

jszuppe commented Aug 10, 2018

Use mtgp32_kernel_params field inside of mtgp_engine for better OO design. #22

Use mtgp32_kernel_params field inside of mtgp_engine for better OO design. #22

Comments

Jorghi12 commented Jul 11, 2018 • edited Loading

Jorghi12 commented Jul 11, 2018

ex-rzr commented Jul 12, 2018

Jorghi12 commented Jul 12, 2018 • edited Loading

ex-rzr commented Jul 12, 2018

Jorghi12 commented Jul 12, 2018

Jorghi12 commented Jul 12, 2018 • edited Loading

jszuppe commented Jul 12, 2018

Jorghi12 commented Jul 12, 2018 • edited Loading

jszuppe commented Jul 12, 2018 • edited Loading

Jorghi12 commented Jul 12, 2018 • edited Loading

jszuppe commented Jul 12, 2018 • edited Loading

jszuppe commented Jul 18, 2018 • edited Loading

Jorghi12 commented Jul 22, 2018

jszuppe commented Jul 23, 2018 • edited Loading

jszuppe commented Jul 30, 2018

jszuppe commented Aug 10, 2018

Jorghi12 commented Jul 11, 2018 •

edited

Loading

Jorghi12 commented Jul 12, 2018 •

edited

Loading

Jorghi12 commented Jul 12, 2018 •

edited

Loading

Jorghi12 commented Jul 12, 2018 •

edited

Loading

jszuppe commented Jul 12, 2018 •

edited

Loading

Jorghi12 commented Jul 12, 2018 •

edited

Loading

jszuppe commented Jul 12, 2018 •

edited

Loading

jszuppe commented Jul 18, 2018 •

edited

Loading

jszuppe commented Jul 23, 2018 •

edited

Loading