Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use cudnn backend API to do int8x32 convolution calculation on Ampere? #8

Closed
ZhaoJob opened this issue Aug 14, 2021 · 5 comments

Comments

@ZhaoJob
Copy link

ZhaoJob commented Aug 14, 2021

Can give samples about int8x32 convolution calculation using cudnn backend API?
1、How to create xTensor, wTensor and so on?
2、How to create conv_op node?
3、Other creation.

@Anerudhan
Copy link
Collaborator

Thank you raising this issue.

A sample with int8x32 has been created for you to look at.

        auto tensor_x = cudnn_frontend::TensorBuilder()
                                       .setDim(4, x_dim_padded)
                                       .setStrides(4, strideA_padded)
                                       .setId('x')
                                       .setAlignment(16)
                                       .setDataType(CUDNN_DATA_INT8)
                                       .setVectorCountAndDimension(vectorCount, vectorDimension)
                                       .build();

Note: There are numerical issues with engine_id=0 and int8x32 vectorCount and has been added to errata.

@ZhaoJob
Copy link
Author

ZhaoJob commented Aug 20, 2021

Thanks for reply.
I have referred to the samples with int8x32 from:
1、one sample you provided, conv_op with int8x32(x y w is int8x32)
2、one sample from one developer in github, conv_op+add_op+bias_op+activation_op(x y w z b is all int8x4; after modification, also run int8x32 type with inner imma api)

For project application, I need test the conv_op+scale_op+bias_op+activation_op case(same as using cudnnFusedOpsExecute), in which
datatype: x y w is all int8x32, scale and bias is float(fp32),
dim: x[N, IC, IH, IW], y[N, OC, OH, OW], z[OC, IC, KH, KW], scale[N, OC, 1, 1], bias[N, OC, 1, 1]

According to the cudnn8 developer guide manual, it supports Convolution_Pointwise flexibly when the compute capability is above 7.5.
Therefore, referred to the samples above, I have write my conv_op+scale_op+bias_op+activation_op sample(x y w is all int8x32, scale and bias is float), but can not find the supported engine for cudnnBackendExectue(), can you give a sample? Or can you correct the error in my code?

@Anerudhan
Copy link
Collaborator

Hi Zhao,

Apologies for the delayed response. We found we do not support the above data type combination because of an internal bug. We have a fix for this and will be part of our future 8.3.0 release and will supported through the run time fusion.

Thanks

@Anerudhan
Copy link
Collaborator

Hi Zhao,

We have fixed this issue in cudnn v8.3 and have released a sample ConvScaleBiasAct_int8 sample for the same. Let us know if it addresses your use case.

Thanks

@Anerudhan
Copy link
Collaborator

Hi @ZhaoJob hope the responses above answer your questions! I'm closing the issue for now. If you have additional questions, please feel free to open a new issue!

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants