Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does Fast-Math Intrinsics Align with Baseline? #3

Closed
K-Wu opened this issue Dec 16, 2021 · 2 comments
Closed

Does Fast-Math Intrinsics Align with Baseline? #3

K-Wu opened this issue Dec 16, 2021 · 2 comments

Comments

@K-Wu
Copy link

K-Wu commented Dec 16, 2021

Hi, I am very interested in your OSDI21 work. However, I noticed that you used __fmaf_rn in your repo. This is a fast-math intrinsics according to documentation https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#intrinsic-functions, while I observed that usually nvcc will heavily emit multiply-add instruction even though expressions are written in naive form and no such intrinsics are used. I am not sure if using this intrinsic aligns with the baseline and how this intrinsic could help you achieve your goal. Could you explain them to me? Thank you.

@YukeWang96
Copy link
Owner

Hi, Kun

Thanks for your interest in our project. Here are my current findings to your questions.

  • One of the major reason for not using Fast-Math Intrinsics is its reduced accuracy precision. However, based on our validation compared our kernel against the kernels without applying those Intrinsics and the standard graphConv kernel from DGL, we notice very minor (less than 10^-5) to none (i.e., exactly match) output differences depending on the input graphs.
  • About your interest in DGL baseline, they are treating degree norms and embeddings for all nodes and as the regular dense tensor directly multiply them together where their sparse kernel only handle the sparse neighbor aggregation.
    I also attached the code from DGL for your reference.
    https://github.com/dmlc/dgl/blob/6c81634b295b41c2d5c6d17433a2c56dc7aeda37/python/dgl/nn/pytorch/conv/graphconv.py#L424-L447

@K-Wu
Copy link
Author

K-Wu commented Dec 17, 2021

Hi Yuke,

Thank you for your detailed explanation and pointer to the related code in the DGL baseline. They cleared up my concern.

Best Regards,
Kun

@K-Wu K-Wu closed this as completed Dec 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants