Skip to content

Conversation

rraminen
Copy link

@rraminen rraminen commented Feb 12, 2025

This PR contains the below two changes

  1. forceinline needs inline and always_inline on ROCm
  2. extra_include_paths is required for the hipification of quant_cuda header files.

@rraminen rraminen changed the title __forceinline__ needs inline and always_inline on ROCm Enablement on ROCm Feb 14, 2025
@rraminen rraminen marked this pull request as ready for review February 14, 2025 22:32
@rraminen
Copy link
Author

rraminen commented Mar 7, 2025

Hi @Tiiiger, could you please review this PR?

@k-artem
Copy link

k-artem commented Jul 22, 2025

hi @rraminen , +1 for this PR, I have one proposal, instead of adding definition of __forceinline__ we can add

diff --git a/qtorch/quant/quant_cuda/bit_helper.cu b/qtorch/quant/quant_cuda/bit_helper.cu
index 794255f..c741d58 100644
--- a/qtorch/quant/quant_cuda/bit_helper.cu
+++ b/qtorch/quant/quant_cuda/bit_helper.cu
@@ -1,3 +1,5 @@
+#include <cuda.h>
+
 #define FLOAT_TO_BITS(x) (*reinterpret_cast<unsigned int*>(x))
 #define BITS_TO_FLOAT(x) (*reinterpret_cast<float*>(x))

diff --git a/qtorch/quant/quant_cuda/sim_helper.cu b/qtorch/quant/quant_cuda/sim_helper.cu
index d165793..5a81493 100644
--- a/qtorch/quant/quant_cuda/sim_helper.cu
+++ b/qtorch/quant/quant_cuda/sim_helper.cu
@@ -1,3 +1,4 @@
+#include <cuda.h>
 #include "quant_kernel.h"
 #include <cmath>

in order to use this definition from hip library.

@k-artem
Copy link

k-artem commented Jul 22, 2025

hi @stevenygd @Tiiiger could you please look at this PR? Thanks in advance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants