Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EDDL does not compile for GPU #40

Closed
simleo opened this issue Oct 8, 2019 · 2 comments
Closed

EDDL does not compile for GPU #40

simleo opened this issue Oct 8, 2019 · 2 comments

Comments

@simleo
Copy link
Contributor

simleo commented Oct 8, 2019

As of 160c689 (current master branch as I'm writing this), the patch at #34 is not enough to build EDDL for GPU. I had to add missing sources to CMakeLists.txt and fix some problems in other files. This is the full patch I had to use:

diff --git a/CMakeLists.txt b/CMakeLists.txt
index 200fc36..e2947d9 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -99,7 +99,7 @@ endif()
 
 # [MACRO] Download submodules
 macro(eddl_update_third_party SUBMODULE)
-    if(GIT_FOUND AND EXISTS "${PROJECT_SOURCE_DIR}/.git")
+    if(GIT_FOUND AND IS_DIRECTORY "${PROJECT_SOURCE_DIR}/.git")
     # Update submodule as needed
         message(STATUS "${SUBMODULE} update")
         execute_process(COMMAND ${GIT_EXECUTABLE} submodule update --init --recursive third_party/${SUBMODULE}
@@ -278,19 +278,27 @@ SET(CUDA_SOURCES
         src/hardware/gpu/gpu_hw.h
         src/hardware/gpu/gpu_comparison.cu
         src/hardware/gpu/gpu_core.cu
+        src/hardware/gpu/gpu_core_kernels.cu
         src/hardware/gpu/gpu_create.cu
+        src/hardware/gpu/gpu_create_kernels.cu
         src/hardware/gpu/gpu_generator.cu
         src/hardware/gpu/gpu_math.cu
+        src/hardware/gpu/gpu_math_kernels.cu
         src/hardware/gpu/gpu_reduction.cu
         src/hardware/gpu/gpu_tensor.cu
         src/hardware/gpu/gpu_tensor.h
         src/hardware/gpu/gpu_kernels.h
         src/hardware/gpu/nn/gpu_nn.h
         src/hardware/gpu/nn/gpu_activations.cu
+        src/hardware/gpu/nn/gpu_activations_kernels.cu
         src/hardware/gpu/nn/gpu_conv.cu
+        src/hardware/gpu/nn/gpu_conv_kernels.cu
         src/hardware/gpu/nn/gpu_losses.cu
+        src/hardware/gpu/nn/gpu_losses_kernels.cu
         src/hardware/gpu/nn/gpu_metrics.cu
+        src/hardware/gpu/nn/gpu_metrics_kernels.cu
         src/hardware/gpu/nn/gpu_pool.cu
+        src/hardware/gpu/nn/gpu_pool_kernels.cu
         )
 SET(SOURCES ${CPP_SOURCES} ${CUDA_SOURCES})
 
@@ -310,6 +318,8 @@ endif(EDDL_SHARED)
 target_sources(eddl PRIVATE ${CPP_SOURCES})
 if(EDDL_WITH_CUDA AND CMAKE_CUDA_COMPILER)
     target_sources(eddl PRIVATE ${CUDA_SOURCES})
+    target_include_directories(eddl PUBLIC ${CMAKE_CUDA_TOOLKIT_INCLUDE_DIRECTORIES})
+    add_compile_definitions(cGPU)
 endif()
 target_include_directories(eddl PUBLIC
 	$<BUILD_INTERFACE:${CMAKE_CURRENT_SOURCE_DIR}/src>
diff --git a/src/hardware/gpu/gpu_hw.h b/src/hardware/gpu/gpu_hw.h
index 3130eb4..42b69e3 100644
--- a/src/hardware/gpu/gpu_hw.h
+++ b/src/hardware/gpu/gpu_hw.h
@@ -43,7 +43,7 @@ void gpu_fill(Tensor *A, int aini, int aend, Tensor *B, int bini, int bend, int
 void gpu_select(Tensor *A, Tensor *B, vector<int> sind, int ini, int end);
 
 // GPU: Create (static)
-void gpu_range(Tensor *A, float min, float step, int size)
+void gpu_range(Tensor *A, float min, float step, int size);
 
 // GPU: Generator
 void gpu_rand_uniform(Tensor *A, float v);
diff --git a/src/hardware/gpu/gpu_kernels.h b/src/hardware/gpu/gpu_kernels.h
index cd7fb71..159fb7a 100644
--- a/src/hardware/gpu/gpu_kernels.h
+++ b/src/hardware/gpu/gpu_kernels.h
@@ -56,6 +56,7 @@ __global__ void mod_(float* a, long int rows, long int cols, float v);
 __global__ void mult_(float* a, long int rows, long int cols, float v);
 __global__ void normalize_(float* a, long int rows, long int cols, float min_ori, float max_ori, float min, float max);
 __global__ void pow_(float* a, long int rows, long int cols, float exp);
+__global__ void range(float* a, long int rows, long int cols, float min, float step, int size);
 __global__ void reciprocal_(float* a, long int rows, long int cols);
 __global__ void remainder_(float* a, long int rows, long int cols, float v);
 __global__ void round_(float* a, long int rows, long int cols);
diff --git a/src/tensor/tensor_create.cpp b/src/tensor/tensor_create.cpp
index 6ce948f..8733169 100644
--- a/src/tensor/tensor_create.cpp
+++ b/src/tensor/tensor_create.cpp
@@ -16,7 +16,7 @@ Tensor* raw_range(float min, float step, int size, int dev){
         cpu_range(t, min, step);
     }
 #ifdef cGPU
-    else if (isGPU())
+    else if (t->isGPU())
       {
         cpu_range(t, min, step);
       }

Could you please integrate these changes so it works out of the box?

@prittt
Copy link
Contributor

prittt commented Oct 8, 2019

@simleo we are going to make a pull request that will try to fix the CMake file and it already includes all the modifications you have mentioned.

@simleo
Copy link
Contributor Author

simleo commented Oct 9, 2019

As of 7b89d5a, these changes have been either integrated, or the problems they solved have been fixed in a different way. Closing

@simleo simleo closed this as completed Oct 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants