Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Part 1 of the big GPU merge #45

Merged
merged 61 commits into from
Nov 14, 2016
Merged
Show file tree
Hide file tree
Changes from 28 commits
Commits
Show all changes
61 commits
Select commit Hold shift + click to select a range
b38f78b
nvcc can build miniapp
bcumming Sep 8, 2016
7e54957
WIP towards basic GPU implementation
bcumming Sep 9, 2016
b7bd5dd
gpu WIP
bcumming Sep 29, 2016
3edad46
Merge branch 'master' into gpu
bcumming Sep 29, 2016
1388d20
WIP: progress towards crash free GPU existance
bcumming Sep 29, 2016
8e6d97f
finish moving vector to internal memory module
bcumming Sep 30, 2016
4ee9930
add the new tests that I forgot in the last commit
bcumming Sep 30, 2016
4c0864b
updated fvm multicell code for the new memory library
bcumming Sep 30, 2016
d1a5b6e
WIP gpu version
bcumming Oct 4, 2016
36ebcc7
WIP merge
bcumming Oct 6, 2016
851f38c
Merge branch 'master' into gpu
bcumming Oct 6, 2016
4a1777f
gpu WIP
bcumming Oct 9, 2016
2c6f5a3
WIP - towards gpu support
bcumming Oct 10, 2016
c742fd5
WIP gpu
bcumming Oct 13, 2016
039d5a9
Merge pull request #1 from eth-cscs/master
bcumming Oct 13, 2016
121dade
WIP
bcumming Oct 15, 2016
b6aec69
Support for units syntax within state block.
Oct 22, 2016
44c8ac0
WIP
bcumming Oct 24, 2016
78f60f5
WIP
bcumming Oct 25, 2016
5c09594
Merge branch 'master' of github.com:eth-cscs/nestmc-proto
bcumming Oct 25, 2016
3830c9a
WIP
bcumming Oct 25, 2016
2db99a2
WIP
bcumming Oct 25, 2016
65ed2fe
WIP
bcumming Oct 25, 2016
7f1216d
WIP
bcumming Oct 25, 2016
de992ad
merge with master
bcumming Oct 26, 2016
b5bae5b
WIP
bcumming Oct 26, 2016
06ef117
Merge branch 'master' of github.com:eth-cscs/nestmc-proto into gpu
bcumming Oct 26, 2016
18a9226
Merge branch 'master' into bugfix/modcc/state-block-units
Oct 26, 2016
05e4770
Adds unit tests for the STATE block.
Oct 26, 2016
b9b1e57
WIP
bcumming Oct 27, 2016
dbca304
Address deprecated use of 'symbol' warning.
halfflat Oct 27, 2016
ff14704
final merge with master
bcumming Oct 27, 2016
9e0b874
Addresses PR comments.
Oct 27, 2016
f1eac25
Merge pull request #35 from vkarak/bugfix/modcc/state-block-units
bcumming Oct 27, 2016
5174b69
Merge remote-tracking branch 'upstream/master'
halfflat Oct 27, 2016
a39c9a3
Unit tests for math.hpp
halfflat Oct 27, 2016
63c507b
Extend range, view functionality.
halfflat Oct 27, 2016
830428a
Add `ball_and_squiggle` model; fix `ball_and_taper`.
halfflat Oct 27, 2016
cee495c
Address PR#46 review comments.
halfflat Oct 28, 2016
c97135d
Merge pull request #46 from halfflat/feature/more-range-utils
bcumming Oct 28, 2016
5ade8d0
Merge pull request #47 from halfflat/feature/new-test-model
bcumming Oct 28, 2016
1b929ff
Consolidate validation test code (issue #41)
halfflat Oct 27, 2016
f189d73
New compartment info structure for FVM.
halfflat Oct 27, 2016
550da10
Merge pull request #48 from halfflat/feature/consolidate-validation-t…
bcumming Oct 28, 2016
a03af27
Merge pull request #49 from halfflat/feature/divided-compartments
bcumming Oct 28, 2016
e7a8fb6
Complex compartments
halfflat Oct 11, 2016
5aeea90
Remove division policy type parameter.
halfflat Oct 31, 2016
e8d3285
Merge pull request #54 from halfflat/feature/complex-compartments
bcumming Oct 31, 2016
b39a93e
WIP
bcumming Nov 2, 2016
0ded25a
Omp (#38)
Ivanmartinezperez Nov 4, 2016
cd4d9ae
WIP - gpu version runs but doesn't validate
bcumming Nov 4, 2016
e74720b
WIP
bcumming Nov 9, 2016
b11990f
backend refactoring
bcumming Nov 10, 2016
bb14d18
WIP
bcumming Nov 11, 2016
1584bf3
merge master into gpu branch (new validation tests mostly)
bcumming Nov 11, 2016
46427e6
rename template parameter in indexed_view to Backend
bcumming Nov 11, 2016
7751058
clean up gpu branch changes
bcumming Nov 14, 2016
2b2b310
findunwind works when libunwind is unavailable
bcumming Nov 14, 2016
5825543
final clean up for PR #45
bcumming Nov 14, 2016
faf1a41
further simplifications to the unwind C++ code
bcumming Nov 14, 2016
6cf11d3
Tiny typo fix
Nov 14, 2016
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@ external/modparser-patch
external/modparser-update
external/tmp
mechanisms/*.hpp
mechanisms/gpu/*.hpp

# build path
build*
Expand Down
Empty file removed .gitmodules
Empty file.
13 changes: 3 additions & 10 deletions .ycm_extra_conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,17 +50,10 @@
'external',
'-I',
'miniapp',
# '-isystem',
# '/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.10.sdk/usr/include/c++/4.2.1',
# '-I',
# '/usr/include/c++/4.9.2',
# '-isystem',
# '/usr/lib/gcc/x86_64-unknown-linux-gnu/4.9.2/include'
# '-isystem',
# '/usr/local/include',
'-I',
'modcc',
]



# Set this to the absolute path to the folder (NOT the file!) containing the
# compile_commands.json file to use that instead of 'flags'. See here for
# more details: http://clang.llvm.org/docs/JSONCompilationDatabase.html
Expand Down
16 changes: 13 additions & 3 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,18 @@ if(WITH_TBB)
add_definitions(-DWITH_TBB)
endif()

# CUDA support
set(WITH_CUDA OFF CACHE BOOL "use CUDA for GPU offload" )
if(WITH_CUDA)
find_package(CUDA REQUIRED)
# the vector library has a compiled component when using the CUDA backend
include(ExternalProject)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer need ExternalProject here I think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

# Turn off annoying and incorrect warnings generated in the JSON file.
# We also work around the same issue with the intel compiler.
set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS};-Xcudafe \"--diag_suppress=not_used_in_template_function_params\";-Xcudafe \"--diag_suppress=cast_to_qualified_type\")
set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS};-DWITH_CUDA)
endif()

# MPI support
set(WITH_MPI OFF CACHE BOOL "use MPI for distrubuted parallelism")
if(WITH_MPI)
Expand Down Expand Up @@ -115,9 +127,7 @@ else()
set(BUILD_NRN_VALIDATION_DATA TRUE)
endif()


include_directories(${CMAKE_SOURCE_DIR}/tclap/include)
include_directories(${CMAKE_SOURCE_DIR}/vector)
include_directories(${CMAKE_SOURCE_DIR}/tclap)
include_directories(${CMAKE_SOURCE_DIR}/include)
include_directories(${CMAKE_SOURCE_DIR}/src)
include_directories(${CMAKE_SOURCE_DIR}/miniapp)
Expand Down
36 changes: 36 additions & 0 deletions mechanisms/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ set(mechanisms pas hh expsyn exp2syn)
# set the flags for the modcc compiler that converts NMODL
# files to C++/CUDA source.
set(modcc_flags "-t cpu")

if(USE_OPTIMIZED_KERNELS) # generate optimized kernels
set(modcc_flags ${modcc_flags} -O)
endif()
Expand Down Expand Up @@ -33,3 +34,38 @@ endforeach()
# Fake target to always trigger .mod -> .hpp dependencies because wtf CMake
add_custom_target(build_all_mods DEPENDS ${all_mod_hpps} modcc)

# oh sweet jesus, CMake is a dog's breakfast.
# that said, let'g go through the same dance to generate CUDA kernels if
# we are targetting the GPU.
if(WITH_CUDA)
set(modcc_flags "-t gpu")

if(USE_OPTIMIZED_KERNELS)
set(modcc_flags ${modcc_flags} -O)
endif()

# generate source for each mechanism
foreach(mech ${mechanisms})
set(mod "${CMAKE_CURRENT_SOURCE_DIR}/mod/${mech}.mod")
set(hpp "${CMAKE_CURRENT_SOURCE_DIR}/gpu/${mech}.hpp")
if(use_external_modcc)
add_custom_command(
OUTPUT "${hpp}"
WORKING_DIRECTORY "${CMAKE_CURRENT_SOURCE_DIR}"
COMMAND ${modcc} ${modcc_flags} ${mod} -o ${hpp}
)
else()
add_custom_command(
OUTPUT "${hpp}"
DEPENDS modparser "${mod}"
WORKING_DIRECTORY "${CMAKE_CURRENT_SOURCE_DIR}"
COMMAND ${modcc} ${modcc_flags} ${mod} -o ${hpp}
)
endif()
set_source_files_properties("${hpp}" PROPERTIES GENERATED TRUE)
list(APPEND all_gpu_mod_hpps "${hpp}")
endforeach()

# Fake target to always trigger .mod -> .hpp dependencies because wtf CMake
add_custom_target(build_all_gpu_mods DEPENDS ${all_gpu_mod_hpps} modcc)
endif()
10 changes: 0 additions & 10 deletions mechanisms/generate.sh

This file was deleted.

21 changes: 17 additions & 4 deletions miniapp/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,15 +1,28 @@
set(HEADERS
)
set(MINIAPP_SOURCES
io.cpp
miniapp.cpp
io.cpp
miniapp_recipes.cpp
)
set(MINIAPP_SOURCES_CUDA
miniapp.cu
io.cpp
miniapp_recipes.cpp
)

add_executable(miniapp.exe ${MINIAPP_SOURCES} ${HEADERS})
if(WITH_CUDA)
cuda_add_executable(miniapp.exe ${MINIAPP_SOURCES_CUDA} ${HEADERS})
target_link_libraries(miniapp.exe gpu)
else()
add_executable(miniapp.exe ${MINIAPP_SOURCES} ${HEADERS})
endif()

target_link_libraries(miniapp.exe nestmc)

target_link_libraries(miniapp.exe LINK_PUBLIC cellalgo)
target_link_libraries(miniapp.exe LINK_PUBLIC ${TBB_LIBRARIES})
if(WITH_TBB)
target_link_libraries(miniapp.exe ${TBB_LIBRARIES})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need a LINK_PUBLIC in here and on line 16 in order to avoid CMP0023 nonsense.

endif()

if(WITH_MPI)
target_link_libraries(miniapp.exe LINK_PUBLIC ${MPI_C_LIBRARIES})
Expand Down
9 changes: 4 additions & 5 deletions miniapp/miniapp.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,13 @@

#include <json/json.hpp>

#include <backends/fvm.hpp>
#include <common_types.hpp>
#include <cell.hpp>
#include <communication/communicator.hpp>
#include <communication/global_policy.hpp>
#include <cell.hpp>
#include <fvm_multicell.hpp>
#include <io/exporter_spike_file.hpp>
#include <mechanism_catalogue.hpp>
#include <model.hpp>
#include <profiling/profiler.hpp>
#include <threading/threading.hpp>
Expand All @@ -28,8 +28,7 @@
using namespace nest::mc;

using global_policy = communication::global_policy;
using lowered_cell = fvm::fvm_multicell<double, cell_local_size_type>;
//using lowered_cell = fvm::fvm_cell<double, cell_local_size_type>;
using lowered_cell = fvm::fvm_multicell<multicore::fvm_policy>;
using model_type = model<lowered_cell>;
using time_type = model_type::time_type;
using sample_trace_type = sample_trace<time_type, model_type::value_type>;
Expand Down Expand Up @@ -128,7 +127,7 @@ int main(int argc, char** argv) {

// reset the model
m.reset();
// rest the source spikes
// reset the source spikes
for (auto source : local_sources) {
m.add_artificial_spike({source, 0});
}
Expand Down
1 change: 1 addition & 0 deletions miniapp/miniapp.cu
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
#include "miniapp.cpp"
102 changes: 34 additions & 68 deletions modcc/cprinter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,6 @@ CPrinter::CPrinter(Module &m, bool o)
text_.add_line("#include <limits>");
text_.add_line();
text_.add_line("#include <mechanism.hpp>");
text_.add_line("#include <mechanism_interface.hpp>");
text_.add_line("#include <algorithms.hpp>");
text_.add_line();

Expand All @@ -44,18 +43,19 @@ CPrinter::CPrinter(Module &m, bool o)

text_.add_line("namespace nest{ namespace mc{ namespace mechanisms{ namespace " + m.name() + "{");
text_.add_line();
text_.add_line("template<typename T, typename I>");
text_.add_line("class " + class_name + " : public mechanism<T, I> {");
text_.add_line("template<class MemoryPolicy>");
text_.add_line("class " + class_name + " : public mechanism<MemoryPolicy> {");
text_.add_line("public:");
text_.increase_indentation();
text_.add_line("using base = mechanism<T, I>;");
text_.add_line("using base = mechanism<MemoryPolicy>;");
text_.add_line("using value_type = typename base::value_type;");
text_.add_line("using size_type = typename base::size_type;");
text_.add_line("using vector_type = typename base::vector_type;");
text_.add_line("using view_type = typename base::view_type;");
text_.add_line("using index_type = typename base::index_type;");
text_.add_line("using index_view = typename base::index_view;");
text_.add_line("using const_index_view = typename base::const_index_view;");
text_.add_line();
text_.add_line("using array = typename base::array;");
text_.add_line("using iarray = typename base::iarray;");
text_.add_line("using view = typename base::view;");
text_.add_line("using iview = typename base::iview;");
text_.add_line("using const_iview = typename base::const_iview;");
text_.add_line("using indexed_view_type= typename base::indexed_view_type;");
text_.add_line("using ion_type = typename base::ion_type;");
text_.add_line();
Expand All @@ -67,12 +67,12 @@ CPrinter::CPrinter(Module &m, bool o)
text_.add_line("struct " + tname + " {");
text_.increase_indentation();
for(auto& field : ion.read) {
text_.add_line("view_type " + field.spelling + ";");
text_.add_line("view " + field.spelling + ";");
}
for(auto& field : ion.write) {
text_.add_line("view_type " + field.spelling + ";");
text_.add_line("view " + field.spelling + ";");
}
text_.add_line("index_type index;");
text_.add_line("iarray index;");
text_.add_line("std::size_t memory() const { return sizeof(size_type)*index.size(); }");
text_.add_line("std::size_t size() const { return index.size(); }");
text_.decrease_indentation();
Expand All @@ -85,7 +85,7 @@ CPrinter::CPrinter(Module &m, bool o)
// constructor
//////////////////////////////////////////////
int num_vars = array_variables.size();
text_.add_line(class_name + "(view_type vec_v, view_type vec_i, const_index_view node_index)");
text_.add_line(class_name + "(view vec_v, view vec_i, const_iview node_index)");
text_.add_line(": base(vec_v, vec_i, node_index)");
text_.add_line("{");
text_.increase_indentation();
Expand All @@ -102,8 +102,8 @@ CPrinter::CPrinter(Module &m, bool o)

text_.add_line();
text_.add_line("// allocate memory");
text_.add_line("data_ = vector_type(field_size * num_fields);");
text_.add_line("data_(memory::all) = std::numeric_limits<value_type>::quiet_NaN();");
text_.add_line("data_ = array(field_size * num_fields);");
text_.add_line("memory::fill(data_, std::numeric_limits<value_type>::quiet_NaN());");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth considering an STL-like constructor for array which can take a fill value? c.f. std::vector<T>(size_type count, const T& value = T())

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The library already has this functionality. I don't know why I have done things this way, probably just sloppiness.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, it was sloppiness. I have fixed it by using just one line:

text_.add_line("data_ = array(field_size*num_fields, std::numeric_limits<value_type>::quiet_NaN());");


// assign the sub-arrays
// replace this : data_(1*n, 2*n);
Expand Down Expand Up @@ -261,7 +261,11 @@ CPrinter::CPrinter(Module &m, bool o)
auto ion = find_ion(ionKind::Na);
text_.add_line("if(k==ionKind::na) {");
text_.increase_indentation();
text_.add_line("ion_na.index = index_into(i.node_index(), node_index_);");
text_.add_line("// TODO a more elegant way of initializing host storage from ranges");
text_.add_line("auto index = index_into(i.node_index(), node_index_);");
text_.add_line("ion_na.index = iarray(size());");
text_.add_line("auto last = std::copy(index.begin(), index.end(), ion_na.index.begin());");
text_.add_line("EXPECTS((unsigned)std::distance(ion_na.index.begin(), last) == size());");
if(has_variable(*ion, "ina")) text_.add_line("ion_na.ina = i.current();");
if(has_variable(*ion, "ena")) text_.add_line("ion_na.ena = i.reversal_potential();");
if(has_variable(*ion, "nai")) text_.add_line("ion_na.nai = i.internal_concentration();");
Expand All @@ -274,7 +278,11 @@ CPrinter::CPrinter(Module &m, bool o)
auto ion = find_ion(ionKind::Ca);
text_.add_line("if(k==ionKind::ca) {");
text_.increase_indentation();
text_.add_line("ion_ca.index = index_into(i.node_index(), node_index_);");
text_.add_line("// TODO a more elegant way of initializing host storage from ranges");
text_.add_line("auto index = index_into(i.node_index(), node_index_);");
text_.add_line("ion_ca.index = iarray(size());");
text_.add_line("auto last = std::copy(index.begin(), index.end(), ion_ca.index.begin());");
text_.add_line("EXPECTS((unsigned)std::distance(ion_ca.index.begin(), last) == size());");
if(has_variable(*ion, "ica")) text_.add_line("ion_ca.ica = i.current();");
if(has_variable(*ion, "eca")) text_.add_line("ion_ca.eca = i.reversal_potential();");
if(has_variable(*ion, "cai")) text_.add_line("ion_ca.cai = i.internal_concentration();");
Expand All @@ -287,7 +295,11 @@ CPrinter::CPrinter(Module &m, bool o)
auto ion = find_ion(ionKind::K);
text_.add_line("if(k==ionKind::k) {");
text_.increase_indentation();
text_.add_line("ion_k.index = index_into(i.node_index(), node_index_);");
text_.add_line("// TODO a more elegant way of initializing host storage from ranges");
text_.add_line("auto index = index_into(i.node_index(), node_index_);");
text_.add_line("ion_k.index = iarray(size());");
text_.add_line("auto last = std::copy(index.begin(), index.end(), ion_k.index.begin());");
text_.add_line("EXPECTS((unsigned)std::distance(ion_k.index.begin(), last) == size());");
if(has_variable(*ion, "ik")) text_.add_line("ion_k.ik = i.current();");
if(has_variable(*ion, "ek")) text_.add_line("ion_k.ek = i.reversal_potential();");
if(has_variable(*ion, "ki")) text_.add_line("ion_k.ki = i.internal_concentration();");
Expand Down Expand Up @@ -324,15 +336,15 @@ CPrinter::CPrinter(Module &m, bool o)
//////////////////////////////////////////////
//////////////////////////////////////////////

text_.add_line("vector_type data_;");
text_.add_line("array data_;");
for(auto var: array_variables) {
if(optimize_) {
text_.add_line(
"__declspec(align(vector_type::alignment())) value_type *"
"__declspec(align(array::alignment())) value_type *"
+ var->name() + ";");
}
else {
text_.add_line("view_type " + var->name() + ";");
text_.add_line("view " + var->name() + ";");
}
}

Expand All @@ -356,52 +368,6 @@ CPrinter::CPrinter(Module &m, bool o)
text_.add_line("using base::node_index_;");

text_.add_line();
//text_.add_line("DATA_PROFILE");
text_.decrease_indentation();
text_.add_line("};");
text_.add_line();

// print the helper type that provides the bridge from the mechanism to
// the calling code
text_.add_line("template<typename T, typename I>");
text_.add_line("struct helper : public mechanism_helper<T, I> {");
text_.increase_indentation();
text_.add_line("using base = mechanism_helper<T, I>;");
text_.add_line("using index_view = typename base::index_view;");
text_.add_line("using view_type = typename base::view_type;");
text_.add_line("using mechanism_ptr_type = typename base::mechanism_ptr_type;");
text_.add_gutter() << "using mechanism_type = " << class_name << "<T, I>;";
text_.add_line();
text_.add_line();

text_.add_line("std::string");
text_.add_line("name() const override");
text_.add_line("{");
text_.increase_indentation();
text_.add_gutter() << "return \"" << m.name() << "\";";
text_.add_line();
text_.decrease_indentation();
text_.add_line("}");
text_.add_line();

text_.add_line("mechanism_ptr<T,I>");
text_.add_line("new_mechanism(view_type vec_v, view_type vec_i, index_view node_index) const override");
text_.add_line("{");
text_.increase_indentation();
text_.add_line("return nest::mc::mechanisms::make_mechanism<mechanism_type>(vec_v, vec_i, node_index);");
text_.decrease_indentation();
text_.add_line("}");
text_.add_line();

text_.add_line("void");
text_.add_line("set_parameters(mechanism_ptr_type&, parameter_list const&) const override");
text_.add_line("{");
text_.increase_indentation();
// TODO : interface that writes parameter_list paramaters into the mechanism's storage
text_.decrease_indentation();
text_.add_line("}");
text_.add_line();

text_.decrease_indentation();
text_.add_line("};");
text_.add_line();
Expand Down Expand Up @@ -713,7 +679,7 @@ void CPrinter::print_APIMethod_optimized(APIMethod* e) {
text_.add_line("int NB = n_/BSIZE;");
for(auto out: aliased_variables) {
text_.add_line(
"__declspec(align(vector_type::alignment())) value_type "
"__declspec(align(array::alignment())) value_type "
+ out->name() + "[BSIZE];");
}
//text_.add_line("START_PROFILE");
Expand Down
Loading