Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Part 1 of the big GPU merge #45

Merged
merged 61 commits into from
Nov 14, 2016
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
Show all changes
61 commits
Select commit Hold shift + click to select a range
b38f78b
nvcc can build miniapp
bcumming Sep 8, 2016
7e54957
WIP towards basic GPU implementation
bcumming Sep 9, 2016
b7bd5dd
gpu WIP
bcumming Sep 29, 2016
3edad46
Merge branch 'master' into gpu
bcumming Sep 29, 2016
1388d20
WIP: progress towards crash free GPU existance
bcumming Sep 29, 2016
8e6d97f
finish moving vector to internal memory module
bcumming Sep 30, 2016
4ee9930
add the new tests that I forgot in the last commit
bcumming Sep 30, 2016
4c0864b
updated fvm multicell code for the new memory library
bcumming Sep 30, 2016
d1a5b6e
WIP gpu version
bcumming Oct 4, 2016
36ebcc7
WIP merge
bcumming Oct 6, 2016
851f38c
Merge branch 'master' into gpu
bcumming Oct 6, 2016
4a1777f
gpu WIP
bcumming Oct 9, 2016
2c6f5a3
WIP - towards gpu support
bcumming Oct 10, 2016
c742fd5
WIP gpu
bcumming Oct 13, 2016
039d5a9
Merge pull request #1 from eth-cscs/master
bcumming Oct 13, 2016
121dade
WIP
bcumming Oct 15, 2016
b6aec69
Support for units syntax within state block.
Oct 22, 2016
44c8ac0
WIP
bcumming Oct 24, 2016
78f60f5
WIP
bcumming Oct 25, 2016
5c09594
Merge branch 'master' of github.com:eth-cscs/nestmc-proto
bcumming Oct 25, 2016
3830c9a
WIP
bcumming Oct 25, 2016
2db99a2
WIP
bcumming Oct 25, 2016
65ed2fe
WIP
bcumming Oct 25, 2016
7f1216d
WIP
bcumming Oct 25, 2016
de992ad
merge with master
bcumming Oct 26, 2016
b5bae5b
WIP
bcumming Oct 26, 2016
06ef117
Merge branch 'master' of github.com:eth-cscs/nestmc-proto into gpu
bcumming Oct 26, 2016
18a9226
Merge branch 'master' into bugfix/modcc/state-block-units
Oct 26, 2016
05e4770
Adds unit tests for the STATE block.
Oct 26, 2016
b9b1e57
WIP
bcumming Oct 27, 2016
dbca304
Address deprecated use of 'symbol' warning.
halfflat Oct 27, 2016
ff14704
final merge with master
bcumming Oct 27, 2016
9e0b874
Addresses PR comments.
Oct 27, 2016
f1eac25
Merge pull request #35 from vkarak/bugfix/modcc/state-block-units
bcumming Oct 27, 2016
5174b69
Merge remote-tracking branch 'upstream/master'
halfflat Oct 27, 2016
a39c9a3
Unit tests for math.hpp
halfflat Oct 27, 2016
63c507b
Extend range, view functionality.
halfflat Oct 27, 2016
830428a
Add `ball_and_squiggle` model; fix `ball_and_taper`.
halfflat Oct 27, 2016
cee495c
Address PR#46 review comments.
halfflat Oct 28, 2016
c97135d
Merge pull request #46 from halfflat/feature/more-range-utils
bcumming Oct 28, 2016
5ade8d0
Merge pull request #47 from halfflat/feature/new-test-model
bcumming Oct 28, 2016
1b929ff
Consolidate validation test code (issue #41)
halfflat Oct 27, 2016
f189d73
New compartment info structure for FVM.
halfflat Oct 27, 2016
550da10
Merge pull request #48 from halfflat/feature/consolidate-validation-t…
bcumming Oct 28, 2016
a03af27
Merge pull request #49 from halfflat/feature/divided-compartments
bcumming Oct 28, 2016
e7a8fb6
Complex compartments
halfflat Oct 11, 2016
5aeea90
Remove division policy type parameter.
halfflat Oct 31, 2016
e8d3285
Merge pull request #54 from halfflat/feature/complex-compartments
bcumming Oct 31, 2016
b39a93e
WIP
bcumming Nov 2, 2016
0ded25a
Omp (#38)
Ivanmartinezperez Nov 4, 2016
cd4d9ae
WIP - gpu version runs but doesn't validate
bcumming Nov 4, 2016
e74720b
WIP
bcumming Nov 9, 2016
b11990f
backend refactoring
bcumming Nov 10, 2016
bb14d18
WIP
bcumming Nov 11, 2016
1584bf3
merge master into gpu branch (new validation tests mostly)
bcumming Nov 11, 2016
46427e6
rename template parameter in indexed_view to Backend
bcumming Nov 11, 2016
7751058
clean up gpu branch changes
bcumming Nov 14, 2016
2b2b310
findunwind works when libunwind is unavailable
bcumming Nov 14, 2016
5825543
final clean up for PR #45
bcumming Nov 14, 2016
faf1a41
further simplifications to the unwind C++ code
bcumming Nov 14, 2016
6cf11d3
Tiny typo fix
Nov 14, 2016
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 10 additions & 7 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -31,13 +31,17 @@ if(WITH_TRACE)
add_definitions("-DWITH_TRACE")
endif()

# list of libraries to be linked against targets
set(EXTERNAL_LIBRARIES "")

#threading model selection
set(THREADING_MODEL "serial" CACHE STRING "set the threading model, one of serial/tbb/omp")
if(THREADING_MODEL MATCHES "tbb")
# TBB support
find_package(TBB REQUIRED)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${TBB_DEFINITIONS}")
add_definitions(-DWITH_TBB)
set(EXTERNAL_LIBRARIES ${EXTERNAL_LIBRARIES} ${TBB_LIBRARIES})

elseif(THREADING_MODEL MATCHES "omp")
# OpenMP support
Expand All @@ -47,17 +51,18 @@ elseif(THREADING_MODEL MATCHES "omp")

elseif(THREADING_MODEL MATCHES "serial")
#setup previously done

else()
message( FATAL_ERROR "-- Threading model '${THREADING_MODEL}' not supported, use one of serial/tbb/omp")

endif()

# libunwind for pretty printing stack traces
set(WITH_UNWIND OFF CACHE BOOL "use libunwind for debug messages" )
if(WITH_UNWIND)
find_package(Unwind REQUIRED)
find_package(Unwind)
if(UNWIND_FOUND)
add_definitions(-DWITH_UNWIND)
include_directories(${UNWIND_INCLUDE_DIR})
set(EXTERNAL_LIBRARIES ${EXTERNAL_LIBRARIES} ${UNWIND_LIBRARIES})
endif()

# CUDA support
Expand All @@ -80,6 +85,7 @@ if(WITH_CUDA)

add_definitions(-DWITH_GPU)
include_directories(SYSTEM ${CUDA_INCLUDE_DIRS})
set(EXTERNAL_LIBRARIES ${EXTERNAL_LIBRARIES} ${CUDA_LIBRARIES})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this approach. Might be cleaner to use

list(APPEND EXTERNAL_LIBRARIES ${CUDA_LIBRARIES})

(and likewise above) instead of set.

endif()

# MPI support
Expand All @@ -93,15 +99,12 @@ if(WITH_MPI)
set_property(DIRECTORY APPEND_STRING PROPERTY COMPILE_OPTIONS "${MPI_C_COMPILE_FLAGS}")
endif()


# Internal profiler support
set(WITH_PROFILING OFF CACHE BOOL "use built-in profiling of miniapp" )
if(WITH_PROFILING)
add_definitions(-DWITH_PROFILING)
endif()



# Cray systems
set(SYSTEM_CRAY OFF CACHE BOOL "add flags for compilation on Cray systems")
if(SYSTEM_CRAY)
Expand Down
76 changes: 38 additions & 38 deletions cmake/FindUnwind.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -8,41 +8,41 @@
# respectively can be used to help CMake finding the library if it
# is not installed in any of the usual locations.

if (NOT UNWIND_FOUND)
set(UNWIND_SEARCH_DIR ${UNWIND_ROOT_DIR} $ENV{UNWIND_ROOT})

find_path(UNWIND_INCLUDE_DIR libunwind.h
HINTS ${UNWIND_SEARCH_DIR}
PATH_SUFFIXES include
)

# libunwind requires that we link agains both libunwind.so/a and a
# a target-specific library libunwind-target.so/a.
# This code sets the "target" string above in libunwind_arch.
if (CMAKE_SYSTEM_PROCESSOR MATCHES "^arm")
set(libunwind_arch "arm")
elseif (CMAKE_SYSTEM_PROCESSOR STREQUAL "x86_64" OR CMAKE_SYSTEM_PROCESSOR STREQUAL "amd64")
set(libunwind_arch "x86_64")
elseif (CMAKE_SYSTEM_PROCESSOR MATCHES "^i.86$")
set(libunwind_arch "x86")
endif()

find_library(unwind_library_generic unwind
HINTS ${UNWIND_SEARCH_DIR}
PATH_SUFFIXES lib64 lib
)

find_library(unwind_library_target unwind-${libunwind_arch}
HINTS ${UNWIND_SEARCH_DIR}
PATH_SUFFIXES lib64 lib
)

set(UNWIND_LIBRARIES ${unwind_library_generic} ${unwind_library_target})

mark_as_advanced(UNWIND_LIBRARIES UNWIND_INCLUDE_DIR)

unset(unwind_search_dir)
unset(unwind_library_generic)
unset(unwind_library_target)
unset(libunwind_arch)
endif ()
set(UNWIND_FOUND ON)

set(UNWIND_SEARCH_DIR ${UNWIND_ROOT_DIR} $ENV{UNWIND_ROOT})

find_path(UNWIND_INCLUDE_DIR libunwind.h
HINTS ${UNWIND_SEARCH_DIR}
PATH_SUFFIXES include
)

# libunwind requires that we link agains both libunwind.so/a and a
# a target-specific library libunwind-target.so/a.
# This code sets the "target" string above in libunwind_arch.
if (CMAKE_SYSTEM_PROCESSOR MATCHES "^arm")
set(libunwind_arch "arm")
elseif (CMAKE_SYSTEM_PROCESSOR STREQUAL "x86_64" OR CMAKE_SYSTEM_PROCESSOR STREQUAL "amd64")
set(libunwind_arch "x86_64")
elseif (CMAKE_SYSTEM_PROCESSOR MATCHES "^i.86$")
set(libunwind_arch "x86")
endif()

find_library(unwind_library_generic unwind
HINTS ${UNWIND_SEARCH_DIR}
PATH_SUFFIXES lib64 lib
)

find_library(unwind_library_target unwind-${libunwind_arch}
HINTS ${UNWIND_SEARCH_DIR}
PATH_SUFFIXES lib64 lib
)

set(UNWIND_LIBRARIES ${unwind_library_generic} ${unwind_library_target})

mark_as_advanced(UNWIND_LIBRARIES UNWIND_INCLUDE_DIR)

unset(unwind_search_dir)
unset(unwind_library_generic)
unset(unwind_library_target)
unset(libunwind_arch)
11 changes: 3 additions & 8 deletions miniapp/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -18,21 +18,16 @@ else()
add_executable(miniapp.exe ${MINIAPP_SOURCES} ${HEADERS})
endif()

target_link_libraries(miniapp.exe LINK_PUBLIC nestmc)
set(aaa nestmc)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this debugging CMake code that snuck in?


if(WITH_TBB)
target_link_libraries(miniapp.exe LINK_PUBLIC ${TBB_LIBRARIES})
endif()
target_link_libraries(miniapp.exe LINK_PUBLIC nestmc)
target_link_libraries(miniapp.exe LINK_PUBLIC ${EXTERNAL_LIBRARIES})

if(WITH_MPI)
target_link_libraries(miniapp.exe LINK_PUBLIC ${MPI_C_LIBRARIES})
set_property(TARGET miniapp.exe APPEND_STRING PROPERTY LINK_FLAGS "${MPI_C_LINK_FLAGS}")
endif()

if(WITH_UNWIND)
target_link_libraries(miniapp.exe LINK_PUBLIC ${UNWIND_LIBRARIES})
endif()

set_target_properties(miniapp.exe
PROPERTIES
RUNTIME_OUTPUT_DIRECTORY "${CMAKE_BINARY_DIR}/miniapp"
Expand Down
66 changes: 53 additions & 13 deletions scripts/print_backtrace
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,43 @@ import argparse
import os
import subprocess

class color:
purple = '\033[95m'
white = '\033[37m'
cyan = '\033[96m'
darkcyan = '\033[36m'
blue = '\033[94m'
green = '\033[92m'
yellow = '\033[93m'
red = '\033[91m'
bold = '\033[1m'
underline = '\033[4m'
end = '\033[0m'

class nocolor:
purple = ''
white = ''
cyan = ''
darkcyan = ''
blue = ''
green = ''
yellow = ''
red = ''
bold = ''
underline = ''
end = ''


def parse_clargs():
P = argparse.ArgumentParser(description='pretty print stack traces')
P.add_argument('input', metavar='FILE',
help='name of file with stack trace')
P.add_argument('-b', '--brief', action='store_false',
P.add_argument('-b', '--brief', action='store_true',
help='print only the file locations')
P.add_argument('-e', '--executable', metavar='FILE',
help='name of the executable or object file to look up symbols')
P.add_argument('-c', '--color', action='store_true',
help='use color output in terminal')

return P.parse_args()

Expand All @@ -23,27 +54,36 @@ def parse_backtrace(source):
tokens = line.split()
trace.append({'location':tokens[0], 'function':tokens[1]})
else:
print "error: unable to open file ", source
print "error: unable to back trace file ", source
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing verb in error text?


return trace

def get_function_name(location):
result = os.popen('addr2line ' + location + ' -e miniapp.exe').read()
def get_function_name(location, executable):
result = os.popen('addr2line ' + location + ' -e ' + executable).read()
descriptor = result.split()[0].split(':')
return {'filename': descriptor[0], 'line': descriptor[1]}

def unmangle(mangled):
unmangled = os.popen('c++filt ' + mangled).read().strip()
# remove the nest::mc:: namespace from all types
return unmangled.replace('nest::mc::', '')

#
# main
#
args = parse_clargs()
trace = parse_backtrace(args.input)

for frame in trace:
location = get_function_name(frame['location'])
name = unmangle(frame['function'])
if args.brief:
print location['filename'] + ':' + location['line'], name
else:
print location['filename'] + ':' + location['line']
# check that a valid executable was provided
executable = args.executable
if not os.path.isfile(executable):
print "error:", executable, "is not a valid executable"
else:
for frame in parse_backtrace(args.input):
location = get_function_name(frame['location'], executable)
name = unmangle(frame['function'])
c = color if args.color else nocolor
fname = c.yellow + location['filename'] + c.end
line = c.cyan + location['line'] + c.end
if args.brief:
print fname + ':' + line
else:
print fname + ':' + line, '\n ', name
1 change: 1 addition & 0 deletions src/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ add_library(nestmc ${BASE_SOURCES} ${HEADERS})
add_dependencies(nestmc build_all_mods)
if(WITH_CUDA)
cuda_add_library(gpu ${CUDA_SOURCES})
set(NESTMC_LIBRARIES ${NESTMC_LIBRARIES} gpu)
add_dependencies(gpu build_all_gpu_mods)
endif()

45 changes: 27 additions & 18 deletions src/fvm_multicell.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
#include <ion.hpp>
#include <math.hpp>
#include <matrix.hpp>
#include <memory/memory.hpp>
#include <profiling/profiler.hpp>
#include <segment.hpp>
#include <stimulus.hpp>
Expand All @@ -24,8 +25,6 @@
#include <util/rangeutil.hpp>
#include <util/span.hpp>

#include <memory/memory.hpp>

namespace nest {
namespace mc {
namespace fvm {
Expand Down Expand Up @@ -63,7 +62,6 @@ class fvm_multicell {

using matrix_assembler = typename backend::matrix_assembler;

/// API for cell_group (see above):
using detector_handle = size_type;
using target_handle = std::pair<size_type, size_type>;
using probe_handle = std::pair<const array fvm_multicell::*, size_type>;
Expand Down Expand Up @@ -97,6 +95,7 @@ class fvm_multicell {
return (this->*h.first)[h.second];
}

/// integrate all cell state forward in time
void advance(double dt);

/// Following types and methods are public only for testing:
Expand Down Expand Up @@ -401,6 +400,15 @@ void fvm_multicell<Backend>::initialize(
std::vector<value_type> tmp_cv_areas(ncomp);
std::vector<value_type> tmp_cv_capacitance(ncomp);

// Iterate over the input cells and build the indexes etc that descrbe the
// fused cell group. On completion:
// - group_paranet_index contains the full parent index for the fused cells.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spellling eror

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note to self paranet is not a word

// - mech_map and syn_mech_map provide a map from mechanism names to an
// iterable container of compartment ranges, which are used later to
// generate the node index for each mechanism kind.
// - the tmp_* vectors contain compartment-specific information for each
// compartment in the fused cell group (areas, capacitance, etc).
// - each probe, stimulus and detector is attached to its compartment.
for (auto i: make_span(0, ncell)) {
const auto& c = cells[i];
auto comp_ival = cell_comp_part[i];
Expand Down Expand Up @@ -488,6 +496,11 @@ void fvm_multicell<Backend>::initialize(
}
}

// confirm user-supplied containers for detectors and probes were
// appropriately sized.
EXPECTS(detectors_size==detectors_count);
EXPECTS(probes_size==probes_count);

// normalize capacitance across cell
for (auto i: util::make_span(0, ncomp)) {
tmp_cv_capacitance[i] /= tmp_cv_areas[i];
Expand All @@ -505,9 +518,9 @@ void fvm_multicell<Backend>::initialize(
matrix_.d(), matrix_.u(), matrix_.rhs(), matrix_.p(),
cv_areas_, face_alpha_, voltage_, current_, cv_capacitance_);

// create density mechanisms
// For each density mechanism build the full node index, i.e the list of
// compartments with that mechanism, then build the mechanism instance.
std::vector<size_type> mech_comp_indices(ncomp);

std::map<std::string, std::vector<size_type>> mech_index_map;
for (auto& mech: mech_map) {
mech_comp_indices.clear();
Expand All @@ -523,7 +536,7 @@ void fvm_multicell<Backend>::initialize(
mech_index_map[mech.first] = mech_comp_indices;
}

// create point (synapse) mechanisms
// Create point (synapse) mechanisms
for (const auto& syni: syn_mech_indices) {
const auto& mech_name = syni.first;
size_type mech_index = mechanisms_.size();
Expand Down Expand Up @@ -553,17 +566,16 @@ void fvm_multicell<Backend>::initialize(
target_hi = std::copy_n(std::begin(handles), n_indices, target_hi);
targets_count += n_indices;

//auto mech = mechanism_catalogue::make(
auto mech = backend::make_mechanism(
mech_name, voltage_, current_, comp_indices);
auto mech = backend::make_mechanism(mech_name, voltage_, current_, comp_indices);
mech->set_areas(cv_areas_);
mechanisms_.push_back(std::move(mech));

// save the compartment indexes for this synapse type
mech_index_map[mech_name] = comp_indices;
}

// confirm write-parameters were appropriately sized
EXPECTS(detectors_size==detectors_count);
// confirm user-supplied containers for targets are appropriately sized
EXPECTS(targets_size==targets_count);
EXPECTS(probes_size==probes_count);

// build the ion species
for (auto ion : mechanisms::ion_kinds()) {
Expand Down Expand Up @@ -641,19 +653,16 @@ void fvm_multicell<Backend>::advance(double dt) {
PL();
}

// TODO KERNEL: the stimulus might have to become a "proper" mechanism
// so that the update kernel is fully implemented on GPU.

// add current contributions from stimuli
for (auto& stim : stimuli_) {
auto ie = stim.second.amplitude(t_); // [nA]
auto loc = stim.first;

// TODO KERNEL
// is a kernel actually needed?
// for now I only make the update if the injected current in nonzero to
// avoid a redundant host->device copy on the gpu
//
// note: current_ in [mA/cm^2], ie in [nA], cv_areas_ in [µm^2].
// unit scale factor: [nA/µm^2]/[mA/cm^2] = 100
//current_[loc] -= 100*ie/cv_areas_[loc];
if (ie!=0.) {
current_[loc] = current_[loc] - 100*ie/cv_areas_[loc];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason not to use -=?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes: the types returned by operator[] for DeviceVector are proxies, which are converted to value_type on asignment. Operators like -= have not been defined for these proxies.

This isn't a problem any more, because the gpu and cpu code for this loop has now been refactored into target specific loops/kernels.

}
Expand Down
Loading