Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release 2.4.0crp #180

Merged
merged 31 commits into from
May 28, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
ce377f1
Use cudaDeviceGetAttribute() for querying the compute capability
sbastrakov Sep 24, 2019
450c73d
Choose the value for the -arch nvcc flag depending on CUDA version
sbastrakov Sep 24, 2019
efd20bc
Merge pull request #164 from sbastrakov/fix-nvccComputeCapability
psychocoderHPC Sep 25, 2019
eff012d
Merge pull request #161 from sbastrakov/topic-cudaDeviceGetArrribute
psychocoderHPC Sep 25, 2019
d911d0c
Add a guard around COMPUTE_CAPABILITY cmake variable
sbastrakov Sep 26, 2019
36cb7f9
Merge pull request #165 from sbastrakov/topic-nvccComputeCapabilityGuard
ax3l Sep 26, 2019
42aed7e
Travis CI: GCC 5.5.0 + CUDA 9.1.85
ax3l May 7, 2020
e383f3c
Merge pull request #170 from ax3l/topic-ciBionic
psychocoderHPC May 7, 2020
4962156
some cleanup
bernhardmgruber May 6, 2020
dafc9b7
* replaced usage of boost::mpl by static constexpr members
bernhardmgruber May 6, 2020
d37e9ed
addressed review comments
bernhardmgruber May 7, 2020
240d4ea
Suggested during review
bernhardmgruber May 7, 2020
25e0de3
renamed variables with 2 leading underscores
bernhardmgruber May 7, 2020
73e21de
* removed check that pagesize is unsigned
bernhardmgruber May 8, 2020
5071114
a little modernization of the CMakeLists
bernhardmgruber May 8, 2020
2856151
* requiring only C++11
bernhardmgruber May 11, 2020
4375461
Merge pull request #169 from bernhardmgruber/cleaning
psychocoderHPC May 11, 2020
6404efd
* added a custom target for mallocMC headers
bernhardmgruber May 11, 2020
5c16da7
applied clang-tidy
bernhardmgruber May 11, 2020
d008ddb
added .vs and build folders to ignores
bernhardmgruber May 12, 2020
95c3223
replaced remaining typedefs by using directives
bernhardmgruber May 14, 2020
8c5a861
Merge pull request #171 from bernhardmgruber/cleaning
psychocoderHPC May 14, 2020
edc10db
added .clang-format file
bernhardmgruber May 11, 2020
669443d
* setting column limit and allowing short loops
bernhardmgruber May 14, 2020
5661a80
formatting
bernhardmgruber May 14, 2020
8aec2cb
using trailing return types
bernhardmgruber May 14, 2020
4a416a2
formatting (after clang-tidy)
bernhardmgruber May 14, 2020
0185438
added CONTRIBUTING.md with instructions how to use clang-format
bernhardmgruber May 14, 2020
7a3d1ce
Merge pull request #172 from bernhardmgruber/format
psychocoderHPC May 18, 2020
c5501d9
version update 2.4.0crp and changelog
psychocoderHPC May 20, 2020
0126a7f
Merge pull request #175 from psychocoderHPC/topic-versionUpdateTo2.4.…
sbastrakov May 26, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
77 changes: 77 additions & 0 deletions .clang-format
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
---
AccessModifierOffset: -4
AlignAfterOpenBracket: AlwaysBreak
AlignConsecutiveAssignments: false
AlignConsecutiveDeclarations: false
AlignEscapedNewlines: DontAlign
AlignOperands: false
AlignTrailingComments: false
AllowAllParametersOfDeclarationOnNextLine: false
AllowShortBlocksOnASingleLine: false
AllowShortCaseLabelsOnASingleLine: false
AllowShortFunctionsOnASingleLine: Empty
AllowShortIfStatementsOnASingleLine: false
AllowShortLoopsOnASingleLine: true
AlwaysBreakAfterReturnType: None
AlwaysBreakBeforeMultilineStrings: true
AlwaysBreakTemplateDeclarations: Yes
BinPackArguments: false
BinPackParameters: false
BreakBeforeBraces: Custom
BraceWrapping:
AfterClass: true
AfterControlStatement: true
AfterEnum: true
AfterFunction: true
AfterNamespace: true
AfterStruct: true
AfterUnion: true
AfterExternBlock: true
BeforeCatch: true
BeforeElse: true
IndentBraces: false
SplitEmptyFunction: false
SplitEmptyRecord: false
SplitEmptyNamespace: false
BreakBeforeBinaryOperators: All
BreakBeforeTernaryOperators: true
BreakConstructorInitializers: AfterColon
BreakInheritanceList: AfterColon
BreakStringLiterals: true
ColumnLimit: 80
CompactNamespaces: false
ConstructorInitializerAllOnOneLineOrOnePerLine: true
ConstructorInitializerIndentWidth: 8
ContinuationIndentWidth: 4
Cpp11BracedListStyle: true
DerivePointerAlignment: false
FixNamespaceComments: false
IncludeBlocks: Regroup
IndentCaseLabels: false
IndentPPDirectives: None
IndentWidth: 4
IndentWrappedFunctionNames: false
KeepEmptyLinesAtTheStartOfBlocks: false
Language: Cpp
NamespaceIndentation: All
PointerAlignment: Middle
ReflowComments: true
SortIncludes: true
SortUsingDeclarations: true
SpaceAfterCStyleCast: false
SpaceAfterTemplateKeyword: false
SpaceBeforeAssignmentOperators: true
SpaceBeforeCpp11BracedList: false
SpaceBeforeCtorInitializerColon: true
SpaceBeforeInheritanceColon: true
SpaceBeforeParens: Never
SpaceBeforeRangeBasedForLoopColon: true
SpaceInEmptyParentheses: false
SpacesInAngles: false
SpacesInCStyleCastParentheses: false
SpacesInContainerLiterals: false
SpacesInParentheses: false
SpacesInSquareBrackets: false
Standard: Cpp11
UseTab: Never
...
3 changes: 3 additions & 0 deletions .clang-tidy
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
---
Checks: '*,-llvm-header-guard,-fuchsia-default-arguments-declarations,-cppcoreguidelines-no-malloc,-cppcoreguidelines-owning-memory,-misc-non-private-member-variables-in-classes'
HeaderFilterRegex: '.*'
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,5 @@

*~
/nbproject
/.vs
/build
11 changes: 6 additions & 5 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ language: cpp

sudo: required

dist: trusty
dist: bionic

compiler:
- gcc
Expand All @@ -14,6 +14,7 @@ env:

script:
- mkdir build_tmp && cd build_tmp
- CXX=g++-5 && CC=gcc-5
- cmake -DCMAKE_INSTALL_PREFIX=$INSTALL_DIR $TRAVIS_BUILD_DIR
- make
- make install
Expand All @@ -28,12 +29,12 @@ before_script:
- sudo apt-get install -f -qq
- sudo dpkg --get-selections | grep hold || { echo "All packages OK."; }
- sudo apt-get install -q -y cmake-data cmake
- sudo apt-get install -qq build-essential
- gcc --version && g++ --version # 4.8
- sudo apt-get install -qq build-essential g++-5
- gcc-5 --version && g++-5 --version # 5.5.0
- apt-cache search nvidia-*
- sudo apt-get install -qq nvidia-common
- sudo apt-get install -qq nvidia-cuda-dev nvidia-cuda-toolkit # 5.5
- sudo apt-get install -qq libboost-dev # 1.54.0
- sudo apt-get install -qq nvidia-cuda-dev nvidia-cuda-toolkit # 9.1.85
- sudo apt-get install -qq libboost-dev # 1.65.1
- sudo find /usr/ -name libcuda*.so

after_script:
Expand Down
21 changes: 21 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,27 @@
Change Log / Release Log for mallocMC
================================================================

2.4.0crp
--------
**Date:** 2020-05-28

This release removes the Boost dependency and switched to C++11.

### Changes to mallocMC 2.3.1crp

**Features**
- Cleaning, remove Boost dependency & C++11 Migration #169

**Bug fixes**
- Choose the value for the -arch nvcc flag depending on CUDA version #164 #165

**Misc:**
- Travis CI: GCC 5.5.0 + CUDA 9.1.85 #170
- Adding headers to projects and applied clang-tidy #171
- clang-format #172

Thanks to Sergei Bastrakov, Bernhard Manfred Gruber and Axel Huebl for contributing to this release!

2.3.1crp
--------
**Date:** 2019-02-14
Expand Down
87 changes: 28 additions & 59 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,7 +1,12 @@
project(mallocMC)
cmake_minimum_required(VERSION 2.8.12.2)
project(mallocMC LANGUAGES CUDA CXX)
cmake_minimum_required(VERSION 3.8)

set(CMAKE_CXX_STANDARD 11)
set(CMAKE_CXX_STANDARD_REQUIRED ON)

# helper for libs and packages
set(CMAKE_CUDA_STANDARD 11)
set(CMAKE_CUDA_STANDARD_REQUIRED ON)
set(CMAKE_PREFIX_PATH "/usr/lib/x86_64-linux-gnu/"
"$ENV{CUDA_ROOT}" "$ENV{BOOST_ROOT}")

Expand All @@ -14,64 +19,37 @@ set(CMAKE_PREFIX_PATH "/usr/lib/x86_64-linux-gnu/"
################################################################################

if(POLICY CMP0074)
cmake_policy(SET CMP0074 NEW)
cmake_policy(SET CMP0074 NEW)
endif()


###############################################################################
# CUDA
###############################################################################
find_package(CUDA REQUIRED)
set(CUDA_NVCC_FLAGS "-arch=sm_20;-use_fast_math;")
set(CUDA_INCLUDE_DIRS ${CMAKE_CURRENT_SOURCE_DIR})
include_directories(${CUDA_INCLUDE_DIRS})
cuda_include_directories(${CUDA_INCLUDE_DIRS})
if(NOT DEFINED COMPUTE_CAPABILITY)
set(COMPUTE_CAPABILITY "30")
endif()
set(CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} -arch=sm_${COMPUTE_CAPABILITY} -use_fast_math")

OPTION(CUDA_OUTPUT_INTERMEDIATE_CODE "Output ptx code" OFF)
if(CUDA_OUTPUT_INTERMEDIATE_CODE)
set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS};-Xptxas;-v;--keep")
endif(CUDA_OUTPUT_INTERMEDIATE_CODE)
set(CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} -Xptxas -v --keep")
endif()

SET(CUDA_OPTIMIZATION_TYPE "unset" CACHE STRING "CUDA Optimization")
set_property(CACHE CUDA_OPTIMIZATION_TYPE PROPERTY STRINGS "unset;-G0;-O0;-O1;-O2;-O3")
if(NOT ${CUDA_OPTIMIZATION_TYPE} STREQUAL "unset")
set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS};${CUDA_OPTIMIZATION_TYPE}")
if(NOT ${CUDA_OPTIMIZATION_TYPE} STREQUAL "unset")
set(CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} ${CUDA_OPTIMIZATION_TYPE}")
endif()


###############################################################################
# Boost
###############################################################################
find_package(Boost 1.48.0 REQUIRED)
include_directories(SYSTEM ${Boost_INCLUDE_DIRS})
set(LIBS ${LIBS} ${Boost_LIBRARIES})

# nvcc + boost 1.55 work around
if(Boost_VERSION EQUAL 105500)
set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} \"-DBOOST_NOINLINE=__attribute__((noinline))\" ")
endif(Boost_VERSION EQUAL 105500)


################################################################################
# Warnings
################################################################################
# GNU
if(CMAKE_COMPILER_IS_GNUCXX)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wall")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wshadow")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-unknown-pragmas")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wextra")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-unused-parameter")
# new warning in gcc 4.8 (flag ignored in previous version)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-unused-local-typedefs")
# ICC
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wall -Wshadow -Wno-unknown-pragmas -Wextra -Wno-unused-parameter -Wno-unused-local-typedefs")
elseif("${CMAKE_CXX_COMPILER_ID}" STREQUAL "Intel")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wall")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wshadow")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -DBOOST_NO_VARIADIC_TEMPLATES")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -DBOOST_NO_CXX11_VARIADIC_TEMPLATES")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -DBOOST_NO_FENV_H")
# PGI
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wall -Wshadow")
elseif("${CMAKE_CXX_COMPILER_ID}" STREQUAL "PGI")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Minform=inform")
endif()
Expand All @@ -87,28 +65,19 @@ INSTALL(
DESTINATION include
PATTERN ".git" EXCLUDE
PATTERN "mallocMC_config.hpp" EXCLUDE
)
)


###############################################################################
# Executables
###############################################################################
file(GLOB_RECURSE headers src/include/**)
add_custom_target(mallocMC SOURCES ${headers}) # create a target with the header files for IDE projects
source_group(TREE ${CMAKE_CURRENT_LIST_DIR}/src/include FILES ${headers})

include_directories(${CMAKE_CURRENT_LIST_DIR}/src/include)
add_executable(mallocMC_Example01 EXCLUDE_FROM_ALL examples/mallocMC_example01.cu examples/mallocMC_example01_config.hpp)
add_executable(mallocMC_Example02 EXCLUDE_FROM_ALL examples/mallocMC_example02.cu)
add_executable(mallocMC_Example03 EXCLUDE_FROM_ALL examples/mallocMC_example03.cu)
add_executable(VerifyHeap EXCLUDE_FROM_ALL tests/verify_heap.cu tests/verify_heap_config.hpp)
add_custom_target(examples DEPENDS mallocMC_Example01 mallocMC_Example02 mallocMC_Example03 VerifyHeap)

cuda_add_executable(mallocMC_Example01
EXCLUDE_FROM_ALL
examples/mallocMC_example01.cu )
cuda_add_executable(mallocMC_Example02
EXCLUDE_FROM_ALL
examples/mallocMC_example02.cu )
cuda_add_executable(mallocMC_Example03
EXCLUDE_FROM_ALL
examples/mallocMC_example03.cu )
cuda_add_executable(VerifyHeap
EXCLUDE_FROM_ALL
tests/verify_heap.cu )

target_link_libraries(mallocMC_Example01 ${LIBS})
target_link_libraries(mallocMC_Example02 ${LIBS})
target_link_libraries(mallocMC_Example03 ${LIBS})
target_link_libraries(VerifyHeap ${LIBS})
15 changes: 15 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Contributing

## Formatting

Please format your code before before opening pull requests using clang-format and the .clang-format file placed in the repository root.

### Visual Studio and CLion
Suport for clang-format is built-in since Visual Studio 2017 15.7 and CLion 2019.1.
The .clang-format file in the repository will be automatically detected and formatting is done as you type, or triggered when pressing the format hotkey.

### Bash
First install clang-format. Instructions therefore can be found on the web. To format you can run this command in bash:
```
find -iname *.cu -o -iname *.hpp | xargs clang-format-10 -i
```
5 changes: 5 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,11 @@ mallocMC is header-only, but requires a few other C++ libraries to be
available. Our installation notes can be found in [INSTALL.md](INSTALL.md).


Contributing
------------

Rules for contributions are found in [CONTRIBUTING.md](CONTRIBUTING.md).

On the ScatterAlloc Algorithm
-----------------------------

Expand Down
24 changes: 12 additions & 12 deletions Usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,15 +19,15 @@ Currently, there are the following policy classes available:

|Policy | Policy Classes (implementations) | description |
|------- |----------------------------------| ----------- |
|**CreationPolicy** | Scatter`<conf1,conf2>` | A scattered allocation to tradeoff fragmentation for allocation time, as proposed in [ScatterAlloc](http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6339604). `conf1` configures the heap layout, `conf2` determines the hashing parameters|
|**CreationPolicy** | Scatter`<conf1,conf2>` | A scattered allocation to tradeoff fragmentation for allocation time, as proposed in [ScatterAlloc](http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6339604). `conf1` configures the heap layout, `conf2` determines the hashing parameters|
| | OldMalloc | device-side malloc/new and free/delete syscalls as implemented on NVidia CUDA graphics cards with compute capability sm_20 and higher |
|**DistributionPolicy** | XMallocSIMD`<conf>` | SIMD optimization for warp-wide allocation on NVIDIA CUDA accelerators, as proposed by [XMalloc](http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=5577907). `conf` is used to determine the pagesize. If used in combination with *Scatter*, the pagesizes must match |
|**DistributionPolicy** | XMallocSIMD`<conf>` | SIMD optimization for warp-wide allocation on NVIDIA CUDA accelerators, as proposed by [XMalloc](http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=5577907). `conf` is used to determine the pagesize. If used in combination with *Scatter*, the pagesizes must match |
| | Noop | no workload distribution at all |
|**OOMPolicy** | ReturnNull | pointers will be *NULL*, if the request could not be fulfilled |
|**OOMPolicy** | ReturnNull | pointers will be *nullptr*, if the request could not be fulfilled |
| | ~~BadAllocException~~ | will throw a `std::bad_alloc` exception. The accelerator has to support exceptions |
|**ReservePoolPolicy** | SimpleCudaMalloc | allocate a fixed heap with `CudaMalloc` |
| | CudaSetLimits | call to `CudaSetLimits` to increase the available Heap (e.g. when using *OldMalloc*) |
|**AlignmentPolicy** | Shrink`<conf>` | shrinks the pool so that the starting pointer is well aligned, applies padding to requested memory chunks. `conf` is used to determine the alignment|
|**AlignmentPolicy** | Shrink`<conf>` | shrinks the pool so that the starting pointer is well aligned, applies padding to requested memory chunks. `conf` is used to determine the alignment|
| | Noop | no alignment at all |

The user has to choose one of each policy that will form a useful allocator
Expand All @@ -45,7 +45,7 @@ to the policy class:
```c++
// configure the AlignmentPolicy "Shrink"
struct ShrinkConfig : mallocMC::AlignmentPolicies::Shrink<>::Properties {
typedef boost::mpl::int_<16> dataAlignment;
static constexpr auto dataAlignment = 16;
};
```

Expand All @@ -57,29 +57,29 @@ parameters to create the desired allocator type:
```c++
using namespace mallocMC;

typedef mallocMC::Allocator<
using Allocator1 = mallocMC::Allocator<
CreationPolicy::OldMalloc,
DistributionPolicy::Noop,
OOMPolicy::ReturnNull,
ReservePoolPolicy::CudaSetLimits,
AlignmentPolicy::Noop
> Allocator1;
>;
```

`Allocator1` will resemble the behaviour of classical device-side allocation known
from NVIDIA CUDA since compute capability sm_20. To get a more novel allocator, one
could create the following typedef instead:
could create the following alias instead:

```c++
using namespace mallocMC;

typedef mallocMC::Allocator<
using ScatterAllocator = mallocMC::Allocator<
CreationPolicies::Scatter<>,
DistributionPolicies::XMallocSIMD<>,
OOMPolicies::ReturnNull,
ReservePoolPolicies::SimpleCudaMalloc,
AlignmentPolicies::Shrink<ShrinkConfig>
> ScatterAllocator;
>;
```

Notice, how the policy classes `Scatter` and `XMallocSIMD` are instantiated without
Expand Down Expand Up @@ -122,13 +122,13 @@ A simplistic example would look like this:

namespace mallocMC = MC;

typedef MC::Allocator<
using ScatterAllocator = MC::Allocator<
MC::CreationPolicies::Scatter<>,
MC::DistributionPolicies::XMallocSIMD<>,
MC::OOMPolicies::ReturnNull,
MC::ReservePoolPolicies::SimpleCudaMalloc,
MC::AlignmentPolicies::Shrink<ShrinkConfig>
> ScatterAllocator;
>;

__global__ exampleKernel(ScatterAllocator::AllocatorHandle sah)
{
Expand Down
Loading