Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distributed vector #1298

Merged
merged 66 commits into from Nov 6, 2014
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
66 commits
Select commit Hold shift + click to select a range
1a5e154
adding feature like block, cyclic and block-cyclic
Sep 27, 2014
d8ab303
adding comments
Sep 27, 2014
24cbf44
earlier there was some problem with block-cyclic distribution because…
Sep 27, 2014
bc02829
deleting temp files
Sep 27, 2014
2d27e98
change the std::for_each to hpx::for_each .. a work around
Sep 28, 2014
8c89477
removing cout statement
Sep 28, 2014
dc7c3ec
operator[]'s block_cyclic_distribution was not accessing the right el…
Oct 1, 2014
e54b85a
this commit is for making the iterator give index and value and tuple
Oct 10, 2014
8bbfd50
adding new feature that will allow the iterator to return index as we…
Oct 10, 2014
591a8f3
Merge branch 'master' of github.com:gbibek/hpx-1
Oct 10, 2014
97de917
removing tmp files
Oct 10, 2014
d15935b
fixing the block-cylic bug which was adding extra elements
Oct 11, 2014
767b5d0
removing temp files and making git ignore tmp files
Oct 14, 2014
db2aba6
removing tmp files
Oct 14, 2014
4d62ac9
Merge branch 'master' of github.com:gbibek/hpx-1 into gbibek-master
hkaiser Oct 14, 2014
7a9475d
Cleaning up current vector implementation, current state
hkaiser Oct 15, 2014
67bf75b
Merge branch 'algorithm_cleanup' into gbibek-master
hkaiser Oct 16, 2014
7ed9082
Started to work on segmented for_each algorithm
hkaiser Oct 16, 2014
a91e690
Significant progress, implemented
hkaiser Oct 19, 2014
8fc86ef
Creation and iteration functional (block policy only), added exhausti…
hkaiser Oct 20, 2014
1969429
Adding vector_configuration, fixing index calculation ofr cyclic and …
hkaiser Oct 21, 2014
982f511
Adding configuration object, added tests for connected vectors
hkaiser Oct 22, 2014
8489925
Making vector partition server a template - linker errors
hkaiser Oct 23, 2014
64d7f90
Adding bulk allocation of vector partitions, removed potentially ambi…
hkaiser Oct 23, 2014
91ea755
Fixed linker errors, thanks @K-ballo!
hkaiser Oct 23, 2014
ad12177
Adding (limited) extraction of main iterators based on locality, some…
hkaiser Oct 24, 2014
a5f8d78
Starting vector_algorithm_test
hkaiser Oct 24, 2014
fc12bdf
Working towards integrating local and remote algorithms, committing c…
hkaiser Oct 25, 2014
e93df44
Merge branch 'master' into distributed_vector
hkaiser Oct 26, 2014
ffea804
First integration with parallel algorithms functional
hkaiser Oct 27, 2014
0077fb6
Deleting old file
hkaiser Oct 27, 2014
67f2c6c
Inheriting parent properties for distribution policies
hkaiser Oct 27, 2014
33a76c2
Adding overloads for locality-based iterator extraction
hkaiser Oct 27, 2014
95e92be
Revert accidental change to the count algorithm
hkaiser Oct 27, 2014
eead301
Merge branch 'master' into distributed_vector
hkaiser Oct 27, 2014
2bdc52b
Fixing missing return value
hkaiser Oct 28, 2014
98114a4
Minor doc fixes
sithhell Oct 28, 2014
ca09dc6
Removed superfluous 'typename'
hkaiser Oct 28, 2014
522454c
Fixing a variable name typo
hkaiser Oct 28, 2014
c239990
Remove explicit instantiation
hkaiser Oct 28, 2014
43f9b04
More goodness to make real compilers happy ...
hkaiser Oct 28, 2014
3b75239
Adding missing 'typename' keywords
hkaiser Oct 28, 2014
2cfca2d
Fix potential thread safety problem
hkaiser Oct 28, 2014
76b43e9
Fixing segmented iterator traits
hkaiser Oct 29, 2014
ff1fbc4
Adding more test cases, fixing algorithm_result
hkaiser Oct 29, 2014
cc78e4e
Adding more missing typename keywords
hkaiser Oct 29, 2014
9f0c6df
Adding proper value semantics to vector, fixing problems with compone…
hkaiser Oct 30, 2014
7b2cc6d
Add missing comma (why didn't MSVC tell me?)
hkaiser Oct 30, 2014
7fe0fc2
Adding set_values/get_values, more tests, renaming sync/async APIs
hkaiser Oct 30, 2014
11b4808
Adding more tests
hkaiser Oct 31, 2014
4791b83
Speeding up things, not quite there yet. Cleaning up examples.
hkaiser Nov 2, 2014
b47fee2
fixing two problems
hkaiser Nov 2, 2014
2f7f7d8
Adding missing include
hkaiser Nov 3, 2014
e31aa36
Enabling local iterators to be usable remotely
hkaiser Nov 3, 2014
a805fae
Adding missing #include
hkaiser Nov 3, 2014
decbd80
More work on vector itself
hkaiser Nov 3, 2014
64fab1e
Speeding up things
hkaiser Nov 3, 2014
f464a3f
Started to implement a segmented index for the vector iterator. Not d…
hkaiser Nov 5, 2014
8a4b8a7
Partially revert latest changes. Indexing is now based on global inde…
hkaiser Nov 5, 2014
016f7e3
Fixing a problem in copy_from
hkaiser Nov 5, 2014
33aa0af
Making it easier for the compiler to fuse / and % on same operands
hkaiser Nov 5, 2014
94860ed
Making vector_algorithm_test working again
hkaiser Nov 5, 2014
8f43d29
More fixes
hkaiser Nov 5, 2014
6a6ca61
Even more fixes
K-ballo Nov 5, 2014
a289ff0
Added missing return statements
K-ballo Nov 6, 2014
f2a653a
Fixing signed/unsigned mismatch errors
hkaiser Nov 6, 2014
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
64 changes: 32 additions & 32 deletions README.rst
Expand Up @@ -59,7 +59,7 @@ If you plan to use HPX we suggest to start with the latest released version

If you would like to work with the cutting edge version from this repository
we suggest following the current health status of the master branch by looking at
our `contiguous integration results website <http://hermione.cct.lsu.edu/waterfall>`_.
our `contiguous integration results website <http://hermione.cct.lsu.edu/console>`_.
While we try to keep the master branch stable and usable, sometimes new bugs
trick their way into the code base - you have been warned!

Expand All @@ -76,7 +76,7 @@ Version 1.0 (See accompanying file LICENSE_1_0.txt or an online copy available
`here <http://www.boost.org/LICENSE_1_0.txt>`_).

Before starting to build HPX, please read about the
`prerequisites <http://stellar-group.github.io/hpx/docs/html/hpx/tutorial/getting_started/prereqs.html>`_.
`prerequisites <http://stellar-group.github.io/hpx/docs/html/hpx/manual/build_system/prerequisites.html>`_.

Linux
-----
Expand All @@ -101,7 +101,7 @@ Linux
/path/to/hpx/source/tree

for instance::

cmake -DBOOST_ROOT=~/packages/boost \
-DHWLOC_ROOT=/packages/hwloc \
-DCMAKE_INSTALL_PREFIX=~/packages/hpx \
Expand All @@ -127,7 +127,7 @@ Linux
gmake install

to build and install the examples.

Please refer `here <http://stellar-group.github.io/hpx/docs/html/hpx/manual/build_system/building_hpx/build_recipes.html#hpx.manual.build_system.building_hpx.build_recipes.unix_installation>`_
for more information about building HPX on a Linux system.

Expand Down Expand Up @@ -256,7 +256,7 @@ Windows
"Where to build the binaries:", enter the full path to the build folder you
created in step 2.

4) Add CMake variable definitions (if any) by clicking the "Add Entry" button and selecting type
4) Add CMake variable definitions (if any) by clicking the "Add Entry" button and selecting type
"String". Most probably you will need to at least add the directories where `Boost <http://www.boost.org>`_
is located as BOOST_ROOT and where `Hwloc <http://www.open-mpi.org/projects/hwloc/>`_ is
located as HWLOC_ROOT.
Expand Down Expand Up @@ -322,7 +322,7 @@ So far we only support BGClang for compiling HPX on the BlueGene/Q.
5) Generate the HPX buildfiles using cmake::

cmake -DHPX_PLATFORM=BlueGeneQ \
-CMAKE_TOOLCHAIN_FILE=/path/to/hpx/cmake/toolchains/BGQ.cmake \
-DCMAKE_TOOLCHAIN_FILE=/path/to/hpx/cmake/toolchains/BGQ.cmake \
-DCMAKE_CXX_COMPILER=bgclang++11 \
-DMPI_CXX_COMPILER=mpiclang++11 \
-DHWLOC_ROOT=/path/to/hwloc/installation \
Expand All @@ -348,10 +348,10 @@ You can find more details about using HPX on a BlueGene/Q system
Intel(R) Xeon/Phi
-----------------

After installing Boost and HWLOC, the build procedure is almost the same as
for how to build HPX on Unix Variants with the sole difference that you have
After installing Boost and HWLOC, the build procedure is almost the same as
for how to build HPX on Unix Variants with the sole difference that you have
to enable the Xeon Phi in the CMake Build system. This is achieved by invoking
CMake in the following way::
CMake in the following way::

cmake \
-DCMAKE_TOOLCHAIN_FILE=/path/to/hpx/cmake/toolchains/XeonPhi.cmake \
Expand All @@ -367,38 +367,38 @@ the `documentation <http://stellar-group.github.io/hpx/docs/html/hpx/manual/buil
Acknowledgements
******************

We would like to acknowledge the NSF, DoE, DARPA, the Center for Computation
and Technology (CCT) at Louisiana State University, and the Department of
We would like to acknowledge the NSF, DoE, DARPA, the Center for Computation
and Technology (CCT) at Louisiana State University, and the Department of
Computer Science 3 - Computer Architecture at the University of Erlangen
Nuremberg who fund and support our work.
Nuremberg who fund and support our work.

We would also like to thank the following
organizations for granting us allocations of their compute resources:
We would also like to thank the following
organizations for granting us allocations of their compute resources:
LSU HPC, LONI, XSEDE, NERSC, and the Gauss Center for Supercomputing.

HPX is currently funded by

* The National Science Foundation through awards 1117470 (APX),
1240655 (STAR), 1447831 (PXFS), and 1339782 (STORM).
* The National Science Foundation through awards 1117470 (APX),
1240655 (STAR), 1447831 (PXFS), and 1339782 (STORM).

Any opinions, findings, and conclusions or
recommendations expressed in this material are those of the author(s)
Any opinions, findings, and conclusions or
recommendations expressed in this material are those of the author(s)
and do not necessarily reflect the views of the National Science Foundation.

* The Department of Energy (DoE) through the award DE-SC0008714 (XPRESS).

Neither the United States Government nor any agency thereof, nor any of
their employees, makes any warranty, express or implied, or assumes any
legal liability or responsibility for the accuracy, completeness, or
usefulness of any information, apparatus, product, or process disclosed,
or represents that its use would not infringe privately owned rights.
Reference herein to any specific commercial product, process, or service
by trade name, trademark, manufacturer, or otherwise does not necessarily
constitute or imply its endorsement, recommendation, or favoring by the
United States Government or any agency thereof. The views and opinions of
authors expressed herein do not necessarily state or reflect those of the
* The Department of Energy (DoE) through the award DE-SC0008714 (XPRESS).

Neither the United States Government nor any agency thereof, nor any of
their employees, makes any warranty, express or implied, or assumes any
legal liability or responsibility for the accuracy, completeness, or
usefulness of any information, apparatus, product, or process disclosed,
or represents that its use would not infringe privately owned rights.
Reference herein to any specific commercial product, process, or service
by trade name, trademark, manufacturer, or otherwise does not necessarily
constitute or imply its endorsement, recommendation, or favoring by the
United States Government or any agency thereof. The views and opinions of
authors expressed herein do not necessarily state or reflect those of the
United States Government or any agency thereof.

* The Bavarian Research Foundation (Bayerische Forschungsstfitung) through
the grant AZ-987-11.
* The Bavarian Research Foundation (Bayerische Forschungsstfitung) through
the grant AZ-987-11.

2 changes: 1 addition & 1 deletion docs/manual/build_system/recipe_bgq.qbk
Expand Up @@ -58,7 +58,7 @@ You can then use this as your build command:

``
cmake -DHPX_PLATFORM=BlueGeneQ \
-CMAKE_TOOLCHAIN_FILE=/path/to/hpx/cmake/toolchains/BGQ.cmake \
-DCMAKE_TOOLCHAIN_FILE=/path/to/hpx/cmake/toolchains/BGQ.cmake \
-DCMAKE_CXX_COMPILER=bgclang++11 \
-DMPI_CXX_COMPILER=mpiclang++11 \
-DHWLOC_ROOT=/path/to/hwloc/installation \
Expand Down
1 change: 1 addition & 0 deletions examples/transpose/CMakeLists.txt
Expand Up @@ -10,6 +10,7 @@ set(example_programs
transpose_smp
transpose_smp_block
transpose
transpose_serial_vector
)

foreach(example_program ${example_programs})
Expand Down
55 changes: 31 additions & 24 deletions examples/transpose/transpose.cpp
Expand Up @@ -3,11 +3,6 @@
// Distributed under the Boost Software License, Version 1.0. (See accompanying
// file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)

// This is the eighth in a series of examples demonstrating the development
// of a fully distributed solver for a simple 1D heat distribution problem.
//
// This example builds on example seven.

#include <hpx/hpx_init.hpp>
#include <hpx/hpx.hpp>
#include <hpx/lcos/local/detail/invoke_when_ready.hpp>
Expand All @@ -19,10 +14,8 @@
#include <algorithm>
#include <vector>

// Constant to shift column index
#define COL_SHIFT 1000.00
// Constant to shift row index
#define ROW_SHIFT 0.001
#define COL_SHIFT 1000.00 // Constant to shift column index
#define ROW_SHIFT 0.001 // Constant to shift row index

bool verbose = false;

Expand Down Expand Up @@ -176,8 +169,12 @@ HPX_REGISTER_MINIMAL_COMPONENT_FACTORY(block_component_type, block_component);
typedef block_component::get_sub_block_action get_sub_block_action;
HPX_REGISTER_ACTION(get_sub_block_action);

void transpose(hpx::future<sub_block> A, hpx::future<sub_block> B, hpx::future<boost::uint64_t> block_order, hpx::future<boost::uint64_t> tile_size);
double test_results(boost::uint64_t order, boost::uint64_t block_order, std::vector<block> & trans, boost::uint64_t blocks_start, boost::uint64_t blocks_end);
void transpose(hpx::future<sub_block> A, hpx::future<sub_block> B,
hpx::future<boost::uint64_t> block_order,
hpx::future<boost::uint64_t> tile_size);
double test_results(boost::uint64_t order, boost::uint64_t block_order,
std::vector<block> & trans, boost::uint64_t blocks_start,
boost::uint64_t blocks_end);

///////////////////////////////////////////////////////////////////////////////
int hpx_main(boost::program_options::variables_map& vm)
Expand All @@ -194,11 +191,12 @@ int hpx_main(boost::program_options::variables_map& vm)
boost::uint64_t tile_size = order;

if(vm.count("tile_size"))
tile_size = vm["tile_size"].as<boost::uint64_t>();
tile_size = vm["tile_size"].as<boost::uint64_t>();

verbose = vm.count("verbose");
verbose = vm.count("verbose") ? true : false;

boost::uint64_t bytes = 2.0 * sizeof(double) * order * order;
boost::uint64_t bytes =
static_cast<boost::uint64_t>(2.0 * sizeof(double) * order * order);

boost::uint64_t num_blocks = num_localities * num_local_blocks;

Expand Down Expand Up @@ -254,8 +252,10 @@ int hpx_main(boost::program_options::variables_map& vm)
for_each(par, boost::begin(range), boost::end(range),
[&](boost::uint64_t b)
{
boost::shared_ptr<block_component> A_ptr = hpx::get_ptr<block_component>(A[b].get_gid()).get();
boost::shared_ptr<block_component> B_ptr = hpx::get_ptr<block_component>(B[b].get_gid()).get();
boost::shared_ptr<block_component> A_ptr =
hpx::get_ptr<block_component>(A[b].get_gid()).get();
boost::shared_ptr<block_component> B_ptr =
hpx::get_ptr<block_component>(B[b].get_gid()).get();

for(boost::uint64_t i = 0; i < order; ++i)
{
Expand All @@ -273,7 +273,7 @@ int hpx_main(boost::program_options::variables_map& vm)
double avgtime = 0.0;
double maxtime = 0.0;
double mintime = 366.0 * 24.0*3600.0; // set the minimum time to a large value;
// one leap year should be enough
// one leap year should be enough
for(boost::uint64_t iter = 0; iter < iterations; ++iter)
{
hpx::util::high_resolution_timer t;
Expand Down Expand Up @@ -334,15 +334,16 @@ int hpx_main(boost::program_options::variables_map& vm)
if(errsq < epsilon)
{
std::cout << "Solution validates\n";
avgtime = avgtime/static_cast<double>((std::max)(iterations-1, static_cast<boost::uint64_t>(1)));
avgtime = avgtime/static_cast<double>(
(std::max)(iterations-1, static_cast<boost::uint64_t>(1)));
std::cout
<< "Rate (MB/s): " << 1.e-6 * bytes/mintime << ", "
<< "Avg time (s): " << avgtime << ", "
<< "Min time (s): " << mintime << ", "
<< "Max time (s): " << maxtime << "\n";

if(verbose)
std::cout << "Squared errors: " << errsq << "\n";
std::cout << "Squared errors: " << errsq << "\n";
}
else
{
Expand All @@ -368,9 +369,11 @@ int main(int argc, char* argv[])
("iterations", value<boost::uint64_t>()->default_value(10),
"# iterations")
("tile_size", value<boost::uint64_t>(),
"Number of tiles to divide the individual matrix blocks for improved cache and TLB performance")
"Number of tiles to divide the individual matrix blocks for improved "
"cache and TLB performance")
("num_blocks", value<boost::uint64_t>()->default_value(1),
"Number of blocks to divide the individual matrix blocks for improved cache and TLB performance")
"Number of blocks to divide the individual matrix blocks for "
"improved cache and TLB performance")
( "verbose", "Verbose output")
;

Expand All @@ -382,7 +385,9 @@ int main(int argc, char* argv[])
return hpx::init(desc_commandline, argc, argv, cfg);
}

void transpose(hpx::future<sub_block> Af, hpx::future<sub_block> Bf, hpx::future<boost::uint64_t> block_order_fut, hpx::future<boost::uint64_t> tile_size_fut)
void transpose(hpx::future<sub_block> Af, hpx::future<sub_block> Bf,
hpx::future<boost::uint64_t> block_order_fut,
hpx::future<boost::uint64_t> tile_size_fut)
{
const sub_block A(Af.get());
sub_block B(Bf.get());
Expand Down Expand Up @@ -416,7 +421,9 @@ void transpose(hpx::future<sub_block> Af, hpx::future<sub_block> Bf, hpx::future
}
}

double test_results(boost::uint64_t order, boost::uint64_t block_order, std::vector<block> & trans, boost::uint64_t blocks_start, boost::uint64_t blocks_end)
double test_results(boost::uint64_t order, boost::uint64_t block_order,
std::vector<block> & trans, boost::uint64_t blocks_start,
boost::uint64_t blocks_end)
{
using hpx::parallel::for_each;
using hpx::parallel::par;
Expand Down Expand Up @@ -445,7 +452,7 @@ double test_results(boost::uint64_t order, boost::uint64_t block_order, std::vec
);

if(verbose)
std::cout << " Squared sum of differences: " << errsq << "\n";
std::cout << " Squared sum of differences: " << errsq << "\n";

return errsq;
}
37 changes: 17 additions & 20 deletions examples/transpose/transpose_serial.cpp
Expand Up @@ -3,21 +3,14 @@
// Distributed under the Boost Software License, Version 1.0. (See accompanying
// file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)

// This is the eighth in a series of examples demonstrating the development
// of a fully distributed solver for a simple 1D heat distribution problem.
//
// This example builds on example seven.

#include <hpx/hpx_init.hpp>
#include <hpx/hpx.hpp>

#include <algorithm>
#include <vector>

// Constant to shift column index
#define COL_SHIFT 1000.00
// Constant to shift row index
#define ROW_SHIFT 0.001
#define COL_SHIFT 1000.00 // Constant to shift column index
#define ROW_SHIFT 0.001 // Constant to shift row index

bool verbose = false;

Expand All @@ -31,11 +24,12 @@ int hpx_main(boost::program_options::variables_map& vm)
boost::uint64_t tile_size = order;

if(vm.count("tile_size"))
tile_size = vm["tile_size"].as<boost::uint64_t>();
tile_size = vm["tile_size"].as<boost::uint64_t>();

verbose = vm.count("verbose");
verbose = vm.count("verbose") ? true : false;

boost::uint64_t bytes = 2.0 * sizeof(double) * order * order;
boost::uint64_t bytes =
static_cast<boost::uint64_t>(2.0 * sizeof(double) * order * order);

std::vector<double> A(order * order);
std::vector<double> B(order * order);
Expand Down Expand Up @@ -64,7 +58,7 @@ int hpx_main(boost::program_options::variables_map& vm)
double avgtime = 0.0;
double maxtime = 0.0;
double mintime = 366.0 * 24.0*3600.0; // set the minimum time to a large value;
// one leap year should be enough
// one leap year should be enough
for(boost::uint64_t iter = 0; iter < iterations; ++iter)
{
hpx::util::high_resolution_timer t;
Expand All @@ -75,16 +69,17 @@ int hpx_main(boost::program_options::variables_map& vm)
{
for(boost::uint64_t j = 0; j < order; j += tile_size)
{
for(boost::uint64_t it = i; it < (std::min)(order, i + tile_size); ++it)
boost::uint64_t i_max = (std::min)(order, i + tile_size);
for(boost::uint64_t it = i; it < i_max; ++it)
{
for(boost::uint64_t jt = j; jt < (std::min)(order, j + tile_size); ++jt)
boost::uint64_t j_max = (std::min)(order, j + tile_size);
for(boost::uint64_t jt = j; jt < j_max; ++jt)
{
B[it + order * jt] = A[jt + order * it];
}
}
}
}

}
else
{
Expand Down Expand Up @@ -115,15 +110,16 @@ int hpx_main(boost::program_options::variables_map& vm)
if(errsq < epsilon)
{
std::cout << "Solution validates\n";
avgtime = avgtime/static_cast<double>((std::max)(iterations-1, static_cast<boost::uint64_t>(1)));
avgtime = avgtime/static_cast<double>(
(std::max)(iterations-1, static_cast<boost::uint64_t>(1)));
std::cout
<< "Rate (MB/s): " << 1.e-6 * bytes/mintime << ", "
<< "Avg time (s): " << avgtime << ", "
<< "Min time (s): " << mintime << ", "
<< "Max time (s): " << maxtime << "\n";

if(verbose)
std::cout << "Squared errors: " << errsq << "\n";
std::cout << "Squared errors: " << errsq << "\n";
}
else
{
Expand All @@ -147,7 +143,8 @@ int main(int argc, char* argv[])
("iterations", value<boost::uint64_t>()->default_value(10),
"# iterations")
("tile_size", value<boost::uint64_t>(),
"Number of tiles to divide the individual matrix blocks for improved cache and TLB performance")
"Number of tiles to divide the individual matrix blocks for improved "
"cache and TLB performance")
( "verbose", "Verbose output")
;

Expand All @@ -173,7 +170,7 @@ double test_results(boost::uint64_t order, std::vector<double> const & trans)
}

if(verbose)
std::cout << " Squared sum of differences: " << errsq << "\n";
std::cout << " Squared sum of differences: " << errsq << "\n";

return errsq;
}