Skip to content

Commit

Permalink
Merge pull request #14 from theZiz/topic-binary-operator-overload
Browse files Browse the repository at this point in the history
Topic binary operator overload
  • Loading branch information
theZiz committed Oct 24, 2018
2 parents 0aa97ab + 2dd328c commit 7f594d5
Show file tree
Hide file tree
Showing 9 changed files with 234 additions and 88 deletions.
2 changes: 1 addition & 1 deletion documentation/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ def fixBug( prefix, className, templateList ):
fixBug( "<definition>using ", "common::allocator::Alpaka", "T_DevAcc, T_Size" )
fixBug( "<definition>using ", "common::allocator::AlpakaMirror", "T_DevAcc, T_Size, T_Mapping" )
fixBug( "<definition>using ", "common::allocator::AlpakaShared", "T_Acc, T_count, T_uniqueID" )
fixBug( "<definition>using ", "llama::VirtualDatum", "T_View, T_BoundDatumDomain" )
fixBug( "<definition>using ", "llama::VirtualDatum", "T_View, T_BoundDatumDomain, T_ViewType" )

# -- Project information -----------------------------------------------------

Expand Down
8 changes: 7 additions & 1 deletion documentation/pages/user/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -177,7 +177,13 @@ Factory
.. doxygentypedef:: llama::OneOnStackFactory
:project: LLAMA

.. doxygenfunction:: llama::tempAlloc
.. doxygenfunction:: llama::stackViewAlloc
:project: LLAMA

.. doxygenfunction:: llama::stackVirtualDatumAlloc
:project: LLAMA

.. doxygenfunction:: llama::stackVirtualDatumCopy
:project: LLAMA

.. _label-api-allocators:
Expand Down
10 changes: 3 additions & 7 deletions documentation/pages/user/plans.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,13 +14,9 @@ the near future. However there is still a long way to go.
View and virtual datum interface
--------------------------------

One of the most relevent plans for end users is the extension of the view and
virtual datum functionalities. At the moment only six inplace and six logical
operators are overload. In the future all arightmetic and logic operators shall
be implemented -- at least of they are inplace.
:cpp:`view1( i ) = view2( j ) + view( k )` would need intermediate temporary
objects (e.g. on the stack) which may be bad or expression templates. Both is
not **planned** atm but **not completely excluded** as well.
Right now only the inplace operations like :cpp:`+=`, the logical operations and
the five most important binary operations are supported. In the future all
binary (and unary?) operators shall be overloaded.

Mappings
--------
Expand Down
20 changes: 16 additions & 4 deletions documentation/pages/user/views.rst
Original file line number Diff line number Diff line change
Expand Up @@ -132,10 +132,18 @@ a parameter to a function (as seen in the
}

The most needed inplace operators ( :cpp:`=`, :cpp:`+=`, :cpp:`-=`, :cpp:`*=`,
:cpp:`/=`, :cpp:`%=` ) are overloaded. Only inplace, because it is not trivial
to create a needed intermediate state out of a virtual datum (without expression
templates). These operators work between two virtual datums, even if they have
different datum domains. Every namings existing in both datum domains will be
:cpp:`/=`, :cpp:`%=` ) are overloaded. These operators directly write into the
corresponding view. Furthermore the not-inplace operators ( :cpp:`+`, :cpp:`-`,
:cpp:`*`, :cpp:`/`, :cpp:`%` ) are overloaded too but return an temporary object
on the stack. Althought it has a basic struct-mapping without padding and
probaly being not compatible to the mapping of the view at all, the compiler
will most probably be able to optimize the data accesses anyway as it has full
knowledge about the data in the stack and can cut out all temporary operations.

These operators work between two virtual datums, even if they have
different datum domains. It is even possible to work on parts of a virtual
datum. This returns a virtual datum with the first coordinates in the datum
domain bound. Every namings existing in both datum domains will be
matched and operated on. Every not matching pair is ignored, e.g.

.. code-block:: C++
Expand Down Expand Up @@ -165,6 +173,10 @@ matched and operated on. Every not matching pair is ignored, e.g.
// datum2.pos.x and only datum2.pos.x will be added to datum1.pos.x because
// of pos.x existing in both datum domains although having different types.

datum1( vel() ) *= datum2( mom() );
// datum2.mom.x will be multiplied to datum2.vel.x as the first part of the
// datum domain coord is explicit given and the same afterwards
The same operators are also overloaded for any other type so that
:cpp:`datum1 *= 7.0` will multiply 7 to every element in the datum domain.
Of course this may throw warnings about narrowing conversion. It is task of the
Expand Down
10 changes: 5 additions & 5 deletions examples/asynccopy/asynccopy.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -162,8 +162,8 @@ struct BlurKernel
for( auto x = start[1]; x < end[1]; ++x )
{
#if ASYNCCOPY_LOCAL_SUM == 1
auto sum = llama::tempAlloc< 1, PixelOnAcc >();
sum() = 0;
auto sum = llama::stackVirtualDatumAlloc< PixelOnAcc >();
sum = 0;
#else // ASYNCCOPY_LOCAL_SUM == 0
newImage( y + kernelSize, x + kernelSize ) = 0;
#endif // ASYNCCOPY_LOCAL_SUM
Expand Down Expand Up @@ -193,7 +193,7 @@ struct BlurKernel
LLAMA_INDEPENDENT_DATA
for ( auto a = i_start[1]; a < i_end[1]; ++a )
#if ASYNCCOPY_LOCAL_SUM == 1
sum() +=
sum +=
#else // ASYNCCOPY_LOCAL_SUM == 0
newImage( y + kernelSize, x + kernelSize ) +=
#endif // ASYNCCOPY_LOCAL_SUM
Expand All @@ -203,8 +203,8 @@ struct BlurKernel
shared( std::size_t(b), std::size_t(a) );
#endif // ASYNCCOPY_SHARED
#if ASYNCCOPY_LOCAL_SUM == 1
sum() /= Element( (2 * kernelSize + 1) * (2 * kernelSize + 1) );
newImage( y + kernelSize, x + kernelSize ) = sum();
sum /= Element( (2 * kernelSize + 1) * (2 * kernelSize + 1) );
newImage( y + kernelSize, x + kernelSize ) = sum;
#else // ASYNCCOPY_LOCAL_SUM == 0
newImage( y + kernelSize, x + kernelSize ) /=
Element( (2 * kernelSize + 1) * (2 * kernelSize + 1) );
Expand Down
54 changes: 18 additions & 36 deletions examples/nbody/nbody.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -70,22 +70,18 @@ pPInteraction(
)
-> void
{
// Creating tempory object for distance on stack:
using DistanceDD = llama::GetTypeFromUID< Particle, dd::Pos >;
auto distance = llama::tempAlloc< 1, DistanceDD >();
distance() = p1( dd::Pos() );
distance() -= p2( dd::Pos() );

distance() *= distance(); //square for each element
// Creating tempory virtual datum object for distance on stack:
auto distance = p1( dd::Pos() ) + p2( dd::Pos() );
distance *= distance; //square for each element
Element distSqr = EPS2 +
distance()( dd::X() ) +
distance()( dd::Y() ) +
distance()( dd::Z() );
distance( dd::X() ) +
distance( dd::Y() ) +
distance( dd::Z() );
Element distSixth = distSqr * distSqr * distSqr;
Element invDistCube = 1.0f / sqrtf( distSixth );
Element s = p2( dd::Mass() ) * invDistCube;
distance() *= s * ts;
p1( dd::Vel() ) += distance();
distance *= s * ts;
p1( dd::Vel() ) += distance;
}

template<
Expand Down Expand Up @@ -292,14 +288,7 @@ struct MoveKernel

LLAMA_INDEPENDENT_DATA
for ( auto pos = start; pos < end; ++pos )
{
// Creating tempory object for distance on stack:
using VelocityDD = llama::GetTypeFromUID< Particle, dd::Vel >;
auto velocity = llama::tempAlloc< 1, VelocityDD >();
velocity() = particles( pos )( dd::Vel() );
velocity() *= ts;
particles( pos )( dd::Pos() ) += velocity();
}
particles( pos )( dd::Pos() ) += particles( pos )( dd::Vel() ) * ts;
}
};

Expand Down Expand Up @@ -455,29 +444,22 @@ int main(int argc,char * * argv)

chrono.printAndReset("Alloc");

std::default_random_engine generator;
std::mt19937_64 generator;
std::normal_distribution< Element > distribution(
Element( 0 ), // mean
Element( 1 ) // stddev
);
auto seed = distribution(generator);
LLAMA_INDEPENDENT_DATA
for (std::size_t i = 0; i < problemSize; ++i)
{
//~ auto temp = llama::tempAlloc< 1, Particle >();
//~ temp(dd::Pos(), dd::X()) = distribution(generator);
//~ temp(dd::Pos(), dd::Y()) = distribution(generator);
//~ temp(dd::Pos(), dd::Z()) = distribution(generator);
//~ temp(dd::Vel(), dd::X()) = distribution(generator)/Element(10);
//~ temp(dd::Vel(), dd::Y()) = distribution(generator)/Element(10);
//~ temp(dd::Vel(), dd::Z()) = distribution(generator)/Element(10);
hostView(i) = seed;
//~ hostView(dd::Pos(), dd::X()) = seed;
//~ hostView(dd::Pos(), dd::Y()) = seed;
//~ hostView(dd::Pos(), dd::Z()) = seed;
//~ hostView(dd::Vel(), dd::X()) = seed;
//~ hostView(dd::Vel(), dd::Y()) = seed;
//~ hostView(dd::Vel(), dd::Z()) = seed;
auto temp = llama::stackVirtualDatumAlloc< Particle >();
temp(dd::Pos(), dd::X()) = distribution(generator);
temp(dd::Pos(), dd::Y()) = distribution(generator);
temp(dd::Pos(), dd::Z()) = distribution(generator);
temp(dd::Vel(), dd::X()) = distribution(generator)/Element(10);
temp(dd::Vel(), dd::Y()) = distribution(generator)/Element(10);
temp(dd::Vel(), dd::Z()) = distribution(generator)/Element(10);
hostView(i) = temp;
}

chrono.printAndReset("Init");
Expand Down
43 changes: 28 additions & 15 deletions include/llama/Factory.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,20 @@

#pragma once

#include "View.hpp"
//#include "View.hpp"
// Forward declartation instead of include as the View.hpp needs to use
// factories itself in some VirtualDatum overloads
namespace llama
{

template<
typename T_Mapping,
typename T_BlobType
>
struct View;

} // namespace llama

#include "allocator/Vector.hpp"
#include "allocator/Stack.hpp"
#include "mapping/One.hpp"
Expand Down Expand Up @@ -164,47 +177,47 @@ struct Factory
* only one element laying on the stack avoiding costly allocation operations.
* \tparam dimension dimension of the view
* \tparam DatumDomain the datum domain for the one element mapping
* \see tempAlloc
* \see stackViewAlloc
*/
template<
std::size_t dimension,
typename DatumDomain
std::size_t T_dimension,
typename T_DatumDomain
>
using OneOnStackFactory =
llama::Factory<
llama::mapping::One<
UserDomain< dimension >,
DatumDomain
UserDomain< T_dimension >,
T_DatumDomain
>,
llama::allocator::Stack<
SizeOf<DatumDomain>::value
SizeOf<T_DatumDomain>::value
>
>;

/** Uses the \ref OneOnStackFactory to allocate one (probably temporary) element
* for a given dimension and dautm domain on the stack (no costly allocation).
* for a given dimension and datum domain on the stack (no costly allocation).
* \tparam dimension dimension of the view
* \tparam DatumDomain the datum domain for the one element mapping
* \return the allocated view
* \see OneOnStackFactory
*/
template<
std::size_t dimension,
typename DatumDomain
std::size_t T_dimension,
typename T_DatumDomain
>
LLAMA_FN_HOST_ACC_INLINE
auto
tempAlloc()
stackViewAlloc()
-> decltype(
OneOnStackFactory<
dimension,
DatumDomain
T_dimension,
T_DatumDomain
>::allocView()
)
{
return OneOnStackFactory<
dimension,
DatumDomain
T_dimension,
T_DatumDomain
>::allocView();
}

Expand Down

0 comments on commit 7f594d5

Please sign in to comment.