Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Event detection w/ ODEINT #203

Open
agerlach opened this issue May 3, 2016 · 7 comments
Open

Event detection w/ ODEINT #203

agerlach opened this issue May 3, 2016 · 7 comments

Comments

@agerlach
Copy link

agerlach commented May 3, 2016

I would like to use VexCL w/ ODEINT to do "event detection" or for finding "zero crossings", but I am struggling to determine the best way to do this.

I would like to use event detection for two different purposes: 1) detect a condition for stopping a simulation, e.g. drop a ball in the atmosphere and detect ground impact to stop simulation, 2) state the state of a system at specific state values, e.g. drop a ball in the atmosphere and record the state for every 10 m descended.

I started by trying to modify odeint-v2/examples/find_crossing.cpp using vex::symbolic (included below). I got the following compilation errors though:

algorithm:859:13: No viable conversion from 'vex::symbolic' to 'bool'
/usr/local/include/boost/numeric/odeint/algebra/default_operations.hpp:443:34: No matching function for call to 'abs'
/usr/local/include/boost/numeric/odeint/algebra/detail/norm_inf.hpp:34:28: No matching function for call to 'abs'

The first error seems related to std::find_if. Are the 2nd two potentially related to headmyshoulder/odeint-v2#107?

After looking at the example some more, I think that this approach doesn't make much sense when using VexCL. One thought I had was to modify vexcl/examples/symbolic.cpp like this:

    auto X_IC = X; // store initial condition. States are [z, z_dot]
    auto X_previous = X; 
    vex::vector<float> current_event(ctx, n);
    current_event = X_IC[0];
    vex::vector<bool> justCrossed(ctx,n);
    vex::Reductor<float, vex::SUM> sum(ctx);
    for(value_type t = 0; t < t_max; t += dt) {
        kernel(X, Parameters );
        justCrossed = (X[0] < current_event && X_previous[0] > current_event)
        if(sum(justCrossed) > 0) {   // did any cross the events?
             current_event = current_event - 10 * justCrossed; // decrement event by 10 m
             Save X and X_previous...
             if(sum(current_event < 0) == 0)
                 break;
        }

        X_previous = X;     
      }

This seems like an ugly approach though. Do you have any suggestions or thoughts?

In the ideal world, I would like to do variable time step integration and be able to adjust the time steps in order to detect state events w/in a specified threshold.

#include <iostream>
#include <utility>
#include <algorithm>
#include <array>
#include <iostream>

#include <boost/numeric/odeint/stepper/runge_kutta_dopri5.hpp>
#include <boost/numeric/odeint/stepper/generation.hpp>
#include <boost/numeric/odeint/iterator/adaptive_iterator.hpp>
#include <boost/numeric/odeint/external/vexcl/vexcl_norm_inf.hpp>

#include <vexcl/devlist.hpp>
#include <vexcl/vector.hpp>
#include <vexcl/multivector.hpp>
#include <vexcl/generator.hpp>
#include <vexcl/element_index.hpp>
#include <vexcl/vexcl.hpp>


namespace odeint = boost::numeric::odeint;


typedef float value_type;
typedef vex::symbolic< value_type > sym_value;
typedef std::array<sym_value, 3> sym_state;

struct sys_func
{

    const sym_value &gam;

    sys_func(const sym_value &gam) : gam(gam){}

    template <class Sig>
    struct result {
        typedef void type;
    };

    void operator()( const sym_state &x , sym_state &dxdt , value_type ) const
    {
        dxdt[0] = x[1];
        dxdt[1] = -x[0] - gam * x[1];
    }
};

struct energy_condition {

    // defines the threshold crossing in terms of a boolean functor

    double m_min_energy;

    energy_condition(const double min_energy)
    : m_min_energy(min_energy) { }

    sym_value energy(const sym_state &x) {
        return 0.5 * x[1] * x[1] + 0.5 * x[0] * x[0];
    }

    vex::symbolic< bool > operator()(const sym_state &x) {
        // becomes true if the energy becomes smaller than the threshold
        return energy(x) <= m_min_energy;
    }
};


int main(int argc, char **argv)
{

    size_t n;
    using namespace std;

    n = argc > 1 ? atoi( argv[1] ) : 1024;

    vex::Context ctx( vex::Filter::DoublePrecision && vex::Filter::GPU );
    cout << ctx << endl;

    const value_type t_start = 0.0;
    const value_type t_end = 10.0;
    const value_type dt = 0.1;
    const value_type threshold = 0.1;

    // Real state initialization:
    value_type gam_min = 1.;
    value_type gam_max = 10.;
    value_type dGam   = (gam_max - gam_min) / (n - 1);
    vex::vector<value_type> gam(ctx, n);
    gam = gam_min + dGam * vex::element_index();

    // Custom kernel body will be recorded here:
    std::ostringstream body;
    vex::generator::set_recorder(body);

    sym_state sym_S = {
        sym_value(sym_value::VectorParameter),
        sym_value(sym_value::VectorParameter)
    };
    sym_value sym_gam(sym_value::VectorParameter, sym_value::Const);

    // Stop condition
    energy_condition cond(threshold);
    sym_state x_cond;
    value_type t_cond;

    //// Find_condition
    sys_func sys(sym_gam);

    odeint::runge_kutta_dopri5<
            sym_state , value_type , sym_state , value_type ,
            odeint::range_algebra , odeint::default_operations
            > sym_stepper;  // range_valgebra vs vector_space_algebra???

    auto stepper = odeint::make_dense_output(1.0e-6, 1.0e-6, sym_stepper);

    auto ode_range = odeint::make_adaptive_range(std::ref(stepper), sys, sym_S,
                                                 t_start, t_end, dt);


    // find the step where the condition changes
    auto found_iter = std::find_if(ode_range.first, ode_range.second, cond);
    auto kernel = vex::generator::build_kernel(ctx, "crossing", body.str(),
                                               sym_S[0], sym_S[1], sym_gam
                                               );

    ctx.finish();
}
@ddemidov
Copy link
Owner

ddemidov commented May 4, 2016

The main problem here is that algorithms that use vex::symbolic are not allowed to use any data-dependent branching. This is because C++ does not allow overloading conditional operators. So I don't think you could make odeint's adaptive algorithms work with vexcl symbolics. Still, it should be possible to get a coarse estimation of the occurred events. "Coarse" here is used in the sense you can not refine your time step to get a better estimation (or may be you could, up to some point, with a series of kernels generated for smaller and smaller time steps).

The idea is that the algorithm recorder (std::ostringstream body in the example above) is just an output stream, and you can put your own code into it. Say, you want to count for each of the balls the number of 10m levels it descended since the start of the experiment. You could do something like this (I did not test if this actually compiles):

// Custom kernel body will be recorded here:
std::ostringstream body;
vex::generator::set_recorder(body);

// Two sets of state variables to be able to track condition inside single kernel.
sym_state X_old = {
    sym_value(sym_value::VectorParameter, sym_value::Const),
    sym_value(sym_value::VectorParameter, sym_value::Const)
};

sym_state X_new = {
    sym_value(sym_value::VectorParameter),
    sym_value(sym_value::VectorParameter)
};

// Number of levels passed by each ball
vex::symbolic<int> sym_nlev(vex::symbolic<int>::VectorParameter);

// Next level for each of the balls:
vex::symbolic<double> sym_next_lev(vex::symbolic<double>::VectorParameter);

sym_value sym_gam(sym_value::VectorParameter, sym_value::Const);

// Non-adaptive stepper:
odeint::runge_kutta4<
    sym_state, value_type, sym_state, value_type ,
    odeint::range_algebra, odeint::default_operations
    > sym_stepper;

sys_func sys(sym_gam);

// Start recording the kernel.
X_new[0] = X_old[0];
X_new[1] = X_old[1];

sym_stepper.do_step(std::ref(sys), X_new, 0, dt);

// Check if the next level has been passed, advance it in case it has:
vex::symbolic<bool> passed = (X_old[1] >= next_lev && next_lev <= X_new[1]);

// We can not use a conditional directly, but we can do this:
body << "if (" << passed /*this will dump the name of the generated variable*/ << ") {\n";
sym_nlev += 1;
sym_next_lev -= 10;
body << "}\n";

// Generate the kernel
auto kernel = vex::generator::build_kernel(ctx, "crossing", body.str(),
                   X_old[0], X_old[1], X_new[0], X_new[1], sym_gam,
                   sym_nlev, sym_next_lev
                   );

I hope you will be able to work something out from this.

@agerlach
Copy link
Author

agerlach commented May 4, 2016

The main problem here is that algorithms that use vex::symbolic are not allowed to use any data-dependent branching. This is because C++ does not allow overloading conditional operators.

That makes sense. Injecting code into the output stream is a pretty simple work around though.

"Coarse" here is used in the sense you can not refine your time step to get a better estimation (or may be you could, up to some point, with a series of kernels generated for smaller and smaller time steps).

Is there any reason why dt cannot be vex::symbolic? I could then do something like this (note: I realize I will have to put the if-else logic into the output stream. I kept it like this since it would be easier to see the logic):

...
vex::symbolic<float> sym_dt(vex::symbolic<float>::VectorParameter);
sym_stepper.do_step(std::ref(sys), X_new, 0, sym_dt);

// Check if the next level has been passed, advance it in case it has:
vex::symbolic<bool> passed = (X_old[0] >= next_lev && next_lev <= X_new[0]);

// We can not use a conditional directly, but we can do this:
if (pass) {
  if(abs(X_new[0] - sym_next_lev) < 0.01) { // Within detection threshold
      sym_nlev +=1;
      sym_next_lev -=10;
      dt = 0.1; // Reset time step to original step size;
  } else {
    X_new[0] = X_old[0];
    X_new[1] = X_old[1]; //Reset state
    dt /= 0.5;           //Reduce step size
  }  
}

// Generate the kernel
auto kernel = vex::generator::build_kernel(ctx, "crossing", body.str(),
                   X_old[0], X_old[1], X_new[0], X_new[1], sym_gam,
                   sym_nlev, sym_next_lev,sym_dt
                   );

@ddemidov
Copy link
Owner

ddemidov commented May 4, 2016

Is there any reason why dt cannot be vex::symbolic?

I don't see any right away, but with the logic you showed the update to dt will be wasted anyway (the kernel will finish right after the update). May be you could get away with a while/for loop around the odeint stepper application? So that single kernel would iterate until a condition is met (be aware of the work imbalance between GPU threads though)?

@agerlach
Copy link
Author

agerlach commented May 4, 2016

I don't see any right away, but with the logic you showed the update to dt will be wasted anyway (the kernel will finish right after the update).

Why is that? Wouldn't the updated dt persist between kernels calls? I would then do something like:

    vex::vector<value_type> dt(ctx, n);
    vex::Reductor<float, vex::SUM> sum(ctx);

    // Integration loop:
    for(int ii = 0; ii < max_iterations; ++ii) {
        kernel(..., dt);
        if (sum(sym_nlvl) == num_lvls * n)   // check if all num_lvls has been detected for all n systems
             break;
     }

Maybe you could get away with a while/for loop around the odeint stepper application?

I may have missed something, but I thought that is how the kernel would be used in your example. Were you thinking something else?

be aware of the work imbalance between GPU threads though?

I guess in my example above, there really isn't a "load imbalance", but wasted computation. Example: I have n systems, but 1 of them descends much slower than the rest. Here, each system will continue to simulate past the last desired event until the last event for all the systems is detected. For my application, the time-scales between systems will not be drastically different.

@ddemidov
Copy link
Owner

ddemidov commented May 4, 2016

Why is that? Wouldn't the updated dt persist between kernels calls? I would then do something like:

Ok, if you make dt a vector parameter, then sure.

I may have missed something, but I thought that is how the kernel would be used in your example. Were you thinking something else?

I meant a loop inside the kernel, not the host-side loop that launches the kernel.

@agerlach
Copy link
Author

agerlach commented May 4, 2016

I meant a loop inside the kernel

Would you implement the loop in the kernel by manually updating the output stream, e.g.

body << "for(int i = 0; i < max_iterations; ++i) { \n";
sym_stepper.do_step(std::ref(sys), X_new, 0, dt);
body <<"if( " << stopping_condition_here << ") {break;}";

... // make updates and adjust dt
body << "}\n";

Based on your experience do you think I would be best off trying the host- or device-side loop approach first?

@ddemidov
Copy link
Owner

ddemidov commented May 4, 2016

Would you implement the loop in the kernel by manually updating the output stream

That was what I meant, yes.

Based on your experience do you think I would be best off trying the host- or device-side loop approach first?

I don't have any experience here, but I would say that (a) this needs some testing, and (b) host-side loop may be more safe, because long-running kernels may be a problem in some environments (e.g. windows watchdog process might kill any GPU kernel that runs for more than 5 seconds).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants