Skip to content

Commit

Permalink
Runtime option for limiting optimistic lookahead (ROSS-org#107)
Browse files Browse the repository at this point in the history
In irregular simulations, runtime crashes are caused due to memory overflow.
This often happens because some LPs over-optimisitcally explore events in
virtual time and are repeatedly rolled back. One user-defined way of avoiding
these crashes is to restrict the optimistic lookahead in terms of virtual time.

Option added: max-opt-lookahead
Unit: virtual time

If max-opt-lookahead is set, a PE repeatedly computes GVT till
(LVT - GVT) > max-opt-lookahead.

Change-Id: I0f81b75628125e1fd2e71eddaa4116025c00a1c6
  • Loading branch information
nikhil-jain authored and gonsie committed Mar 1, 2017
1 parent de26e58 commit 53b7fca
Show file tree
Hide file tree
Showing 4 changed files with 6 additions and 3 deletions.
5 changes: 3 additions & 2 deletions core/gvt/mpi_allreduce.c
Expand Up @@ -65,7 +65,7 @@ void
tw_gvt_step1(tw_pe *me)
{
if(me->gvt_status == TW_GVT_COMPUTE ||
++gvt_cnt < g_tw_gvt_interval)
(++gvt_cnt < g_tw_gvt_interval && (tw_pq_minimum(me->pq) - me->GVT < g_tw_max_opt_lookahead)))
return;

me->gvt_status = TW_GVT_COMPUTE;
Expand All @@ -77,7 +77,8 @@ tw_gvt_step1_realtime(tw_pe *me)
unsigned long long current_rt;

if( (me->gvt_status == TW_GVT_COMPUTE) ||
( (current_rt = tw_clock_read()) - g_tw_gvt_interval_start_cycles < g_tw_gvt_realtime_interval))
( ((current_rt = tw_clock_read()) - g_tw_gvt_interval_start_cycles < g_tw_gvt_realtime_interval)
&& (tw_pq_minimum(me->pq) - me->GVT < g_tw_max_opt_lookahead)))
{
/* if( me->node == 0 ) */
/* { */
Expand Down
1 change: 1 addition & 0 deletions core/ross-extern.h
Expand Up @@ -27,6 +27,7 @@ extern tw_lpid g_tw_rng_default;
extern tw_seed g_tw_rng_seed;
extern unsigned int g_tw_mblock;
extern unsigned int g_tw_gvt_interval;
extern unsigned long long g_tw_max_opt_lookahead;
extern unsigned long long g_tw_gvt_realtime_interval;
extern unsigned long long g_tw_gvt_interval_start_cycles;
extern tw_stime g_tw_ts_end;
Expand Down
2 changes: 1 addition & 1 deletion core/ross-global.c
Expand Up @@ -68,7 +68,7 @@ tw_stime g_tw_min_detected_offset=DBL_MAX;
*/
unsigned int g_tw_mblock = 16;
unsigned int g_tw_gvt_interval = 16;

unsigned long long g_tw_max_opt_lookahead = ULLONG_MAX;
unsigned long long g_tw_gvt_realtime_interval; // calculated at runtime
unsigned long long g_tw_gvt_interval_start_cycles = 0;

Expand Down
1 change: 1 addition & 0 deletions core/tw-setup.c
Expand Up @@ -20,6 +20,7 @@ static const tw_optdef kernel_options[] = {
TWOPT_UINT("extramem", g_tw_events_per_pe_extra, "Number of extra events allocated per PE."),
TWOPT_UINT("buddy-size", g_tw_buddy_alloc, "delta encoding buddy system allocation (2^X)"),
TWOPT_UINT("lz4-knob", g_tw_lz4_knob, "LZ4 acceleration factor (higher = faster)"),
TWOPT_ULONGLONG("max-opt-lookahead", g_tw_max_opt_lookahead, "Optimistic simulation: maximum lookahead allowed in virtual clock time"),
#ifdef AVL_TREE
TWOPT_UINT("avl-size", g_tw_avl_node_count, "AVL Tree contains 2^avl-size nodes"),
#endif
Expand Down

0 comments on commit 53b7fca

Please sign in to comment.