prov/gni: fix a problem with progress #1347

hppritcha · 2017-05-17T22:14:07Z

Turns out that the VC refactor had another impact,
it causes all data progress to be delayed until
the app starts calling fi_cq_read, etc. This
brings out all kind of issues with one-sided
program models.

Fix verified by SOS developers.

Signed-off-by: Howard Pritchard howardp@lanl.gov

sungeunchoi · 2017-05-17T23:32:34Z

Is this for manual or our pseudo-auto progress?

chuckfossen · 2017-05-18T16:30:01Z

Are you saying that we used to progress before the vc refactor even in manual progress mode? Or is this auto progress? What happens to the SOS app? Does it fail before it starts calling cq read?

hppritcha · 2017-05-18T16:36:53Z

I've attached the SOS app. Note it uses extensions to OpenSHMEM but you can see what its trying to do, namely post a whole bunch of non blocking AMOs, then go into a shmem quiet.

#include <shmem.h>
#include <shmemx.h>
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include <sys/time.h>
#include <time.h>

double get_wtime(void)
{
    double wtime = 0.0;

#ifdef CLOCK_MONOTONIC
    struct timespec tv;
    clock_gettime(CLOCK_MONOTONIC, &tv);
    wtime = tv.tv_sec * 1e6;
    wtime += (double)tv.tv_nsec / 1000.0;
#else
    struct timeval tv;
    gettimeofday(&tv, NULL);
    wtime = tv.tv_sec * 1e6;
    wtime += (double)tv.tv_usec;
#endif
    return wtime;
}

int main(int argc, char **argv) {
    if (argc != 2) {
        fprintf(stderr, "usage: %s natomics-per-pe\n", argv[0]);
        return 1;
    }

    const int natomics_per_pe = atoi(argv[1]);

    shmem_init();

    const int target_pe = (shmem_my_pe() + 1) % shmem_n_pes();
    unsigned long long *mem = (unsigned long long *)shmem_malloc(sizeof(*mem));
    assert(mem);

    double start_time, mid_time, start_quiet_time, end_time;

    // Warm up
    start_time = get_wtime();
    for (int i = 0; i < natomics_per_pe; i++) {
        shmemx_ulonglong_atomic_or(mem, 0xff, target_pe);
    }
    mid_time = get_wtime();
    shmem_quiet();
    end_time = get_wtime();


    start_time = get_wtime();
    for (int i = 0; i < natomics_per_pe; i++) {
        shmemx_ulonglong_atomic_or(mem, 0xff, target_pe);
    }
    mid_time = get_wtime();
    while (get_wtime() - mid_time < 2.0) ;
    start_quiet_time = get_wtime();
    shmem_quiet();
    end_time = get_wtime();
    printf("PE %d took %f seconds to issue, slept for %f seconds, %f seconds "
            "to complete\n", shmem_my_pe(), mid_time - start_time,
            start_quiet_time - mid_time, end_time - start_quiet_time);

    shmem_barrier_all();

    if (shmem_my_pe() == 0) {
        printf("\n");
    }
    shmem_barrier_all();

    start_time = get_wtime();
    for (int i = 0; i < natomics_per_pe; i++) {
        shmemx_ulonglong_atomic_or(mem, 0xff, target_pe);
    }
    mid_time = get_wtime();
    shmem_quiet();
    end_time = get_wtime();
    printf("PE %d took %f seconds to issue, did not sleep, %f seconds "
            "to complete\n", shmem_my_pe(), mid_time - start_time,
            end_time - mid_time);

    shmem_finalize();

    return 0;
}

sung: this is for either progress model.

If one tries to post many millions of atomic ops, OOM's are observed.

chuckfossen · 2017-05-18T17:05:14Z

Normally the progress happens in a separate thread, true? Is it ok to call gnix_nic_progress in the sender's thread?

hppritcha · 2017-05-18T19:50:43Z

no progress for small atomics doesn't happen from the progress thread unless you turn on zti's irq for every op option - then you get high latency and a nose-dive in message throughput rates.

chuckfossen · 2017-05-18T19:59:48Z

So, I'm hearing this won't affect latency.

chuckfossen · 2017-05-18T20:02:27Z

Should we have a similar addition to _gnix_sendv as well?

hppritcha · 2017-05-19T19:27:03Z

I'll add one for _gnix_sendv too. good catch.

Turns out that the VC refactor had another impact, it causes all data progress to be delayed until the app starts calling fi_cq_read, etc. This brings out all kind of issues with one-sided program models. Signed-off-by: Howard Pritchard <howardp@lanl.gov>

hppritcha · 2017-05-19T20:02:52Z

add similar code in sendv code.

Turns out that the VC refactor had another impact, it causes all data progress to be delayed until the app starts calling fi_cq_read, etc. This brings out all kind of issues with one-sided program models. upstream merge of ofi-cray#1347 Signed-off-by: Howard Pritchard <howardp@lanl.gov> (cherry picked from commit ofi-cray/libfabric-cray@83b9c8f)

hppritcha assigned chuckfossen May 17, 2017

hppritcha requested a review from chuckfossen May 17, 2017 22:14

hppritcha added the needs-to-be-merged-upstream label May 17, 2017

hppritcha force-pushed the topic/fix_progress_problem branch from cc7545f to 83b9c8f Compare May 19, 2017 20:02

hppritcha merged commit 93c083b into ofi-cray:master May 20, 2017

hppritcha mentioned this pull request May 25, 2017

prov/gni: fix a problem with progress ofiwg/libfabric#3000

Merged

hppritcha added merged-upstream and removed needs-to-be-merged-upstream labels May 26, 2017

hppritcha deleted the topic/fix_progress_problem branch September 21, 2017 21:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

prov/gni: fix a problem with progress #1347

prov/gni: fix a problem with progress #1347

hppritcha commented May 17, 2017

sungeunchoi commented May 17, 2017

chuckfossen commented May 18, 2017

hppritcha commented May 18, 2017 •

edited

chuckfossen commented May 18, 2017

hppritcha commented May 18, 2017

chuckfossen commented May 18, 2017

chuckfossen commented May 18, 2017

hppritcha commented May 19, 2017

hppritcha commented May 19, 2017

prov/gni: fix a problem with progress #1347

prov/gni: fix a problem with progress #1347

Conversation

hppritcha commented May 17, 2017

sungeunchoi commented May 17, 2017

chuckfossen commented May 18, 2017

hppritcha commented May 18, 2017 • edited

chuckfossen commented May 18, 2017

hppritcha commented May 18, 2017

chuckfossen commented May 18, 2017

chuckfossen commented May 18, 2017

hppritcha commented May 19, 2017

hppritcha commented May 19, 2017

hppritcha commented May 18, 2017 •

edited