Issue with runguard #54

edrdo · 2022-07-20T10:55:36Z

I have been using CodeRunner with C++ programs. I was trying to overcome the problem of CodeRunner failing to execute further tests when a test causes a program crash, hence I tried to changed the template so that a subprocess is created using fork() per each test case.

It does not work and I believe the reason has to do with the way runguard works. Consider the following program.

#include <unistd.h>
#include <sys/wait.h>
#include <stdio.h>
#include <stdlib.h>

void doit(int x) {
  pid_t pid = fork();
  if (pid == 0) {
    printf("SON %d %d\n", x, getpid());
    exit(0);
  } else if (pid > 0) {
    waitpid(pid, 0, 0);
    printf("PARENT %d %d\n", x, getpid());
  } else {
    printf("FAIL %d\n", x);
  }
}
int main() {
  puts("A");
  doit(0);
  puts("B");
  doit(1);
  puts("C");
  doit(2);
}

During a normal execution of the program I get (as expected) something like:

A
SON 0 1963
PARENT 0 1962
B
SON 1 1964
PARENT 1 1962
C
SON 2 1965
PARENT 2 1962

When I execute the program through run_guard I get surprising output - I believe fork() behaves differently:

A
SON 0 2070
A
PARENT 0 2069
B
SON 1 2071
A
PARENT 0 2069
B
PARENT 1 2069
C
SON 2 2072
A
PARENT 0 2069
B
PARENT 1 2069
C
PARENT 2 2069

Is there a way to fix this behavior through CR configuration options passed on to Jobe?

Thanks

The text was updated successfully, but these errors were encountered:

trampgeek · 2022-07-20T23:41:21Z

Surprising behaviour indeed. My first thought was it couldn't happen, but you're right - it does. Looking through the runguard code I see it forks at the start. The parent is a watchdog process, which connects to the child (which runs the test code) via pipes. The watchdog loops, reading data from the pipes. until the child finishes. If the child prints something then forks, the contents of the pipe get replicated in the grandchild so the watchdog gets the contents twice. Or something like that, anyway - I'm not totally clear on the details.

I "borrowed" runguard from the domjudge project many years ago and I've stuck with the old version because it doesn't have to use cgroups, which would considerably complicate the Jobe installation if I had to use them. It has proved extremely reliable, but I've never tried using it in a multitasking mode. I checked with the latest version of runguard and the behaviour under forking is unchanged. So there's no easy fix here and I think you'll have to live with it as it stands.

However, I believe you can still use a fork to avoid aborting testing when a test case fails. I've created a proof of concept question, which I attach (zipped, because this UI won't accept a .xml file). It uses a non-combinator template, which means it runs each test case separately. For each test, it forks and the child runs the test code after redirecting stderr output to stdout. The parent waits for the child to finish, then returns with a return code of 0. You could make that example into a prototype for a new question type if you want to do this often.

Does that solve your problem?

c_function_prevent_abort_on_error.zip

edrdo · 2022-07-21T20:28:05Z

Thanks, your solution had a small but important ingredient that paved the way for my solution to work: the dup2(1,2) call in the child process. The difference is that I use a combinator template. This is more efficient as we save time on compiling just one C/C++ program as I understand, rather than one compiled program per test case. My template is as shown next.

#include <iostream>
#include <fstream>
#include <string>
#include <cmath>
#include <vector>
#include <algorithm>
#include <unistd.h>
#include <sys/wait.h>

using namespace std;
#define SEPARATOR "#<ab@17943918#@>#"

{{ STUDENT_ANSWER }}

int main() {
{% for TEST in TESTCASES %} 
   {
       pid_t pid = fork();
       if (pid == 0) {
            dup2(1, 2);
            {{ TEST.extra }};
            {{ TEST.testcode }};
            exit(0);
       } else if (pid > 0) {
            waitpid(pid, 0, 0);
            {% if not loop.last %}cout << SEPARATOR << endl;{% endif %}

       }
   }
{% endfor %}
    return 0;
}

A bit of context that perhaps you may find interesting: we are compiling programs with AddressSanitizer and UndefinedBehaviorSanitizer enabled (gcc and clang support both), hence crashes in student submissions are more frequent (and deterministic!), and we need to accommodate for those cases more often. An additional minor note is that AddressSanitizer uses 20 terabytes (!) of virtual memory hence MemLimit needs to be set to 0. This line in your code thus applies to other cases beyond Matlab.

trampgeek · 2022-07-21T22:14:56Z

Yep, a combinator template is definitely better. Thanks. And I'm most impressed at the 20 terabytes figure! I've taken the liberty of pasting your last posting to the question author's forum on coderunner.org.nz so it is more easily found by other users. Hope that's OK with you.

More recent versions of runguard require the use of cgroups, and so monitor actual memory usage rather than virtual memory demands. If I used such a version the issues with programs reserving huge amounts of virtual memory would go away. But at least when I first looked into that possibility it was significantly more difficult to set up a Jobe server with cgroups enabled and I wanted the procedure to be as simple as possible to encourage its use. With a docker Jobe implementation that's presumably no longer an issue so I probably should look into that. But ... I'm primarily a teacher not a developer and the current system works fine for me.

If you have any further comments suggestions would you mind using one of the forums on https://coderunner.org.nz, please? Postings there are more accessible to the community.

Thanks for the contribution.

trampgeek closed this as completed Jul 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with runguard #54

Issue with runguard #54

edrdo commented Jul 20, 2022

trampgeek commented Jul 20, 2022 •

edited

Loading

edrdo commented Jul 21, 2022 •

edited

Loading

trampgeek commented Jul 21, 2022

Issue with runguard #54

Issue with runguard #54

Comments

edrdo commented Jul 20, 2022

trampgeek commented Jul 20, 2022 • edited Loading

edrdo commented Jul 21, 2022 • edited Loading

trampgeek commented Jul 21, 2022

trampgeek commented Jul 20, 2022 •

edited

Loading

edrdo commented Jul 21, 2022 •

edited

Loading