Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LibOS] poll returns immediately instead of waiting on events #917

Open
alexandermerritt opened this issue Aug 12, 2019 · 3 comments

Comments

@alexandermerritt
Copy link
Contributor

commented Aug 12, 2019

Symptom

A process using poll() to wait for input data from a remote client via a TCP socket does not block when given an infinite timeout. The function returns immediately. This behavior does not conform to the specification, and potentially breaks some designs of server applications.

Details

From the manpage, poll() with a negative timeout argument should only return under two conditions, else block indefinitely:

  1. a file descriptor becomes ready;
  2. the call is interrupted by a signal handler

Creating a socket with family AF_INET, type SOCK_STREAM and protocol IPPROTO_TCP with flag AI_NUMERICHOST, invoking accept4() correctly blocks waiting for new client connections. However, a subsequent invocation of poll(_,_,-1) will return with value zero. If it were interrupted by a signal, it should be a negative value.

I expect poll() to block until one of the above conditions is satisfied, then only return either a positive value (a file descriptor is ready) or negative value (error or signal interruption).

This bug was encountered via enabling use of OpenBSD netcat as an application in the enclave.

I will submit a PR for adding a regression test of this. Included here is a sample of the test code I use to diagnose the problem.

Remedy

2019-08-13 -- Running the client like this gives the correct behavior (poll returns positive value):

echo 'x' | nc -v 127.0.0.1 8000

But if there is a delay between making the connection and input sent on the connection, then poll() behaves incorrectly.

nc -v 127.0.0.1 8000
# Wait 1 second
# Enter some text, hit <Enter>

2019-08-12(1) -- If one simply reinvokes poll() after receiving a zero return value, it can achieve the correct behavior, but applications may not expect to be written this way.

2019-08-12(0) -- Still investigating. Will add info here as I learn more.

Test Code

/* $OpenBSD: netcat.c,v 1.203 2019/02/26 17:32:47 jsing Exp $ */
/*
 * Copyright (c) 2001 Eric Jackson <ericj@monkey.org>
 * Copyright (c) 2015 Bob Beck.  All rights reserved.
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions
 * are met:
 *
 * 1. Redistributions of source code must retain the above copyright
 *   notice, this list of conditions and the following disclaimer.
 * 2. Redistributions in binary form must reproduce the above copyright
 *   notice, this list of conditions and the following disclaimer in the
 *   documentation and/or other materials provided with the distribution.
 * 3. The name of the author may not be used to endorse or promote products
 *   derived from this software without specific prior written permission.
 *
 * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
 * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
 * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
 * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
 * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
 * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
 * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
 * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
 * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
 * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 */

/*
 * Re-written nc(1) for OpenBSD. Original implementation by
 * *Hobbit* <hobbit@avian.org>.
 */

/*
 * Simplified use of netcat in Graphene for a simple regression
 * test of poll.
 *
 * Run this client
 *  nc -v 127.0.0.1 8000
 *
 * Expected: server blocks on poll waiting for client input
 * Behavior: poll returns zero, program exits with nonzero status
 */

#define _GNU_SOURCE

#include <err.h>
#include <errno.h>
#include <netdb.h>
#include <poll.h>
#include <stdio.h>
#include <string.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <sys/uio.h>
#include <sys/un.h>
#include <unistd.h>

#include <netinet/in.h>
#include <netinet/tcp.h>
#include <netinet/ip.h>
#include <arpa/telnet.h>
#include <arpa/inet.h>
#include <linux/in6.h>

#define POLL_STDIN  0
#define POLL_NETOUT 1
#define POLL_NETIN  2
#define POLL_STDOUT 3
#define BUFSIZE     16384

int kflag = 1;
int lflag = 1;

/*
 * local_listen()
 * Returns a socket listening on a local port, binds to specified source
 * address. Returns -1 on failure.
 */
int
local_listen(const char *host, const char *port, struct addrinfo hints)
{
    struct addrinfo *res, *res0;
    int s = -1, ret, x = 1, save_errno;
    int error;

    /* Allow nodename to be null. */
    hints.ai_flags |= AI_PASSIVE;

    if ((error = getaddrinfo(host, port, &hints, &res0)))
        errx(1, "getaddrinfo: %s", gai_strerror(error));

    for (res = res0; res; res = res->ai_next) {
        if ((s = socket(res->ai_family, res->ai_socktype,
            res->ai_protocol)) < 0)
            continue;

        ret = setsockopt(s, SOL_SOCKET, SO_REUSEADDR, &x, sizeof(x));
        if (ret == -1)
            err(1, NULL);

        ret = setsockopt(s, SOL_SOCKET, SO_REUSEPORT, &x, sizeof(x));
        if (ret == -1)
            err(1, NULL);

        if (bind(s, (struct sockaddr *)res->ai_addr,
            res->ai_addrlen) == 0) {
            break;
        }

        save_errno = errno;
        close(s);
        errno = save_errno;
        s = -1;
    }

    if (s != -1 && listen(s, 1) < 0)
        err(1, "listen");

    freeaddrinfo(res0);

    return s;
}

void
check_poll(int net_fd) {
    int timeout;
    struct pollfd pfd[2];
    int num_fds;

    /* network in */
    pfd[0].fd = net_fd;
    pfd[0].events = POLLIN;

    /* From man(2) poll:
     *
     *      The timeout argument specifies the number of milliseconds that
     *      poll() should block waiting for a file descriptor to become
     *      ready.  The call will block until either:
     *
     *             *  a file descriptor becomes ready;
     *             *  the call is interrupted by a signal handler; or
     *             *  the timeout expires.
     *
     *      Specifying a negative value in timeout means an infinite
     *      timeout.  Specifying a timeout of zero causes poll() to
     *      return immediately, even if no file descriptors are ready.
     *
     * Thus our invocation of poll should only return due to an event
     * or a signal interrupted it, and never result in a zero return
     * status.
     */
    timeout = -1;
retry_poll:
    num_fds = poll(pfd, 1, timeout);

    if (num_fds < 0) {
        if (num_fds != EINTR)
            errx(1, "unexpected polling error");
        warnx("interrupted by signal, retrying");
        goto retry_poll;
    }

    /* This is the BUG */
    if (num_fds == 0)
        errx(1, "poll() returned 0 but no timeout specified");

    /* Else, num_fds > 0 and we have correct execution */
}

#define HOST "127.0.0.1"
#define PORT "8000"

int main(void) {
    struct addrinfo hints;
    int s = -1, ret = 0;
    socklen_t len;
    union {
        struct sockaddr_storage storage;
        struct sockaddr_un forunix;
    } cliaddr;

    char* host = HOST;
    char* port = PORT;

    memset(&hints, 0, sizeof(struct addrinfo));
    hints.ai_family = AF_INET;
    hints.ai_socktype = SOCK_STREAM; // tcp
    hints.ai_protocol = IPPROTO_TCP; // tcp
    hints.ai_flags |= AI_NUMERICHOST; // nflag

    s = local_listen(host, port, hints);
    if (s < 0)
        err(1, "listen failed");
    for (;;) {
        int connfd;
        len = sizeof(cliaddr);
        connfd = accept4(s, (struct sockaddr *)&cliaddr,
                &len, SOCK_NONBLOCK);
        if (connfd == -1)
            err(1, "accept");
        check_poll(connfd);
        close(connfd);
        if (!kflag && s != -1) {
            close(s);
            s = -1;
            break;
        }
    }

    if (s != -1)
        close(s);

    return ret;
}
@alexandermerritt alexandermerritt changed the title [LibOS] poll returns immediately instead of waiting on input [LibOS] poll returns immediately instead of waiting on events Aug 13, 2019
@dimakuv

This comment has been minimized.

Copy link
Contributor

commented Aug 14, 2019

One thing to note here is that pfd[0].events = POLLIN is basically ignored by Graphene's LibOS layer. Graphene does its own bookkeeping of which events can and do happen (e.g., POLLOUT only makes sense to Graphene if there is any pending data on this file descriptor, otherwise events = POLLOUT is ignored).

This "Graphene knows better" logic already led to several corner-case bugs. Ideally, we should rewrite the select/poll/epoll logic such that it takes into account input fd.events from the user.

@alexandermerritt

This comment has been minimized.

Copy link
Contributor Author

commented Aug 14, 2019

I understand. I do not consider this a corner case, as this is a valid use of poll by netcat.

I am able to observe that when human involvement plays a role in data movement, this bug appears, but when removing the human (e.g., give netcat its input immediately via scripting, or use redis-benchmark with Redis, etc.) this behavior does not seem to appear (or it does but is symptom-less because user-land does not expect this kind of behavior). Probably by the time the receiving end begins to poll, the sending end has already shipped its data and the kernel has made indication the FD has an event ready, masking this issue, and permitting some data movement.

@thomasknauth

This comment has been minimized.

Copy link
Contributor

commented Aug 28, 2019

You could try and see if this branch fixes your problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.