Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: Dial fails on Solaris with gccgo #6828

Closed
4ad opened this issue Nov 24, 2013 · 9 comments

Comments

Projects
None yet
4 participants
@4ad
Copy link
Member

commented Nov 24, 2013

What steps will reproduce the problem?

1. Download attached file (standalone $GOROOT/src/pkg/net/dial_test.go:/TestDialer)
2. gccgo a.go -lsocket -lnsl
3. ./a.out

What is the expected output?

Nothing.

What do you see instead?

Dial failed: dial tcp4 127.0.0.1:60819: invalid argument

Which compiler are you using (5g, 6g, 8g, gccgo)?

gccgo

Which operating system are you using?

Solaris

: oos:1; uname -a
SunOS oos 5.11 omnios-b281e50 i86pc i386 i86pc
: oos:1; cat /etc/release
  OmniOS v11 r151006
  Copyright 2012-2013 OmniTI Computer Consulting, Inc. All rights reserved.
  Use is subject to license terms.
: oos:1; 

Which version are you using?  (run 'go version')

: oos:1; gccgo -v
Using built-in specs.
COLLECT_GCC=gccgo
COLLECT_LTO_WRAPPER=/opt/gcc482/bin/../libexec/gcc/x86_64-sun-solaris2.11/4.8.2/lto-wrapper
Target: x86_64-sun-solaris2.11
Configured with: ../src/configure --prefix=/home/aram/gcc482
--build=x86_64-sun-solaris2.11 --host=x86_64-sun-solaris2.11 --disable-multilib
--disable-shared --enable-threads=posix --enable-languages=c,c++,go --disable-libssp
--disable-nls --with-gnu-as --with-gnu-ld --enable-ld --disable-gold
Thread model: posix
gcc version 4.8.2 (GCC) 

Please provide any additional information below.

I can reproduce this on a slow machine 100% of the time. On a fast machine it fails only
around 20% of the time.

This is a truss snippet:

/1: connect(6, 0xC20082EB4C, 16, SOV_XPG4_2)    Err#150 EINPROGRESS
/1: getcontext(0xC200001850)
/1: write(5, "\0", 1)             = 1
/1: getcontext(0xC200001C50)
/1: setcontext(0xC200211220)
/1: setcontext(0xFFFFFD7FFFDFECC0)
/1: getcontext(0xC200629850)
/1: mmap(0x00000000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANON, 4294967295, 0) =
0xFFFFFD7FEE300000
/1: mmap(0x00000000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANON, 4294967295, 0) =
0xFFFFFD7FEE2F0000
/1: mmap(0x00000000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANON, 4294967295, 0) =
0xFFFFFD7FEE2E0000
/1: mmap(0x00000000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANON, 4294967295, 0) =
0xFFFFFD7FEE2D0000
/2: pollsys(0xFFFFFD7FEE4529D0, 0, 0xFFFFFD7FEE452AC0, 0x00000000) = 0
/1: mmap(0x00000000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANON, 4294967295, 0) =
0xFFFFFD7FEE2C0000
/1: lwp_sigmask(SIG_SETMASK, 0xFFBFFEEF, 0xFFFFFFF7, 0x000000FF, 0x00000000) =
0xFFBFFEFF [0xFFFFFFFF]
/1: mmap(0x00000000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_NORESERVE|MAP_ANON,
4294967295, 0) = 0xFFFFFD7FEE2B0000
/1: uucopy(0xC200829970, 0xFFFFFD7FEE2B2FE8, 24)    = 0
/1: lwp_create(0xC200829A80, LWP_DETACHED|LWP_SUSPENDED, 0xC200829DEC) = 3
/2: pollsys(0xFFFFFD7FEE4529D0, 0, 0xFFFFFD7FEE452AC0, 0x00000000) = 0
/3: lwp_create()    (returning as new lwp ...)  = 0
/1: lwp_continue(3)                 = 0
/3: setustack(0xFFFFFD7FEE430AE8)
/1: lwp_sigmask(SIG_SETMASK, 0x00000000, 0x00000000, 0x00000000, 0x00000000) =
0xFFBFFEFF [0xFFFFFFFF]
/3: schedctl()                  = 0xFFFFFD7FEE49A020
/3: getcontext(0xFFFFFD7FEE2B2C10)
/3: getcontext(0xC200C4FC50)
/3: sigaltstack(0xFFFFFD7FEE2B2F30, 0x00000000) = 0
/3: lwp_sigmask(SIG_SETMASK, 0x00000000, 0x00000000, 0x00000000, 0x00000000) =
0xFFBFFEFF [0xFFFFFFFF]
/2: pollsys(0xFFFFFD7FEE4529D0, 0, 0xFFFFFD7FEE452AC0, 0x00000000) = 0
/3: setcontext(0xFFFFFD7FEE2B2B80)
/3: getcontext(0xC200849050)
/3: pollsys(0xC200A49C90, 2, 0x00000000, 0x00000000) = 2
/3: getcontext(0xC200849050)
/3: read(4, "\0", 100)                = 1
/3: getcontext(0xC200849050)
/2: pollsys(0xFFFFFD7FEE4529D0, 0, 0xFFFFFD7FEE452AC0, 0x00000000) = 0
/2: pollsys(0xFFFFFD7FEE4529D0, 0, 0xFFFFFD7FEE452AC0, 0x00000000) = 0
/2: lwp_sigmask(SIG_SETMASK, 0xFFBFFEEF, 0xFFFFFFF7, 0x000000FF, 0x00000000) =
0xFFBFFEFF [0xFFFFFFFF]
/2: mmap(0x00000000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_NORESERVE|MAP_ANON,
4294967295, 0) = 0xFFFFFD7FEE2A0000
/2: uucopy(0xFFFFFD7FEE452550, 0xFFFFFD7FEE2A2FE8, 24) = 0
/2: lwp_create(0xFFFFFD7FEE452660, LWP_DETACHED|LWP_SUSPENDED, 0xFFFFFD7FEE4529CC) = 4
/2: lwp_continue(4)                 = 0
/4: lwp_create()    (returning as new lwp ...)  = 0
/2: lwp_sigmask(SIG_SETMASK, 0x00000000, 0x00000000, 0x00000000, 0x00000000) =
0xFFBFFEFF [0xFFFFFFFF]
/4: setustack(0xFFFFFD7FEE4312E8)
/4: schedctl()                  = 0xFFFFFD7FEE49A030
/2: pollsys(0xFFFFFD7FEE4529D0, 0, 0xFFFFFD7FEE452AC0, 0x00000000) = 0
/4: getcontext(0xFFFFFD7FEE2A2C10)
/4: getcontext(0xC200C60C50)
/4: sigaltstack(0xFFFFFD7FEE2A2F30, 0x00000000) = 0
/4: lwp_sigmask(SIG_SETMASK, 0x00000000, 0x00000000, 0x00000000, 0x00000000) =
0xFFBFFEFF [0xFFFFFFFF]
/4: setcontext(0xFFFFFD7FEE2A2B80)
/4: getcontext(0xC200849850)
/4: accept(3, 0xC200831310, 0xC200000168, SOV_DEFAULT, 0) = 7
/4: getcontext(0xC200849850)
/4: fcntl(7, F_SETFD, 0x00000001)           = 0
/4: getcontext(0xC200849850)
/2: pollsys(0xFFFFFD7FEE4529D0, 0, 0xFFFFFD7FEE452AC0, 0x00000000) = 0
/4: fcntl(7, F_GETFL)               = 130
/4: getcontext(0xC200849850)
/4: fcntl(7, F_SETFL, FWRITE|FNONBLOCK)     = 0
/4: getsockname(7, 0xC200831380, 0xC200000170, SOV_DEFAULT) = 0
/4: getcontext(0xC200849850)
/4: setsockopt(7, tcp, TCP_NODELAY, 0xC200000198, 4, SOV_DEFAULT) = 0
/4: getcontext(0xC200849850)
/4: close(7)                    = 0
/4: getcontext(0xC200849C50)
/4: setcontext(0xC200C49C10)
/4: setcontext(0xFFFFFD7FEE2A2B60)
/4: getcontext(0xC200001850)
/2: pollsys(0xFFFFFD7FEE4529D0, 0, 0xFFFFFD7FEE452AC0, 0x00000000) = 0
/4: connect(6, 0xC20082EB4C, 16, SOV_XPG4_2)    Err#22 EINVAL
/4: getcontext(0xC200001850)
/4: close(6)                    = 0
/4: getcontext(0xC200001850)
Dial failed: dial tcp4 127.0.0.1:46372: invalid argument
/4: write(1, " D i a l   f a i l e d :".., 57)    = 57
/4: getcontext(0xC200001850)
/4: close(3)                    = 0
/4: _exit(0)

You can see that the first connect returns EINPROGRESS and the second one EINVAL.

Attachments:

  1. test0.go.txt (748 bytes)
@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Nov 24, 2013

Comment 1:

I don't know what is happening here, but it would be much more interesting to try this
on tip, since the polling mechanism has completely changed.
EINVAL is a peculiar errno value to return.  The man page only suggests one possibility:
that the address length argument is wrong.  I don't see how that could be the case here.

Labels changed: added gccgo.

Owner changed to @ianlancetaylor.

@4ad

This comment has been minimized.

Copy link
Member Author

commented Nov 24, 2013

Comment 2:

I don't have a tip gccgo, however our gc port uses the gccgo tip polling mechanism which
exhibits the same behavior. If/when I have time I'll try building a more recent gccgo as
well.
Yes, EINVAL there makes little sense.
@rsc

This comment has been minimized.

Copy link
Contributor

commented Nov 27, 2013

Comment 3:

Labels changed: added go1.3maybe.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Dec 3, 2013

Comment 5:

Gccgo is not part of Go 1.3.

Labels changed: removed go1.3maybe.

@rsc

This comment has been minimized.

Copy link
Contributor

commented Dec 4, 2013

Comment 6:

Labels changed: added repo-gccgo.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Dec 28, 2013

Comment 7:

This is a guess at the problem.
/1: connect(6, 0xC20082EB4C, 16, SOV_XPG4_2)    Err#150 EINPROGRESS
/4: accept(3, 0xC200831310, 0xC200000168, SOV_DEFAULT, 0) = 7
/4: close(7)                    = 0
/4: connect(6, 0xC20082EB4C, 16, SOV_XPG4_2)    Err#22 EINVAL
In other words, I wonder if, while a connect is in progress, the connection is accepted
and closed, the attempt to complete the connect returns EINVAL.

Labels changed: removed priority-triage, gccgo.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Dec 28, 2013

Comment 8:

Confirmed.  This program exits with success on GNU/Linux but on Solaris fails with
"connect: Invalid argument."
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <pthread.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
static void
die (const char *s)
{
  perror (s);
  exit (EXIT_FAILURE);
}
static void *
thread (void *arg)
{
  int o1, o3;
  struct sockaddr_in asin;
  socklen_t slen;
  o1 = *(int *) arg;
  slen = sizeof asin;
  o3 = accept (o1, (struct sockaddr *) &asin, &slen);
  if (o3 < 0)
    die ("accept");
  if (close (o3) < 0)
    die ("close");
  return NULL;
}
int
main ()
{
  struct sockaddr_in sin;
  int o1, o2, flags;
  socklen_t slen;
  pthread_t tid;
  int i;
  o1 = socket (AF_INET, SOCK_STREAM, 0);
  if (o1 < 0)
    die ("socket 1");
  memset (&sin, 0, sizeof sin);
  sin.sin_family = AF_INET;
  sin.sin_addr.s_addr = htonl (INADDR_LOOPBACK);
  if (bind (o1, (struct sockaddr *) &sin, sizeof sin) < 0)
    die ("bind");
  if (listen (o1, 5) < 0)
    die ("listen");
  slen = sizeof sin;
  if (getsockname (o1, (struct sockaddr *) &sin, &slen) < 0)
    die ("getsockname");
  o2 = socket (AF_INET, SOCK_STREAM, 0);
  if (o2 < 0)
    die ("socket 2");
  flags = fcntl (o2, F_GETFL);
  if (flags < 0)
    die ("fcntl 1");
  if (fcntl (o2, F_SETFL, flags | O_NONBLOCK) < 0)
    die ("fcntl 2");
  if (connect (o2, (struct sockaddr *) &sin, sizeof sin) >= 0)
    die ("connect succeeded");
  if (errno != EINPROGRESS)
    die ("bad connect errno");
  i = pthread_create (&tid, NULL, thread, (void *) &o1);
  if (i != 0)
    {
      errno = i;
      die ("pthread_create");
    }
  sleep (1);
  if (connect (o2, (struct sockaddr *) &sin, sizeof sin) < 0)
    die ("connect");
  return 0;
}
@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Dec 28, 2013

Comment 9:

This issue was updated by revision 672525a.

R=golang-codereviews, bradfitz, dave
CC=golang-codereviews
https://golang.org/cl/46160043
@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Dec 28, 2013

Comment 10:

Should be fixed on mainline and GCC 4.8 branch.

Status changed to Fixed.

@4ad 4ad added fixed labels Dec 28, 2013

@golang golang locked and limited conversation to collaborators Jun 25, 2016

This issue was closed.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.