Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opal lifo/fifo failures on SPARC #317

Closed
jsquyres opened this issue Dec 17, 2014 · 5 comments
Closed

opal lifo/fifo failures on SPARC #317

jsquyres opened this issue Dec 17, 2014 · 5 comments
Assignees
Labels
Milestone

Comments

@jsquyres
Copy link
Member

Siegmar reported the following failures on Linux on SPARC:

Linux, gcc-4.9.2:
=================
...
SUPPORT: OMPI Test Passed: opal_pointer_array: (14 tests)
PASS: opal_pointer_array
/bin/sh: line 5:  5746 Illegal instruction     ${dir}$tst
FAIL: opal_lifo
/bin/sh: line 5:  5776 Illegal instruction     ${dir}$tst
FAIL: opal_fifo

Also, on Solaris/SPARC, the tests fail to compile -- apparently timersub() doesn't exist there?

Solaris 10 Sparc and x86_64, Sun C 5.13 and gcc-4.9.2:
======================================================
...
 CC       opal_lifo.o
"../../../openmpi-dev-557-g01a24c4/test/class/opal_lifo.c", line 45: warning: implicit function 
declaration: timersub
 CCLD     opal_lifo
Undefined                       first referenced
symbol                             in file
timersub                            opal_lifo.o
ld: fatal: symbol referencing errors. No output written to .libs/opal_lifo
make[3]: *** [opal_lifo] Error 2

@hjelmn Can you have a look?

Link to Siegmar's original post: http://www.open-mpi.org/community/lists/users/2014/12/26012.php

@jsquyres jsquyres added the bug label Dec 17, 2014
@jsquyres jsquyres added this to the Open MPI 1.9 milestone Dec 17, 2014
@hjelmn
Copy link
Member

hjelmn commented Dec 18, 2014

Ok, there are two issues here. I fixed the second one by adding a definition of timersub for when one is not already defined. Looking at the first now.

@hjelmn
Copy link
Member

hjelmn commented Dec 18, 2014

@jsquyres Looking at the original email now. It looks like the illegal instruction is on x86_64. This is probably a system that doesn't support cmpxchg16b... Wonder how it got past the assembly test at configure. Let me see if I have a system old enough to not support the instruction. I will also see if I can get his config.log.

@hjelmn
Copy link
Member

hjelmn commented Dec 18, 2014

Ok, I see what is happening. The compiler is fine with the instruction but it isn't available. We have to do something similar to how portals4 handles this. Should have that in soon.

@hjelmn
Copy link
Member

hjelmn commented Dec 18, 2014

Should be fixed in 79d8f6e. I took a similar approach to portals4: only enable the 128-bit atomic if a binary containing it can be run or the check is overridden by a configure option.

@jsquyres
Copy link
Member Author

Great; thanks.

FWIW: you can use the same "Fixed #x" notation in commit messages that will automatically close issues (like we used to have in Trac).

jsquyres pushed a commit to jsquyres/ompi that referenced this issue Sep 19, 2016
dong0321 pushed a commit to dong0321/ompi that referenced this issue Feb 19, 2020
A couple of headers were missing from the tarball generated
by "make dist", which makes running nightly built / Coverity
difficult.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
BrendanCunningham pushed a commit to BrendanCunningham/ompi that referenced this issue Jul 13, 2022
@asn-the-goblin-slayer:
  "the name_parse() function in libevent's DNS code is vulnerable to a buffer overread.

   971         if (cp != name_out) {
   972             if (cp + 1 >= end) return -1;
   973             *cp++ = '.';
   974         }
   975         if (cp + label_len >= end) return -1;
   976         memcpy(cp, packet + j, label_len);
   977         cp += label_len;
   978         j += label_len;
   No check is made against length before the memcpy occurs.

   This was found through the Tor bug bounty program and the discovery should be credited to 'Guido Vranken'."

Reproducer for gdb (https://gist.github.com/azat/e4fcf540e9b89ab86d02):
  set $PROT_NONE=0x0
  set $PROT_READ=0x1
  set $PROT_WRITE=0x2
  set $MAP_ANONYMOUS=0x20
  set $MAP_SHARED=0x01
  set $MAP_FIXED=0x10
  set $MAP_32BIT=0x40

  start

  set $length=202
  # overread
  set $length=2
  # allocate with mmap to have a seg fault on page boundary
  set $l=(1<<20)*2
  p mmap(0, $l, $PROT_READ|$PROT_WRITE, $MAP_ANONYMOUS|$MAP_SHARED|$MAP_32BIT, -1, 0)
  set $packet=(char *)$1+$l-$length
  # hack the packet
  set $packet[0]=63
  set $packet[1]='/'

  p malloc(sizeof(int))
  set $idx=(int *)$2
  set $idx[0]=0
  set $name_out_len=202

  p malloc($name_out_len)
  set $name_out=$3

  # have WRITE only mapping to fail on read
  set $end=$1+$l
  p (void *)mmap($end, 1<<12, $PROT_NONE, $MAP_ANONYMOUS|$MAP_SHARED|$MAP_FIXED|$MAP_32BIT, -1, 0)
  set $m=$4

  p name_parse($packet, $length, $idx, $name_out, $name_out_len)
  x/2s (char *)$name_out

Before this patch:
$ gdb -ex 'source gdb' dns-example
$1 = 1073741824
$2 = (void *) 0x633010
$3 = (void *) 0x633030
$4 = (void *) 0x40200000

Program received signal SIGSEGV, Segmentation fault.
__memcpy_sse2_unaligned () at memcpy-sse2-unaligned.S:33

After this patch:
$ gdb -ex 'source gdb' dns-example
$1 = 1073741824
$2 = (void *) 0x633010
$3 = (void *) 0x633030
$4 = (void *) 0x40200000
$5 = -1
0x633030:       "/"
0x633032:       ""
(gdb) p $m
$6 = (void *) 0x40200000
(gdb) p $1
$7 = 1073741824
(gdb) p/x $1
$8 = 0x40000000
(gdb) quit

P.S. plus drop one condition duplicate.

Fixes: open-mpi#317
BrendanCunningham pushed a commit to BrendanCunningham/ompi that referenced this issue Jul 13, 2022
@asn-the-goblin-slayer:
  "the name_parse() function in libevent's DNS code is vulnerable to a buffer overread.

   971         if (cp != name_out) {
   972             if (cp + 1 >= end) return -1;
   973             *cp++ = '.';
   974         }
   975         if (cp + label_len >= end) return -1;
   976         memcpy(cp, packet + j, label_len);
   977         cp += label_len;
   978         j += label_len;
   No check is made against length before the memcpy occurs.

   This was found through the Tor bug bounty program and the discovery should be credited to 'Guido Vranken'."

Reproducer for gdb (https://gist.github.com/azat/e4fcf540e9b89ab86d02):
  set $PROT_NONE=0x0
  set $PROT_READ=0x1
  set $PROT_WRITE=0x2
  set $MAP_ANONYMOUS=0x20
  set $MAP_SHARED=0x01
  set $MAP_FIXED=0x10
  set $MAP_32BIT=0x40

  start

  set $length=202
  # overread
  set $length=2
  # allocate with mmap to have a seg fault on page boundary
  set $l=(1<<20)*2
  p mmap(0, $l, $PROT_READ|$PROT_WRITE, $MAP_ANONYMOUS|$MAP_SHARED|$MAP_32BIT, -1, 0)
  set $packet=(char *)$1+$l-$length
  # hack the packet
  set $packet[0]=63
  set $packet[1]='/'

  p malloc(sizeof(int))
  set $idx=(int *)$2
  set $idx[0]=0
  set $name_out_len=202

  p malloc($name_out_len)
  set $name_out=$3

  # have WRITE only mapping to fail on read
  set $end=$1+$l
  p (void *)mmap($end, 1<<12, $PROT_NONE, $MAP_ANONYMOUS|$MAP_SHARED|$MAP_FIXED|$MAP_32BIT, -1, 0)
  set $m=$4

  p name_parse($packet, $length, $idx, $name_out, $name_out_len)
  x/2s (char *)$name_out

Before this patch:
$ gdb -ex 'source gdb' dns-example
$1 = 1073741824
$2 = (void *) 0x633010
$3 = (void *) 0x633030
$4 = (void *) 0x40200000

Program received signal SIGSEGV, Segmentation fault.
__memcpy_sse2_unaligned () at memcpy-sse2-unaligned.S:33

After this patch:
$ gdb -ex 'source gdb' dns-example
$1 = 1073741824
$2 = (void *) 0x633010
$3 = (void *) 0x633030
$4 = (void *) 0x40200000
$5 = -1
0x633030:       "/"
0x633032:       ""
(gdb) p $m
$6 = (void *) 0x40200000
(gdb) p $1
$7 = 1073741824
(gdb) p/x $1
$8 = 0x40000000
(gdb) quit

P.S. plus drop one condition duplicate.

Fixes: open-mpi#317
Signed-off-by: Brendan Cunningham <14318587+BrendanCunningham@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants