Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

luajit-gdb: ValueError: sequence.index(x): x not in sequence #4828

Closed
olegrok opened this issue Apr 1, 2020 · 4 comments
Closed

luajit-gdb: ValueError: sequence.index(x): x not in sequence #4828

olegrok opened this issue Apr 1, 2020 · 4 comments
Assignees
Labels
bug Something isn't working devtools luajit
Milestone

Comments

@olegrok
Copy link
Collaborator

olegrok commented Apr 1, 2020

Tarantool version:
Tarantool 2.4.0-137-g9933c5d
Target: Linux-x86_64-Debug
Build options: cmake . -DCMAKE_INSTALL_PREFIX=/usr/local -DENABLE_BACKTRACE=OFF
Compiler: /usr/bin/cc /usr/bin/c++
flags: ' -fexceptions -funwind-tables -fno-common -fopenmp -msse2 -std=c11 -Wall
-Wextra -Wno-strict-aliasing -Wno-char-subscripts -Wno-format-truncation -fno-gnu89-inline
-Wno-cast-function-type -Werror'

OS version: centos-7 (docker run -it --cap-add=SYS_PTRACE --security-opt seccomp=unconfined tarantool/tarantool:2.x-centos7 bash)

Bug description:

(gdb) source luajit-gdb.py 
Traceback (most recent call last):
  File "luajit-gdb.py", line 702, in <module>
    load(None)
  File "luajit-gdb.py", line 699, in load
    'lj-gc': LJGC,
  File "luajit-gdb.py", line 687, in init
    command(name)
  File "luajit-gdb.py", line 468, in __init__
    gdb.write('{} command initialized\n'.format(name))
ValueError: sequence.index(x): x not in sequence

After #4827 I decided fairly build Tarantool from source in Debug mode. Simply cmake . && make -j && ./src/tarantool and got error that is shown above.

@igormunkin
Copy link
Collaborator

@Buristan also faced the similar issue. Here are some artefacts:

$ uname -r
4.19.52-1.el7.x86_64
$ cat /etc/os-release 
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

$ git lo -1 | cat
377198b jit: abort trace execution on JIT mode change
$ gdb ./src/luajit
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-115.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/i.munkin/luajit/src/luajit...done.
Traceback (most recent call last):
  File "/home/i.munkin/luajit/src/luajit-gdb.py", line 702, in <module>
    load(None)
  File "/home/i.munkin/luajit/src/luajit-gdb.py", line 699, in load
    'lj-gc': LJGC,
  File "/home/i.munkin/luajit/src/luajit-gdb.py", line 687, in init
    command(name)
  File "/home/i.munkin/luajit/src/luajit-gdb.py", line 468, in __init__
    gdb.write('{} command initialized\n'.format(name))
ValueError: sequence.index(x): x not in sequence
(gdb) python
>print(sys.version)
>2.7.5 (default, Aug  7 2019, 00:51:29) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-39)]

It looks like another issue with ancient Python support, since everything is OK on my Gentoo host:

$ uname -r
5.4.28-gentoo
$ cat /etc/os-release 
NAME=Gentoo
ID=gentoo
PRETTY_NAME="Gentoo/Linux"
ANSI_COLOR="1;32"
HOME_URL="https://www.gentoo.org/"
SUPPORT_URL="https://www.gentoo.org/support/"
BUG_REPORT_URL="https://bugs.gentoo.org/"
$ git lo -1 | cat    
377198b jit: abort trace execution on JIT mode change
$ gdb ./src/luajit 
GNU gdb (Gentoo 9.1 vanilla) 9.1
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://bugs.gentoo.org/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./src/luajit...
lj-arch command initialized
lj-tv command initialized
lj-str command initialized
lj-tab command initialized
lj-stack command initialized
lj-state command initialized
lj-gc command initialized
luajit-gdb.py is successfully loaded
(gdb) python
>print(sys.version)
>3.6.10 (default, May  1 2020, 22:49:58) 
[GCC 9.3.0]

@igormunkin igormunkin added bug Something isn't working luajit labels May 15, 2020
@olegrok
Copy link
Collaborator Author

olegrok commented May 15, 2020

I don't know is correct it or not.
But if you put any command just after gdb.events.new_objfile.disconnect(load) (e.g. print or sleep). This exception will be thrown earlier. So, you could continue to use gdb:

(None, ValueError('sequence.index(x): x not in sequence',)) -- this is my print
lj-gc command initialized
lj-tab command initialized
lj-str command initialized
lj-state command initialized
lj-stack command initialized
lj-tv command initialized
lj-arch command initialized
luajit-gdb.py is successfully loaded
(gdb) lj-gc
GC stats: PROPAGATE
	total: 1325573
	threshold: 1325579
	debt: 395
	estimate: 603262
	stepmul: 200
	pause: 200
	sweepstr: 8192/8192
	root: 5923 objects
	gray: 2139 objects
	grayagain: 238 objects
	weak: 1087 objects

And my experimental absolutely dirty changes:

diff --git a/src/luajit-gdb.py b/src/luajit-gdb.py
index f142fc5..ed92fbe 100644
--- a/src/luajit-gdb.py
+++ b/src/luajit-gdb.py
@@ -669,10 +669,13 @@ def init(commands):
                   'until libluajit objfile is loaded\n')
         gdb.events.new_objfile.connect(load)
         return
-
+    e = None
     try:
-        gdb.events.new_objfile.disconnect(load)
-    except:
+        e = gdb.events.new_objfile.disconnect(load)
+        import time
+        time.sleep(0.1)
+    except Exception as ex:
+        print(e, ex)
         pass # was not connected
 
     try:

igormunkin added a commit to tarantool/luajit that referenced this issue Jul 4, 2020
There was a mystic error when the extension was loaded against old gdb
versions build against Python 2:
| (gdb) source luajit-gdb.py
| Traceback (most recent call last):
|   File "luajit-gdb.py", line 702, in <module>
|     load(None)
|   File "luajit-gdb.py", line 699, in load
|     'lj-gc': LJGC,
|   File "luajit-gdb.py", line 687, in init
|     command(name)
|   File "luajit-gdb.py", line 468, in __init__
|     gdb.write('{} command initialized\n'.format(name))
| ValueError: sequence.index(x): x not in sequence

I made a little investigation (for more info see the mentioned issue)
and found the next fun fact: the exception was raised much earlier to
<str.format>, more precisely in <gdb.events.new_objfile.disconnect>.
However, the handled exception is preserved until <str.format> call and
hits the condition underneath leading to the extension load failure.

As a result to avoid the exception raise, the special global variable is
introduced for legacy (i.e. Python 2) environment. It checks whether any
callback is associated with new_objfile event prior to disconnecting it.
This variable usage is encapsulated within two introduced routines:
<connect> and <disconnect> which are wrappers for ones provided by gdb.

Furthermore, after diving to gdb sources related to Python embedding, I
found that callbacks are grouped into an internal list. Previous
implementation appended the <load> function to this callback list on
each its unsuccessful call, but only the successful one is removes it
from the list. Thereby disconnect action is moved prior to connect one
so there is no more than one <load> instance kept in callback list.

Fixes tarantool/tarantool#4828

Reported-by: Oleg Babin <olegrok@tarantool.org>
Signed-off-by: Igor Munkin <imun@tarantool.org>
@igormunkin
Copy link
Collaborator

@olegrok, thanks for your investigation, it pushed me to go a bit deeper. I applied the diff below and obtained the following output:

diff --git a/src/luajit-gdb.py b/src/luajit-gdb.py
index f142fc5..93f9058 100644
--- a/src/luajit-gdb.py
+++ b/src/luajit-gdb.py
@@ -676,6 +676,11 @@ def init(commands):
         pass # was not connected
 
     try:
+        print("Normal person's loading log")
+    except:
+        print("Smoker's loading log")
+
+    try:
         LJ_64 = str(gdb.parse_and_eval('IRT_PTR')) == 'IRT_P64'
         LJ_FR2 = LJ_GC64 = str(gdb.parse_and_eval('IRT_PGC')) == 'IRT_P64'
     except:
gdb w/ Python 2.7.5
$ gdb ./luajit
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-115.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/i.munkin/luajit/src/luajit...done.
Smoker's loading log
lj-gc command initialized
lj-tab command initialized
lj-str command initialized
lj-state command initialized
lj-stack command initialized
lj-tv command initialized
lj-arch command initialized
luajit-gdb.py is successfully loaded
(gdb) python
>print(sys.version)
>2.7.5 (default, Aug  7 2019, 00:51:29)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-39)]
gdb w/ Python 3.6.8
gdb ./luajit
GNU gdb (GDB) Red Hat Enterprise Linux 8.2-11.el8
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./luajit...done.
Normal person's loading log
lj-arch command initialized
lj-tv command initialized
lj-str command initialized
lj-tab command initialized
lj-stack command initialized
lj-state command initialized
lj-gc command initialized
luajit-gdb.py is successfully loaded
(gdb) python
>print(sys.version)
>3.6.8 (default, Apr 16 2020, 01:36:27)
[GCC 8.3.1 20191121 (Red Hat 8.3.1-5)]

The result looks freaking ridiculous to me. I tried to debug the case a little to find the root cause, but after some time digging the problem I can only state the following: it seems the exception raised within gdb.events.new_objfile.disconnect(load) (i.e. in _PySequence_IterSearch) is not properly handled by the ancient Python. Here is a simple gdb scenario I used with some comments and results:

gdb --args gdb ./luajit
b _PySequence_IterSearch        # lookup made on disconnect
r
b PyErr_Restore                 # function setting an exception to thread state
c
s
p/x tstate                      # dump thread state address to be used later
finish
p $1->curexc_type               # dump current exception type (i.e. ValueError)
# <type at remote 0x7f5713a22640>
# p PyExc_ValueError
# <type at remote 0x7f5713a22640>
p $1->curexc_value              # dump current exception value
# 'sequence.index(x): x not in sequence'
d
b do_string_format              # str.format implementation
c

And here is the difference: Python 2.7.5 doesn't purge the handled exception

p $1->curexc_type               # dump current exception type (i.e. ValueError)
# <type at remote 0x7f5713a22640>
# p PyExc_ValueError
# <type at remote 0x7f5713a22640>
p $1->curexc_value              # dump current exception value
# 'sequence.index(x): x not in sequence'

but Python 3.6.8 (at least) does

p $1->curexc_type               # dump current exception type (i.e. ValueError)
# 0x0
p $1->curexc_value              # dump current exception value
# 0x0

As a result it hits the condition underneath the do_string_format function and error occurs.

Finally, I've just added a custom connect and disconnect routines to the extension respecting both major Python versions (like we've done here). Could you please try whether the patched version works for you?

@igormunkin igormunkin self-assigned this Jul 4, 2020
igormunkin added a commit to tarantool/luajit that referenced this issue Jul 6, 2020
There was a mystic error when the extension was loaded by the old gdb
versions built against Python 2:
| (gdb) source luajit-gdb.py
| Traceback (most recent call last):
|   File "luajit-gdb.py", line 702, in <module>
|     load(None)
|   File "luajit-gdb.py", line 699, in load
|     'lj-gc': LJGC,
|   File "luajit-gdb.py", line 687, in init
|     command(name)
|   File "luajit-gdb.py", line 468, in __init__
|     gdb.write('{} command initialized\n'.format(name))
| ValueError: sequence.index(x): x not in sequence

I made a little investigation (for more info see the mentioned issue)
and found the next fun fact: the exception was raised much earlier to
<str.format>, more precisely in <gdb.events.new_objfile.disconnect>.
However, the handled exception is preserved until <str.format> call and
hits the condition underneath leading to the extension load failure.

As a result to avoid the exception, the special global variable is
introduced for legacy (i.e. Python 2) environment. It checks whether any
callback is associated with new_objfile event prior to disconnecting it.
This variable usage is encapsulated within two introduced routines:
<connect> and <disconnect> which are wrappers for ones provided by gdb.

Furthermore, after diving to gdb sources related to Python embedding, I
found that callbacks are grouped into an internal list. Previous
implementation appended the <load> function to this callback list on
each its unsuccessful call, but only the successful one is removes it
from the list. Thereby disconnect action is moved prior to connect one
so there is no more than one <load> instance kept in callback list.

Fixes tarantool/tarantool#4828

Reported-by: Oleg Babin <olegrok@tarantool.org>
Reviewed-by: Oleg Babin <olegrok@tarantool.org>
Signed-off-by: Igor Munkin <imun@tarantool.org>
igormunkin added a commit to tarantool/luajit that referenced this issue Jul 22, 2020
There was a mystic error when the extension was loaded by the old gdb
versions built against Python 2:
| (gdb) source luajit-gdb.py
| Traceback (most recent call last):
|   File "luajit-gdb.py", line 702, in <module>
|     load(None)
|   File "luajit-gdb.py", line 699, in load
|     'lj-gc': LJGC,
|   File "luajit-gdb.py", line 687, in init
|     command(name)
|   File "luajit-gdb.py", line 468, in __init__
|     gdb.write('{} command initialized\n'.format(name))
| ValueError: sequence.index(x): x not in sequence

I made a little investigation (for more info see the mentioned issue)
and found the next fun fact: the exception was raised much earlier to
<str.format>, more precisely in <gdb.events.new_objfile.disconnect>.
However, the handled exception is preserved until <str.format> call and
hits the condition underneath leading to the extension load failure.

As a result to avoid the exception, the special global variable is
introduced for legacy (i.e. Python 2) environment. It checks whether any
callback is associated with new_objfile event prior to disconnecting it.
This variable usage is encapsulated within two introduced routines:
<connect> and <disconnect> which are wrappers for ones provided by gdb.

Furthermore, after diving to gdb sources related to Python embedding, I
found that callbacks are grouped into an internal list. Previous
implementation appended the <load> function to this callback list on
each its unsuccessful call, but only the successful one is removes it
from the list. Thereby disconnect action is moved prior to connect one
so there is no more than one <load> instance kept in callback list.

Fixes tarantool/tarantool#4828

Reported-by: Oleg Babin <olegrok@tarantool.org>
Reviewed-by: Oleg Babin <olegrok@tarantool.org>
Reviewed-by: Sergey Ostanevich <sergos@tarantool.org>
Signed-off-by: Igor Munkin <imun@tarantool.org>
@kyukhin kyukhin added this to the 1.10.8 milestone Jul 22, 2020
@igormunkin
Copy link
Collaborator

Introduced in tarantool/luajit in tarantool/luajit@ad1d444. The submodule was updated in tarantool in 2.6.0-12-gbc58c0679 (bc58c06), 2.5.1-4-g0c3f90bd2 (0c3f90b), 2.4.2-4-gae283d828 (ae283d8), 1.10.7-3-g2eddcb432 (2eddcb4).

igormunkin added a commit to tarantool/luajit that referenced this issue Jun 16, 2022
There was a mystic error when the extension was loaded by the old gdb
versions built against Python 2:
| (gdb) source luajit-gdb.py
| Traceback (most recent call last):
|   File "luajit-gdb.py", line 702, in <module>
|     load(None)
|   File "luajit-gdb.py", line 699, in load
|     'lj-gc': LJGC,
|   File "luajit-gdb.py", line 687, in init
|     command(name)
|   File "luajit-gdb.py", line 468, in __init__
|     gdb.write('{} command initialized\n'.format(name))
| ValueError: sequence.index(x): x not in sequence

I made a little investigation (for more info see the mentioned issue)
and found the next fun fact: the exception was raised much earlier to
<str.format>, more precisely in <gdb.events.new_objfile.disconnect>.
However, the handled exception is preserved until <str.format> call and
hits the condition underneath leading to the extension load failure.

As a result to avoid the exception, the special global variable is
introduced for legacy (i.e. Python 2) environment. It checks whether any
callback is associated with new_objfile event prior to disconnecting it.
This variable usage is encapsulated within two introduced routines:
<connect> and <disconnect> which are wrappers for ones provided by gdb.

Furthermore, after diving to gdb sources related to Python embedding, I
found that callbacks are grouped into an internal list. Previous
implementation appended the <load> function to this callback list on
each its unsuccessful call, but only the successful one is removes it
from the list. Thereby disconnect action is moved prior to connect one
so there is no more than one <load> instance kept in callback list.

Fixes tarantool/tarantool#4828

Reported-by: Oleg Babin <olegrok@tarantool.org>
Reviewed-by: Oleg Babin <olegrok@tarantool.org>
Reviewed-by: Sergey Ostanevich <sergos@tarantool.org>
Signed-off-by: Igor Munkin <imun@tarantool.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working devtools luajit
Projects
None yet
Development

No branches or pull requests

3 participants