Skip to content

Commit

Permalink
co_lnotab supports negative line number delta
Browse files Browse the repository at this point in the history
Issue #26107: The format of the co_lnotab attribute of code objects changes to
support negative line number delta.

Changes:

* assemble_lnotab(): if line number delta is less than -128 or greater than
  127, emit multiple (offset_delta, lineno_delta) in co_lnotab
* update functions decoding co_lnotab to use signed 8-bit integers

  - dis.findlinestarts()
  - PyCode_Addr2Line()
  - _PyCode_CheckLineNumber()
  - frame_setlineno()

* update lnotab_notes.txt
* increase importlib MAGIC_NUMBER to 3361
* document the change in What's New in Python 3.6
* cleanup also PyCode_Optimize() to use better variable names
  • Loading branch information
vstinner committed Jan 20, 2016
1 parent 316fcc8 commit f3914eb
Show file tree
Hide file tree
Showing 11 changed files with 203 additions and 161 deletions.
10 changes: 10 additions & 0 deletions Doc/whatsnew/3.6.rst
Expand Up @@ -244,6 +244,16 @@ that may require changes to your code.
Changes in the Python API
-------------------------

* The format of the ``co_lnotab`` attribute of code objects changed to support
negative line number delta. By default, Python does not emit bytecode with
negative line number delta. Functions using ``frame.f_lineno``,
``PyFrame_GetLineNumber()`` or ``PyCode_Addr2Line()`` are not affected.
Functions decoding directly ``co_lnotab`` should be updated to use a signed
8-bit integer type for the line number delta, but it's only required to
support applications using negative line number delta. See
``Objects/lnotab_notes.txt`` for the ``co_lnotab`` format and how to decode
it, and see the :pep:`511` for the rationale.

* The functions in the :mod:`compileall` module now return booleans instead
of ``1`` or ``0`` to represent success or failure, respectively. Thanks to
booleans being a subclass of integers, this should only be an issue if you
Expand Down
2 changes: 1 addition & 1 deletion Include/code.h
Expand Up @@ -117,7 +117,7 @@ PyAPI_FUNC(int) _PyCode_CheckLineNumber(PyCodeObject* co,
#endif

PyAPI_FUNC(PyObject*) PyCode_Optimize(PyObject *code, PyObject* consts,
PyObject *names, PyObject *lineno_obj);
PyObject *names, PyObject *lnotab);

#ifdef __cplusplus
}
Expand Down
7 changes: 5 additions & 2 deletions Lib/dis.py
Expand Up @@ -397,8 +397,8 @@ def findlinestarts(code):
Generate pairs (offset, lineno) as described in Python/compile.c.
"""
byte_increments = list(code.co_lnotab[0::2])
line_increments = list(code.co_lnotab[1::2])
byte_increments = code.co_lnotab[0::2]
line_increments = code.co_lnotab[1::2]

lastlineno = None
lineno = code.co_firstlineno
Expand All @@ -409,6 +409,9 @@ def findlinestarts(code):
yield (addr, lineno)
lastlineno = lineno
addr += byte_incr
if line_incr >= 0x80:
# line_increments is an array of 8-bit signed integers
line_incr -= 0x100
lineno += line_incr
if lineno != lastlineno:
yield (addr, lineno)
Expand Down
5 changes: 3 additions & 2 deletions Lib/importlib/_bootstrap_external.py
Expand Up @@ -223,13 +223,14 @@ def _write_atomic(path, data, mode=0o666):
# Python 3.5b1 3330 (PEP 448: Additional Unpacking Generalizations)
# Python 3.5b2 3340 (fix dictionary display evaluation order #11205)
# Python 3.5b2 3350 (add GET_YIELD_FROM_ITER opcode #24400)
# Python 3.6a0 3360 (add FORMAT_VALUE opcode #25483)
# Python 3.6a0 3360 (add FORMAT_VALUE opcode #25483
# Python 3.6a0 3361 (lineno delta of code.co_lnotab becomes signed)
#
# MAGIC must change whenever the bytecode emitted by the compiler may no
# longer be understood by older implementations of the eval loop (usually
# due to the addition of new opcodes).

MAGIC_NUMBER = (3360).to_bytes(2, 'little') + b'\r\n'
MAGIC_NUMBER = (3361).to_bytes(2, 'little') + b'\r\n'
_RAW_MAGIC_NUMBER = int.from_bytes(MAGIC_NUMBER, 'little') # For import.c

_PYCACHE = '__pycache__'
Expand Down
3 changes: 3 additions & 0 deletions Misc/NEWS
Expand Up @@ -10,6 +10,9 @@ Release date: tba
Core and Builtins
-----------------

- Issue #26107: The format of the ``co_lnotab`` attribute of code objects
changes to support negative line number delta.

- Issue #26154: Add a new private _PyThreadState_UncheckedGet() function to get
the current Python thread state, but don't issue a fatal error if it is NULL.
This new function must be used instead of accessing directly the
Expand Down
11 changes: 7 additions & 4 deletions Objects/codeobject.c
Expand Up @@ -557,7 +557,8 @@ PyCode_Addr2Line(PyCodeObject *co, int addrq)
addr += *p++;
if (addr > addrq)
break;
line += *p++;
line += (signed char)*p;
p++;
}
return line;
}
Expand Down Expand Up @@ -592,17 +593,19 @@ _PyCode_CheckLineNumber(PyCodeObject* co, int lasti, PyAddrPair *bounds)
if (addr + *p > lasti)
break;
addr += *p++;
if (*p)
if ((signed char)*p)
bounds->ap_lower = addr;
line += *p++;
line += (signed char)*p;
p++;
--size;
}

if (size > 0) {
while (--size >= 0) {
addr += *p++;
if (*p++)
if ((signed char)*p)
break;
p++;
}
bounds->ap_upper = addr;
}
Expand Down
2 changes: 1 addition & 1 deletion Objects/frameobject.c
Expand Up @@ -137,7 +137,7 @@ frame_setlineno(PyFrameObject *f, PyObject* p_new_lineno)
new_lasti = -1;
for (offset = 0; offset < lnotab_len; offset += 2) {
addr += lnotab[offset];
line += lnotab[offset+1];
line += (signed char)lnotab[offset+1];
if (line >= new_lineno) {
new_lasti = addr;
new_lineno = line;
Expand Down
37 changes: 21 additions & 16 deletions Objects/lnotab_notes.txt
Expand Up @@ -12,42 +12,47 @@ pairs. The details are important and delicate, best illustrated by example:
0 1
6 2
50 7
350 307
361 308
350 207
361 208

Instead of storing these numbers literally, we compress the list by storing only
the increments from one row to the next. Conceptually, the stored list might
the difference from one row to the next. Conceptually, the stored list might
look like:

0, 1, 6, 1, 44, 5, 300, 300, 11, 1
0, 1, 6, 1, 44, 5, 300, 200, 11, 1

The above doesn't really work, but it's a start. Note that an unsigned byte
can't hold negative values, or values larger than 255, and the above example
contains two such values. So we make two tweaks:
The above doesn't really work, but it's a start. An unsigned byte (byte code
offset)) can't hold negative values, or values larger than 255, a signed byte
(line number) can't hold values larger than 127 or less than -128, and the
above example contains two such values. So we make two tweaks:

(a) there's a deep assumption that byte code offsets and their corresponding
line #s both increase monotonically, and
(b) if at least one column jumps by more than 255 from one row to the next,
more than one pair is written to the table. In case #b, there's no way to know
from looking at the table later how many were written. That's the delicate
part. A user of co_lnotab desiring to find the source line number
corresponding to a bytecode address A should do something like this
(a) there's a deep assumption that byte code offsets increase monotonically,
and
(b) if byte code offset jumps by more than 255 from one row to the next, or if
source code line number jumps by more than 127 or less than -128 from one row
to the next, more than one pair is written to the table. In case #b,
there's no way to know from looking at the table later how many were written.
That's the delicate part. A user of co_lnotab desiring to find the source
line number corresponding to a bytecode address A should do something like
this:

lineno = addr = 0
for addr_incr, line_incr in co_lnotab:
addr += addr_incr
if addr > A:
return lineno
if line_incr >= 0x80:
line_incr -= 0x100
lineno += line_incr

(In C, this is implemented by PyCode_Addr2Line().) In order for this to work,
when the addr field increments by more than 255, the line # increment in each
pair generated must be 0 until the remaining addr increment is < 256. So, in
the example above, assemble_lnotab in compile.c should not (as was actually done
until 2.2) expand 300, 300 to
until 2.2) expand 300, 200 to
255, 255, 45, 45,
but to
255, 0, 45, 255, 0, 45.
255, 0, 45, 128, 0, 72.

The above is sufficient to reconstruct line numbers for tracebacks, but not for
line tracing. Tracing is handled by PyCode_CheckLineNumber() in codeobject.c
Expand Down
25 changes: 18 additions & 7 deletions Python/compile.c
Expand Up @@ -4452,7 +4452,6 @@ assemble_lnotab(struct assembler *a, struct instr *i)
d_lineno = i->i_lineno - a->a_lineno;

assert(d_bytecode >= 0);
assert(d_lineno >= 0);

if(d_bytecode == 0 && d_lineno == 0)
return 1;
Expand Down Expand Up @@ -4482,9 +4481,21 @@ assemble_lnotab(struct assembler *a, struct instr *i)
d_bytecode -= ncodes * 255;
a->a_lnotab_off += ncodes * 2;
}
assert(d_bytecode <= 255);
if (d_lineno > 255) {
int j, nbytes, ncodes = d_lineno / 255;
assert(0 <= d_bytecode && d_bytecode <= 255);

if (d_lineno < -128 || 127 < d_lineno) {
int j, nbytes, ncodes, k;
if (d_lineno < 0) {
k = -128;
/* use division on positive numbers */
ncodes = (-d_lineno) / 128;
}
else {
k = 127;
ncodes = d_lineno / 127;
}
d_lineno -= ncodes * k;
assert(ncodes >= 1);
nbytes = a->a_lnotab_off + 2 * ncodes;
len = PyBytes_GET_SIZE(a->a_lnotab);
if (nbytes >= len) {
Expand All @@ -4502,15 +4513,15 @@ assemble_lnotab(struct assembler *a, struct instr *i)
lnotab = (unsigned char *)
PyBytes_AS_STRING(a->a_lnotab) + a->a_lnotab_off;
*lnotab++ = d_bytecode;
*lnotab++ = 255;
*lnotab++ = k;
d_bytecode = 0;
for (j = 1; j < ncodes; j++) {
*lnotab++ = 0;
*lnotab++ = 255;
*lnotab++ = k;
}
d_lineno -= ncodes * 255;
a->a_lnotab_off += ncodes * 2;
}
assert(-128 <= d_lineno && d_lineno <= 127);

len = PyBytes_GET_SIZE(a->a_lnotab);
if (a->a_lnotab_off + 2 >= len) {
Expand Down

0 comments on commit f3914eb

Please sign in to comment.