Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions Doc/library/functions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -788,6 +788,14 @@ section.

The long type is described in :ref:`typesnumeric`.

.. versionchanged:: 2.7.18.6
:class:`long` string inputs and string representations can be limited to
help avoid denial of service attacks. A :exc:`ValueError` is raised when
the limit is exceeded while converting a string *x* to an :class:`long` or
when converting an :class:`long` into a string would exceed the limit.
See the :ref:`integer string conversion length limitation
<int_max_str_digits>` documentation.


.. function:: map(function, iterable, ...)

Expand Down
11 changes: 11 additions & 0 deletions Doc/library/json.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,11 @@ is a lightweight data interchange format inspired by
`JavaScript <https://en.wikipedia.org/wiki/JavaScript>`_ object literal syntax
(although it is not a strict subset of JavaScript [#rfc-errata]_ ).

.. warning::
Be cautious when parsing JSON data from untrusted sources. A malicious
JSON string may cause the decoder to consume considerable CPU and memory
resources. Limiting the size of data to be parsed is recommended.

:mod:`json` exposes an API familiar to users of the standard library
:mod:`marshal` and :mod:`pickle` modules.

Expand Down Expand Up @@ -249,6 +254,12 @@ Basic Usage
be used to use another datatype or parser for JSON integers
(e.g. :class:`float`).

.. versionchanged:: 2.7.18.6
The default *parse_int* of :func:`int` now limits the maximum length of
the integer string via the interpreter's :ref:`integer string
conversion length limitation <int_max_str_digits>` to help avoid denial
of service attacks.

*parse_constant*, if specified, will be called with one of the following
strings: ``'-Infinity'``, ``'Infinity'``, ``'NaN'``.
This can be used to raise an exception if invalid JSON numbers
Expand Down
168 changes: 168 additions & 0 deletions Doc/library/stdtypes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -521,6 +521,13 @@ class`. float also has the following additional methods.

.. versionadded:: 2.6

.. note::

The values returned by ``as_integer_ratio()`` can be huge. Attempts
to render such integers into decimal strings may bump into the
:ref:`integer string conversion length limitation
<int_max_str_digits>`.

.. method:: float.is_integer()

Return ``True`` if the float instance is finite with integral
Expand Down Expand Up @@ -3190,6 +3197,167 @@ The following attributes are only supported by :term:`new-style class`\ es.
[<type 'bool'>]


.. _int_max_str_digits:

Integer string conversion length limitation
===========================================

CPython has a global limit for converting between :class:`long` and :class:`str`
or :class:`unicode` to mitigate denial of service attacks. This limit *only* applies
to decimal or other non-power-of-two number bases. Hexadecimal, octal, and binary
conversions are unlimited. The limit can be configured.

The :class:`long` type in CPython is an arbitrary length number stored in binary
form (commonly known as a "bignum"). There exists no algorithm that can convert
a string to a binary integer or a binary integer to a string in linear time,
*unless* the base is a power of 2. Even the best known algorithms for base 10
have sub-quadratic complexity. Converting a large value such as ``long('1' *
500_000)`` can take over a second on a fast CPU.

Limiting conversion size offers a practical way to avoid `CVE-2020-10735
<https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-10735>`_.

The limit is applied to the number of digit characters in the input or output
string when a non-linear conversion algorithm would be involved. Trailing *L*s
and the sign are not counted towards the limit.

When an operation would exceed the limit, a :exc:`ValueError` is raised:

.. doctest::

>>> import sys
>>> sys.set_int_max_str_digits(4300) # Illustrative, this is the default.
>>> _ = long('2' * 5432)
Traceback (most recent call last):
...
ValueError: Exceeds the limit (4300) for integer string conversion: value has 5432 digits.
>>> i = long('2' * 4300)
>>> len(str(i))
4300
>>> i_squared = i*i
>>> len(str(i_squared))
Traceback (most recent call last):
...
ValueError: Exceeds the limit (4300) for integer string conversion: value has 8599 digits.
>>> len(hex(i_squared))
7144
>>> assert long(hex(i_squared), base=16) == i*i # Hexadecimal is unlimited.

The default limit is 4300 digits as provided in
:data:`sys.long_info.default_max_str_digits <sys.long_info>`.
The lowest limit that can be configured is 640 digits as provided in
:data:`sys.long_info.str_digits_check_threshold <sys.long_info>`.

Verification:

.. doctest::

>>> import sys
>>> assert sys.int_info.default_max_str_digits == 4300, sys.int_info
>>> assert sys.int_info.str_digits_check_threshold == 640, sys.int_info
>>> msg = long('578966293710682886880994035146873798396722250538762761564'
... '9252925514383915483333812743580549779436104706260696366600'
... '571186405732').to_bytes(53, 'big')
...

.. versionadded:: 2.7.18.6


Affected APIs
-------------

Because int automatically converts to long if the value is larger than
:data:`sys.maxint` this limitation applies to potentially slow conversions
between any of :class:`int` or :class:`long` and :class:`str` or :class:`unicode`:

* ``int(string)`` with default base 10.
* ``int(string, base)`` for all bases that are not a power of 2.
* ``long(string)`` with default base 10.
* ``long(string, base)`` for all bases that are not a power of 2.
* ``int(unicode)`` with default base 10.
* ``int(unicode, base)`` for all bases that are not a power of 2.
* ``long(unicode)`` with default base 10.
* ``long(unicode, base)`` for all bases that are not a power of 2.
* ``str(long)``.
* ``repr(long)``.
* ``unicode(long)``.
* any other string conversion to base 10, for example ``"{}".format(long)``.

The limitations do not apply to functions with a linear algorithm:

* ``long(string, base)`` with base 2, 4, 8, 16, or 32.
* :func:`hex`, :func:`oct`, :func:`bin`.
* :ref:`formatspec` for hex, octal, and binary numbers.
* :class:`str` to :class:`float`.
* :class:`str` to :class:`decimal.Decimal`.

Configuring the limit
---------------------

Before Python starts up you can use an environment variable to configure the limit:

* :envvar:`PYTHONINTMAXSTRDIGITS`, e.g.
``PYTHONINTMAXSTRDIGITS=640 python`` to set the limit to 640 or
``PYTHONINTMAXSTRDIGITS=0 python`` to disable the limitation.
* :data:`sys.flags.long_max_str_digits` contains the value of
:envvar:`PYTHONINTMAXSTRDIGITS`. A value of *-1* indicates that none was set,
thus a value of :data:`sys.int_info.default_max_str_digits` was used during
initialization.

From code, you can inspect the current limit and set a new one using these
:mod:`sys` APIs:

* :func:`sys.get_int_max_str_digits` and :func:`sys.set_int_max_str_digits` are
a getter and setter for the interpreter-wide limit.

Information about the default and minimum can be found in :attr:`sys.long_info`:

* :data:`sys.long_info.default_max_str_digits <sys.long_info>` is the compiled-in
default limit.
* :data:`sys.long_info.str_digits_check_threshold <sys.long_info>` is the lowest
accepted value for the limit (other than 0 which disables it).

.. versionadded:: 2.7.18.6

.. caution::

Setting a low limit *can* lead to problems. While rare, code exists that
contains integer constants in decimal in their source that exceed the
minimum threshold. A consequence of setting the limit is that Python source
code containing decimal integer literals longer than the limit will
encounter an error during parsing, usually at startup time or import time or
even at installation time - anytime an up to date ``.pyc`` does not already
exist for the code. A workaround for source that contains such large
constants is to convert them to ``0x`` hexadecimal form as it has no limit.

Test your application thoroughly if you use a low limit. Ensure your tests
run with the limit set early via the environment so that it applies during
startup and even during any installation step that may invoke Python to
precompile ``.py`` sources to ``.pyc`` files.

Recommended configuration
-------------------------

The default :data:`sys.long_info.default_max_str_digits` is expected to be
reasonable for most applications. If your application requires a different
limit, set it from your main entry point using Python version agnostic code as
these APIs were ported from the original fix in version 3.12.

Example::

>>> import sys
>>> if hasattr(sys, "set_int_max_str_digits"):
... upper_bound = 68000
... lower_bound = 4004
... current_limit = sys.get_int_max_str_digits()
... if current_limit == 0 or current_limit > upper_bound:
... sys.set_int_max_str_digits(upper_bound)
... elif current_limit < lower_bound:
... sys.set_int_max_str_digits(lower_bound)

If you need to disable it entirely, set it to ``0``.


.. rubric:: Footnotes

.. [1] Additional information on these special methods may be found in the Python
Expand Down
48 changes: 38 additions & 10 deletions Doc/library/sys.rst
Original file line number Diff line number Diff line change
Expand Up @@ -431,6 +431,14 @@ always available.
an argument to :func:`getrefcount`.


.. function:: get_int_max_str_digits()

Returns the current value for the :ref:`integer string conversion length
limitation <int_max_str_digits>`. See also :func:`set_int_max_str_digits`.

.. versionadded:: 2.7.18.6


.. function:: getrecursionlimit()

Return the current value of the recursion limit, the maximum depth of the Python
Expand Down Expand Up @@ -603,19 +611,30 @@ always available.

.. tabularcolumns:: |l|L|

+-------------------------+----------------------------------------------+
| Attribute | Explanation |
+=========================+==============================================+
| :const:`bits_per_digit` | number of bits held in each digit. Python |
| | integers are stored internally in base |
| | ``2**long_info.bits_per_digit`` |
+-------------------------+----------------------------------------------+
| :const:`sizeof_digit` | size in bytes of the C type used to |
| | represent a digit |
+-------------------------+----------------------------------------------+
+----------------------------------------+-----------------------------------------------+
| Attribute | Explanation |
+========================================+===============================================+
| :const:`bits_per_digit` | number of bits held in each digit. Python |
| | integers are stored internally in base |
| | ``2**int_info.bits_per_digit`` |
+----------------------------------------+-----------------------------------------------+
| :const:`sizeof_digit` | size in bytes of the C type used to |
| | represent a digit |
+----------------------------------------+-----------------------------------------------+
| :const:`default_max_str_digits` | default value for |
| | :func:`sys.get_int_max_str_digits` when it |
| | is not otherwise explicitly configured. |
+----------------------------------------+-----------------------------------------------+
| :const:`str_digits_check_threshold` | minimum non-zero value for |
| | :func:`sys.set_int_max_str_digits`, |
| | :envvar:`PYTHONINTMAXSTRDIGITS`. |
+----------------------------------------+-----------------------------------------------+

.. versionadded:: 2.7

.. versionchanged:: 2.7.18.6
Added ``default_max_str_digits`` and ``str_digits_check_threshold``.


.. data:: last_type
last_value
Expand Down Expand Up @@ -848,6 +867,15 @@ always available.
.. versionadded:: 2.2


.. function:: set_int_max_str_digits(n)

Set the :ref:`integer string conversion length limitation
<int_max_str_digits>` used by this interpreter. See also
:func:`get_int_max_str_digits`.

.. versionadded:: 2.7.18.6


.. function:: setprofile(profilefunc)

.. index::
Expand Down
10 changes: 10 additions & 0 deletions Doc/library/test.rst
Original file line number Diff line number Diff line change
Expand Up @@ -443,6 +443,16 @@ The :mod:`test.support` module defines the following functions:
.. versionadded:: 2.7


.. function:: adjust_int_max_str_digits(max_digits)

This function returns a context manager that will change the global
:func:`sys.set_int_max_str_digits` setting for the duration of the
context to allow execution of test code that needs a different limit
on the number of digits when converting between an integer and string.

.. versionadded:: 2.7.18.6


The :mod:`test.support` module defines the following classes:

.. class:: TransientResource(exc[, **kwargs])
Expand Down
6 changes: 6 additions & 0 deletions Doc/library/xmlrpclib.rst
Original file line number Diff line number Diff line change
Expand Up @@ -557,6 +557,12 @@ Convenience Functions
.. versionchanged:: 2.5
The *use_datetime* flag was added.

.. versionchanged:: 2.7.18.6
The default *parse_int* of :func:`int` now limits the maximum length of
the integer string via the interpreter's :ref:`integer string
conversion length limitation <int_max_str_digits>` to help avoid denial
of service attacks.


.. _xmlrpc-client-example:

Expand Down
9 changes: 9 additions & 0 deletions Doc/using/cmdline.rst
Original file line number Diff line number Diff line change
Expand Up @@ -638,6 +638,15 @@ conflict.

.. versionadded:: 2.7.12


.. envvar:: PYTHONINTMAXSTRDIGITS

If this variable is set to an integer, it is used to configure the
interpreter's global :ref:`integer string conversion length limitation
<int_max_str_digits>`.

.. versionadded:: 2.7.18.6

Debug-mode variables
~~~~~~~~~~~~~~~~~~~~

Expand Down
12 changes: 12 additions & 0 deletions Doc/whatsnew/2.7.rst
Original file line number Diff line number Diff line change
Expand Up @@ -884,6 +884,18 @@ Some smaller changes made to the core Python language are:
now only cleared if no one else is holding a reference to the
dictionary (:issue:`7140`).

* Converting between :class:`int` or :class:`long` and :class:`str` or
:class:`unicode` in bases other than 2 (binary), 4, 8 (octal), 16
(hexadecimal), or 32 such as base 10 (decimal) now raises a
:exc:`ValueError` if the number of digits in string form is above a
limit to avoid potential denial of service attacks due to the
algorithmic complexity. This is a mitigation for `CVE-2020-10735
<https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-10735>`_.
This limit can be configured or disabled by environment variable or
:mod:`sys` APIs. See the :ref:`integer string conversion length
limitation <int_max_str_digits>` documentation. The default limit
is 4300 digits in string form.

.. ======================================================================

.. _new-27-interpreter:
Expand Down
26 changes: 26 additions & 0 deletions Include/longobject.h
Original file line number Diff line number Diff line change
Expand Up @@ -129,6 +129,32 @@ PyAPI_FUNC(PyObject *) _PyLong_FormatAdvanced(PyObject *obj,
char *format_spec,
Py_ssize_t format_spec_len);

#define _MAX_STR_DIGITS_ERROR_FMT "Exceeds the limit (%d) for integer string conversion: value has %zd digits"
/*
* Default long base conversion size limitation: Denial of Service prevention.
*
* Chosen such that this isn't wildly slow on modern hardware
* 4300 decimal digits fits a ~14284 bit number.
*/
#define _PY_LONG_DEFAULT_MAX_STR_DIGITS 4300
/*
* Threshold for max digits check. For performance reasons long() and
* long.__str__() don't checks values that are smaller than this
* threshold. Acts as a guaranteed minimum size limit for bignums that
* applications can expect from CPython.
*
* "640 digits should be enough for anyone." - gps
* fits a ~2126 bit decimal number.
*/
#define _PY_LONG_MAX_STR_DIGITS_THRESHOLD 640

#if ((_PY_LONG_DEFAULT_MAX_STR_DIGITS != 0) && \
(_PY_LONG_DEFAULT_MAX_STR_DIGITS < _PY_LONG_MAX_STR_DIGITS_THRESHOLD))
# error "_PY_LONG_DEFAULT_MAX_STR_DIGITS smaller than threshold."
#endif

int Py_LongMaxStrDigits;

#ifdef __cplusplus
}
#endif
Expand Down
1 change: 1 addition & 0 deletions Include/pydebug.h
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ PyAPI_DATA(int) _Py_QnewFlag;
/* Warn about 3.x issues */
PyAPI_DATA(int) Py_Py3kWarningFlag;
PyAPI_DATA(int) Py_HashRandomizationFlag;
PyAPI_DATA(int) Py_LongMaxStrDigitsFlag;

/* this is a wrapper around getenv() that pays attention to
Py_IgnoreEnvironmentFlag. It should be used for getting variables like
Expand Down
Loading