Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent uses of format string based PyObject_Call* that do not produce tuple required by docs #71007

Open
MojoVampire mannequin opened this issue Apr 21, 2016 · 3 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs)

Comments

@MojoVampire
Copy link
Mannequin

MojoVampire mannequin commented Apr 21, 2016

BPO 26820
Nosy @vstinner, @jkloth, @MojoVampire

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2016-04-21.18:36:47.602>
labels = ['interpreter-core']
title = 'Prevent uses of format string based PyObject_Call* that do not produce tuple required by docs'
updated_at = <Date 2016-04-22.00:26:44.891>
user = 'https://github.com/MojoVampire'

bugs.python.org fields:

activity = <Date 2016-04-22.00:26:44.891>
actor = 'jkloth'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Interpreter Core']
creation = <Date 2016-04-21.18:36:47.602>
creator = 'josh.r'
dependencies = []
files = []
hgrepos = []
issue_num = 26820
keywords = []
message_count = 3.0
messages = ['263929', '263930', '263945']
nosy_count = 3.0
nosy_names = ['vstinner', 'jkloth', 'josh.r']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = None
url = 'https://bugs.python.org/issue26820'
versions = ['Python 3.5', 'Python 3.6']

@MojoVampire
Copy link
Mannequin Author

MojoVampire mannequin commented Apr 21, 2016

PyObject_CallMethod explicitly documents that "The C arguments are described by a Py_BuildValue() format string that should produce a tuple." While PyObject_CallFunction doesn't document this requirement, it has the same behavior, and the same failures, as does the undocumented _PyObject_CallMethodId.

The issue is that, should someone pass a format string of "O", then the type of the subsequent argument determines the effect in a non-obvious way; when the argument comes from a caller, and the intent was to pass a single argument, this means that if the caller passes a non-tuple sequence, everything works, while passing a tuple tries to pass the contents of the tuple as sequential arguments. This inconsistency was the cause of both bpo-26478 and bpo-21209 (maybe others).

Assuming the API can't/shouldn't be changed, it should still be an error when a format string of "O" is passed and the argument is a non-tuple (because you've violated the spec; the result of BuildValue was not a tuple). Instead call_function_tail is silently rewrapping non-tuple args in a single element tuple.

I'm proposing that, in debug builds (and ideally release builds too), abstract.c's call_function_tail treat the "non-tuple" case as an error, rather than rewrapping in a single element tuple. This still allows the use case where the function is used inefficiently, but correctly (where the format string is "O" and the value is *always* a tuple that's supposed to be varargs; it should really just use PyObject_CallFunctionObjArgs/PyObject_CallMethodObjArgs/PyObject_CallObject or Id based optimized versions, but it's legal). But it will make the majority of cases where a user provided argument could be tuple or not fail fast, rather than silently behave themselves *until* they receive a tuple and misbehave.

Downside: It will require code changes for cases like PyObject_CallFunction(foo, "k", myunsigned);, where there was no risk of misbehavior, but those cases were also violating the spec, and should be fixable by changing the format string to wrap the single value in parens, e.g. "(k)".

@MojoVampire MojoVampire mannequin added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Apr 21, 2016
@MojoVampire
Copy link
Mannequin Author

MojoVampire mannequin commented Apr 21, 2016

The motivation for this change was Mr. STINNER's comment on bpo-26814 ( https://bugs.python.org/issue26814#msg263923 ), where he mentioned the weirdness of PyObject_CallFunction and friends, which complicates the implementation of PyObject_FastCall and alerted me to a second case ( bpo-21209 ) in which this silent fix up has caused confusing issues in CPython (I filed bpo-26478 so I recognized the issue).

If this fix could be made, it might be possible to eventually make the check for non-tuple arguments a debug build only check (or a check only on public APIs), allowing the implementation in release mode and/or internal APIs to avoid the work involved in constantly checking for and performing this workaround to fix doc violating code, and possible simplify PyObject_FastCall by removing the corner case.

@jkloth
Copy link
Contributor

jkloth commented Apr 22, 2016

IMHO, this is a documentation bug with PyObject_CallMethod. The change to its documentation to differ from PyObject_CallFunction was changed back in 2004. It should have been updated then to reflect the already well-entrenched behavior of those 2 (at the time) functions, but that ship has sailed.

It would be a huge mistake to change how they handle the format strings now as they have operated they way they have since at least Python 1.5.2 (when I learned Python's C API). There is just too much C code that would potentially break (third-party C extensions).

I think that a good change for the docs would be to separate the specification of the "building format string" away from the Py_BuildValue function so as to allow for differences in how the resulting object is created in those C functions that use that specification. Similar, I suppose, to how the "parsing format string" is defined independently from PyArg_ParseTuple.

Now to the "attractive nuisance" that the single argument passed as varargs is handled. I believe it may be best to introduce a new format character just for this purpose. Possibly "T" (for tuple) or "V" (for varargs) using uppercase as that seems to be the practice for referencing objects. And at the same time, add a check in the "call" functions (those that *use* Py_VaBuildValue to create a argument tuple, of which there are also internal ones). The check could be as simple as:

  if (format[0] == 'O' && format[1] == '\0') {
    va_start(va, format);
    PyObject *ob = (PyObject *)va_arg(va, PyObject *);
    if (ob != NULL) {
      if (PyTuple_Check(ob)) {
        if (PyErr_WarnFormat(PyExc_DeprecationWarning, 1,
                "use 'V' format code to support varargs")) {
          args = NULL;
        }
        else {
          args = ob;
        }
      }
      else {
        args = PyTuple_Pack(1, ob);
      }
    }
    else if (!PyErr_Occurred()) {
      PyErr_SetString(PyExc_SystemError, "argument is NULL");
      args = NULL;
    }
    va_end(va);
  }
  else {
    args = //...whatever happens now...
  }

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs)
Projects
None yet
Development

No branches or pull requests

1 participant