Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify the _struct module to use FASTCALL and Argument Clinic #73486

Closed
vstinner opened this issue Jan 17, 2017 · 48 comments
Closed

Modify the _struct module to use FASTCALL and Argument Clinic #73486

vstinner opened this issue Jan 17, 2017 · 48 comments
Labels
3.7 expert-argument-clinic performance

Comments

@vstinner
Copy link
Member

@vstinner vstinner commented Jan 17, 2017

BPO 29300
Nosy @vstinner, @larryhastings, @methane, @skrah, @vadmium, @serhiy-storchaka
Files
  • struct_fastcall.patch
  • struct_fastcall-2.patch
  • struct_fastcall-3.patch
  • struct_fastcall-4.patch
  • struct_fastcall-5.patch
  • struct_fastcall-6.patch
  • struct_fastcall-7.patch
  • cache_struct.patch
  • unpack_buffer.patch
  • cache_struct-2.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2017-02-10.11:33:37.652>
    created_at = <Date 2017-01-17.16:31:45.553>
    labels = ['3.7', 'expert-argument-clinic', 'performance']
    title = 'Modify the _struct module to use FASTCALL and Argument Clinic'
    updated_at = <Date 2017-02-10.11:33:37.651>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2017-02-10.11:33:37.651>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2017-02-10.11:33:37.652>
    closer = 'vstinner'
    components = ['Argument Clinic']
    creation = <Date 2017-01-17.16:31:45.553>
    creator = 'vstinner'
    dependencies = []
    files = ['46319', '46422', '46431', '46432', '46434', '46436', '46488', '46489', '46490', '46492']
    hgrepos = []
    issue_num = 29300
    keywords = ['patch']
    message_count = 48.0
    messages = ['285663', '285669', '286326', '286327', '286329', '286346', '286348', '286349', '286350', '286351', '286357', '286358', '286364', '286367', '286373', '286645', '286659', '286712', '286756', '286761', '286766', '286767', '286768', '286770', '286771', '286772', '286773', '286774', '286775', '286776', '286781', '286782', '286783', '286784', '286785', '286786', '286789', '286790', '286791', '286792', '286795', '286799', '286800', '286834', '286934', '286943', '287125', '287517']
    nosy_count = 7.0
    nosy_names = ['vstinner', 'larry', 'methane', 'skrah', 'python-dev', 'martin.panter', 'serhiy.storchaka']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue29300'
    versions = ['Python 3.7']

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Jan 17, 2017

    Attached patch modify the _struct module to use FASTCALL and Argument Clinic.

    AC requires a summary line, so I duplicated the first sentence of iter_unpack() docstring:

    iter_unpack(fmt, buffer, /)
        Return an iterator yielding tuples unpacked from the given bytes.
        
        Return an iterator yielding tuples unpacked from the given bytes
        source according to the format string, like a repeated invocation of
        unpack_from().  Requires that the bytes length be a multiple of the
        format struct size.

    @vstinner vstinner added 3.7 expert-argument-clinic performance labels Jan 17, 2017
    @serhiy-storchaka
    Copy link
    Member

    @serhiy-storchaka serhiy-storchaka commented Jan 17, 2017

    Why not just convert the struct module to Argument Clinic?

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Jan 26, 2017

    Patch version 2 converts most functions and methods to Argument Clinic, except of unpack() and unpack_into() which require "*args" support in Argument Clinic: see the issue bpo-20291.

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Jan 26, 2017

    I reformatted docstrings, and wrote a summary line.

    @serhiy-storchaka
    Copy link
    Member

    @serhiy-storchaka serhiy-storchaka commented Jan 26, 2017

    It looks to me that the Py_buffer converter can be used for all the buffer parameter and special purposed converter can be used for the fmt parameter.

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Jan 27, 2017

    Patch version 3:

    • Use also Argument Clinic for Struct.__init__()
    • Rename fmt to format: in the code and docstring. By the way, Struct docstring was wrong: Struct() accepts a 'format' keyword argument, not 'fmt'
    • Use Py_buffer converter for the buffer parameter, except for iter_unpack()
    • Document that unpack_from() now accepts keywords and write an unit test
    • Fix Struct.__init__() error message: the method accepts also Unicode
    • Document that Struct accepts str encodable to ASCII and bytes

    I would be nice to use Py_buffer format for the "buffer" argument of iter_unpack(), but Argument Clinic calls PyBuffer_Release(&buffer), whereas the buffer should stay alive after the function exit. Is there a way to clone/copy a Py_buffer?

    For the fmt/format argument: s_init() uses custom code to accept Unicode encodable to ASCII and bytes objects. Using Argument Clinic to check fmt type would require to write a converter. I would prefer to not make too many changes in a single patch, and I don't know how to write such converter. I suggest to do that later and keep the existing code. It seems like all functions getting a format ends in s_init() to check the format argument.

    @serhiy-storchaka
    Copy link
    Member

    @serhiy-storchaka serhiy-storchaka commented Jan 27, 2017

    The converter for the format parameter already exists. It is cache_struct(). You just need to change its interface.

    • Fix Struct.__init__() error message: the method accepts also Unicode
    • Document that Struct accepts str encodable to ASCII and bytes

    I think this should be backported to older versions. Maybe there is an open issue for this.

    I would be nice to use Py_buffer format for the "buffer" argument of iter_unpack(), but Argument Clinic calls PyBuffer_Release(&buffer), whereas the buffer should stay alive after the function exit. Is there a way to clone/copy a Py_buffer?

    I agreed that the Py_buffer converter is not applicable here. Just object is good.

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Jan 27, 2017

    The converter for the format parameter already exists. It is cache_struct(). You just need to change its interface.

    The code to convert the format string is in s_init(). The code increases the reference counter, I don't see how to produce the Py_DECREF() code.

    Again, I would prefer to do that later. The change is already big enough :-)

    I think this should be backported to older versions.

    Ok, will do once this change is accepted.

    @serhiy-storchaka
    Copy link
    Member

    @serhiy-storchaka serhiy-storchaka commented Jan 27, 2017

    static int
    cache_struct_converter(PyObject *arg, PyObject **s_object)
    {
        if (arg == NULL) {
            Py_DECREF();
            return 1;
        }
        *s_object = cache_struct(arg); // actually inline this
        if (*s_object == NULL)
            return 0;
        return Py_CLEANUP_SUPPORTED;
    }

    @serhiy-storchaka
    Copy link
    Member

    @serhiy-storchaka serhiy-storchaka commented Jan 27, 2017

    See also ascii_buffer_converter in Modules/binascii.c for example.

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Jan 27, 2017

    Thank you for your reviews Serhiy :-) Here is a new patch to take in account your latest comments.

    I didn't change the API anymore: Struct.unpack_from() accepts keywords, unpack_from() doesn't.

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Jan 27, 2017

    See also ascii_buffer_converter in Modules/binascii.c for example.

    Nice. I didn't know the "def cleanup" feature. I used it in my patch version 4.

    @serhiy-storchaka
    Copy link
    Member

    @serhiy-storchaka serhiy-storchaka commented Jan 27, 2017

    I said not about the converter that converts format to bytes (it is needed only in Struct.__init__), but about the converter that converts format to the Struct object. It is used in all module-level functions.

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Jan 27, 2017

    New patch version 5 without the "format_converter".

    Serhiy Storchaka: I said not about the converter that converts format to bytes (it is needed only in Struct.__init__),

    For me, it makes sense to check format type earlier than Struct.__init__(), but since you like the current code, I'm also ok to leave this part of the code unchanged :-)

    Serhiy: but about the converter that converts format to the Struct object. It is used in all module-level functions.

    Sorry, but I dislike this change. Even if multiple functions do the same, for me creating a Struct object is not something part of the argument parsing. I prefer to keep the following 3 lines in each C function *body* to ease debugging:

        PyObject *s_object = cache_struct(format);
        if (s_object == NULL)
            return NULL;

    IMHO my struct_fastcall-5.patch is already big enough. I prefer incremental changes.

    I have no strong opinion on cache_struct(), just a matter of taste :-) And others may like your change!

    Feel free to write your own patch on top of mine, once mine is merged ;-)

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Jan 27, 2017

    Updated patch, version 6:

    • Rename "obj" variable to "iter"
    • Fix unpack_from() signature: only the first parameter is positional only

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Feb 1, 2017

    Serhiy: Would you mind to review my latest patch, struct_fastcall-6.patch? It should address all your remarks, except your proposal to write a converter for cache_struct(): see my previous comment about this idea. Thanks in advance ;)

    @serhiy-storchaka
    Copy link
    Member

    @serhiy-storchaka serhiy-storchaka commented Feb 1, 2017

    It looks to me that the type of the self parameter can be changed from PyObject* to PyStructObject*. This will make the patch larger but the final code simpler.

    class Struct "PyStructObject *" "&PyStructType"
    

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Feb 1, 2017

    It looks to me that the type of the self parameter can be changed from PyObject* to PyStructObject*. This will make the patch larger but the final code simpler.

    I like the idea, but I prefer to do in a separated change, once the first big one is merged. Otherwise, the first patch will do too many unrelated things at the same time IMHO.

    @serhiy-storchaka
    Copy link
    Member

    @serhiy-storchaka serhiy-storchaka commented Feb 2, 2017

    This doesn't make the patch much bigger. Usually such kind of changes are made in one patch.

    Here is a patch that also changes the type of the self parameter to PyStructObject*.

    struct_fastcall-6.patch LGTM, but I prefer struct_fastcall-7.patch.

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Feb 2, 2017

    • if (PyObject_GetBuffer(input, &vbuf, PyBUF_SIMPLE) < 0)
    •    return NULL;
      

    Oh by the way, I forgot to mention a subtle change. PyObject_GetBuffer(PyBUF_SIMPLE) is less strict that PyArg_Parse("y#") / "buffer" converter of Argument Clinic: getargs.c also checks that the buffer is contiguous, extract of getbuffer():

        if (!PyBuffer_IsContiguous(view, 'C')) {
            PyBuffer_Release(view);
            *errmsg = "contiguous buffer";
            return -1;
        }

    I don't know well the buffer protocol. I don't know any object which provide a non-contiguous buffer. At least, I can say that the last time I looked at this dark part of Python, the documentation was between tiny and non-existent :-/ The buffer protocol is complex but not well documented :-(

    @python-dev
    Copy link
    Mannequin

    @python-dev python-dev mannequin commented Feb 2, 2017

    New changeset f3ff4a3ce77c by Victor Stinner in branch 'default':
    Issue bpo-29300: Convert _struct module to Argument Clinic
    https://hg.python.org/cpython/rev/f3ff4a3ce77c

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Feb 2, 2017

    I also prefer Serhiy's struct_fastcall-7.patch over my struct_fastcall-6.patch.

    Serhiy Storchaka: "This doesn't make the patch much bigger. Usually such kind of changes are made in one patch."

    Well, it's just that you prefer to reduce the number of commits and so fold tiny changes into a single big commit, whereas I now prefer multiple tiny commits. I have not strong preferences between the two ways to commit, so I pushed your patch.

    Here is the full commit message, since Roundbot bot only shows the first line:
    ---
    Issue bpo-29300: Convert _struct module to Argument Clinic

    • The struct module now requires contiguous buffers.
    • Convert most functions and methods of the _struct module to Argument Clinic
    • Use "Py_buffer" type for the "buffer" argument. Argument Clinic is
      responsible to create and release the Py_buffer object.
    • Use "PyStructObject *" type for self to avoid explicit conversions.
    • Add an unit test on the _struct.Struct.unpack_from() method to test passing
      arguments as keywords.
    • Rephrase docstrings.
    • Rename "fmt" argument to "format" in docstrings and the documentation.

    As a side effect, functions and methods which used METH_VARARGS calling
    convention like struct.pack() now use the METH_FASTCALL calling convention
    which avoids the creation of temporary tuple to pass positional arguments and
    so is faster. For example, struct.pack("i", 1) becomes 1.56x faster (-36%)::

        $ ./python -m perf timeit \
            -s 'import struct; pack=struct.pack' 'pack("i", 1)' \
            --compare-to=../default-ref/python
        Median +- std dev: 119 ns +- 1 ns -> 76.8 ns +- 0.4 ns: 1.56x faster (-36%)
        Significant (t=295.91)

    Patch co-written with Serhiy Storchaka.
    ---

    So yeah, the "side effect" is that struct.pack("i", 1) becomes 1.56x faster (-36%). Ok, maybe it was my main goal ;-) I also mentioned the "new" (?) contiguous requirement on buffers.

    Thanks Serhiy for your multiple reviews, very useful as usual. I like the final result: better documentation, better docstrings and better code!

    @vstinner vstinner closed this as completed Feb 2, 2017
    @vadmium
    Copy link
    Member

    @vadmium vadmium commented Feb 2, 2017

    Shouldn’t the top-level unpack() parameter be called “buffer” like the other functions and methods, not “inputstr”?

    @serhiy-storchaka
    Copy link
    Member

    @serhiy-storchaka serhiy-storchaka commented Feb 2, 2017

    Propose for your consideration a patch that uses Argument Clinic for getting possible cached struct object from the format. This simplifies the implementation of module level functions.

    @vadmium
    Copy link
    Member

    @vadmium vadmium commented Feb 2, 2017

    FYI Victor, you can make non-C-contiguous buffers by slicing memoryview:

    >>> struct.unpack(">L", memoryview(b"1234")[::-1])
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    BufferError: memoryview: underlying buffer is not C-contiguous

    Can also use the built-in _testbuffer module to create stranger buffers.

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Feb 2, 2017

    Martin> Shouldn’t the top-level unpack() parameter be called “buffer” like the other functions and methods, not “inputstr”?

    Hum. I reopen the issue.

    Attached patch renames unpack() "inputstr" argument to "buffer" and uses the Py_buffer type for it. I had to fix unit tests which passed str instead of bytes to buffer. Before this error was missed because the unit test is written to test the format string value, not the type of arguments.

    @serhiy-storchaka
    Copy link
    Member

    @serhiy-storchaka serhiy-storchaka commented Feb 2, 2017

    Oh by the way, I forgot to mention a subtle change.
    PyObject_GetBuffer(PyBUF_SIMPLE) is less strict that PyArg_Parse("y#") /
    "buffer" converter of Argument Clinic: getargs.c also checks that the
    buffer is contiguous, extract of getbuffer():

    We already made such changes in the past. The difference is subtle and I have
    doubts that choosing any of ways was deliberate.

    @serhiy-storchaka
    Copy link
    Member

    @serhiy-storchaka serhiy-storchaka commented Feb 2, 2017

    So yeah, the "side effect" is that struct.pack("i", 1) becomes 1.56x faster
    (-36%). Ok, maybe it was my main goal ;-) I also mentioned the "new" (?)
    contiguous requirement on buffers.

    struct.pack() always was faster than int.to_bytes(). I wanted to speed up
    int.to_bytes(), and after converting to Argument Clinic in bpo-20185 it have
    became faster than struct.pack(). But after converting the struct module to
    Argument Clinic struct.pack() is faster than int.to_bytes() again! Now I need
    to find other ways to make int.to_bytes() even faster to win this chase.

    @serhiy-storchaka
    Copy link
    Member

    @serhiy-storchaka serhiy-storchaka commented Feb 2, 2017

    unpack_buffer.patch LGTM. Please backport test changes and other changes discussed before to other branches.

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Feb 2, 2017

    Serhiy Storchaka: "But after converting the struct module to Argument Clinic struct.pack() is faster than int.to_bytes() again!"

    Sorry about that ;-)

    Serhiy Storchaka: "Now I need to find other ways to make int.to_bytes() even faster to win this chase."

    I ran a microbenchmark:

    $ ./python -m perf timeit -s 'to_bytes=int.to_bytes' 'to_bytes(1, 4, "little")'

    Reference: ~154 ns

    Replace int_to_bytes_impl() body with:
    PyBytes_FromStringAndSize("1", 1)
    => ~120 ns (-34 ns)

    Replace int_to_bytes() body with:
    return int_to_bytes_impl(self, 4, NULL, 1);
    => ~76 ns (-44 ns)

    _PyArg_ParseStackAndKeywords() with _PyArg_Parser{"nU|$p:to_bytes"} takes 44 ns on a total of 154 ns. 29% of the runtime is spent on parsing arguments.

    If you want to optimize further int.to_bytes(), IMHO we should explore the issue bpo-29419: "Argument Clinic: inline PyArg_UnpackTuple and PyArg_ParseStack(AndKeyword)?".

    @python-dev
    Copy link
    Mannequin

    @python-dev python-dev mannequin commented Feb 2, 2017

    New changeset 32380d41e788 by Victor Stinner in branch '3.5':
    Issue bpo-29300: test_struct tests unpack_from() with keywords
    https://hg.python.org/cpython/rev/32380d41e788

    @python-dev
    Copy link
    Mannequin

    @python-dev python-dev mannequin commented Feb 2, 2017

    New changeset faa1e4f4b156 by Victor Stinner in branch 'default':
    Rename struct.unpack() 2nd parameter to "buffer"
    https://hg.python.org/cpython/rev/faa1e4f4b156

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Feb 2, 2017

    Serhiy Storchaka: "We already made such changes in the past. The difference is subtle and I have
    doubts that choosing any of ways was deliberate."

    Right. IMHO it's safe to make sure that the buffer is contiguous. I'm quite sure that the code doesn't support non-contiguous buffers.

    Serhiy Storchaka: "Please backport test changes and other changes discussed before to other branches."

    Done. Sorry, I forgot this part.

    Serhiy Storchaka: "unpack_buffer.patch LGTM."

    Thanks for the review, it's now merged. It was a regression in unpack() docstring, Python 3.5 docstring contains "unpack(fmt, buffer)".

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Feb 2, 2017

    Martin Panter: """FYI Victor, you can make non-C-contiguous buffers by slicing memoryview:

    >>> struct.unpack(">L", memoryview(b"1234")[::-1])
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    BufferError: memoryview: underlying buffer is not C-contiguous"""

    Oh, it means that the Argument Clinic change doesn't add new checks on the buffer? In Python 3.6, memory_getbuf() raises an exception in the following code:

        if (!REQ_STRIDES(flags)) {
            if (!MV_C_CONTIGUOUS(baseflags)) {
                PyErr_SetString(PyExc_BufferError,
                    "memoryview: underlying buffer is not C-contiguous");
                return -1;
            }
            view->strides = NULL;
        }

    I undersrtand that memory_getbuf() is smart enough to raise an exception becaues the buffer is not contiguous. But a weaker implementation of getbuffer may not implement such check, whereas getbuffer() double check that the buffer is C-contiguous. Am I right?

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Feb 2, 2017

    I like the overall cache_struct.patch change, but I have questions: see my review.

    @python-dev python-dev mannequin closed this as completed Feb 2, 2017
    @skrah
    Copy link
    Mannequin

    @skrah skrah mannequin commented Feb 2, 2017

    PyBUF_SIMPLE implies C-contiguous for a conforming buffer provider.

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Feb 2, 2017

    Stefan Krah: "PyBUF_SIMPLE implies C-contiguous for a conforming buffer provider."

    Oh ok, thanks for the confirmation. So my change didn't add new checks and my commit message is wrong :-) Only non-conforming objects are impacted, but in such case, it's more a bug in the object than a bug in struct. I guess that non-conforming non-contiguous objects are very rare in the wild ;-)

    The new code is more strict, so is safe in all corner cases.

    @serhiy-storchaka
    Copy link
    Member

    @serhiy-storchaka serhiy-storchaka commented Feb 2, 2017

    Dear Roundup Robot, please don't close issues.

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Feb 2, 2017

    Serhiy Storchaka: "Dear Roundup Robot, please don't close issues."

    For what it's worth, I reported issues of this robot to python-dev:

    https://mail.python.org/pipermail/python-dev/2017-February/147317.html

    https://mail.python.org/pipermail/python-dev/2017-February/147325.html

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Feb 2, 2017

    cache_struct-2.patch LGTM.

    @python-dev
    Copy link
    Mannequin

    @python-dev python-dev mannequin commented Feb 4, 2017

    New changeset 9245894af223 by Serhiy Storchaka in branch 'default':
    Issue bpo-29300: Use Argument Clinic for getting struct object from the format.
    https://hg.python.org/cpython/rev/9245894af223

    @python-dev
    Copy link
    Mannequin

    @python-dev python-dev mannequin commented Feb 4, 2017

    New changeset e21cda7 by Serhiy Storchaka in branch 'master':
    Issue bpo-29300: Use Argument Clinic for getting struct object from the format.
    e21cda7

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Feb 6, 2017

    @serhiy: Can we please now close this issue? Or is there still something to do?

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Feb 10, 2017

    I'm not aware of any pending issues, buildbots are happy, I'm happy, I close the issue :-) Don't hesitate to reopen it if I missed something.

    @python-dev
    Copy link
    Mannequin

    @python-dev python-dev mannequin commented Apr 9, 2022

    New changeset e552e18 by Victor Stinner in branch 'master':
    Issue bpo-29300: Convert _struct module to Argument Clinic
    e552e18

    @python-dev
    Copy link
    Mannequin

    @python-dev python-dev mannequin commented Apr 9, 2022

    New changeset a88eb4f by Victor Stinner in branch 'master':
    Issue bpo-29300: test_struct tests unpack_from() with keywords
    a88eb4f

    New changeset dcd4b1a by Victor Stinner in branch 'master':
    Rename struct.unpack() 2nd parameter to "buffer"
    dcd4b1a

    @python-dev
    Copy link
    Mannequin

    @python-dev python-dev mannequin commented Apr 9, 2022

    New changeset a88eb4f by Victor Stinner in branch '3.5':
    Issue bpo-29300: test_struct tests unpack_from() with keywords
    a88eb4f

    @python-dev
    Copy link
    Mannequin

    @python-dev python-dev mannequin commented Apr 9, 2022

    New changeset a88eb4f by Victor Stinner in branch '3.6':
    Issue bpo-29300: test_struct tests unpack_from() with keywords
    a88eb4f

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 expert-argument-clinic performance
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants