New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce memory allocations in C++ to PyList Conversions #4081
Conversation
Cython/Utility/CppConvert.pyx
Outdated
@cname("{{cname}}") | ||
cdef object {{cname}}(const vector[X]& v): | ||
o = PyList_New(v.size()) | ||
cdef size_t i = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be size_t
instead of the usual Py_ssize_t
for indexing?
cdef size_t i = 0 | |
cdef Py_ssize_t i |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might just be a mismatch where no option can ever be "right" - size_t
matches the type of of v.size()
, but not PyList_SET_ITEM
.
Possibly the right thing to do is to raise an error if v.size() > PY_SSIZE_T_MAX
and then use Py_ssize_t
after that check? But I doubt if many people have the memory to run that test-case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it to Py_ssize_t. Even for a bit vector (vector) this would require 1.07E+09 Gb of memory, so I do not think, that this is a concern (+ the memory allocation for the PyList). In Cpython this would already causes a exception because a negative size is specified, while in PyPy this would allocate a empty list.
https://github.com/mozillazg/pypy/blob/11af013b3723c15c9ce51f68fd5c2db51f74ba3d/pypy/module/cpyext/listobject.py#L23
I personally do not see anyone running into this issue, given the big memory requirement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to get the safety checks in so that this can be merged.
@scoder I added the overflow check for the Py_ssize_t conversions |
Appears that I killed something. I will try to fix this later today. |
@scoder I just checked the generated functions and at least static PyObject *__pyx_convert_vector_to_py_double(std::vector<double> const &__pyx_v_v) {
PyObject *__pyx_v_o = NULL;
Py_ssize_t __pyx_v_i;
double __pyx_v_item;
PyObject *__pyx_r = NULL;
__Pyx_RefNannyDeclarations
PyObject *__pyx_t_1 = NULL;
size_t __pyx_t_2;
size_t __pyx_t_3;
Py_ssize_t __pyx_t_4;
int __pyx_lineno = 0;
const char *__pyx_filename = NULL;
int __pyx_clineno = 0;
__Pyx_RefNannySetupContext("__pyx_convert_vector_to_py_double", 0);
/* "vector.to_py":70
*
*
* o = PyList_New(<Py_ssize_t>v.size()) # <<<<<<<<<<<<<<
* cdef Py_ssize_t i
* for i in range(v.size()):
*/
/* __pyx_t_1 allocated (Python object) */
__pyx_t_1 = PyList_New(((Py_ssize_t)__pyx_v_v.size())); if (unlikely(!__pyx_t_1)) __PYX_ERR(1, 70, __pyx_L1_error)
__Pyx_GOTREF(__pyx_t_1);
__pyx_v_o = ((PyObject*)__pyx_t_1);
__pyx_t_1 = 0;
/* __pyx_t_1 released */
/* "vector.to_py":72
* o = PyList_New(<Py_ssize_t>v.size())
* cdef Py_ssize_t i
* for i in range(v.size()): # <<<<<<<<<<<<<<
* item = v[i]
* Py_INCREF(item)
*/
/* __pyx_t_2 allocated (size_t) */
__pyx_t_2 = __pyx_v_v.size();
/* __pyx_t_3 allocated (size_t) */
__pyx_t_3 = __pyx_t_2;
/* __pyx_t_4 allocated (Py_ssize_t) */
for (__pyx_t_4 = 0; __pyx_t_4 < __pyx_t_3; __pyx_t_4+=1) {
__pyx_v_i = __pyx_t_4;
/* "vector.to_py":73
* cdef Py_ssize_t i
* for i in range(v.size()):
* item = v[i] # <<<<<<<<<<<<<<
* Py_INCREF(item)
* PyList_SET_ITEM(o, i, item)
*/
__pyx_v_item = (__pyx_v_v[__pyx_v_i]);
/* "vector.to_py":74
* for i in range(v.size()):
* item = v[i]
* Py_INCREF(item) # <<<<<<<<<<<<<<
* PyList_SET_ITEM(o, i, item)
* return o
*/
/* __pyx_t_1 allocated (Python object) */
__pyx_t_1 = PyFloat_FromDouble(__pyx_v_item); if (unlikely(!__pyx_t_1)) __PYX_ERR(1, 74, __pyx_L1_error)
__Pyx_GOTREF(__pyx_t_1);
Py_INCREF(__pyx_t_1);
__Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;
/* __pyx_t_1 released */
/* "vector.to_py":75
* item = v[i]
* Py_INCREF(item)
* PyList_SET_ITEM(o, i, item) # <<<<<<<<<<<<<<
* return o
*
*/
/* __pyx_t_1 allocated (Python object) */
__pyx_t_1 = PyFloat_FromDouble(__pyx_v_item); if (unlikely(!__pyx_t_1)) __PYX_ERR(1, 75, __pyx_L1_error)
__Pyx_GOTREF(__pyx_t_1);
PyList_SET_ITEM(__pyx_v_o, __pyx_v_i, __pyx_t_1);
__Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;
/* __pyx_t_1 released */
}
/* __pyx_t_4 released */
/* __pyx_t_3 released */
/* __pyx_t_2 released */
/* "vector.to_py":76
* Py_INCREF(item)
* PyList_SET_ITEM(o, i, item)
* return o # <<<<<<<<<<<<<<
*
*/
__Pyx_XDECREF(__pyx_r);
__Pyx_INCREF(__pyx_v_o);
__pyx_r = __pyx_v_o;
goto __pyx_L0;
/* "vector.to_py":66
*
* @cname("__pyx_convert_vector_to_py_double")
* cdef object __pyx_convert_vector_to_py_double(const vector[X]& v): # <<<<<<<<<<<<<<
*
*
*/
/* function exit code */
__pyx_L1_error:;
__Pyx_XDECREF(__pyx_t_1);
__Pyx_AddTraceback("vector.to_py.__pyx_convert_vector_to_py_double", __pyx_clineno, __pyx_lineno, __pyx_filename);
__pyx_r = 0;
__pyx_L0:;
__Pyx_XDECREF(__pyx_v_o);
__Pyx_XGIVEREF(__pyx_r);
__Pyx_RefNannyFinishContext();
return __pyx_r;
} I am not sure why it:
I do not understand why steps 3 and 4 are generated.
in a standalone |
In the line I think (I'm not completely clear on why the standalone example works in this case, but it's possible there's different settings for type-inference. I think the compile-options for utility-code are now set to something consistent, but I'm not sure what) |
Thanks @da-woods this fixed the problem :) |
Thanks |
When compiling cythonized code which uses `std::vector` we get the following compiler warning on GCC 8 and Python 3.9 (which is turned into an error in our case): ``` my_file.cpp: In function ‘PyObject* __pyx_convert_vector_to_py_int(const std::vector<int>&)’: my_file.cpp:4716:33: warning: comparison of integer expressions of different signedness: ‘Py_ssize_t’ {aka ‘long int’} and ‘size_t’ {aka ‘long unsigned int’} [-Wsign-compare] for (__pyx_t_5 = 0; __pyx_t_5 < __pyx_t_4; __pyx_t_5+=1) { ~~~~~~~~~~^~~~~~~~~~~ ``` The generated code in question is as follows: ``` /* "vector.to_py":75 * cdef object item * * for i in range(v.size()): # <<<<<<<<<<<<<< * item = v[i] * Py_INCREF(item) */ __pyx_t_3 = __pyx_v_v.size(); __pyx_t_4 = __pyx_t_3; for (__pyx_t_5 = 0; __pyx_t_5 < __pyx_t_4; __pyx_t_5+=1) { __pyx_v_i = __pyx_t_5; ``` `__pyx_t_5` is of type `‘Py_ssize_t’` (signed), and `__pyx_t_4` aka `__pyx_t_3` is `size_t` (unsigned), causing GCC to rightfully complain. Fix the generated code by explicitly using the signed variant of the vector's size in the loop. This bug has been introduced in cython#4081, which also contains some discussion on the use of signed vs unsigned types. This patch chooses to keep the status quo and only fixes the compiler warning.
When compiling cythonized code which uses `std::vector` we get the following compiler warning on GCC 8 and Python 3.9 (which is turned into an error in our case): ``` my_file.cpp: In function ‘PyObject* __pyx_convert_vector_to_py_int(const std::vector<int>&)’: my_file.cpp:4716:33: warning: comparison of integer expressions of different signedness: ‘Py_ssize_t’ {aka ‘long int’} and ‘size_t’ {aka ‘long unsigned int’} [-Wsign-compare] for (__pyx_t_5 = 0; __pyx_t_5 < __pyx_t_4; __pyx_t_5+=1) { ~~~~~~~~~~^~~~~~~~~~~ ``` The generated code in question is as follows: ``` /* "vector.to_py":75 * cdef object item * * for i in range(v.size()): # <<<<<<<<<<<<<< * item = v[i] * Py_INCREF(item) */ __pyx_t_3 = __pyx_v_v.size(); __pyx_t_4 = __pyx_t_3; for (__pyx_t_5 = 0; __pyx_t_5 < __pyx_t_4; __pyx_t_5+=1) { __pyx_v_i = __pyx_t_5; ``` `__pyx_t_5` is of type `‘Py_ssize_t’` (signed), and `__pyx_t_4` aka `__pyx_t_3` is `size_t` (unsigned), causing GCC to rightfully complain. Fix the generated code by explicitly using the signed variant of the vector's size in the loop. This bug has been introduced in #4081, which also contains some discussion on the use of signed vs unsigned types. This patch chooses to keep the status quo and only fixes the compiler warning.
No description provided.