Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Buffers: Acquisition counted buffers #871

Open
robertwb opened this issue May 6, 2009 · 4 comments
Open

Buffers: Acquisition counted buffers #871

robertwb opened this issue May 6, 2009 · 4 comments

Comments

@robertwb
Copy link
Contributor

robertwb commented May 6, 2009

Currently when one does

cdef object[int] a = ..., b
b = a

then the second line does a full acquisition of the buffer. This is unecesarry as a has all that is needed already.

To implement this in full generality one needs a reference count on the acquired Py_buffer struct.

While the immediate benefit may seem small, this becomes more crucial if http://trac.cython.org/ticket/178 is implemented; where each slice would need a new reference to the buffer. The time taken of the tp_getbuffer of a given object is outside of our control; and we can't even be guaranteed to get back the same buffer each time, making slicing less robust.

'''Implementation notes'''

It seems clear that one needs a reference count allocated on the heap, since we want to eventually be able to pass buffers to other functions (http://trac.cython.org/ticket/177) and store them in fields and global variables (http://trac.cython.org/ticket/301). So something like:

typedef struct {
  size_t refcount;
  Py_buffer bufinfo;
} __Pyx_Buffer;

and then clients would, instead of having Py_buffers on the stack, malloc and free a __Pyx_buffer which would be shared among all users of it.

With regards to http://trac.cython.org/ticket/299, this would mean that the structs in http://trac.cython.org/ticket/299 would keep a pointer instead. I'll add a comment there.

(One could keep the Py_buffer on the stack and copy it by value, only the refcount absolutely needs to live on the heap, however it is the mallc/free that is expensive and might as well save some hassle and memory. ''The original must be kept anyway, as we are not allowed to change it before passing it back to releasebuffer'')

As a possible optimization, if a buffer is never passed on to another variable/function then one can use a Py_buffer on the stack instead. Not a priority though.

Migrated from http://trac.cython.org/ticket/311

@robertwb
Copy link
Contributor Author

robertwb commented May 6, 2009

@dagss commented

Kurt: I tagged this as kurtgsoc only because the details would be relevant for you, not because you have to do it in your GSoC.

@robertwb
Copy link
Contributor Author

robertwb commented May 6, 2009

@dagss changed description from

Currently when one does

cdef object[a = ..., b
b = a

then the second line does a full acquisition of the buffer. This is unecesarry as a has all that is needed already.

To implement this in full generality one needs a reference count on the acquired Py_buffer struct.

While the immediate benefit may seem small, this becomes more crucial if http://trac.cython.org/ticket/178 is implemented; where each slice would need a new reference to the buffer. The time taken of the tp_getbuffer of a given object is outside of our control; and we can't even be guaranteed to get back the same buffer each time, making slicing less robust.

'''Implementation notes'''

It seems clear that one needs a reference count allocated on the heap. So something like:

typedef struct {
  size_t refcount;
  Py_buffer bufinfo;
} __Pyx_Buffer;

and then clients would, instead of having Py_buffers on the stack, malloc and free a __Pyx_buffer which would be shared among all users of it.

As a possible optimization, if a buffer is never passed on to another variable/function then one can use a Py_buffer on the stack instead. Not a priority though.

to

Currently when one does

cdef object[int](int]) a = ..., b
b = a

then the second line does a full acquisition of the buffer. This is unecesarry as a has all that is needed already.

To implement this in full generality one needs a reference count on the acquired Py_buffer struct.

While the immediate benefit may seem small, this becomes more crucial if http://trac.cython.org/ticket/178 is implemented; where each slice would need a new reference to the buffer. The time taken of the tp_getbuffer of a given object is outside of our control; and we can't even be guaranteed to get back the same buffer each time, making slicing less robust.

'''Implementation notes'''

It seems clear that one needs a reference count allocated on the heap, since we want to eventually be able to pass buffers to other functions (http://trac.cython.org/ticket/177) and store them in fields and global variables (http://trac.cython.org/ticket/301). So something like:

typedef struct {
  size_t refcount;
  Py_buffer bufinfo;
} __Pyx_Buffer;

and then clients would, instead of having Py_buffers on the stack, malloc and free a __Pyx_buffer which would be shared among all users of it.

(One could keep the Py_buffer on the stack and copy it by value, only the refcount absolutely needs to live on the heap, however it is the mallc/free that is expensive and might. ''The original must be kept anyway, as we are not allowed to change it before passing it back to releasebuffer'')

As a possible optimization, if a buffer is never passed on to another variable/function then one can use a Py_buffer on the stack instead. Not a priority though.
commented

@robertwb
Copy link
Contributor Author

robertwb commented May 6, 2009

@dagss changed description from

Currently when one does

cdef object[a = ..., b
b = a

then the second line does a full acquisition of the buffer. This is unecesarry as a has all that is needed already.

To implement this in full generality one needs a reference count on the acquired Py_buffer struct.

While the immediate benefit may seem small, this becomes more crucial if http://trac.cython.org/ticket/178 is implemented; where each slice would need a new reference to the buffer. The time taken of the tp_getbuffer of a given object is outside of our control; and we can't even be guaranteed to get back the same buffer each time, making slicing less robust.

'''Implementation notes'''

It seems clear that one needs a reference count allocated on the heap, since we want to eventually be able to pass buffers to other functions (http://trac.cython.org/ticket/177) and store them in fields and global variables (http://trac.cython.org/ticket/301). So something like:

typedef struct {
  size_t refcount;
  Py_buffer bufinfo;
} __Pyx_Buffer;

and then clients would, instead of having Py_buffers on the stack, malloc and free a __Pyx_buffer which would be shared among all users of it.

(One could keep the Py_buffer on the stack and copy it by value, only the refcount absolutely needs to live on the heap, however it is the mallc/free that is expensive and might. ''The original must be kept anyway, as we are not allowed to change it before passing it back to releasebuffer'')

As a possible optimization, if a buffer is never passed on to another variable/function then one can use a Py_buffer on the stack instead. Not a priority though.

to

Currently when one does

cdef object[int](int]) a = ..., b
b = a

then the second line does a full acquisition of the buffer. This is unecesarry as a has all that is needed already.

To implement this in full generality one needs a reference count on the acquired Py_buffer struct.

While the immediate benefit may seem small, this becomes more crucial if http://trac.cython.org/ticket/178 is implemented; where each slice would need a new reference to the buffer. The time taken of the tp_getbuffer of a given object is outside of our control; and we can't even be guaranteed to get back the same buffer each time, making slicing less robust.

'''Implementation notes'''

It seems clear that one needs a reference count allocated on the heap, since we want to eventually be able to pass buffers to other functions (http://trac.cython.org/ticket/177) and store them in fields and global variables (http://trac.cython.org/ticket/301). So something like:

typedef struct {
  size_t refcount;
  Py_buffer bufinfo;
} __Pyx_Buffer;

and then clients would, instead of having Py_buffers on the stack, malloc and free a __Pyx_buffer which would be shared among all users of it.

(One could keep the Py_buffer on the stack and copy it by value, only the refcount absolutely needs to live on the heap, however it is the mallc/free that is expensive and might as well save some hassle and memory. ''The original must be kept anyway, as we are not allowed to change it before passing it back to releasebuffer'')

As a possible optimization, if a buffer is never passed on to another variable/function then one can use a Py_buffer on the stack instead. Not a priority though.
commented

@robertwb
Copy link
Contributor Author

robertwb commented May 6, 2009

@dagss changed description from

Currently when one does

cdef object[a = ..., b
b = a

then the second line does a full acquisition of the buffer. This is unecesarry as a has all that is needed already.

To implement this in full generality one needs a reference count on the acquired Py_buffer struct.

While the immediate benefit may seem small, this becomes more crucial if http://trac.cython.org/ticket/178 is implemented; where each slice would need a new reference to the buffer. The time taken of the tp_getbuffer of a given object is outside of our control; and we can't even be guaranteed to get back the same buffer each time, making slicing less robust.

'''Implementation notes'''

It seems clear that one needs a reference count allocated on the heap, since we want to eventually be able to pass buffers to other functions (http://trac.cython.org/ticket/177) and store them in fields and global variables (http://trac.cython.org/ticket/301). So something like:

typedef struct {
  size_t refcount;
  Py_buffer bufinfo;
} __Pyx_Buffer;

and then clients would, instead of having Py_buffers on the stack, malloc and free a __Pyx_buffer which would be shared among all users of it.

(One could keep the Py_buffer on the stack and copy it by value, only the refcount absolutely needs to live on the heap, however it is the mallc/free that is expensive and might as well save some hassle and memory. ''The original must be kept anyway, as we are not allowed to change it before passing it back to releasebuffer'')

As a possible optimization, if a buffer is never passed on to another variable/function then one can use a Py_buffer on the stack instead. Not a priority though.

to

Currently when one does

cdef object[int](int]) a = ..., b
b = a

then the second line does a full acquisition of the buffer. This is unecesarry as a has all that is needed already.

To implement this in full generality one needs a reference count on the acquired Py_buffer struct.

While the immediate benefit may seem small, this becomes more crucial if http://trac.cython.org/ticket/178 is implemented; where each slice would need a new reference to the buffer. The time taken of the tp_getbuffer of a given object is outside of our control; and we can't even be guaranteed to get back the same buffer each time, making slicing less robust.

'''Implementation notes'''

It seems clear that one needs a reference count allocated on the heap, since we want to eventually be able to pass buffers to other functions (http://trac.cython.org/ticket/177) and store them in fields and global variables (http://trac.cython.org/ticket/301). So something like:

typedef struct {
  size_t refcount;
  Py_buffer bufinfo;
} __Pyx_Buffer;

and then clients would, instead of having Py_buffers on the stack, malloc and free a __Pyx_buffer which would be shared among all users of it.

With regards to http://trac.cython.org/ticket/299, this would mean that the structs in http://trac.cython.org/ticket/299 would keep a pointer instead. I'll add a comment there.

(One could keep the Py_buffer on the stack and copy it by value, only the refcount absolutely needs to live on the heap, however it is the mallc/free that is expensive and might as well save some hassle and memory. ''The original must be kept anyway, as we are not allowed to change it before passing it back to releasebuffer'')

As a possible optimization, if a buffer is never passed on to another variable/function then one can use a Py_buffer on the stack instead. Not a priority though.
commented

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant