-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Python] .nbytes
should release the GIL while calculating the data size
#39096
Comments
.nbytes
blocks the GIL while calculating the data size.nbytes
should release the GIL while calculating the data size
No, I don't think there is a specific reason for not releasing the GIL, apart from missing to do that when this was added (or assuming this was cheap, I think initially |
jorisvandenbossche
pushed a commit
that referenced
this issue
Dec 7, 2023
### Rationale for this change The `.nbytes` holds the GIL while computing the data size in C++, which has caused performance issues in Dask because threads were blocking each other See #39096 ### Are these changes tested? I am not sure if additional tests are necessary here. If so, I'm happy to add them but would welcome some pointers. ### Are there any user-facing changes? No * Closes: #39096 Authored-by: Hendrik Makait <hendrik@makait.com> Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
mapleFU
pushed a commit
to mapleFU/arrow
that referenced
this issue
Dec 13, 2023
### Rationale for this change The `.nbytes` holds the GIL while computing the data size in C++, which has caused performance issues in Dask because threads were blocking each other See apache#39096 ### Are these changes tested? I am not sure if additional tests are necessary here. If so, I'm happy to add them but would welcome some pointers. ### Are there any user-facing changes? No * Closes: apache#39096 Authored-by: Hendrik Makait <hendrik@makait.com> Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
dgreiss
pushed a commit
to dgreiss/arrow
that referenced
this issue
Feb 19, 2024
### Rationale for this change The `.nbytes` holds the GIL while computing the data size in C++, which has caused performance issues in Dask because threads were blocking each other See apache#39096 ### Are these changes tested? I am not sure if additional tests are necessary here. If so, I'm happy to add them but would welcome some pointers. ### Are there any user-facing changes? No * Closes: apache#39096 Authored-by: Hendrik Makait <hendrik@makait.com> Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the enhancement requested
The
.nbytes
holds the GIL while computing the data size in C++, which has caused performance issues in Dask because threads were blocking each other. Am I missing a valid reason for blocking the GIL during the computation? From my understanding, there should be no possible race conditions. Similarly, I noticed that.get_total_buffer_size
holds the GIL as well, but I'm not sure if that's a problem because the computation is much cheaper.Component(s)
Python
The text was updated successfully, but these errors were encountered: