Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switching deprecated Thread Local Storage (TLS) usage in Python 3.7 to Thread Specific Storage (TSS) #1454

Merged
merged 4 commits into from
Jul 17, 2018

Conversation

YannickJadoul
Copy link
Collaborator

@YannickJadoul YannickJadoul commented Jul 13, 2018

Python 3.7 deprecated the use of the 'Thread Local Storage (TLS) API', resulting in a bunch of warnings when compiling (see #1444). This PR switches between using the old TLS API and new TSS API, based on the Python version number.

See https://python.readthedocs.io/en/latest/c-api/init.html#thread-specific-storage-tss-api and https://www.python.org/dev/peps/pep-0539/ for details on the new API.

Potential issues:

  • Parts of the code where TLS was used were already subject to conditional compilation, macros and #ifs. But now a few more are added to switch between the old and new API (though only at a handful of places). Is it worth adding a wrapper API, to get rid of some of the preprocessor directives within the code?
  • The new Py_tss_t * key type of internals::tstate needs to be allocated (PyThread_tss_alloc()), but since the internals struct gets statically allocated once (and there is no destructor or other cleanup), it never gets PyThread_tss_freed (neither is the entry removed with PyThread_tss_delete). But then the old version never called PyThread_delete_key, either, and the struct has a static lifetime. Should there be a way of cleaning this up (e.g., through a Py_AtExit callback)?

@YannickJadoul
Copy link
Collaborator Author

That failing test seems to be due to brew installing Python 3.7 and the python3.6 command not being found.

@bstaletic
Copy link
Collaborator

I tried to change the .travis.yml to use python3.7, but that had a rather surprising behaviour. It raised these deprecation warnings that you mention, only they were treated as errors.

So to properly fix CI, we need this PR and this change.

@YannickJadoul
Copy link
Collaborator Author

Hmmmm, interesting. Seems to be this line in .travis.yml that's doing that (together with something in the tests/CMakeLists.txt).

Either way, I'll incorporate those changes in this one, then. Thanks! :-)

@YannickJadoul
Copy link
Collaborator Author

Hurray, checks are passing! Thanks, @bstaletic!

So this PR would also solve this problem the other recent PRs that are getting CI failures (i.e., #1455, #1453, #1448, ...)

pybind11_fail("get_internals: could not successfully initialize the TSS key!");
PyThread_tss_set(internals_ptr->tstate, tstate);
#else
internals_ptr->tstate = PyThread_create_key();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to check for failure status

@wjakob
Copy link
Member

wjakob commented Jul 17, 2018

This looks great! — I just added one minor comment.

@wjakob
Copy link
Member

wjakob commented Jul 17, 2018

Regarding this question:

Is it worth adding a wrapper API, to get rid of some of the preprocessor directives within the code?

Pybind11 already has a few #define PYBIND11_*s for cases where function names differ but the API is otherwise unchanged. This is AFAIK the case for most of the TSS calls except for the one that resets to NULL, so this could simplify the code quite a bit.

The new Py_tss_t * key type of internals::tstate needs to be allocated (PyThread_tss_alloc()), but since the internals struct gets statically allocated once (and there is no destructor or other cleanup), it never gets PyThread_tss_freed (neither is the entry removed with PyThread_tss_delete). But then the old version never called PyThread_delete_key, either, and the struct has a static lifetime. Should there be a way of cleaning this up (e.g., through a Py_AtExit callback)?

This could be interesting, but I'd propose to do it in a separate PR.

@bstaletic
Copy link
Collaborator

This could be interesting, but I'd propose to do it in a separate PR.

I tried playing with Valgrind and ASAN. Running the benchmaarks for my project, there were consistent memory leaks (nothing alarming, but it was there) pointing to something in the python library. Considering that valgrind python -c 'x=5' (and even some slightly less trivial code) showed no signs of memory leaks, I wondder if it would be possible to track these memory leaks down and if they are caused by pybind.

Not calling PyThread_tss_free or PyThread_delete_key sounds like something ASAN/Valgrind would mark as a potential leak.

The whole point of such an excercise would be to allow projects to use ASAN as a part of the CI.

@YannickJadoul
Copy link
Collaborator Author

@wjakob Done, thanks for the suggestions.

I just added one minor comment.

Done that. Apart from that, in principle, the set call could also fail, according to the API docs. Should we still check those (because that did not happen before with the old TLS API, either) ?

[...] so this could simplify the code quite a bit

Yes, done so as well. Just noting that the macro functionality is not needed, so we could also have nice C++ typedefs and functions instead of macros.

@wjakob and @bstaletic Yeah, ok, that seems indeed like a separate effort to clean up all allocations and have a clean ASAN/Valgrind run. So, let's add this cleanup to that list of things to do in such a PR, then?

// Thread Specific Storage (TSS) API.
#if PY_VERSION_HEX >= 0x03070000
#define PYBIND11_TLS_KEY_TYPE Py_tss_t *
#define PYBIND11_TLS_KEY_NULL_VALUE nullptr
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do the type and null value need to be separate here? It looks from the way they are used that you could simplify it to something along the lines of:

#define PYBIND11_TLS_KEY_INIT(var) Py_tss_t *var = nullptr;
// or for <3.7:
#define PYBIND11_TLS_KEY_INIT(var) decltype(PyThread_create_key()) var = 0;

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, myeah, true. Depends if at some point you'd need to have the type separate.

But I can still change that, if you want? (Or feel free to push that commit to this branch yourself, as you should be allowed by GitHub.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering that these are entirely internal macros (i.e. if you're using them outside of pybind11, shame on you) we can also break them up later if needed -- but for the moment it doesn't look like they are.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Pushed a commit)

@bstaletic
Copy link
Collaborator

ASAN/Valgrind

Yes, that's definitely something that should be done in a different PR.

@YannickJadoul
Copy link
Collaborator Author

Thanks!
Weird how Travis CI was apparently not triggered by your commit and push, but since Travis succeeded before, this should be OK if AppVeyor succeeds...

@jagerman
Copy link
Member

Weird how Travis CI was apparently not triggered by your commit and push, but since Travis succeeded before, this should be OK if AppVeyor succeeds...

Travis-ci has flagged me for abuse for some unknown reason, and thus ignores my commits. It's a bit annoying.

@YannickJadoul
Copy link
Collaborator Author

Or they just know you don't write bugs? ;-)

@wjakob
Copy link
Member

wjakob commented Jul 17, 2018

Travis-ci has flagged me for abuse for some unknown reason, and thus ignores my commits. It's a bit annoying.

Wow, super-strange. I'll go ahead and merge this now.

@wjakob wjakob merged commit b4719a6 into pybind:master Jul 17, 2018
@YannickJadoul YannickJadoul deleted the python37-tss branch August 10, 2018 08:31
EricCousineau-TRI added a commit to EricCousineau-TRI/pybind11 that referenced this pull request Sep 3, 2018
…on 3.7 to Thread Specific Storage (TSS) (pybind#1454)"

This reverts commit b4719a6.
#else
#define PYBIND11_TLS_REPLACE_VALUE(key, value) PyThread_set_key_value((key), (value))
#endif
#define PYBIND11_TLS_DELETE_VALUE(key) PyThread_set_key_value((key), nullptr)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use PyThread_delete_key_value for PYBIND11_TLS_DELETE_VALUE.

PyThread_set_key_value is strange that it does not change key value to nullptr when it's already set to non-null value, tested on Python 2.7.12.

I notice this because my Python callback function (overriding virtual function by PYBIND11_OVERLOAD_PURE) always segfaults upon the second call.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cyfdecyf And did it do that before this PR? Because I thought the code from before and after would be the same, on < 3.7 Python versions?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, you're right, something weird is happening, there, using this macro :|

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@YannickJadoul, @cyfdecyf : could you check if PR #1517 fixes the issue?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wjakob I don't know the actual scenario that caused this for @cyfdecyf, but this seems OK to me. Apologies for overlooking this mistake in this original PR :-/

The only difference to the original version I can see now is at here: https://github.com/pybind/pybind11/pull/1454/files#diff-2443092219154d8d844074fa320e45f7L1828, where for Python 3 PyThread_set_key_value((key), nullptr) is called instead of PyThread_delete_key_value((key)). It is hard/impossible to find documentation on the old TLS API, but I think this should be OK.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late reply. I'll try to construct a test case for this issue.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wjakob I can confirm your patch fixes a regression we're seeing in PyTorch pytorch/pytorch#11473 (comment). We have a release this Friday, do you have an ETA on when that fix can be merged? Thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will merge it today and push out a new patch release today.

wjakob added a commit to wjakob/pybind11 that referenced this pull request Sep 11, 2018
wjakob added a commit that referenced this pull request Sep 11, 2018
wjakob pushed a commit that referenced this pull request Sep 11, 2018
…o Thread Specific Storage (TSS) (#1454)

* Switching deprecated Thread Local Storage (TLS) usage in Python 3.7 to Thread Specific Storage (TSS)

* Changing Python version from 3.6 to 3.7 for Travis CI, to match brew's version of Python 3

* Introducing PYBIND11_ macros to switch between TLS and TSS API
wjakob added a commit that referenced this pull request Sep 11, 2018
facebook-github-bot pushed a commit to pytorch/pytorch that referenced this pull request Sep 11, 2018
Summary:
Fixes #11419

In particular pulling in pybind/pybind11#1454
as well as pending bugfix in pybind/pybind11#1517 (documenting in comment)
Pull Request resolved: #11473

Differential Revision: D9776003

Pulled By: jamesr66a

fbshipit-source-id: a225dcfb66c06bcae98fd2508d9e690c24be551a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants