Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accessing C++ hash functions from Python and vice versa #1034

Merged
merged 5 commits into from
Aug 30, 2017

Conversation

bmerry
Copy link
Contributor

@bmerry bmerry commented Aug 27, 2017

There are two separate additions:

  1. py::hash(obj) is equivalent to the Python hash(obj).
  2. .def(hash(py::self)) registers the hash function defined by std::hash as the Python hash function.

There are two separate additions:

1. `py::hash(obj)` is equivalent to the Python `hash(obj)`.
2. `.def(hash(py::self))` registers the hash function defined by
   std::hash<T> as the Python hash function.
@bmerry
Copy link
Contributor Author

bmerry commented Aug 27, 2017

I think it would also be nice to provide a convenient way to mark a wrapped class as unhashable, but I'm not sure what a good syntax for that would be.

@@ -390,6 +390,13 @@ inline void setattr(handle obj, handle name, handle value) {
inline void setattr(handle obj, const char *name, handle value) {
if (PyObject_SetAttrString(obj.ptr(), name, value.ptr()) != 0) { throw error_already_set(); }
}

inline long hash(object obj) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handle is generally used for function parameters to avoid an extra incref/decref of copying an object.

Py_hash_t is needed on Python 3 to resolve the compiler warnings, but it's not available on Python 2. Looking at the docs, ssize_t should satisfy the return type of both 2 and 3:

Changed in version 3.2: The return type is now Py_hash_t. This is a signed integer the same size as Py_ssize_t.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Not sure how that ended up being object - it was handle in my brain. I decided to deal with the return type by letting decltype handle it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the return type: It would be nice for pybind11's API to have the same type for both Python 2 and 3, so that we don't pass on these inconsistencies to users (unless it's absolutely necessary). If ssize_t is compatible with both, it would be preferable IMO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think anything in the standard that prevents long from being wider than Py_ssize_t, but I don't know of any contemporary systems where that happens (it would probably have been true on 16-bit systems). I've switched to Py_ssize_t and added a static_assert just to be safe.

@@ -5,6 +5,18 @@
from pybind11_tests import debug_enabled


class Hashable(object):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two test classes are only used in test_hash(). Move them into the scope of the function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@dean0x7d
Copy link
Member

I think it would also be nice to provide a convenient way to mark a wrapped class as unhashable, but I'm not sure what a good syntax for that would be.

Something along the lines of .def(hash(<some_tag_variable>))?

@bmerry
Copy link
Contributor Author

bmerry commented Aug 28, 2017

I think it would also be nice to provide a convenient way to mark a wrapped class as unhashable, but I'm not sure what a good syntax for that would be.

Something along the lines of .def(hash(<some_tag_variable>))?

That's certainly one option, but I didn't have a good what what to name the tag variable. Other options would be something like .def_unhashable(), or just .unhashable(), or .def(unhashable) (where unhashable would be tag variable in the last case). I might leave it for another pull request, rather than bogging this one down in a naming discussion.

@dean0x7d
Copy link
Member

Could you also add a changelog entry for this?

Regarding the unhashable syntax: it would be good if it wasn't a one-off thing (e.g. .def_unhashable(), .unhashable() would be one-off members of class_). A tag variable of some form is better. And if it can be applied to other functions/operators, that would be best. But I guess cls.attr("__hash__") = py::none() also works for now so there's no hurry.

@bmerry
Copy link
Contributor Author

bmerry commented Aug 29, 2017

Could you also add a changelog entry for this?

Done. Not sure where in the changelog you feel this belongs, so feel free to directly commit to my branch to move it around.

Regarding the unhashable syntax: it would be good if it wasn't a one-off thing (e.g. .def_unhashable(), .unhashable() would be one-off members of class_). A tag variable of some form is better. And if it can be applied to other functions/operators, that would be best. But I guess cls.attr("hash") = py::none() also works for now so there's no hurry.

Maybe something like .def(not_implemented(hash(self)))? That could be extended to things like .def(not_implemented(self > self)) to create implementations that raise NotImplemented. Or alternatively .def_not_implemented, to be similar to .def_cast.

@dean0x7d
Copy link
Member

Maybe something like .def(not_implemented(hash(self)))? That could be extended to things like .def(not_implemented(self > self)) to create implementations that raise NotImplemented. Or alternatively .def_not_implemented, to be similar to .def_cast.

Things can get a little bit tricky here because of the distinction between NotImplemented, NotImplementedError and __op__ = None, where __hash__ requires the last one. Perhaps a simple assignment is still best for that case since it mirrors the Python syntax and doesn't obscure what's being done. I guess the only issue with cls.attr("__hash__") = py::none() is that it's outside the regular class definition chain. Perhaps a generalized attribute setter would be interesting?

class_<Foo>(m, "Foo")
    .attrs("__hash__"_a=py::none(),
           "bar"_a=1,
           "baz"_a="two")
    .def("method", [](){});

To mirror the usual Python syntax:

class Foo:
    __hash__ = None
    bar = 1
    baz = "two"

    def method():
       pass

@bmerry
Copy link
Contributor Author

bmerry commented Aug 30, 2017

Things can get a little bit tricky here because of the distinction between NotImplemented, NotImplementedError and __op__ = None, where __hash__ requires the last one.

Yes, I was also a bit worried about that, and thinking it might be a bit of a pain to implement.

For me it wasn't totally obvious that setting __hash__ = None from the C++ side actually works, given that this seems to imply that one has to do special things with the callback slots. That can probably be solved with documentation though.

    .attrs("__hash__"_a=py::none(),
           "bar"_a=1,
           "baz"_a="two")

I think I would rather have this as three separate function calls, each taking a name and a value (which I'm guessing would also simplify the implementation since it wouldn't need to muck about with variadic templates). Maybe it's worth adding a two-argument form of object_api::attr that takes a value to set, and returns derived() to allow chaining? That could then also be used on modules (although it may cause some confusion with module::add_object - I'm not clear about what PyModule_AddObject does that's different to PyObject_SetAttr).

@dean0x7d dean0x7d merged commit 37de2da into pybind:master Aug 30, 2017
@dean0x7d
Copy link
Member

I've merged this now and the not_implemented/unhashable can be part of a separate PR.

For me it wasn't totally obvious that setting hash = None from the C++ side actually works, given that this seems to imply that one has to do special things with the callback slots.

Seems like it's mostly just a convenience for the C API, since it assigns None later anyway.

I think I would rather have this as three separate function calls

Sure, I can see that being preferable. But I'm not sure about allowing chaining for all object_api-derived classes. Seems like it should be limited to class_.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants