Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Must not use Pybind's implicit object-identity __hash__ #102

Open
kaushikcfd opened this issue Dec 28, 2022 · 8 comments
Open

Must not use Pybind's implicit object-identity __hash__ #102

kaushikcfd opened this issue Dec 28, 2022 · 8 comments
Labels

Comments

@kaushikcfd
Copy link
Collaborator

kaushikcfd commented Dec 28, 2022

To Reproduce

  1. Create a file: islpy_hash.py as:
import islpy as isl


a = isl.BasicSet("{[i,j]: 0<=i<=j<10}")

print(hash("hashing of islpy objects isn't Python compliant"))
print(hash(a))
print(hash(a))
  1. Call it as PYTHONHASHSEED=3 python islpy_hash.py.

Observed behavior

(py311_env) [line@line ~]$ PYTHONHASHSEED=3 python islpy_hash.py 
-5820926828285319218
8778101896947
8778101896947
(py311_env) [line@line ~]$ PYTHONHASHSEED=3 python islpy_hash.py 
-5820926828285319218
8727734557427
8727734557427
(py311_env) [line@line ~]$ PYTHONHASHSEED=3 python islpy_hash.py 
-5820926828285319218
8787476382451
8787476382451

Expected behavior

The message printed to stdout must be the same across interpreter runs.

@kaushikcfd kaushikcfd added the bug label Dec 28, 2022
@inducer
Copy link
Owner

inducer commented Dec 28, 2022

Does __hash__ have documented cross-run semantics?

@isuruf
Copy link
Contributor

isuruf commented Dec 28, 2022

Yes. https://docs.python.org/3/using/cmdline.html#envvar-PYTHONHASHSEED

@kaushikcfd
Copy link
Collaborator Author

FWIW, isl_set_get_hash returns deterministic results as its seed is a constant.

@inducer
Copy link
Owner

inducer commented Dec 28, 2022

But then something's weird. set_get_hash is a direct wrapper of isl_set_get_hash:

        uint32_t  set_get_hash(set const &arg_self)
        {
          isl_ctx *islpy_ctx = nullptr;

                    if (!arg_self.is_valid())
                      throw isl::error(
                        "passed invalid arg to isl_set_get_hash for self");
                    

                        islpy_ctx = isl_set_get_ctx(arg_self.m_data);
                        
if (islpy_ctx) isl_ctx_reset_error(islpy_ctx);
uint32_t result = isl_set_get_hash(arg_self.m_data);
return result;
        }

and

wrap_set.def("__hash__", isl::set_get_hash, "get_hash(self)\n\n:param self: :class:`Set`\n:return: uint32_t \n\n.. warning::\n\n    This function is not part of the officially public isl API. Use at your own risk.");

@kaushikcfd
Copy link
Collaborator Author

kaushikcfd commented Dec 28, 2022

Aah I see. Set.__hash__ is consistent across runs (irrespective of PYTHONHASHSEED). However, BasicSet.__hash__ isn't. This is because ISL does not define isl_basic_set_get_hash and so isl.BasicSet.__hash__ is pybind11_builtins.pybind11_object.__hash__

@kaushikcfd
Copy link
Collaborator Author

kaushikcfd commented Dec 28, 2022

Using pybind11_object.__hash__ is even more problematic as:

(py311_env) [line@line temp]$ cat understand_bset_hash.py 
import islpy as isl

a1 = isl.BasicSet("{[i]: 0<=i<512}")
a2 = isl.BasicSet("{[i]: 0<=i<512}")
print(hash(a1))
print(hash(a2))
print(a1 == a2)
print({a1: "hello", a2: "world"})

(py311_env) [line@line temp]$ python understand_bset_hash.py 
8766500567951
8766500487043
True
{BasicSet("{ [i] : 0 <= i <= 511 }"): 'hello', BasicSet("{ [i] : 0 <= i <= 511 }"): 'world'}

I wonder if we should use the upcasting logic for such types for which ISL does not define isl_xxx_get_hash.

@inducer inducer changed the title __hash__ of ISLPy objects does not respect PYTHONHASHSEED Must not use Pybind's implicit object-identity __hash__ Dec 30, 2022
@inducer
Copy link
Owner

inducer commented Dec 30, 2022

See #103 for a first stab. I'm excited for all the ways in which this breaks loopy. 🙂

@matthiasdiener
Copy link
Contributor

Not sure if this is useful, but while looking at this issue, I noticed that isl does offer isl_basic_set_get_hash (see https://repo.or.cz/isl.git/blob/HEAD:/isl_map.c#l11315), but does not expose it in its headers, which is why islpy doesn't pick it up.

Manually adding it to the header and rebuilding islpy seems to make the two examples in this PR run as expected:

$ PYTHONHASHSEED=3 python islpy_hash.py
-2348628950087003433
1863046788
1863046788
$ PYTHONHASHSEED=3 python islpy_hash.py
-2348628950087003433
1863046788
1863046788
$ PYTHONHASHSEED=3 python islpy_hash.py
-2348628950087003433
1863046788
1863046788

$ PYTHONHASHSEED=3 python understand_bset_hash.py 
181834274
181834274
True
{BasicSet("{ [i] : 0 <= i <= 511 }"): 'world'}
$ PYTHONHASHSEED=3 python understand_bset_hash.py 
181834274
181834274
True
{BasicSet("{ [i] : 0 <= i <= 511 }"): 'world'}
$ PYTHONHASHSEED=3 python understand_bset_hash.py 
181834274
181834274
True
{BasicSet("{ [i] : 0 <= i <= 511 }"): 'world'}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants