Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Number normalization for Crystal::Hasher #5276

Merged
merged 6 commits into from Nov 24, 2017

Conversation

akzhan
Copy link
Contributor

@akzhan akzhan commented Nov 11, 2017

As declared by Crystal language reference, 1_i32.hash should be equal to 1_f64.hash.

This method must have the property that a == b implies a.hash == b.hash.

With this Pull request it's true.

Extracted from #4675, also replaces #4581.

Highly inspired by Python hashes, and has great optimizations for some cases by @funny-falcon.

@akzhan akzhan changed the title Indtruces real Number normalization for Crystal::Hasher Intrudces real Number normalization for Crystal::Hasher Nov 11, 2017
@akzhan
Copy link
Contributor Author

akzhan commented Nov 11, 2017

/cc @funny-falcon @RX14 @asterite

P.S.: price of hash normalization is minimal: i8-32 has no price, i64/128 has one div, floats need one frexp op (but float32/float64 use fast integer IEEE754 path). No price for others,

@akzhan akzhan changed the title Intrudces real Number normalization for Crystal::Hasher Introduces real Number normalization for Crystal::Hasher Nov 11, 2017
@akzhan akzhan force-pushed the number_normalization branch 2 times, most recently from 43c6611 to 4dfa15a Compare November 11, 2017 20:31

private HASH_BITS = 61
private HASH_MODULUS = (1_i64 << HASH_BITS) - 1
private HASH_NAN = 0_u64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it a formatting issue?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it is aligned with _ of below 314159_u64. I think it is expected.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it should should have single space after =.

@akzhan
Copy link
Contributor Author

akzhan commented Nov 12, 2017

I have not touched/optimized BigDecimal class here because it was born too late (#4876). I hope that @vegai helps to optimize number normalization for these numbers too.

But generic float implementation will work anyway too.

@vegai
Copy link
Contributor

vegai commented Nov 12, 2017

@akzhan Roger, I'll try to do this soon.

As declared by Crystal language reference, 1i32.hash should equal to 1f64.hash.

Extracted from crystal-lang#4675, also replaces crystal-lang#4581.
@akzhan akzhan changed the title Introduces real Number normalization for Crystal::Hasher Number normalization for Crystal::Hasher Nov 12, 2017
@akzhan akzhan mentioned this pull request Nov 12, 2017
@@ -51,6 +52,15 @@ describe "Crystal::Hasher" do
2.hash.should eq(2_u64.hash)
end

it "Big i64 numbers should be hashed ok" do
Int64::MAX.hash.should eq (Int64::MAX.hash)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove spaces between eq and the value that is expected to be equal to.

end

pending "128bit types should be hashed ok" do
1.to_i128.hash.should eq (1_i8.hash)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above. Please remove spaces between eq and the arguments given to it.

@luislavena
Copy link
Contributor

Hello @akzhan, Thank you for the PR!

I read the link you provided about Object#hash and doesn't state what you described in this request. Perhaps this was discussed in another issue by Crystal core developers? Can you provide the link to that instead?

Thank you.

@akzhan
Copy link
Contributor Author

akzhan commented Nov 13, 2017

Hello @luislavena - Crystal language reference was my starting point.

There is definition:

This method must have the property that a == b implies a.hash == b.hash.

Current Crystal (any version) doesn't follow this definition (1_i32.hash != 1_f64.hash in master).

Nice point how it is in Python: https://github.com/python/cpython/blob/master/Python/pyhash.c#L1

P.S.: spacing fixed.

@akzhan
Copy link
Contributor Author

akzhan commented Nov 15, 2017

Is it GTG?

# :nodoc:
struct Crystal::Hasher
def float(value : BigFloat)
permute(float_normalize_wrap(value) do |value|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please assign the result of the float_normalize_wrap to a variable which is passed to permute. Having a multi-line block inside a method call arguments is just bad style.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akzhan You missed that one ;)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Sija Thanks!

end

def float(value : Float32)
permute(float_normalize_wrap(value) do |value|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

end

def float(value : Float64)
permute(float_normalize_wrap(value) do |value|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

end

def float(value : Float)
permute(float_normalize_wrap(value) do |value|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@RX14
Copy link
Contributor

RX14 commented Nov 18, 2017

Apart from that, I think this is gtg.

@akzhan
Copy link
Contributor Author

akzhan commented Nov 18, 2017

Changes applied

@akzhan
Copy link
Contributor Author

akzhan commented Nov 23, 2017

I know that few of team known this area, but please merge it. It's required to get Crystal stable in terms of production use.

may be @mverzilli ?

P.S.: It is part of Hashing #4675.

Copy link
Contributor

@luislavena luislavena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All my comments have been addressed. 👍

@mverzilli mverzilli merged commit 3f06d20 into crystal-lang:master Nov 24, 2017
@mverzilli
Copy link

Thanks @akzhan!

@mverzilli mverzilli added this to the Next milestone Nov 24, 2017
@akzhan akzhan deleted the number_normalization branch November 24, 2017 17:48
bcardiff pushed a commit that referenced this pull request Jun 5, 2018
Conditionally run this on 64bits platforms.
Ref #5276
chris-huxtable pushed a commit to chris-huxtable/crystal that referenced this pull request Jun 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants