Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

THRIFT-2877 Generate hashCode using primitives, static utility methods #448

Closed
wants to merge 2 commits into from

Conversation

roshan
Copy link
Contributor

@roshan roshan commented Apr 17, 2015

This is pretty much the List.hashCode() except without any list. It takes about a third the time on some rudimentary benchmarks. I tried it out with

typedef i32 SomeId
typedef binary BinId
typedef string StringId

struct NonTrue {}

union MaybeAThing {
    1: NonTrue nt
    2: bool bl
}

enum Nomnom {
    EAT=31
    LIVE=515
}

struct AllPrims {
    1: bool boole
    2: byte single_byte
    3: i16 shrt
    4: i32 integ
    5: i64 longue
    6: double f64
    7: string str
    8: binary bin
    9: Nomnom en
    10: NonTrue stru
    11: MaybeAThing un
    12: SomeId intid
    13: BinId binid
    14: StringId strid
}

generating this hashCode().

Populating these structs with some values has it match the AbstractList.hashCode() result but maybe we could try a different multiplicative factor.

Sorry about the other one #447. I rebased to one commit, but couldn't seem to reuse the old PR (it compiled locally on gcc but failed on clang).

@roshan
Copy link
Contributor Author

roshan commented Apr 17, 2015

Ah, I used Java 8 only methods. I'll fix this and reopen later.

@roshan roshan closed this Apr 17, 2015
@roshan roshan reopened this Apr 26, 2015
@roshan roshan force-pushed the THRIFT-2877_int_based_hashcode branch from b436a7e to 0e54f3f Compare April 26, 2015 06:13
@nsuke
Copy link
Member

nsuke commented Feb 10, 2016

Hi @roshan, sorry for the delay, I've just discovered this.
While this change makes sense in general, I have one concern.

The TBaseHelper.hashCode methods are the Java 8 implementations of hashCode for
those types.

Does this involve any code-level transplant, like copy-paste ?
In that case I'm afraid we have no way to incorporate this, as OpenJDK is GPL (let alone Oracle JDK disassemble).

@roshan
Copy link
Contributor Author

roshan commented Feb 12, 2016

Ha, thanks for that @nsuke . I only used the JDK 8 methods so we could keep hashCode identical in value to its current value for people's existing structs. Is that not a concern? If so, it should be easy to write a trivial implementation that does not resemble the JDK 8 methods in functionality or value and I'll be happy to do that.

@nsuke
Copy link
Member

nsuke commented Feb 12, 2016

@roshan great, thanks for working on it after such a long break.
I would suggest rewriting merely targeting "boxing value not involved" and "reasonably good" hash functions "from scratch."

If the JDK-identical return values are important, we may probably ask https://issues.apache.org/jira/browse/LEGAL if it's feasible first, but it's not the case, I suppose ?

@jsirois
Copy link
Member

jsirois commented Feb 12, 2016

My 2 nonbinding cents: No reasonable client should care about hashCode values, only that the values meet the hashCode contract (this is true of any language). I don't think preserving them should ever be a goal.

A second comment, not as helpful since it requires work, is that we should enshrine perf improvements using a tool like jmh. We use this in Apache Aurora for example to ensure we have a shared history of performance improvements and regressions.

@roshan roshan force-pushed the THRIFT-2877_int_based_hashcode branch from a168c7b to a5f81a2 Compare February 13, 2016 05:56
…hods

The TBaseHelper.hashCode methods are the Java 8 implementations of hashCode for
those types.
@roshan roshan force-pushed the THRIFT-2877_int_based_hashcode branch from a5f81a2 to c79377c Compare February 13, 2016 06:20
@roshan
Copy link
Contributor Author

roshan commented Feb 13, 2016

@nsuke, I switched it so we now consider a long like two ints (the high bits and low bits) and we combine their values like we combine hashCodes (just with a different multiplicative factor). For doubles we get the long represented by their bytes and treat as before. Since we're changing the values, I also took the chance to pick arbitrary constants to combine with.

For what it's worth, though, the previous hashCode helper functions are pretty much what are described in Effective Java. Anyway, I'll rebase and squash after you have a look.

@jsirois That's a good idea. Unfortunately, I don't have the time at the moment to do that :(

break;
default:
throw "compiler error: the following base type has no hashcode generator: " +
t_base_type::t_base_name(((t_base_type*)t)->get_base());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can take this opportunity to stop throwing string here.
As I believe it should never happen (right ?), std::logic_error would be appropriate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a style nit, could you indent switch block in the same way as the other part of this file (less indentation) ?

@nsuke
Copy link
Member

nsuke commented Feb 13, 2016

@roshan I've added comments on minor points. Other than that I think we're good to proceed 👍

@roshan
Copy link
Contributor Author

roshan commented Feb 13, 2016

Thanks for the comments, @nsuke . I will squash after builds pass.

@roshan roshan force-pushed the THRIFT-2877_int_based_hashcode branch from 260ec2a to 8ea6669 Compare February 14, 2016 00:56
@asfgit asfgit closed this in 949e242 Feb 14, 2016
gadLinux pushed a commit to gadLinux/thrift that referenced this pull request Mar 6, 2016
…hods

Client: Java
Author: Roshan George <roshan@arjie.com>

The TBaseHelper.hashCode methods are the Java 8 implementations of hashCode for
those types.

This closes apache#448
allengeorge pushed a commit to allengeorge/thrift that referenced this pull request Jan 1, 2017
…hods

Client: Java
Author: Roshan George <roshan@arjie.com>

The TBaseHelper.hashCode methods are the Java 8 implementations of hashCode for
those types.

This closes apache#448
jeking3 pushed a commit to jeking3/thrift that referenced this pull request Nov 30, 2017
…hods

Client: Java
Author: Roshan George <roshan@arjie.com>

The TBaseHelper.hashCode methods are the Java 8 implementations of hashCode for
those types.

This closes apache#448
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants