-
Notifications
You must be signed in to change notification settings - Fork 13.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Flink-5734] code generation for normalizedkey sorter #3511
[Flink-5734] code generation for normalizedkey sorter #3511
Conversation
Has the FLIP been posted and officially discussed on the mailing list? |
I haven't posted it yet. I will do it next week. |
@heytitle please remove the old FLINK-3722 commits and rebase to master. |
@greghogan May I ask you how to remove |
@heytitle, you can An initial thought from skimming the code: should we create an abstract |
988b207
to
e1f9822
Compare
I also think about the abstract class but I'm not sure how to do it properly. |
@heytitle, the code generation is only for a few methods, right? So the other methods in the sorter template could be moved into a |
Hi @greghogan, Thanks for the explanation. I like the idea. I also think we might not need What do you think? |
@heytitle yes, let's try that first. |
|
||
private static final long POINTER_MASK = LARGE_RECORD_TAG - 1; | ||
public static final long POINTER_MASK = LARGE_RECORD_TAG - 1; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason public
is used here because Janino
first check accessibility of these variables and it seems not able to access them when protected
is used and it throws the error below.
org.codehaus.commons.compiler.CompileException: Field "LARGE_RECORD_THRESHOLD" is not accessible
at org.codehaus.janino.ReflectionIClass$ReflectionIField.getConstantValue(ReflectionIClass.java:340)
at org.codehaus.janino.UnitCompiler.getConstantValue2(UnitCompiler.java:4433)
at org.codehaus.janino.UnitCompiler.access$10000(UnitCompiler.java:182)
at org.codehaus.janino.UnitCompiler$11.visitFieldAccess(UnitCompiler.java:4407)
at org.codehaus.janino.Java$FieldAccess.accept(Java.java:3229)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe put that into the comment inside the code?
@heytitle apologies for the long delay. I've been working to improve the Gelly examples to process with each of the standard data types (from byte to string). I think we can validate both the correctness and performance of this PR in time for the 1.4 release. Are you able to continue working on this feature? There are several steps to proceed with:
Rather than extending |
Hi @greghogan, Thank very much for the feedback.
Yes, I would like to complete the feature and will take a look into the issues you mentioned in next couple of weeks. |
… procedures for compare function
…atch with the new ones
We now get the classloader from the caller of createSorter.
…d FixedLengthRecordSorter
Oh, you are right @heytitle, thanks! Fixed in 9016cce |
Thanks @fhueske and @heytitle! Most of the comments have been addressed. The two outstanding issues are
I think I can do both next week. Or, @heytitle , if you have time you could also work on them. (Please drop me an email if you start working on 1., to avoid both of us doing it. For 2., the more eyes on it, the better.) |
@heytitle @ggevay Great work! |
Do you mean So I think a simpler and better approach is to just make sure that most types have a good implementation of
I'm happy to hear that! Btw. I think a large potential in code-generation is to eliminate the overheads of the very many virtual function calls throughout the runtime, by making the calls monomorphic [1,2]. For example, there could be a custom For example, the following virtual calls are on the per-record code path:
I think most of the above calls are megamorphic (especially in larger Flink jobs with many operators), which makes them slow [1,2]. They could be made monomorphic by code-generating custom versions of these classes, where the customization would be to fix the type of the targets of these calls. (I think this could probably be done independently of the Table API.) Another potential for code-generation is customizing
Having more info about the types, UDFs, etc. of your program certainly can help. Unfortunately I don't know too much about the Table API & SQL, but I have a few random thoughts:
Btw. have you seen this PR for code generation for POJO serializers and comparators? #2211 [1] http://insightfullogic.com/2014/May/12/fast-and-megamorphic-what-influences-method-invoca/ |
Hi @ggevay , thanks for the detailed response.
You are right when the sort keys are simple numeric types, but not with strings, which maybe the most popular choice in some ETL and data warehouse pipelines. But i agree that code generation can't help with this situation, so we investigate some binary data formats to represent our record and modify the interface of TypeSerializer & TypeComparator when doing ser/de. We don't have to consume the input/output view byte by byte, but has the ability to random access the underlying data, aka MemorySegment. It acts like spark's UnsafeRow: https://reviewable.io/reviews/apache/spark/5725, so we can eliminate the most deserialization cost such as
Totally agreed, after we finish dealing with the code generation and improving the ser/de, we will investigate more about this. Good to see that you have a list of all the megamorphic calls. BTW, we are actually translating the batch jobs into the streaming runtime, i think there will be lots in common. Having and control more type informations, and code generation the whole operator have lots of benefits, it can also help to making most of the calls monomorphic, such as:
And you are right this is orthogonal with runtime improvements, and we see the boundary is the Operator. The framework should provide the most efficient environment for operators to run, and we will code generating the most efficient operators to live in it.
I didn't see it yet, will find some time to check it out. |
Thanks @heytitle , done in ff3a35e . |
The @fhueske , I think all the comments have been addressed. |
This PR has not found support by the community for quite some time. I'm not sure if it is still mergeable. @KurtYoung you mentioned that Blink's implementation was inspired by this contribution, right? So I guess we can close this PR now. What do you guys think? |
Hi @twalthr , This PR has a similar status as the serializer codegen PR, explained here: In the last paragraph of that comment I mentioned that there will be an MSc student working on an alternative approach, which would subsume both PRs. In the meantime, @mukrram-bajwa wrote a nice MSc thesis on this alternative approach, but unfortunately it's far from a PR-ready state, and I'm not sure at the moment whether we will push it further, or how feasible is the approach for pushing it to production-readiness. Btw. I don't know the details about the Blink improvements. Maybe that subsumes both of these PRs and even the alternative approach that was pursued in the MSc thesis, but I don't know. Or it might win solely on the basis of being closer to production readiness. I suggest to just close both of these PRs for now, and then maybe later get back to these performance issues with a fresh mind. |
Hi @ggevay, thanks for letting us know the current status of both PRs. Currently, we are trying to perform a cleanup of stale PRs and issue before the big Blink reviewing/merging starts. It would be great if there is a solution that even beats Blink's performance. But as you said "with a fresh mind". Feel free to close both PRs for now. |
OK, but could you please close them @twalthr ? I don't have a close button; I guess I don't have permission because I'm not a commiter. |
Yes, sure. Sorry, I thought you were opening the PR. Will close it for now... |
@twalthr Sorry for the late response, but yes we borrowed lots of idea from this PR in blink's sorter code generation. And i want to thank @heytitle for these cool ideas. |
Actually, the original ideas were from @ggevay. |
Big thanks to the original author @ggevay , hope you don't mind for us borrowing your ideas. |
Of course! I'm actually happy that they are finally making it into Flink in some form. So thanks for borrowing them :) |
This pull-request is the implementation of applying code generation to
NormalizedKeySorter
. It is built on top of FLINK-3722 and based on FLIP: Code Generation for NormalizedKeySorter.Result from SortPerformance.java