New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-6864] Fix confusing "invalid POJO type" messages from TypeExtractor #4574
Conversation
@tzulitai I create this PR to fix https://issues.apache.org/jira/browse/FLINK-6864, I think add logs instead of removing them will be better, what do you think? Thanks |
I agree with @zjureel . Even though I know the POJO rules well, it still happened to me more than once that I accidentally had a non-POJO, and I only noticed this because of these log messages, so I think it would be problematic to completely remove them. |
I also agree that these messages have been personally useful. I think this PR is an improvement but that we could do better to note and/or emphasize the performance implications of |
@ggevay @greghogan Thank you for your suggestions, I tried to add some notice about performance loss in the log, what do you think? Thanks |
docs/dev/types_serialization.md
Outdated
@@ -115,6 +115,8 @@ conditions are fulfilled: | |||
or have a public getter- and a setter- method that follows the Java beans | |||
naming conventions for getters and setters. | |||
|
|||
Note that when a data type can't be recognized as a POJO type, it will be handled as GenericType. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"... it will be processed as a GenericType and serialized with Kryo"? Is it generally understood that use of Kryo can have a significant impact on performance? It's not just serialization which is slower. It looks like if the class does not implement NormalizedKey
then objects are deserialized for each comparison.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to reply late, and thank you for your suggestions @greghogan
In fact, I have some doubts about what information to be provided to the users when the data can't be recognized as POJO type. Do you think data type and serialization are sufficient here? thanks
@@ -1900,7 +1900,8 @@ private boolean isValidPojoField(Field f, Class<?> clazz, ArrayList<Type> typeHi | |||
ParameterizedType parameterizedType, TypeInformation<IN1> in1Type, TypeInformation<IN2> in2Type) { | |||
|
|||
if (!Modifier.isPublic(clazz.getModifiers())) { | |||
LOG.info("Class " + clazz.getName() + " is not public, cannot treat it as a POJO type. Will be handled as GenericType"); | |||
LOG.info("Class " + clazz.getName() + " is not public, cannot treat it as a POJO type. " + | |||
"Will be handled as GenericType, and may lose some serialization performance."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we said something like "... is not public and cannot be processed as a POJO type. Please read the Flink documentation on "Data Types & Serialization" for details on the impact to performance."
6193d17
to
e73cea1
Compare
@zjureel am merging this ... thanks for the PR and edits! |
What is the purpose of the change
Fix confusing "invalid POJO type" messages from TypeExtractor
Brief change log
Verifying this change
This change needs no testing
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: (no)Documentation