-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade Avro to 1.9.1 #5938
Upgrade Avro to 1.9.1 #5938
Conversation
|
||
try { | ||
// Disable validation of default values for compatibility | ||
validateDefaults.set(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is not a very good way. However, there seems to be no public method to disable validation. Another option is to enable validation of default values.
retest this please |
rerun cpp tests |
Add label release-2.5.1, due to #6406 dependency |
### Motivation Currently, Pulsar uses Avro 1.8.2, a version released two years ago. The latest version of Avro is 1.9.1, which uses FasterXML's Jackson 2.x instead of Codehaus's Jackson 1.x. Jackson is prone to security issues, so we should not keep using older versions. https://blog.godatadriven.com/apache-avro-1-9-release ### Modifications Avro 1.9 has some major changes: - The library used to handle logical datetime values has changed from Joda-Time to JSR-310 (apache/avro#631) - Namespaces no longer include "$" when generating schemas containing inner classes using ReflectData (apache/avro#283) - Validation of default values has been enabled (apache/avro#288). This results in a validation error when parsing the following schema: ```json { "name": "fieldName", "type": [ "null", "string" ], "default": "defaultValue" } ``` The default value of a nullable field must be null (cf. https://issues.apache.org/jira/browse/AVRO-1803), and the default value of the field as above is actually null. However, this PR disables the validation in order to maintain the traditional behavior. (cherry picked from commit d6f240e)
@massakam Thanks for this fix and looking forward for the newer Avro library to be integrated with Pulsar. |
@codelipenghui @sijie thoughts on the above comment? ^ Wouldn't that fix break backwards compatibility with existing code base? If it does, should Pulsar also take an action of being more compliant with default Avro behaviour such as :
|
@psilos good point. @codelipenghui @tuteng I think we need to verify the two questions that @psilos raised. If they are breaking behaviors, then it is a problem for 2.5.1. Can you guys verify that? |
@psilos I have pushed a PR #6704 to adds Joda time logical type conversion so that we can keep forwarding compatibility, And it should be included in 2.5.1. About the allow null, this is a legacy. We'd better keep forwarding compatibility. I agree with you that Pulsar schema should compliant with default Avro behavior. We'd better draft a PIP to change the default behavior and provide options to fall back to the old behavior. I think he can plan the default value validation and allow null in 2.6.0 since 2.5.1 is on the release process. |
### Motivation After upgrade to Apache Avro 1.9.x, the default time conversion changed to JSR-310. For forwarding compatibility, we'd better add the Joda time conversion. related to #5938 ### Modifications Add joda time conversions ### Verifying this change New integration test added
### Motivation Currently, Pulsar uses Avro 1.8.2, a version released two years ago. The latest version of Avro is 1.9.1, which uses FasterXML's Jackson 2.x instead of Codehaus's Jackson 1.x. Jackson is prone to security issues, so we should not keep using older versions. https://blog.godatadriven.com/apache-avro-1-9-release ### Modifications Avro 1.9 has some major changes: - The library used to handle logical datetime values has changed from Joda-Time to JSR-310 (apache/avro#631) - Namespaces no longer include "$" when generating schemas containing inner classes using ReflectData (apache/avro#283) - Validation of default values has been enabled (apache/avro#288). This results in a validation error when parsing the following schema: ```json { "name": "fieldName", "type": [ "null", "string" ], "default": "defaultValue" } ``` The default value of a nullable field must be null (cf. https://issues.apache.org/jira/browse/AVRO-1803), and the default value of the field as above is actually null. However, this PR disables the validation in order to maintain the traditional behavior. (cherry picked from commit d6f240e)
@codelipenghui Thanks very much for the quick fix, and looking forward for the new release. Unfortunately one of the areas where we could have used the Avro upgrade was inside Pulsar Functions, are you planning to add the extra flag(s) into Pulsar Functions as well? Currently there is no way to override the default allow null behaviour, neither the use of JSR-310 Instant. CC @sijie |
@psilos could you please help create an issue? so that we can keep it be tracked. Since 2.5.1 RC is out, we can plan it in 2.5.2 BTW, if you are using the Avro compiler, you need to upgrade the compiler. Otherwise, you may meet some class not found exception. Because some classes are deleted in the new Avro lib. |
### Motivation Currently, Pulsar uses Avro 1.8.2, a version released two years ago. The latest version of Avro is 1.9.1, which uses FasterXML's Jackson 2.x instead of Codehaus's Jackson 1.x. Jackson is prone to security issues, so we should not keep using older versions. https://blog.godatadriven.com/apache-avro-1-9-release ### Modifications Avro 1.9 has some major changes: - The library used to handle logical datetime values has changed from Joda-Time to JSR-310 (apache/avro#631) - Namespaces no longer include "$" when generating schemas containing inner classes using ReflectData (apache/avro#283) - Validation of default values has been enabled (apache/avro#288). This results in a validation error when parsing the following schema: ```json { "name": "fieldName", "type": [ "null", "string" ], "default": "defaultValue" } ``` The default value of a nullable field must be null (cf. https://issues.apache.org/jira/browse/AVRO-1803), and the default value of the field as above is actually null. However, this PR disables the validation in order to maintain the traditional behavior. (cherry picked from commit d6f240e)
After upgrade to Apache Avro 1.9.x, the default time conversion changed to JSR-310. For forwarding compatibility, we'd better add the Joda time conversion. related to apache#5938 Add joda time conversions New integration test added (cherry picked from commit 854716f)
### Motivation Currently, Pulsar uses Avro 1.8.2, a version released two years ago. The latest version of Avro is 1.9.1, which uses FasterXML's Jackson 2.x instead of Codehaus's Jackson 1.x. Jackson is prone to security issues, so we should not keep using older versions. https://blog.godatadriven.com/apache-avro-1-9-release ### Modifications Avro 1.9 has some major changes: - The library used to handle logical datetime values has changed from Joda-Time to JSR-310 (apache/avro#631) - Namespaces no longer include "$" when generating schemas containing inner classes using ReflectData (apache/avro#283) - Validation of default values has been enabled (apache/avro#288). This results in a validation error when parsing the following schema: ```json { "name": "fieldName", "type": [ "null", "string" ], "default": "defaultValue" } ``` The default value of a nullable field must be null (cf. https://issues.apache.org/jira/browse/AVRO-1803), and the default value of the field as above is actually null. However, this PR disables the validation in order to maintain the traditional behavior.
### Motivation After upgrade to Apache Avro 1.9.x, the default time conversion changed to JSR-310. For forwarding compatibility, we'd better add the Joda time conversion. related to apache#5938 ### Modifications Add joda time conversions ### Verifying this change New integration test added
Motivation
Currently, Pulsar uses Avro 1.8.2, a version released two years ago. The latest version of Avro is 1.9.1, which uses FasterXML's Jackson 2.x instead of Codehaus's Jackson 1.x. Jackson is prone to security issues, so we should not keep using older versions.
https://blog.godatadriven.com/apache-avro-1-9-release
Modifications
Avro 1.9 has some major changes:
The default value of a nullable field must be null (cf. https://issues.apache.org/jira/browse/AVRO-1803), and the default value of the field as above is actually null. However, this PR disables the validation in order to maintain the traditional behavior.