Skip to content

Conversation

@JoshRosen
Copy link
Contributor

This patch upgrades Jackson from 2.5.3 to 2.7.3. I'd like to upgrade now in order to take advantage of new performance improvements and features, as well as to be better-prepared for when we'll want to upgrade for Scala 2.12 support.

@JoshRosen JoshRosen changed the title [SPARK-14989] Upgrade Jackson from 2.5.3 to 2.7.3 [SPARK-14989][BUILD] Upgrade Jackson from 2.5.3 to 2.7.3 Apr 28, 2016
@SparkQA
Copy link

SparkQA commented Apr 28, 2016

Test build #57282 has finished for PR 12766 at commit 264f76f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen
Copy link
Contributor Author

/cc @srowen

opencsv-2.3.jar
oro-2.0.8.jar
paranamer-2.6.jar
paranamer-2.3.jar
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird that this version went down

@srowen
Copy link
Member

srowen commented Apr 29, 2016

I like it. The only looming problem is whether this works in the context of Hadoop classes which may have a different Jackson version. As long as 2.7 wins, it probably works, but it bears testing. Shading could help if there's a problem. This dependency is always a little problematic but worth pushing forward.

@srowen
Copy link
Member

srowen commented Apr 29, 2016

Actually may be fine. Hadoop actually includes Jackson 1.x, and the things pulling in 2.x besides Spark seem to be dropwizard metrics and Scala. Those may be much more manageable

@JoshRosen
Copy link
Contributor Author

My impression is that Jackson's backwards-compatibility story is good.

It looks like Jackson 2.7.4 was just released a few hours ago, so maybe we should wait a day and upgrade straight to that instead: https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.7.4

@cowtowncoder, since you mentioned wanting to get the various "big data" frameworks to upgrade to newer Jackson versions, do you know of any compatibility issues that we should be aware of? Just thought I'd ask since you might have received feedback from other projects that are performing the same upgrade.

@cowtowncoder
Copy link

@JoshRosen yes, thank you for following up on this!

On compatibility: there were some issues with 2.7.0 - 2.7.2, regarding type resolution, most of which were fixed with 2.7.3.
With 2.7.4 (released today) I hope that regressions are finally dealt with; there were some more esoteric edge cases.

2.6.x had one compatibility issue over 2.5.x that users have had problems with (wrt JsonInclude.Include.NON_EMPTY applying to default values of primitive/wrapper types), so it may be safest to skip that version (2.5.5 is latest from 2.5). Behavior of 2.7 is same as that of 2.5; 2.7 added better support for option JsonInclude.Include.NON_DEFAULT to allow suppression of values like 0 for int, but keeping NON_EMPTY reserved for container things (and empty String).
This is the only major compatibility concern I am aware of.

So with Spark 2.0, I would recommend going with 2.7.4.

One additional thing that may make 2.7 best choice is that we once again have active owner for Scala module, and there has been progress in getting some long-time-open bugs fixed for 2.7.4.
And there is also Scala 2.12 variant of that module now (starting with 2.7.3).

@JoshRosen
Copy link
Contributor Author

Closing this PR for now since this upgrade was performed in a different PR while I was away.

@JoshRosen JoshRosen closed this May 27, 2016
@JoshRosen JoshRosen deleted the upgrade-jackson branch May 27, 2016 16:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants