Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ThriftBinaryDeserializer incompatible with thrift 0.9.1+ #452

Closed
EugenCepoi opened this issue Oct 13, 2015 · 12 comments
Closed

ThriftBinaryDeserializer incompatible with thrift 0.9.1+ #452

EugenCepoi opened this issue Oct 13, 2015 · 12 comments

Comments

@EugenCepoi
Copy link
Contributor

Since thrift 0.9.1 the method setReadLength from TBinaryProtocol has been removed but ThriftBinaryDeserializer is still using it.

I couldn't find why it has been removed or any replacement to it. This issue happens when using emr 4 and spark 1.5 as they use thrift 0.9.2. If "the fix" is to just remove the calls to that method and make sure it works with latest thrift releases I can open a PR.

@ianoc
Copy link
Contributor

ianoc commented Oct 13, 2015

I believe for spark you should use their classloader options to isolate. Unfortunately we consume our thrift via github.com/twitter/scrooge which is based off apache thrift 0.5 or so. This method is super useful (probably always, not sure why it got removed) for us since it avoids OOM's when corrupt data reports that arrays should be huge

@EugenCepoi
Copy link
Contributor Author

The option in spark is experimental and works only in cluster mode, which in my case is not possible. I understand that this maybe doesn't match with the versions used at twitter, but at the same time freezing the version of some dependency prevents other users from upgrading too.

Some ideas:

  • Parquet seems to provide a way to build against thrift 0.9 via maven profile, maybe the same could be done for EB?
  • If the dependency on thrift is "internal" then perhaps it could be shaded and relocated to some internal package to avoid conflicts?

@ianoc
Copy link
Contributor

ianoc commented Oct 14, 2015

Well its freezing the versions used by our other open source projects, util, finagle, scrooge. So unless scrooge upgrades or removes its dependency on libthrift we cannot really upgrade.

You could investigate doing the maven profile thing, which you would need to include different versions of sources depending on the profile, that would be fine with me.
We can't shade anything since several OSS projects use the thrift version as it is, maybe eventually they will cut that dependency, but it would be a large change to thousands of lines of code, so won't likely soon.

The maven profile approach sounds like a decent approach though, would love a PR with it

@EugenCepoi
Copy link
Contributor Author

Ok I am looking at it. If we go for the maven profile I see two alternatives:

  • Have two versions of the files that are incompatible, one for the old version and another one for the new. I dont like much the idea of having two versions of the source code but it would allow us to continue using setReadLength for the old version.
  • Update the code to be compatible with both versions (from a quick look seems possible). The nice thing is that this makes things much easier and cleaner but the case you describe should be fixed in some other way.

WDYT, for which one should I go?

EugenCepoi added a commit to EugenCepoi/elephant-bird that referenced this issue Nov 6, 2015
EugenCepoi added a commit to EugenCepoi/elephant-bird that referenced this issue Nov 6, 2015
EugenCepoi added a commit to EugenCepoi/elephant-bird that referenced this issue Nov 6, 2015
EugenCepoi added a commit to EugenCepoi/elephant-bird that referenced this issue Nov 6, 2015
EugenCepoi added a commit to EugenCepoi/elephant-bird that referenced this issue Nov 6, 2015
EugenCepoi added a commit to EugenCepoi/elephant-bird that referenced this issue Nov 6, 2015
EugenCepoi added a commit to EugenCepoi/elephant-bird that referenced this issue Nov 6, 2015
EugenCepoi added a commit to EugenCepoi/elephant-bird that referenced this issue Nov 6, 2015
EugenCepoi added a commit to EugenCepoi/elephant-bird that referenced this issue Nov 6, 2015
@EugenCepoi
Copy link
Contributor Author

Sorry for the noise, had a bit of trouble to make the build pass with travis.
@ianoc does this look good to you? Should we also do cross thrift version releases using classifiers?

@ianoc
Copy link
Contributor

ianoc commented Nov 11, 2015

Thanks for the contribution in the PR @EugenCepoi , the multiple maven profiles while having a small amount of duplication look like a good way to go here as you've done it.

EugenCepoi added a commit to EugenCepoi/elephant-bird that referenced this issue Nov 12, 2015
@ianoc ianoc closed this as completed in e73bb6e Dec 17, 2015
ianoc added a commit that referenced this issue Dec 17, 2015
Fixes #452 - supporting thrift 0.7 and 0.9
@EugenCepoi
Copy link
Contributor Author

@ianoc do you think you could make a release including the merged changes? Thanks!

@ianoc
Copy link
Contributor

ianoc commented Jan 5, 2016

We are looking into it post holidays now, the release script makes assumptions about being on a debian/ubuntu env, so little effort to get going. Will let you know

@EugenCepoi
Copy link
Contributor Author

I never managed to build thrift 0.7 on a mac, so I guessed that you are releasing from something else, most likely debian/ubuntu. Looks like I was wrong...on the other hand you can easily use a vagrant box or docker to do the release, as in theory you only need the maven credentials for sonatype. I used a vagrant with debian to test it.

@ianoc
Copy link
Contributor

ianoc commented Jan 5, 2016

Yeah vagrant or virtualbox is the plan, I just need to get time this week to set it up get the creds in place and do it really. Should be done before end of week

@EugenCepoi
Copy link
Contributor Author

Cool thanks! Let me know if I can be of some help :)

@laurentgo
Copy link
Contributor

FYI You can get working thrift binaries for Mac from the pantsbuild binaries project -> https://github.com/pantsbuild/binaries

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants