Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

upgrade protobuf compiler and library to 2.5.0 #358

Closed
simonandluna opened this issue Nov 25, 2013 · 17 comments · Fixed by #418
Closed

upgrade protobuf compiler and library to 2.5.0 #358

simonandluna opened this issue Nov 25, 2013 · 17 comments · Fixed by #418

Comments

@simonandluna
Copy link

It seems current elephand-bird still uses 2.4.1 protobuf compiler and library, which was 2.5 years old. We are currently using 2.5.0 protobuf library in our project. Unfortunately, library of 2.5.0 does not fully support auto-generated protobuf java codes using 2.4.1 compiler.

This caused an issue when we tried to use SerializedBlock class in block_storage.proto. Error messages look like:

java.lang.UnsupportedOperationException: This is supposed to be overridden by subclasses at com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180) at com.twitter.data.proto.BlockStorage$SerializedBlock.getSerializedSize(BlockStorage.java:164) at com.google.protobuf.AbstractMessageLite.toByteArray(AbstractMessageLite.java:62)

Will you consider to upgrade protobuf compiler and library in elephandbird?

@rangadi
Copy link
Contributor

rangadi commented Dec 3, 2013

Is there a work around for you? If you build EB with 2.5.0, would it work?

are 2.5.0 generated java files backward compatible?

@simonandluna
Copy link
Author

Thanks. That's what we did by re-building EB with 2.5.0 protobuf compiler to fix the backward compatibility issue with 2.4.1. We have to set up an internal maven repository to host the binary.

@lukasnalezenec
Copy link

How about excluding 2.4.1 maven dependency in your pom and replacing it by 2.5.0 ? It might work without recompiling EB.

@13h3r
Copy link

13h3r commented Mar 3, 2014

without 2.5 support it is impossible to use it with spark :(

@rangadi
Copy link
Contributor

rangadi commented Mar 3, 2014

I see. We need a fix for this.

Only dependency on pre-generated protobufs is for com.twitter.data.proto.BlockStorage (used for BlockStorage). One option is to replace BlockStorage with a DynamicMessage so that we don't have any dependency on such pre-generated files.

@rangadi
Copy link
Contributor

rangadi commented Mar 5, 2014

can you check if pull #373 would fix this issue? It replaces generated SerializedBlock protobuf with a DynamicMessage built at runtime.

@rangadi
Copy link
Contributor

rangadi commented Mar 5, 2014

fixed in #373. Let us know if it does not fix the problem.

@rangadi rangadi closed this as completed Mar 5, 2014
@shwethags
Copy link

@rangadi , can the the default protoc version be upgraded to 2.5.0?

@rangadi
Copy link
Contributor

rangadi commented Sep 24, 2014

I think we can. Right now only reason we use protoc is for tests.

What is the issue this is causing in your case? EB should work fine with protobuf 2.5.0 at runtime.

@shwethags
Copy link

We get the following exception with EB4.5:
Caused by: java.lang.UnsupportedOperationException: This is supposed to be overridden by subclasses.
at com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
at com.twitter.data.proto.Misc$ColumnarMetadata.getSerializedSize(Misc.java:596)
at com.google.protobuf.AbstractMessageLite.toByteArray(AbstractMessageLite.java:62)
at com.twitter.elephantbird.util.Protobufs.toText(Protobufs.java:289)
at com.twitter.elephantbird.mapreduce.output.RCFileThriftOutputFormat$ThriftWriter.(RCFileThriftOutputFormat.java:105)
at com.twitter.elephantbird.mapreduce.output.RCFileThriftOutputFormat.getRecordWriter(RCFileThriftOutputFormat.java:235)
at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.(ReduceTask.java:540)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:614)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:695)

Is it fixed in trunk? Does any of the releases have this fix?

@rangadi
Copy link
Contributor

rangadi commented Sep 25, 2014

I see, thank for reporting it. We need to handle 'ColumnarMetadata' used in RCFileThriftOutputFormat with a DynamicProto. I will have a patch soon.

Are you able to build EB with 2.5.0 to unblock?

@shwethags
Copy link

Yes, we were able to build EB with proto 2.5.0 and works fine. Thanks

@rangadi
Copy link
Contributor

rangadi commented Sep 29, 2014

Swetha, please see #418. It removes remaining two protobufs EB build. This should make EB RCFile work with either version of protobufs.

Now protoc is used only for protobufs used in tests.

@shwethags
Copy link

Great, thank you. Is there a release planned soon with this fix?

@row-column
Copy link

I have this problem too,As my project already running so long time,but recently, I need to add new fuction,when I ready to running this project in my cluster hadoop envroment,this problem is comming~

@datonli
Copy link

datonli commented Aug 10, 2015

My code counts on this problem recently,how could I fix it,pls?

@datonli
Copy link

datonli commented Aug 10, 2015

It makes me mad

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants