Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read header and footer if available #99

Open
wants to merge 7 commits into
base: trunk
Choose a base branch
from

Conversation

sebastian-alfers
Copy link

This commit fixes an issue, that header-values can not be restored correctly after an event was avro-serialized to disk.

The problem was, that deserializing the event from disk, the body of the event contained the binary and the header.

See: https://issues.apache.org/jira/browse/FLUME-2942

Discussion welcome!

Copy link
Contributor

@adenes adenes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sebastian-alfers, thank you for the pull request, I had some comments on it. Let me know if you have any question regarding to them.

public abstract class AbstractAvroEventDeserializer implements EventDeserializer {

protected AvroSchemaType schemaType;
protected ResettableInputStream ris;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to change this to private final: just add it to the constructor and replace the this.ris = ris with a super(ris) call in the subclasses.
There is no need for the subclasses to access this and pushing the responsibility of setting it (and also giving the possibility to modify it) to the subclasses would make it very error-prone.

= "flume.avro.schema.literal";


protected void initialize(ResettableInputStream ris)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter is not needed if AbstractAvroEventDeserializer.ris is changed to private final.

@@ -26,6 +26,7 @@
public enum EventDeserializerType {
LINE(LineDeserializer.Builder.class),
AVRO(AvroEventDeserializer.Builder.class),
FLUME(FlumeEventAvroEventDeserializer.Builder.class),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think FLUME_AVRO would be a more expressive name for this.
Please add this to the user guide as well. Thanks.

private FlumeEventAvroEventDeserializer(Context context, ResettableInputStream ris) {
this.ris = ris;

schemaType = AvroSchemaType.valueOf(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (line 48-57) is a copy-paste from AvroEventDeserializer. Could you please simplify it by pushing it (from here and from the AvroEventDeserializer) up to the AbstractAvroEventDeserializer?
If the logic is moved to the parent class then its schemaType field can be changed to private final.

SeekableResettableInputBridge in = new SeekableResettableInputBridge(ris);
long pos = in.tell();
in.seek(0L);
fileReader = new DataFileReader<GenericRecord>(in,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: explicit type argument can be replaced with <>

@sebastian-alfers
Copy link
Author

@adenes Thanks fore your comments, I updated the pr

@sebastian-alfers
Copy link
Author

@adenes any comments?

@sebastian-alfers
Copy link
Author

@adenes ping

@sebastian-alfers
Copy link
Author

sebastian-alfers commented Feb 14, 2017

@laxman-ch @adenes @simonati anyone likes to push this PR forward? Comments welcome

@simonati
Copy link

Thanks for the heads up! Let me take a look today.

@simonati
Copy link

Hi @sebastian-alfers,

This change is quite big and hard to follow. Please give me more time to finish the review or help any other reviewer by making consumption of your changes easier by leaving some more hints about the ideas behind your changes.

@sebastian-alfers
Copy link
Author

@simonati Thanks for you comment. Actually, I wrote down all my ideas in the corresponding jira ticket. If you have any particular question feel free to point them out

@sebastian-alfers
Copy link
Author

Push

@asfgit
Copy link

asfgit commented Aug 17, 2018

Can one of the admins verify this patch?

waidr pushed a commit to waidr/flume that referenced this pull request Jul 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants