Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AVRO-2918: Polymorphism #1776

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Conversation

clesaec
Copy link
Contributor

@clesaec clesaec commented Jul 21, 2022

AVRO-2918 : Proposition for polymorphism

@github-actions github-actions bot added the Java Pull Requests for Java binding label Jul 21, 2022
Comment on lines +237 to +243
final Schema realSchema;
if (expected.hasChild()) {
int index = in.readExtends();
realSchema = expected.findInHierachy(index);
} else {
realSchema = expected;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be at odds with GenericDatumWriter, lines 239-241: there, the extends is optional, based on whether the parent schema or one of its children is written.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed, i have to add unit test and fix that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You were right, so now writer will always write the "extends" symbol if schema has child.

Comment on lines 239 to 241
if (!Objects.equals(realSchema, schema)) {
out.writeExtends(realSchema.getIndex());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This creates two possible binary representations of the same (parent) schema: one for writing a parent schema, and a different one for child schemata.

How can the reader distinguish between the two?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@@ -178,6 +178,11 @@ public void writeIndex(int unionIndex) throws IOException {
encodeLong(unionIndex, out);
}

@Override
public void writeExtends(int extendsIndex) throws IOException {
encodeLong(extendsIndex, out);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the extends clause be supported by the legacy encoder at all?

Or should it throw?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know ... i can throw an exception ....

@@ -345,6 +379,9 @@ public String getFullName() {
return getName();
}

public void terminateInit() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this method do?

I understand methods like hasChild, visitHierarchy and findInHierarchy above (though they also need javadoc), but I cannot understand the what&why of this one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a kind of PostConstruct, for SchemaRecord, once all schemas has been built and list of fields set.
But, indeed, i will see if we can do without it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sort of like how the fields are marked immutable once set for the first time? (and any field use before they are set yields an exception)

How about this idea:

  1. in the constructor, mark if the record schema will have children (if the parameter is omitted, it will not)
  2. have a method where you can set all child schemata, once

This behaves like setFields: it allows us both to finalise initialisation and makes the schema unusable for (de)serialisation until the child schemata have been set (so it cannot be forgotten).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, i finally find an easy way to remove this weird method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw; lazy initialisation. Also a good idea.

@RyanSkraba RyanSkraba changed the title Avro 2918 polymorphism AVRO-2918: Polymorphism Aug 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Java Pull Requests for Java binding
Projects
None yet
2 participants