Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Schema] Fix parse BigDecimal #14019

Merged
merged 1 commit into from
Jan 29, 2022
Merged

Conversation

nodece
Copy link
Member

@nodece nodece commented Jan 28, 2022

Signed-off-by: Zixuan Liu nodeces@gmail.com

Motivation

I can use Avro schema with BigDecimal in Pulsar 2.8.0, but this doesn't work on Pulsar 2.8.1, so I check this codebase and PR, found #10428 breaks this.

The following is error log:

org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:154) ~[spring-beans-5.3.4.jar:5.3.4]
	... 34 common frames omitted
Caused by: java.lang.UnsupportedOperationException: No recommended schema for decimal (scale is required)
	at org.apache.pulsar.shade.org.apache.avro.Conversions$DecimalConversion.getRecommendedSchema(Conversions.java:73) ~[pulsar-client-2.8.1.jar:2.8.1]
	at org.apache.pulsar.shade.org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:696) ~[pulsar-client-2.8.1.jar:2.8.1]
	at org.apache.pulsar.shade.org.apache.avro.reflect.ReflectData.createFieldSchema(ReflectData.java:873) ~[pulsar-client-2.8.1.jar:2.8.1]
	at org.apache.pulsar.shade.org.apache.avro.reflect.ReflectData$AllowNull.createFieldSchema(ReflectData.java:92) ~[pulsar-client-2.8.1.jar:2.8.1]
	at org.apache.pulsar.shade.org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:736) ~[pulsar-client-2.8.1.jar:2.8.1]
	at org.apache.pulsar.shade.org.apache.avro.specific.SpecificData$3.computeValue(SpecificData.java:328) ~[pulsar-client-2.8.1.jar:2.8.1]
	at org.apache.pulsar.shade.org.apache.avro.specific.SpecificData$3.computeValue(SpecificData.java:325) ~[pulsar-client-2.8.1.jar:2.8.1]
	at java.base/java.lang.ClassValue.getFromHashMap(ClassValue.java:228) ~[na:na]
	at java.base/java.lang.ClassValue.getFromBackup(ClassValue.java:210) ~[na:na]
	at java.base/java.lang.ClassValue.get(ClassValue.java:116) ~[na:na]
	at org.apache.pulsar.shade.org.apache.avro.specific.SpecificData.getSchema(SpecificData.java:339) ~[pulsar-client-2.8.1.jar:2.8.1]
	... 52 common frames omitted

I think that Avro cannot work on using the ReflectData with Conversions.DecimalConversion to parse the BigDecimal field without AvroSchema when getting schema, but the Avro ReflectDatumWriter and ReflectDatumReader are working, it seems that Avro support for BigDecimal is not enough.

Affected version: 2.8.1...2.8.x, 2.9.x

Modifications

  • Skip add the DecimalConversion in extractAvroSchema()

Verifying this change

  • Make sure that the change passes the CI checks.

Documentation

Need to update docs?

  • no-need-doc

@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Jan 28, 2022
@codelipenghui
Copy link
Contributor

@nodece Please help add the root cause in the PR description.

@nodece
Copy link
Member Author

nodece commented Jan 29, 2022

/pulsarbot rerun-failure-checks

Signed-off-by: Zixuan Liu <nodeces@gmail.com>
@codelipenghui
Copy link
Contributor

@nodece Looks

protected static Schema createAvroSchema(SchemaDefinition schemaDefinition) {
Class pojo = schemaDefinition.getPojo();
if (StringUtils.isNotBlank(schemaDefinition.getJsonDef())) {
return parseAvroSchema(schemaDefinition.getJsonDef());
} else if (pojo != null) {
ThreadLocal<Boolean> validateDefaults = null;
try {
Field validateDefaultsField = Schema.class.getDeclaredField("VALIDATE_DEFAULTS");
validateDefaultsField.setAccessible(true);
validateDefaults = (ThreadLocal<Boolean>) validateDefaultsField.get(null);
} catch (NoSuchFieldException | IllegalAccessException e) {
throw new RuntimeException("Cannot disable validation of default values", e);
}
final boolean savedValidateDefaults = validateDefaults.get();
try {
// Disable validation of default values for compatibility
validateDefaults.set(false);
return extractAvroSchema(schemaDefinition, pojo);
} finally {
validateDefaults.set(savedValidateDefaults);
}
} else {
throw new RuntimeException("Schema definition must specify pojo class or schema json definition");
}
}
protected static Schema extractAvroSchema(SchemaDefinition schemaDefinition, Class pojo) {
try {
return parseAvroSchema(pojo.getDeclaredField("SCHEMA$").get(null).toString());
} catch (NoSuchFieldException | IllegalAccessException | IllegalArgumentException ignored) {
return schemaDefinition.getAlwaysAllowNull() ? ReflectData.AllowNull.get().getSchema(pojo)
: ReflectData.get().getSchema(pojo);
}
}
protected static Schema parseAvroSchema(String schemaJson) {
final Schema.Parser parser = new Schema.Parser();
parser.setValidateDefaults(false);
return parser.parse(schemaJson);
}
public static <T> SchemaInfo parseSchemaInfo(SchemaDefinition<T> schemaDefinition, SchemaType schemaType) {
return SchemaInfoImpl.builder()
.schema(createAvroSchema(schemaDefinition).toString().getBytes(UTF_8))
.properties(schemaDefinition.getProperties())
.name("")
.type(schemaType).build();
}
have duplicated static method with SchemaUtil

@codelipenghui
Copy link
Contributor

codelipenghui commented Jan 29, 2022

This root cause is here: https://github.com/apache/avro/blame/master/lang/java/avro/src/main/java/org/apache/avro/reflect/ReflectData.java#L687-L699

Because BigDecimal is a stringableClasse, without @org.apache.avro.reflect.AvroSchema can work.
https://github.com/apache/avro/blame/master/lang/java/avro/src/main/java/org/apache/avro/reflect/ReflectData.java#L713-L715

https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/specific/SpecificData.java#L120

  protected Set<Class> stringableClasses = new HashSet<>(Arrays.asList(java.math.BigDecimal.class,
      java.math.BigInteger.class, java.net.URI.class, java.net.URL.class, java.io.File.class));

So the difference between without annotation and without annotation is

With annotation will encode as bytes.

@org.apache.avro.reflect.AvroSchema("{\n" +
                "  \"type\": \"bytes\",\n" +
                "  \"logicalType\": \"decimal\",\n" +
                "  \"precision\": 4,\n" +
                "  \"scale\": 2\n" +
                "}")

Without annotation will encode as String

{"type":"string","java-class":"java.math.BigDecimal"}

So, they should be incompatible.

Copy link
Contributor

@gaoran10 gaoran10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nodece
Copy link
Member Author

nodece commented Jan 29, 2022

@nodece Looks

protected static Schema createAvroSchema(SchemaDefinition schemaDefinition) {
Class pojo = schemaDefinition.getPojo();
if (StringUtils.isNotBlank(schemaDefinition.getJsonDef())) {
return parseAvroSchema(schemaDefinition.getJsonDef());
} else if (pojo != null) {
ThreadLocal<Boolean> validateDefaults = null;
try {
Field validateDefaultsField = Schema.class.getDeclaredField("VALIDATE_DEFAULTS");
validateDefaultsField.setAccessible(true);
validateDefaults = (ThreadLocal<Boolean>) validateDefaultsField.get(null);
} catch (NoSuchFieldException | IllegalAccessException e) {
throw new RuntimeException("Cannot disable validation of default values", e);
}
final boolean savedValidateDefaults = validateDefaults.get();
try {
// Disable validation of default values for compatibility
validateDefaults.set(false);
return extractAvroSchema(schemaDefinition, pojo);
} finally {
validateDefaults.set(savedValidateDefaults);
}
} else {
throw new RuntimeException("Schema definition must specify pojo class or schema json definition");
}
}
protected static Schema extractAvroSchema(SchemaDefinition schemaDefinition, Class pojo) {
try {
return parseAvroSchema(pojo.getDeclaredField("SCHEMA$").get(null).toString());
} catch (NoSuchFieldException | IllegalAccessException | IllegalArgumentException ignored) {
return schemaDefinition.getAlwaysAllowNull() ? ReflectData.AllowNull.get().getSchema(pojo)
: ReflectData.get().getSchema(pojo);
}
}
protected static Schema parseAvroSchema(String schemaJson) {
final Schema.Parser parser = new Schema.Parser();
parser.setValidateDefaults(false);
return parser.parse(schemaJson);
}
public static <T> SchemaInfo parseSchemaInfo(SchemaDefinition<T> schemaDefinition, SchemaType schemaType) {
return SchemaInfoImpl.builder()
.schema(createAvroSchema(schemaDefinition).toString().getBytes(UTF_8))
.properties(schemaDefinition.getProperties())
.name("")
.type(schemaType).build();
}

have duplicated static method with SchemaUtil

Has been deprecated.

@hangc0276 hangc0276 merged commit 2ca8e8a into apache:master Jan 29, 2022
codelipenghui pushed a commit that referenced this pull request Jan 30, 2022
### Motivation

I can use Avro schema with BigDecimal in Pulsar 2.8.0, but this doesn't work on Pulsar 2.8.1, so I check this codebase and PR, found #10428 breaks this.

The following is error log:
```
org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:154) ~[spring-beans-5.3.4.jar:5.3.4]
	... 34 common frames omitted
Caused by: java.lang.UnsupportedOperationException: No recommended schema for decimal (scale is required)
	at org.apache.pulsar.shade.org.apache.avro.Conversions$DecimalConversion.getRecommendedSchema(Conversions.java:73) ~[pulsar-client-2.8.1.jar:2.8.1]
	at org.apache.pulsar.shade.org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:696) ~[pulsar-client-2.8.1.jar:2.8.1]
	at org.apache.pulsar.shade.org.apache.avro.reflect.ReflectData.createFieldSchema(ReflectData.java:873) ~[pulsar-client-2.8.1.jar:2.8.1]
	at org.apache.pulsar.shade.org.apache.avro.reflect.ReflectData$AllowNull.createFieldSchema(ReflectData.java:92) ~[pulsar-client-2.8.1.jar:2.8.1]
	at org.apache.pulsar.shade.org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:736) ~[pulsar-client-2.8.1.jar:2.8.1]
	at org.apache.pulsar.shade.org.apache.avro.specific.SpecificData$3.computeValue(SpecificData.java:328) ~[pulsar-client-2.8.1.jar:2.8.1]
	at org.apache.pulsar.shade.org.apache.avro.specific.SpecificData$3.computeValue(SpecificData.java:325) ~[pulsar-client-2.8.1.jar:2.8.1]
	at java.base/java.lang.ClassValue.getFromHashMap(ClassValue.java:228) ~[na:na]
	at java.base/java.lang.ClassValue.getFromBackup(ClassValue.java:210) ~[na:na]
	at java.base/java.lang.ClassValue.get(ClassValue.java:116) ~[na:na]
	at org.apache.pulsar.shade.org.apache.avro.specific.SpecificData.getSchema(SpecificData.java:339) ~[pulsar-client-2.8.1.jar:2.8.1]
	... 52 common frames omitted
```

I think that Avro cannot work on using the ReflectData with Conversions.DecimalConversion to parse the BigDecimal field without AvroSchema when getting schema, but the Avro ReflectDatumWriter and ReflectDatumReader are working, it seems that Avro support for BigDecimal is not enough.

Affected version: 2.8.1...2.8.x, 2.9.x

### Modifications

- Skip add the DecimalConversion in `extractAvroSchema()`

(cherry picked from commit 2ca8e8a)
@codelipenghui codelipenghui added the cherry-picked/branch-2.9 Archived: 2.9 is end of life label Jan 30, 2022
zymap pushed a commit that referenced this pull request Jan 30, 2022
### Motivation

I can use Avro schema with BigDecimal in Pulsar 2.8.0, but this doesn't work on Pulsar 2.8.1, so I check this codebase and PR, found #10428 breaks this.

The following is error log:
```
org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:154) ~[spring-beans-5.3.4.jar:5.3.4]
	... 34 common frames omitted
Caused by: java.lang.UnsupportedOperationException: No recommended schema for decimal (scale is required)
	at org.apache.pulsar.shade.org.apache.avro.Conversions$DecimalConversion.getRecommendedSchema(Conversions.java:73) ~[pulsar-client-2.8.1.jar:2.8.1]
	at org.apache.pulsar.shade.org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:696) ~[pulsar-client-2.8.1.jar:2.8.1]
	at org.apache.pulsar.shade.org.apache.avro.reflect.ReflectData.createFieldSchema(ReflectData.java:873) ~[pulsar-client-2.8.1.jar:2.8.1]
	at org.apache.pulsar.shade.org.apache.avro.reflect.ReflectData$AllowNull.createFieldSchema(ReflectData.java:92) ~[pulsar-client-2.8.1.jar:2.8.1]
	at org.apache.pulsar.shade.org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:736) ~[pulsar-client-2.8.1.jar:2.8.1]
	at org.apache.pulsar.shade.org.apache.avro.specific.SpecificData$3.computeValue(SpecificData.java:328) ~[pulsar-client-2.8.1.jar:2.8.1]
	at org.apache.pulsar.shade.org.apache.avro.specific.SpecificData$3.computeValue(SpecificData.java:325) ~[pulsar-client-2.8.1.jar:2.8.1]
	at java.base/java.lang.ClassValue.getFromHashMap(ClassValue.java:228) ~[na:na]
	at java.base/java.lang.ClassValue.getFromBackup(ClassValue.java:210) ~[na:na]
	at java.base/java.lang.ClassValue.get(ClassValue.java:116) ~[na:na]
	at org.apache.pulsar.shade.org.apache.avro.specific.SpecificData.getSchema(SpecificData.java:339) ~[pulsar-client-2.8.1.jar:2.8.1]
	... 52 common frames omitted
```

I think that Avro cannot work on using the ReflectData with Conversions.DecimalConversion to parse the BigDecimal field without AvroSchema when getting schema, but the Avro ReflectDatumWriter and ReflectDatumReader are working, it seems that Avro support for BigDecimal is not enough.

Affected version: 2.8.1...2.8.x, 2.9.x

### Modifications

- Skip add the DecimalConversion in `extractAvroSchema()`

(cherry picked from commit 2ca8e8a)
@zymap zymap added the cherry-picked/branch-2.8 Archived: 2.8 is end of life label Jan 30, 2022
Nicklee007 pushed a commit to Nicklee007/pulsar that referenced this pull request Apr 20, 2022
### Motivation

I can use Avro schema with BigDecimal in Pulsar 2.8.0, but this doesn't work on Pulsar 2.8.1, so I check this codebase and PR, found apache#10428 breaks this.

The following is error log:
```
org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:154) ~[spring-beans-5.3.4.jar:5.3.4]
	... 34 common frames omitted
Caused by: java.lang.UnsupportedOperationException: No recommended schema for decimal (scale is required)
	at org.apache.pulsar.shade.org.apache.avro.Conversions$DecimalConversion.getRecommendedSchema(Conversions.java:73) ~[pulsar-client-2.8.1.jar:2.8.1]
	at org.apache.pulsar.shade.org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:696) ~[pulsar-client-2.8.1.jar:2.8.1]
	at org.apache.pulsar.shade.org.apache.avro.reflect.ReflectData.createFieldSchema(ReflectData.java:873) ~[pulsar-client-2.8.1.jar:2.8.1]
	at org.apache.pulsar.shade.org.apache.avro.reflect.ReflectData$AllowNull.createFieldSchema(ReflectData.java:92) ~[pulsar-client-2.8.1.jar:2.8.1]
	at org.apache.pulsar.shade.org.apache.avro.reflect.ReflectData.createSchema(ReflectData.java:736) ~[pulsar-client-2.8.1.jar:2.8.1]
	at org.apache.pulsar.shade.org.apache.avro.specific.SpecificData$3.computeValue(SpecificData.java:328) ~[pulsar-client-2.8.1.jar:2.8.1]
	at org.apache.pulsar.shade.org.apache.avro.specific.SpecificData$3.computeValue(SpecificData.java:325) ~[pulsar-client-2.8.1.jar:2.8.1]
	at java.base/java.lang.ClassValue.getFromHashMap(ClassValue.java:228) ~[na:na]
	at java.base/java.lang.ClassValue.getFromBackup(ClassValue.java:210) ~[na:na]
	at java.base/java.lang.ClassValue.get(ClassValue.java:116) ~[na:na]
	at org.apache.pulsar.shade.org.apache.avro.specific.SpecificData.getSchema(SpecificData.java:339) ~[pulsar-client-2.8.1.jar:2.8.1]
	... 52 common frames omitted
```

I think that Avro cannot work on using the ReflectData with Conversions.DecimalConversion to parse the BigDecimal field without AvroSchema when getting schema, but the Avro ReflectDatumWriter and ReflectDatumReader are working, it seems that Avro support for BigDecimal is not enough.

Affected version: 2.8.1...2.8.x, 2.9.x

### Modifications

- Skip add the DecimalConversion in `extractAvroSchema()`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/schema cherry-picked/branch-2.8 Archived: 2.8 is end of life cherry-picked/branch-2.9 Archived: 2.9 is end of life doc-not-needed Your PR changes do not impact docs release/2.8.3 release/2.9.2
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants