Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-16627][json]Support ignore null fields when serializing into JSON #24430

Closed

Conversation

RubyChou
Copy link
Contributor

@RubyChou RubyChou commented Mar 4, 2024

What is the purpose of the change

This pull request introduces json.encode.ignore-null-fields for JSON formats which indicates whether to ignore null fields when serializing into JSON.

Brief change log

  • Add this option in JsonOptions.
  • Let JsonFormatFactorys and JsonRowDataSerializationSchemas be aware of this option.

Verifying this change

This change added tests and can be verified as follows:

  • Added test in JsonRowDataSerDeSchemaTest to check whether the result of serialization is expected.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (no)
  • The serializers: (no)
  • The runtime per-record code paths (performance sensitive): (no)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
  • The S3 file system connector: (no)

Documentation

  • Does this pull request introduce a new feature? (yes)
  • If yes, how is the feature documented? (docs)

@flinkbot
Copy link
Collaborator

flinkbot commented Mar 4, 2024

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@RubyChou RubyChou force-pushed the support-json-encode-ignore-null branch from 1df5347 to 43667dc Compare March 4, 2024 04:36
Copy link
Member

@libenchao libenchao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RubyChou generally looks good, I only left some minor comments about the styles and tests.

<td>选填</td>
<td style="word-wrap: break-word;">false</td>
<td>Boolean</td>
<td>仅序列化非Null的列,默认情况下,会序列化所有列无论是否为Null。</td>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Chinese document, we usually add spaces between english words, see more here: https://cwiki.apache.org/confluence/display/FLINK/Flink+Translation+Specifications

Suggested change
<td>仅序列化非Null的列,默认情况下,会序列化所有列无论是否为Null。</td>
<td>仅序列化非 Null 的列,默认情况下,会序列化所有列无论是否为 Null。</td>

<td>optional</td>
<td style="word-wrap: break-word;">false</td>
<td>Boolean</td>
<td>Encode only non-null fields. By default, all fields will be written.</td>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<td>Encode only non-null fields. By default, all fields will be written.</td>
<td>Encode only non-null fields. By default, all fields will be included.</td>

@@ -626,6 +635,62 @@ void testSerializationDecimalEncode() throws Exception {
assertThat(scientificDecimalResult).isEqualTo(scientificDecimalJson);
}

@TestTemplate
public void testSerDeMultiRowsWithNullValuesIgnored() throws Exception {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we have migrated to Junit5, public is no longer needed, see more here: https://docs.google.com/document/d/1514Wa_aNB9bJUen4xm5uiuXOooOJTtXqS_Jqk9KJitU/edit#heading=h.4iilyfqpoiw2

Suggested change
public void testSerDeMultiRowsWithNullValuesIgnored() throws Exception {
void testSerDeMultiRowsWithNullValuesIgnored() throws Exception {

@@ -89,6 +90,7 @@ void testUserDefinedOptions() {
options.put("canal-json.map-null-key.mode", "LITERAL");
options.put("canal-json.map-null-key.literal", "nullKey");
options.put("canal-json.encode.decimal-as-plain-number", "true");
options.put("canal-json.encode.ignore-null-fields", "true");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you enable Canal json format to test "encode.ignore-null-fields" while leave others (debezium, maxwell, ogg) out?

@RubyChou
Copy link
Contributor Author

RubyChou commented Mar 5, 2024

hi @libenchao, I've resolved all comments above. Please help reviewing. Thanks.

Copy link
Member

@libenchao libenchao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, merging.

libenchao pushed a commit to libenchao/flink that referenced this pull request Mar 5, 2024
<tr>
<td><h5>json.encode.ignore-null-fields</h5></td>
<td>选填</td>
<td style="word-wrap: break-word;">false</td>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this option can be forwarded.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants