Make metadata allFields ordered consistently with the valueSchema. #787

norganna · 2020-02-06T07:46:18Z

Currently the use of a HashMap to store the columns loses the ordering of the columns from the schema.

When the create table is generated, the order of the columns is unpredictable.

This change creates a LinkedHashMap which preserves the original column order from the valueSchema and tests to check ordering is functioning correctly.

ghost · 2020-02-06T07:46:21Z

@confluentinc It looks like @norganna just signed our Contributor License Agreement. 👍

Always at your service,

clabot

norganna · 2020-02-06T07:48:13Z

@gharris1727 This is a more complete and accurate replacement for #763.

gharris1727

I think your previous strategy of just replacing allFields was more robust.

gharris1727 · 2020-02-06T17:14:58Z

src/main/java/io/confluent/connect/jdbc/sink/metadata/FieldsMetadata.java

@@ -129,7 +130,15 @@ public static FieldsMetadata extract(
      );
    }

-    return new FieldsMetadata(keyFieldNames, nonKeyFieldNames, allFields);
+    final Map<String, SinkRecordField> allFieldsOrdered = new LinkedHashMap<>();


By separating allFields and allFieldsOrdered, you're removing the PK-first ordering of the columns in favor of whatever order the record specifies them in. I think this might violate some assumptions that we have downstream.

I'm also concerned about what happens when allFields contains a key that's not in the record (such as topic/partition/offset), since these are added in a previous step to allFields, but never appear in allFieldsOrdered.

I think the build failures may be linked to this change in semantics.

I don't think there's any PK-first dependancies down-stream because currently the HashMap is indeterminately ordered.

Ok, then perhaps this is not the right approach, since the previous method will not preserve original column ordering in the destination table (as PK's will appear first).

Perhaps the approach needed is for the query builder to not use the order from allFields :/

I'll take a look at the additional fields that are added to the allFields and see if I can fix this PR up.

I have committed additional code to make sure all fields are added in a determinate order to the allFieldsOrdered map.

Looks like tests are completing now.

The default Kafka fields are added at the beginning of the fields, and the remainder of non-valueSchema fields are added at the end in sorted order for a repeatably determinate ordering.

norganna · 2020-02-07T00:51:15Z

src/main/java/io/confluent/connect/jdbc/sink/metadata/FieldsMetadata.java

+      if (allFields.containsKey(fieldName)) {
+        allFieldsOrdered.put(fieldName, allFields.get(fieldName));
+      }
+    }


This will add the default PK fields first, the ordering is determinate because the DEFAULT_KAFKA_PK_NAMES will always be the same.

norganna · 2020-02-07T00:52:00Z

src/main/java/io/confluent/connect/jdbc/sink/metadata/FieldsMetadata.java

+          allFieldsOrdered.put(fieldName, allFields.get(fieldName));
+        }
+      }
+    }


This will add the fields from the source DB in the order from the source table and is thus determinate.

norganna · 2020-02-07T01:38:09Z

src/main/java/io/confluent/connect/jdbc/sink/metadata/FieldsMetadata.java

+          allFieldsOrdered.put(fieldName, allFields.get(fieldName));
+        }
+      }
+    }


If there's fields missing (possible if pkMode != RECORD_VALUE and valueSchema = NULL) then add the remaining fields sorted by name for deterministic ordering.

gharris1727

LGTM, thanks for the improvement @norganna !

Make metadata allFields ordered consistently with the valueSchema.

95eacec

norganna mentioned this pull request Feb 6, 2020

Make sink metadata column ordering more consistent. #763

Closed

gharris1727 suggested changes Feb 6, 2020

View reviewed changes

Add additional fields to allFieldsOrdered.

827eb4d

The default Kafka fields are added at the beginning of the fields, and the remainder of non-valueSchema fields are added at the end in sorted order for a repeatably determinate ordering.

norganna requested a review from gharris1727 February 7, 2020 00:47

norganna commented Feb 7, 2020

View reviewed changes

norganna added 2 commits February 7, 2020 11:01

Don't iterate over allFields unless there's fields missing.

61bd259

Add test coverage for when valueSchema is not supplied to extract.

5fd501f

norganna commented Feb 7, 2020

View reviewed changes

gharris1727 approved these changes Feb 18, 2020

View reviewed changes

gharris1727 merged commit 63dc94f into confluentinc:5.0.x Feb 18, 2020

norganna deleted the 5.0.x branch February 21, 2020 02:33

gharris1727 mentioned this pull request Feb 24, 2021

Column order lost when writing Avro topic to JDBC sink #1014

Closed

Oduig mentioned this pull request Feb 26, 2021

Column order lost when writing Avro topic to JDBC sink Aiven-Open/jdbc-connector-for-apache-kafka#81

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make metadata allFields ordered consistently with the valueSchema. #787

Make metadata allFields ordered consistently with the valueSchema. #787

norganna commented Feb 6, 2020

ghost commented Feb 6, 2020

norganna commented Feb 6, 2020

gharris1727 left a comment

gharris1727 Feb 6, 2020

norganna Feb 6, 2020 •

edited

Loading

norganna Feb 7, 2020

norganna Feb 7, 2020

norganna Feb 7, 2020

norganna Feb 7, 2020

gharris1727 left a comment

Make metadata allFields ordered consistently with the valueSchema. #787

Make metadata allFields ordered consistently with the valueSchema. #787

Conversation

norganna commented Feb 6, 2020

ghost commented Feb 6, 2020

norganna commented Feb 6, 2020

gharris1727 left a comment

Choose a reason for hiding this comment

gharris1727 Feb 6, 2020

Choose a reason for hiding this comment

norganna Feb 6, 2020 • edited Loading

Choose a reason for hiding this comment

norganna Feb 7, 2020

Choose a reason for hiding this comment

norganna Feb 7, 2020

Choose a reason for hiding this comment

norganna Feb 7, 2020

Choose a reason for hiding this comment

norganna Feb 7, 2020

Choose a reason for hiding this comment

gharris1727 left a comment

Choose a reason for hiding this comment

norganna Feb 6, 2020 •

edited

Loading