Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UDT fields messed up on writes through gRPC #1329

Closed
pkolaczk opened this issue Oct 12, 2021 · 1 comment · Fixed by #1330
Closed

UDT fields messed up on writes through gRPC #1329

pkolaczk opened this issue Oct 12, 2021 · 1 comment · Fixed by #1330
Assignees

Comments

@pkolaczk
Copy link
Contributor

Consider the following schema:

CREATE TYPE address_1(f1 VARCHAR, f2 VARCHAR);
CREATE TABLE users_with_addr_1(id BIGINT PRIMARY KEY, address address_1);

Now we do an insert of row with UdtValue of { f1: 'Long St', f2: '7870' } and we check what was inserted in C*:

 id | address
----+-----------------------------
  3 | {f1: 'Long St', f2: '7870'}

So far looks good.
But we used stupid field names.
Let's make it nicer. Let's change the field names.

CREATE TYPE address_2(street VARCHAR, number VARCHAR);
CREATE TABLE users_with_addr_2(id BIGINT PRIMARY KEY, address address_2);

And insert just the same thing:
UdtValue of { street: 'Long St', number: '7870' }.

 id | address
----+-------------------------------------
  3 | {street: '7870', number: 'Long St'}

Oooops. Values got misplaced.

If you make it even nicer and change the number type to BIGINT as it should be, then you'll get:

org.apache.cassandra.serializers.MarshalException: String didn't validate.
	at org.apache.cassandra.serializers.UTF8Serializer.validate(UTF8Serializer.java:35)
	at org.apache.cassandra.serializers.UserTypeSerializer.validate(UserTypeSerializer.java:59)
	at org.apache.cassandra.db.marshal.AbstractType.validate(AbstractType.java:163)
	at org.apache.cassandra.cql3.UserTypes$Value.fromSerialized(UserTypes.java:166)
	at org.apache.cassandra.cql3.UserTypes$Marker.bind(UserTypes.java:270)
	at org.apache.cassandra.cql3.UserTypes$Setter.execute(UserTypes.java:283)
	at org.apache.cassandra.cql3.statements.UpdateStatement.addUpdateForKey(UpdateStatement.java:94)
	at org.apache.cassandra.cql3.statements.ModificationStatement.addUpdates(ModificationStatement.java:694)
	at org.apache.cassandra.cql3.statements.ModificationStatement.getMutations(ModificationStatement.java:635)
	at org.apache.cassandra.cql3.statements.ModificationStatement.executeWithoutCondition(ModificationStatement.java:437)
	at org.apache.cassandra.cql3.statements.ModificationStatement.execute(ModificationStatement.java:425)
	at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:225)
	at org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:532)
	at org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:509)
	at io.stargate.db.cassandra.impl.StargateQueryHandler.processPrepared(StargateQueryHandler.java:205)

because C* is trying to process a BIGINT value with a UTF8Validator, which is just a result of values attached to wrong keys.

@pkolaczk
Copy link
Contributor Author

Fix coming.
Please assign to me (I can't assign myself - why?).

pkolaczk added a commit that referenced this issue Oct 12, 2021
UdtCodec#encode was encoding the fields in the order
of values received from the `UserDefinedType#columnMap`.
That order does not necessarily match the order of the fields
in the UDT, because the returned map is a HashMap.
Placing the values in the output buffer in a wrong order can
lead to silent data corruption or at best
marshalling exceptions from C*.

This commit makes the encoded values order match exactly
the order of fields in the UDT.

Fixes #1329
pkolaczk added a commit that referenced this issue Oct 12, 2021
UdtCodec#encode was encoding the fields in the order
of values received from the `UserDefinedType#columnMap`.
That order does not necessarily match the order of the fields
in the UDT, because the returned map is a HashMap.
Placing the values in the output buffer in a wrong order can
lead to silent data corruption or at best
marshalling exceptions from C*.

This commit makes the encoded values order match exactly
the order of fields in the UDT.

Fixes #1329
tomekl007 added a commit that referenced this issue Oct 13, 2021
* Encode Udt field values in the correct order

UdtCodec#encode was encoding the fields in the order
of values received from the `UserDefinedType#columnMap`.
That order does not necessarily match the order of the fields
in the UDT, because the returned map is a HashMap.
Placing the values in the output buffer in a wrong order can
lead to silent data corruption or at best
marshalling exceptions from C*.

This commit makes the encoded values order match exactly
the order of fields in the UDT.

Fixes #1329

* add integration test

Co-authored-by: tomekl007 <tomasz.lelek@datastax.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant