Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In Kafka-Connect scenario, the same schema is updated (PUT) in the registry for each table row #734

Closed
elakito opened this issue Jul 31, 2020 · 6 comments

Comments

@elakito
Copy link
Contributor

elakito commented Jul 31, 2020

I am observing that the registry-client is invoking a PUT /api/artifacts/-value for each record row and updating its version number. I am using version 1.2.3.Final and this behaviour is observed with both apicurio-registry-mem and apicurio-registry-kafka 1.2.3.Final.

I have a test table created with
CREATE TABLE Persons5 ("id" int primary key, LastName varchar(255), FirstName varchar(255));

The HTTP transcription shows the following message exchange.

==============
PUT /api/artifacts/test-sqlite-jdbc-Persons5-value HTTP/1.1
Accept: application/json
X-Registry-ArtifactType: AVRO
Content-Type: */*
User-Agent: Jersey/2.28 (HttpUrlConnection 11.0.2)
Host: localhost:8080
Connection: keep-alive
Content-Length: 221

{"type":"record","name":"Persons5","fields":[{"name":"id","type":"long"},{"name":"LastName","type":["null","string"],"default":null},{"name":"FirstName","type":["null","string"],"default":null}],"connect.name":"Persons5"}

HTTP/1.1 404 Not Found
Date: Fri, 31 Jul 2020 21:15:02 GMT
Expires: Thu, 30 Jul 2020 21:15:02 GMT
Pragma: no-cache
Cache-control: no-cache, no-store, must-revalidate
Content-Type: application/json
Content-Length: 15241

{"message":"No artifact with ID 'test-sqlite-jdbc-Persons5-value' was found.","error_code":404,"detail":"io.apicurio.registry.storage.ArtifactNotFoundException: No artifact with ID 'test-sqlite-jdbc-Persons5-value' was found.\n\tat io.apicurio.registry.storage.impl.SimpleMapRegistryStorage$SimpleStorageMap.get(SimpleMapRegistryStorage.java:124)\n\tat io.apicurio.registry.storage.impl.AbstractMapRegistryStorage.getVersion2ContentMap(AbstractMapRegistryStorage.java:93)
#### the stack trace is too long and omitted in this issue report ####
\n\tat java.lang.Thread.run(Thread.java:748)\n\tat org.jboss.threads.JBossThread.run(JBossThread.java:479)\n"}

==============
POST /api/artifacts?ifExists=FAIL HTTP/1.1
Accept: application/json
X-Registry-ArtifactId: test-sqlite-jdbc-Persons5-value
X-Registry-ArtifactType: AVRO
Content-Type: */*
User-Agent: Jersey/2.28 (HttpUrlConnection 11.0.2)
Host: localhost:8080
Connection: keep-alive
Content-Length: 221

{"type":"record","name":"Persons5","fields":[{"name":"id","type":"long"},{"name":"LastName","type":["null","string"],"default":null},{"name":"FirstName","type":["null","string"],"default":null}],"connect.name":"Persons5"}

HTTP/1.1 200 OK
Date: Fri, 31 Jul 2020 21:15:03 GMT
Expires: Thu, 30 Jul 2020 21:15:03 GMT
Pragma: no-cache
Cache-control: no-cache, no-store, must-revalidate
Content-Type: application/json
Content-Length: 168

{"name":"Persons5","createdOn":1596230103600,"modifiedOn":1596230103600,"id":"test-sqlite-jdbc-Persons5-value","version":1,"type":"AVRO","globalId":0,"state":"ENABLED"}

==============
PUT /api/artifacts/test-sqlite-jdbc-Persons5-value HTTP/1.1
Accept: application/json
X-Registry-ArtifactType: AVRO
Content-Type: */*
User-Agent: Jersey/2.28 (HttpUrlConnection 11.0.2)
Host: localhost:8080
Connection: keep-alive
Content-Length: 221

{"type":"record","name":"Persons5","fields":[{"name":"id","type":"long"},{"name":"LastName","type":["null","string"],"default":null},{"name":"FirstName","type":["null","string"],"default":null}],"connect.name":"Persons5"}

HTTP/1.1 200 OK
Date: Fri, 31 Jul 2020 21:15:03 GMT
Expires: Thu, 30 Jul 2020 21:15:03 GMT
Pragma: no-cache
Cache-control: no-cache, no-store, must-revalidate
Content-Type: application/json
Content-Length: 168

{"name":"Persons5","createdOn":1596230103960,"modifiedOn":1596230103960,"id":"test-sqlite-jdbc-Persons5-value","version":2,"type":"AVRO","globalId":1,"state":"ENABLED"}

==============
PUT /api/artifacts/test-sqlite-jdbc-Persons5-value HTTP/1.1
Accept: application/json
X-Registry-ArtifactType: AVRO
Content-Type: */*
User-Agent: Jersey/2.28 (HttpUrlConnection 11.0.2)
Host: localhost:8080
Connection: keep-alive
Content-Length: 221

{"type":"record","name":"Persons5","fields":[{"name":"id","type":"long"},{"name":"LastName","type":["null","string"],"default":null},{"name":"FirstName","type":["null","string"],"default":null}],"connect.name":"Persons5"}

HTTP/1.1 200 OK
Date: Fri, 31 Jul 2020 21:15:03 GMT
Expires: Thu, 30 Jul 2020 21:15:03 GMT
Pragma: no-cache
Cache-control: no-cache, no-store, must-revalidate
Content-Type: application/json
Content-Length: 168

{"name":"Persons5","createdOn":1596230103986,"modifiedOn":1596230103986,"id":"test-sqlite-jdbc-Persons5-value","version":3,"type":"AVRO","globalId":2,"state":"ENABLED"}

==============
PUT /api/artifacts/test-sqlite-jdbc-Persons5-value HTTP/1.1
Accept: application/json
X-Registry-ArtifactType: AVRO
Content-Type: */*
User-Agent: Jersey/2.28 (HttpUrlConnection 11.0.2)
Host: localhost:8080
Connection: keep-alive
Content-Length: 221

{"type":"record","name":"Persons5","fields":[{"name":"id","type":"long"},{"name":"LastName","type":["null","string"],"default":null},{"name":"FirstName","type":["null","string"],"default":null}],"connect.name":"Persons5"}

HTTP/1.1 200 OK
Date: Fri, 31 Jul 2020 21:15:27 GMT
Expires: Thu, 30 Jul 2020 21:15:27 GMT
Pragma: no-cache
Cache-control: no-cache, no-store, must-revalidate
Content-Type: application/json
Content-Length: 168

{"name":"Persons5","createdOn":1596230127547,"modifiedOn":1596230127547,"id":"test-sqlite-jdbc-Persons5-value","version":4,"type":"AVRO","globalId":3,"state":"ENABLED"}

==============
PUT /api/artifacts/test-sqlite-jdbc-Persons5-value HTTP/1.1
Accept: application/json
X-Registry-ArtifactType: AVRO
Content-Type: */*
User-Agent: Jersey/2.28 (HttpUrlConnection 11.0.2)
Host: localhost:8080
Connection: keep-alive
Content-Length: 221

{"type":"record","name":"Persons5","fields":[{"name":"id","type":"long"},{"name":"LastName","type":["null","string"],"default":null},{"name":"FirstName","type":["null","string"],"default":null}],"connect.name":"Persons5"}

HTTP/1.1 200 OK
Date: Fri, 31 Jul 2020 21:15:27 GMT
Expires: Thu, 30 Jul 2020 21:15:27 GMT
Pragma: no-cache
Cache-control: no-cache, no-store, must-revalidate
Content-Type: application/json
Content-Length: 168

{"name":"Persons5","createdOn":1596230127572,"modifiedOn":1596230127572,"id":"test-sqlite-jdbc-Persons5-value","version":5,"type":"AVRO","globalId":4,"state":"ENABLED"}


This results in each Kafka message containing a new schema version number.

@EricWittmann
Copy link
Member

If I'm reading your post correctly (not missing any details) then I think this is expected behavior. The PUT operation for an artifact does not currently check to see if the content you are uploading already exists as a version. You can make that happen, but you need to use POST rather than PUT:

https://studio-ws.apicur.io/sharing/76f236d3-1312-4548-a4d1-400487188c66#operation/createArtifact

You'll need to use the ifExists query parameter and set its value to RETURN_OR_UPDATE. In the future we should consider adding a similar query parameter to the PUT operation to allow this optional behavior.

@EricWittmann
Copy link
Member

If this makes sense and you agree then I would suggest we convert this issue into a Feature Request to add such a query param to the PUT.

@elakito
Copy link
Contributor Author

elakito commented Aug 3, 2020

@EricWittmann Thanks for your reply. I see a series of strange requests are sent to the registry. But aren't these requests triggered by the apircurio-registry-client or the way how its converter uses this client? I haven't looked into the detail to determine why this is happening. In my connector's configuration, I have only the declarative configuration.

value.converter=io.apicurio.registry.utils.converter.AvroConverter
...
value.converter.apicurio.registry.url=http://localhost:8080/api
value.converter.apicurio.registry.converter.serializer=io.apicurio.registry.utils.serde.AvroKafkaSerializer
value.converter.apicurio.registry.converter.deserializer=io.apicurio.registry.utils.serde.AvroKafkaDeserializer
value.converter.apicurio.registry.global-id=io.apicurio.registry.utils.serde.strategy.AutoRegisterIdStrategy

I thought the sequence of those requests were created by the apicurio's converter. In that case, shouldn't the behaviour be fixed there so that a series of POST+ifExist are issued? Or the problem is happening in another layer?

In addition to this question, when multiple records are read at once from a table, it would make sense to use the same schema (not doing any query to the registry for each record) to create a series of Kafka messages, no?

@elakito
Copy link
Contributor Author

elakito commented Aug 4, 2020

maybe , it was because I was using
value.converter.apicurio.registry.global-id=io.apicurio.registry.utils.serde.strategy.AutoRegisterIdStrategy
when I switched to use
value.converter.apicurio.registry.global-id=io.apicurio.registry.utils.serde.strategy.GetOrCreateIdStrategy
the problem didn't happen.

@elakito
Copy link
Contributor Author

elakito commented Aug 4, 2020

@EricWittmann For my original concern, this issue has been resolved. As I commented above, this behavior seems to be the intended behavior of AutoRegisterIdStrategy and using an appropriate strategy will avoid this problem. I can close it. Thank you.

@elakito elakito closed this as completed Aug 4, 2020
IPT Registry Version 2 automation moved this from Backlog to Done Aug 4, 2020
@EricWittmann
Copy link
Member

Thanks @elakito - I think we have some things to improve in the serdes layer to make things easier to use/understand. That might just be better documentation, or examples. Or perhaps we can bundle up some common use-cases into configuration aliases. I'm not sure yet, but there's a lot of potential for improvement I think! Any suggestions are always welcome - it's good to get the perspective of someone actually trying to get things done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Development

No branches or pull requests

2 participants