Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

addReference() doesn't fully replace previous dependencies feature #57

Closed
denisw opened this issue Apr 21, 2021 · 20 comments · Fixed by #71
Closed

addReference() doesn't fully replace previous dependencies feature #57

denisw opened this issue Apr 21, 2021 · 20 comments · Fixed by #71

Comments

@denisw
Copy link

denisw commented Apr 21, 2021

Before version 1.0, it was possible to split an Avro schema into multiple files and pass all of them to a single subject() call in the register block.

The addReference() DSL method is meant to replace this, but it only works if the external reference is registered as a standalone subject in the Schema Registry. This is not the case in many of our uses, where we simply move the definition of a nested record type into a separate Avro file for convenience. This is not possible with the new DSL.

It would be great if the previous set-of-Avro-files approach would be re-added to the DSL somehow, perhaps through an addInternalReference() method.

@ImFlog
Copy link
Owner

ImFlog commented Apr 23, 2021

Hello @denisw,

Thank you for your interest in the plugin.
Do you have a sample project where I can reproduce the issue (or even better fill a PR with the error in a test) ?
I am pretty short on time currently and that would speed up the fix process :)

I think I will also need to check if it's still possible on the schema registry side as It changed with the arrival of JSON and ProtoBuf support (which changed the API).

@ImFlog
Copy link
Owner

ImFlog commented May 2, 2021

@denisw I've just checked a bit more the code.
All the methods that deals with Schema in the recent version needs a SchemaReference list (to parse, to register ...).

The SchemaReference looks like this:

public class SchemaReference implements Comparable<SchemaReference> {

  private String name;
  private String subject;
  private Integer version;
  ...
}

As you can see we cannot pass the schema directly in it. I've also checked on the Maven plugin, it does not support this feature either.
I don't know if It would be worse to create an issue on the Confluent Schema Registry repository to support this kind of use case.

In the meantime, would it be OK for your use case if the addInternalReference does the register of the reference schema remotely for you ?

@denisw
Copy link
Author

denisw commented May 2, 2021

@ImFlog Sorry for the late reply, and thanks for looking into this more!

My use case would not require any schema references on the remote side. I was really just looking for the previously supported feature where dependencies between Avro schema files were resolved locally and inlined before sending to Schema Registry - that is, the functionality introduced in #13.

Maybe that is still supported and the arguments just passed a bit? Otherwise, I'm happy to look into re-implementing this at least for Avro.

@ImFlog
Copy link
Owner

ImFlog commented May 5, 2021

I think it's possible to add references locally and then send the request with the schema containing all it's dependencies in one go.
Doing it for Avro mean we need to doing It for Protobuf and JSON too (if possible) and thus import the Json / Avro / Protobuf library used for parsing. The thing I don't like by adding required libraries is that it needs to be imported manually by user (but I may misunderstand how buildscript dependencies are resolved), thus we may have to be careful to make things work without the mentioned imports.

A mix of local and remote could work if we have internal and external dependencies.

Now that it's clearer for me, I will be able to start creating the structure for this mechanism. When done I will publish it like this if anyone volunteer for one of the implementation it will be easier :)

I'll keep this ticket for this the structure part and then create 3 new ones for the respective implementation.
Is it OK for you @denisw ?

@denisw
Copy link
Author

denisw commented May 6, 2021

Sounds good! Regarding the need for the serialization libraries of each data format, I can see three options:

  • Add compileOnly dependencies, requiring the user to depend on the needed libraries themselves
  • Add implementation dependencies, but then everyone gets the libraries of all schema formats, whether needed or not
  • Somehow create multiple plugin variants for the different serialization formats

@ImFlog
Copy link
Owner

ImFlog commented May 11, 2021

This section of the documentation seems to fit what we need : https://docs.gradle.org/current/userguide/implementing_gradle_plugins.html#providing_default_dependencies_for_plugins
The variants solution seems promising too as most of the time, users only need one of AVRO / JSON / PROTOBUF.

I'm going on vacations for 3 weeks starting tomorrow and unfortunately I was not able to find time to work on this before.
I will start as soon as I get back. Sorry for the delay.

@ImFlog
Copy link
Owner

ImFlog commented Jul 9, 2021

Sorry for the long delay.
This task is next on my Todo !

@dejank1986
Copy link

Any update on this? We are using the plugin in the same way, we do not need internal schema references to be registered remote.

@ImFlog
Copy link
Owner

ImFlog commented Sep 21, 2021

Hello @dejank1986, I just had a child and thus I didn't had much time to work on this. As this is not a quick fix I need to try to find more time but I did not start the design / implementation yet.
Hopefully I will have more time to work on this soon.
Once again I am sorry for the delay.

@ImFlog
Copy link
Owner

ImFlog commented Oct 1, 2021

Good new @denisw and @dejank1986, I took some time to look at this.
I have some leads that could allow me to do the implementation without the need for additional dependencies.
If we append the schemas to the base one (in a single file) it seems to be registered correctly.
I have to do some fix around protobuf (as I need to stripe the headers) and validate that It works correctly in a real use case but If it's validated it shouldn't be long.
You can see the attached draft PR.

@ImFlog ImFlog mentioned this issue Oct 1, 2021
7 tasks
@denisw
Copy link
Author

denisw commented Oct 2, 2021

@ImFlog That’s great to hear! Thank you so much for working on this. :) I am happy to test the PR in one of our production codebases, though I must admit that I never have installed a Gradle plugin from source, so I’d need to figure this out first.

@dejank1986
Copy link

@ImFlog Great! Looking forward to new plugin release! :)

@ImFlog
Copy link
Owner

ImFlog commented Oct 18, 2021

Bad news, It seems that my first hack attempt didn't work because the register method only took the first schema and didn't take the local reference into account.
The only way to override this is to use a custom provider for the 3 format.

I am currently trying with AVRO (with hugly hacks), but it's a bit of a rabbit hole. I have already implemented a custom AvroSchemaProvider that resolve the localReferences to String and made sure that the Avro parser read everything well but at one place of the code https://github.com/confluentinc/schema-registry/blob/master/client/src/main/java/io/confluent/kafka/schemaregistry/client/CachedSchemaRegistryClient.java#L491 schema.canonicalString() is used and it does not resolve the dependencies (as Kafka is supposed to read the dependencies ...). If I want to override this, I would need to extend the registry as well.

Other lead would be the first thing I had in mind with using the Parser directly (might be the easiest).

More information on the next episode

@ImFlog
Copy link
Owner

ImFlog commented Oct 18, 2021

I may have found something (again) but that would mean that if a user wants to use local + remote dependencies, It would be treated as a one big schema without remote dependencies.
I don't know if that's acceptable but from my quick 1am POC it could work.

@ImFlog
Copy link
Owner

ImFlog commented Nov 2, 2021

Good news @denisw @dejank1986, I don't know if you saw the PR changes but I have something working for AVRO (still need to check for Protobuf and Avro).
I will clean a bit the code and then I will probably release a snapshot version (that you will be able to test before the release).
Then I will open separate issues for Protobuf and Json.

@denisw
Copy link
Author

denisw commented Nov 3, 2021

@ImFlog Amazing! 🎉 Thank you so much for your continued work on this, despite all of the hurdles. It’s very appreciated. 😃

I’ll give it a spin in our projects as soon as the snapshot version is available. 👍

@ImFlog
Copy link
Owner

ImFlog commented Nov 5, 2021

Okay one last issue, I was polishing the PR and I even released a version to give it a spin on the example folder.
Turns out that my current implementation does not allow remote and local reference to be used in the same register / compatibility call.
Avro yells because it doesn't know the remote types.

The only thing I can think of is to fetch the remote references and inline the whole schema before the call.
WDYT @denisw ?

@denisw
Copy link
Author

denisw commented Nov 7, 2021

@ImFlog I think this is a perfectly fine limitation. I don't expect us to mix local and remote references; in fact, it might be a good idea to explicitly not support this mixing for the same subject to avoid surprises.

@ImFlog
Copy link
Owner

ImFlog commented Nov 29, 2021

It's finally here @denisw ! The version 1.6.0 include the localReference support for AVRO.
I've created separated tickets for JSON and Protobuf.
As discussed together, it is for now not possible to mix local and remote references but It should be doable in the future (thus another ticket).

It took 2 month to fix, sorry for the delay. I will try to implement the other types faster :)

Do not hesitate to reopen an issue if something is not working right.

@dejank1986
Copy link

@ImFlog Great! 🎉🎉🎉 Thank you for pushing this! I will try it out in next days for sure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants