Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade process from previous versions to 2.x #1339

Closed
carlesarnal opened this issue Mar 12, 2021 · 9 comments
Closed

Upgrade process from previous versions to 2.x #1339

carlesarnal opened this issue Mar 12, 2021 · 9 comments
Assignees
Labels

Comments

@carlesarnal
Copy link
Member

Now that we have the first CR of 2.x we should start thinking about an upgrade process.

@EricWittmann
Copy link
Member

I'm thinking about a standalone tool/CLI, perhaps based on Kafka Streams (?), that would simply subscribe to the old kafka topics and produce messages on new topics with the new format. I'd like to be able to support both Streams and Kafka+SQL formats if possible. Thoughts?

We could probably copy/paste the protobuf code from 1.3.2.Final (renamed/namespaced) for the consumer of the old topics. Then have some custom java logic that would produce new messages in the new format. Easy peasy?

@carlesarnal
Copy link
Member Author

carlesarnal commented Mar 18, 2021

Although this approach seems reasonable to me, I slightly prefer an API-based approach. Get all keys of a running registry in 1.3.2.Final, get the actual artifact, and send them to a Registry running in 2.0.0.Final. We can tweak this approach to preserve globalIds, I think.

@carlesarnal
Copy link
Member Author

@Apicurio/developers thoughts?

@famarting
Copy link
Contributor

This is the idea I have of how this could be implemented:

  • We implement api endpoints in registry 2.0.0 for import and export (/apis/registry/v2/admin/import) this APIs are natively implemented in the storage, so more efficient mechanisms for exporting data can be used or more complex logic can be applied when importing data. I imagine we will have a compressed file with some file structure and a bunch of json files with the artifacts metadata.
  • The import api will have (at first) two flags: keep globalIds(default true) and fail on conflicting artifactIds(default true). Because native storage implementations will be used this features can be implemented (the sql storage allows to provide globalIds beforehand) , however kafkasql uses topic partition to set the globalId (this is an issue, this needs to be investigated, I hope there is a solution)
  • For registry 1.3.2 we will implement a script or a java tool that scrapes a registry deployment a generates the file that our import api understand.

The keep globalIds flag is useful if the user doesn't need to keep the globalIds identical from the old registry or for importing data to a registry that already have some artifacts in it. If keep globalIds is set to true and there is a globalId conflict the import operation should fail
The fail on conflicting artifacts flag , again is useful for importing data to a registry that already have some artifacts in it. If set to false, in case of artifactId conflict, the registry will try to add versions to the artifacts where possible or skip them otherwise.

With this three things we cover all of our usecases with the minimum amount of things to implement.

@EricWittmann
Copy link
Member

I like this plan very much. The only part of this that I think requires some thought and analysis is the globalId part. Everything else is pretty straightforward, and is consistent with what I imagined we would have for import/export in our V2 API. I think the utility to upgrade from Streams 1.x to Kafkasql 2.x certainly can use the import API, but it might make that tool slightly harder to write. It needs to produce our export file format rather than just copy the messages to another Kafka topic. That's probably OK though.

Note that I'm hoping we can use e.g. ZipOutputStream to stream the export file to the HTTP response (same/reverse for import). Keycloak doesn't allow export over HTTP due to the amount of data potentially involved. We may need some constraints just to make sure export can't accidentally kill the server. But I think we're unlikely to have too much data in the registry to export.

@mgvirtzm
Copy link

A couple of questions about migration of apicurio server with SQL storage from 1.3.x to 2.x.

  1. Is there a need in database migration when upgrading apicurio server to 2.x while using v1 API?
  2. Will there be a need to migrate the database when upgrading to v2 API?

@famarting
Copy link
Contributor

Data migration is needed in both cases. The jpa storage we had in 1.3.x and the sql we have now in 2.x use different database schemas.

In 2.x we now have an import/export api that will help with the data migration process. it's documented here https://www.apicur.io/registry/docs/apicurio-registry/2.0.0.Final/getting-started/assembly-managing-registry-artifacts-api.html#exporting-importing-using-rest-api

And for exporting the data from a 1.3.x registry we have this tool https://github.com/Apicurio/apicurio-registry/tree/master/utils/exportV1

We are working on documenting this whole process properly...

@EricWittmann
Copy link
Member

@famartinrh want to mark this as done? :)

@famarting
Copy link
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants