Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add concept of scraper release channels #957

Open
benoit74 opened this issue May 3, 2024 · 1 comment
Open

Add concept of scraper release channels #957

benoit74 opened this issue May 3, 2024 · 1 comment

Comments

@benoit74
Copy link
Collaborator

benoit74 commented May 3, 2024

Currently, in a recipe we configure which scraper image and which scraper tag we want to use.

This has the side-effect that anytime a new scraper tag (version) is released, we need to update all recipes images tags to use the new tag / version.

This is currently a manual effort and quite brutal: we update all recipes using a given scraper image to use the last tag.

It also means that we cannot do it without have technical access to the DB, the risk of manipulation error is significant, the rollback is difficult and even more importantly we do not manage at all situations where some recipes have to be run with a different image tag. This does not happen a lot, but is very uncomfortable when it happens (like currently where few recipes are using the zimit2 tag while most of them should continue to use the "officially released" version).

I advise to introduce the concept of release channels (nota: this idea comes from yesterday demo of browsertrix).

A release channel will consist of:

  • id (technical identifier)
  • scraper name (informational), e.g. zimit, ifixit, ...
  • channel name (informational), e.g. Production, Development, Beta, ...
  • scraper image name (technical information), e.g. ghcr.io/openzim/zimit
  • scraper image tag (technical information), e.g. 1.2.0

Recipes would not be associated anymore with a scraper image name and scraper image tag but with the id of a release channel.

This means that anytime a new image tag is released, we will only have to update settings of few release channels.

Requested Task will store the release channel id that was configured in the recipe at request time.

When transforming the requested task into a task, we will decide which image name and tag will be used, and this information will be stored in the task configuration. This means that whenever a release channel is updated, any task which is started after this point in time will benefit from the new setting.

We need a specific database table to store these release channels.

We need a UI screen/section and new APIs for admins to:

  • see existing release channels
  • add new release channel
  • delete a release channel (need to check that no more recipes are using this release channel)
  • edit a release channel (where we will reuse the logic / API to see which image tag are available for a given image name)

UI needs to be updated to:

  • display release channel information in requested task, task, recipe
  • to allow to select a release channel in recipe configuration
    • first select a scraper name
    • then select as release channel name available for this scraper name

Open question: what do we wanna do when we clone a recipe? We probably needs something since this has proved to be a source of confusion in the past, i.e. an editor clones a zimit recipe which is unfortunately using the zimit2 tag without realizing it is not using the default (zimit1) tag. Do we wanna have a concept of default release channel per scraper, so that whenever we clone a recipe which is not on this default release channel, either we display a warning or we force it to use the default one?

@rgaudin
Copy link
Member

rgaudin commented May 3, 2024

I like the idea. As for the question, I still believe that we should expect minimal effort from User. If they dont recognize that Channel is Zimit 2 instead of Zimit 1, then we cannot expect anything they input to be correct.

That said cloning UI could be improved to prominently include the Channel (or task_name before)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants