Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Push build output to private gin-proc repo and share with user #6

Open
achilleas-k opened this issue Jul 3, 2019 · 0 comments
Open
Labels
discussion Ideas open to discussion

Comments

@achilleas-k
Copy link
Member

achilleas-k commented Jul 3, 2019

This is an alternative idea to the current workflow of pushing output to a gin-proc branch of the original repository.

One of the original ideas we had for serving build output to the user was having a data store that would serve archives. The output would be privately accessible, either using credentials or by secret URLs available only to the user.

This lead me to the idea that we could use GIN repositories as data stores. The workflow would be:

  • When the user enables gin-proc builds on their repository (when the hook is created) create a private repository as the gin-proc user on GIN with the name gin-proc/<user>-<repository> (the data repository). The repository name is guaranteed unique since the repositories unique names are <user>/<repository>.
  • The originating user is added as a collaborator on the repository.
    • The user could be added as a read-only collaborator. This would prevent users from adding commits to the data repository and creating conflicts on subsequent builds, or deleting the repository, creating issues for gin-proc.
  • On successful build, newly created files or more specifically, the output files specified by the user in their configuration, are moved to a local clone of the data repository and pushed.
    • Subdirectories can be used to separate different builds. Alternatively, each new build can create a new commit with a message specifying which build number it was. Commits can even be tagged.

Benefits of this approach:

  • When a user visits gin.g-node.org/gin-proc, they will see a list of data repositories for their CI enabled repositories only.
  • Users have access to the data store but they can't remove any data or add extra commits that would create conflicts for the gin-proc user.
    • In the current branch-based method, users can delete the branch or modify files in it, which can create issues for the gin-proc user. We could work around this. It might also be desirable to allow the user to delete or modify build outputs, so this point is not a clear benefit.
  • The data repositories can be used to store intermediate outputs from snakemake as well as build outputs (in a subdirectory names appropriately). This might help users with troubleshooting builds when something goes wrong, without requiring the hindsight of specifying storage of intermediate files.

Disadvantages of this approach (versus branch-based):

  • User has no direct control over the storage of their build outputs.
    • This is the counterpart to the second point above and depends on whether we want to give that control to the users and how we handle conflicts and other issues.
  • If we store intermediate snakemake files in a user-accessible repository, we take away the ability for us to clear old build outputs if (when) storage becomes an issue, because they will always be part of the git history.
    • We can get around this by having a clear-written policy that old builds are deleted after a certain period of time, however to do this we would have to delete git history (or annexed data) potentially causing issues for users who have cloned their data repository.

For now, we should move forward with the branch-based method, since it's more straightforward. I thought this idea would require GOGS changes as well at first, since there was no API call to add collaborators, but that's available now.

Feel free to use this issue for discussions on this idea and any alternatives.

@achilleas-k achilleas-k added the discussion Ideas open to discussion label Jul 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion Ideas open to discussion
Projects
None yet
Development

No branches or pull requests

1 participant