Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design a "bot" to assist "managing" of the dandisets by authorized users #360

Open
2 tasks
yarikoptic opened this issue Oct 13, 2023 · 5 comments
Open
2 tasks
Assignees

Comments

@yarikoptic
Copy link
Member

yarikoptic commented Oct 13, 2023

DataLad dandisets are not "sources", they are automagically updated by a cron service running on drogon. As such we should not give any write permissions to original authors/owners of the dandisets as known to the DANDI system.

Only DANDI archive has information about which users have write access to the original dandiset, and thus should be capable e.g. to

It will be responsibility of the bot to first verify that the author of the command is among authors of the dandiset to operate on the command.

Such bot service might also help to show/seek confirmation from the authors for some scheduled/mass maintenance tasks etc as we mind stormed with @rly during ODIN, e.g.

  • upgrade .nwb files:
    • from some ancient version (e.g. 1.0a) over to a newer one
  • upgrade dandisets with some auxiliary metadata (e.g. semantic annotations etc)

Some extra "cron" jobs which might need to be performed but not necessary need to be a part of "bot" I guess:

  • checking list of each dandiset users with Triage role which had been invited but have not confirmed/responded (removed themselves), and email a reminder/instructions on how to accept invitation or remove themselves (Design a "bot" to assist "managing" of the dandisets by authorized users #360). Actually -- we might not want to remind if at least one "Triage" user has replied. The others might be PIs etc, not desired to pester if already somebody took care.

TODOs (yet to be finalized)

  • check out e.g. conda-forge bot(s). They must reflect lots of knowledge/best practices to design such beasts
  • ...

refs:

@yarikoptic
Copy link
Member Author

FWIW https://github.com/dependabot is a great bot/project/resource to checkout .Since it is quite modular and configurable, I even started to wonder if we could build on top of it even if for some jobs...

@rly
Copy link

rly commented Oct 20, 2023

The JOSS editorial-bot https://github.com/openjournals/buffy might also be a good bot/project/resource to check out.

@jwodder
Copy link
Member

jwodder commented Nov 15, 2023

Preliminary research:

  • At a high level, setting up a GitHub bot (properly known as a "GitHub App") involves building a publicly-accessible HTTP server and configuring GitHub to send requests called "webhooks" to the server whenever certain events happen. If the server receives an actionable webhook, it takes action via the normal GitHub APIs.

  • The steps for creating an app seem simple enough, but I'd find it hard to believe if no one has yet created a generic GitHub App framework in Python to save us from having to write everything from scratch.

  • A GitHub App can be public or private.

    • A private app can only be associated with at most one organization, so if we want to use this bot in multiple organizations (e.g., dandisets and dandizarrs), we would have to duplicate the setup on the GitHub end in order to create two identical apps in different organizations pointing to the same server.

    • Public apps can be installed in any number of organizations and by any GitHub user, though if we don't choose to put it on the Marketplace, I'm not sure how anyone else would find it.


Next step: Search PyPI and elsewhere for pre-existing GitHub App frameworks

@jwodder
Copy link
Member

jwodder commented Nov 15, 2023

FWIW https://github.com/dependabot is a great bot/project/resource to checkout .Since it is quite modular and configurable, I even started to wonder if we could build on top of it even if for some jobs...

I'm pretty sure that Dependabot's modularity & configurability is all oriented around updating dependencies; e.g., you can add a module for looking up packages in a new package source, but you can't add a module for operating on dandisets. Also, it's written in Ruby, which I don't know and I don't think anyone else on the team knows.

@jwodder
Copy link
Member

jwodder commented Nov 15, 2023

[Work in Progress]

Features to judge packages on:

  • Validating webhook deliveries
  • Generating a JWT
  • Generating an installation access token
    • Installation access tokens are valid for one hour, so ideally our bot should reuse them until they expire instead of generating a new one on every webhook event; do any of these packages help with this?
  • Tracking X-GitHub-Delivery headers [1] [2]
  • Integration with Flask or another web framework
    • Dispatching based on event & action
  • I'm assuming that, for at least some webhook events, our bot is going to want to spawn asynchronous tasks (through celery or similar) rather than blocking the HTTP request handler until the webhook is fully acted upon. Do any of these packages help with this?
    • Relevant GitHub docs: [link]

Relevant (to varying degrees) Python packages found so far:

GitHub apps written in Python:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants