Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plugin installation step #6875

Closed
mariusandra opened this issue May 21, 2021 · 12 comments
Closed

Plugin installation step #6875

mariusandra opened this issue May 21, 2021 · 12 comments
Assignees
Labels

Comments

@mariusandra
Copy link
Collaborator

Currently plugins are installed from Django. The main posthog app in plugin.py. The code there either 1) gets a github repo URL, parses it, fetches a .zip file with the plugin code, does basic validation (version check) and stores the archive in postgres, or 2) Opens a source editor and stores its contents in postgres.

In both cases, we won't know if the plugin actually works before we try it out for real.

Instead, I'd like there to be a separate "install plugin" step and "installing..."/"updating..." states. This might mean pushing some of the logic away from Python into NodeJS (e.g. downloading the archives?) and just dispatching tasks from the plugin server, and awaiting their responses.

Eventually we the plugin server would spin up a VM for the "to install" plugin, capture its capabilities, and if all loaded well, mark the plugin as installed and ready to use. I'd argue the VM should be spin up in a separate worker pool (with just one thread) for isolation. Either via piscina or some other system. This could be the basis for plugin testing as well https://github.com/PostHog/plugin-server/issues/165

@mariusandra
Copy link
Collaborator Author

mariusandra commented May 21, 2021

This unlocks:

@neilkakkar
Copy link
Collaborator

Thinking this through: There's 2 ways for posthog to share data with the plugin-server: the postgres database, and the celery queue.

Given the discussion in PostHog/plugin-server#165 , I think we don't want to go the webserver route to open a channel of communication between these two.

Now, the question becomes how installation works. There's the UI presentation side, and the backend implementation side. I'm thinking of reusing most of what we already have, in both cases.

Here's how I imagine things would work:

  1. The pluginConfig gets an installation status field (that goes into the DB)
  2. Python/Django does exactly what it does right now, except instead of being enabled on installation, it sets status to installing (( need to think a bit more here whether I can modify enabled field instead, but that feels very messy)
  3. This moves the plugin into the Installed tab (name to be changed):
    image - with a third sub-tab, called "Installing Plugins".
    1. This way, we can leverage existing screens and setup, while probably making more sense around testing. This tab gets a send Test Event button in the future.
    2. For now: it's exactly like a normal plugin, except it's not enabled. The logs look the same, so errors show up right here.
    3. Instead of sending a reload all plugins signal on creation of a new plugin, we send a Celery task to plugins server under a posthog.tasks.test.setupPlugin task, which creates a VM, runs everything, and adds an ERROR to the pluginConfig if need be. The logs and error on the frontend are populated appropriately.
    4. If no errors, we update status to installed, and enable the plugin - from the plugin server. This is a bit icky - I'm not sure if there's a way to get the Django API to do this on its own. This then requires the reload broadcast task.
    5. For errors, there's a retry button which attempts setting up again.

The reason I don't want to move most of the plugin setup to the plugin server is because of how the Django ORM fully manages CRUD operations for every model. This gets borked when we get the plugin server to change stuff on the DB, but gets borked less vs making half the changes in Django, and half in PluginServer.

Ideally, at the end of the pluginServer test without errors: it would send capabilities + ok to enable request to the Django API, which then manages doing this + sending the reload signal back. I'm not yet sure if this is possible, but otherwise, this seems like a reasonable approach to me.

Am I missing something? Are there better alternatives you can think of?


About calling the Django API from PluginServer: isn't this just a matter of figuring out the right endpoint via config, and making usual HTTP requests? The webserver is a problem on the plugin-server end, not the django end?

@neilkakkar
Copy link
Collaborator

Assuming I'm not missing anything, our data flow looks like:

  1. django sends request via celery to plugin server to 'test setup' of given plugin
  2. plugin server does it by setting up a fresh vm, and
    1. on error, does nothing except updating error on pluginConfig (the usual): this update gets picked up by the frontend to show the error on the installing tab
    2. on no error, sends capabilities + request to enable plugin to the API, which manages updating capabilities if need be, and setting state to enabled + installation status to installed.

I'll have to figure out how that results in updates on the frontend, but design / concept wise - I don't see any problems (yet).

@mariusandra
Copy link
Collaborator Author

Hey, this is mostly spot on! Some things to add though :).

  1. Upgrades :). Plugins need to be upgraded at times. That means uploading a new github zip archive, or editing the source field. How to manage the installed field then? I think we really can't because if we override the code, and set installed to false, the plugin will effectively be disabled for a bit. That is unfortunately not okay :/.

Ultimately the answer is to store installs in a different model, which would also allow us to keep much more metadata like installed_at, installed_by, etc... not to mention having a auditable log/history of previous plugin versions. The plugin itself could then still contain an installed field and latest source/archive to keep things simple... or it could just contain a link to the installed version. Up to you. However keep in mind we can't do database migrations that remove fields because that makes django crazy when doing rolling upgrades, so I'd stick with keeping a copy of the source/archive... maybe :).

  1. I'd simplify this part:

"on no error, sends capabilities + request to enable plugin to the API, which manages updating capabilities if need be, and setting state to enabled + installation status to installed."

For storing capabilities, let's not do plugin server -> http or celery -> django -> postgres, but just plugin server -> postgres. Ideally we want to avoid relying on django or celery too much just because that makes ingestion much much easier to scale and operate. That's why we're moving action matching from django to the plugin server for example (#436).

  1. There's no need to enable a plugin after it has been installed. So no need to broadcast any global reload task. That will be done by the frontend when you first save the configuration that opens after installation. This is so because you often have to enter some required values, and can't auto-enable plugins anyway.

  2. I wouldn't show the plugin in any list until it's installed. That means just keep spinning the button in the repository until it's installed. The "install plugin from url" box will require some special treatment. One option is to show a new section "stale plugins" or "installing for a while..." or "broken installs", and include in it a list of plugins, which are not installed, not in the repository, and not in the current "install from custom url" box. This should be super easy if you move some kea selectors around.

  3. posthog.tasks.test.setupPlugin <-- I assume this is a temp name, but probably better to use install in the task name for clarity. I'm also not sure which notation (camelcase vs underscore) is the "default" for celery task names, if any, but so far we've used underscore_case, so please keep consistency. Always double check these kinds of things :).

  4. Super important yet easy to forget: we need to make sure to never actually start uninstalled plugins.

@neilkakkar
Copy link
Collaborator

Excellent points, thanks!

It makes sense about the plugin-server -> postgres bit. Had a deeper look last night, and sure enough, there's lots of parts doing this already, so my assumption about Django being the only model manager is false.

  1. There's no need to enable a plugin after it has been installed. So no need to broadcast any global reload task. That will be done by the frontend when you first save the configuration that opens after installation. This is so because you often have to enter some required values, and can't auto-enable plugins anyway.

Oh, I think the installation can only happen after the configuration has happened, since the setup task can depend on the config?

  1. I wouldn't show the plugin in any list until it's installed. That means just keep spinning the button in the repository until it's installed.

Hmm, I don't think this is feasible, if we want to show if an installation failed for whatever reason? (bad config, typos, etc.)?

One option is to show a new section "stale plugins" or "installing for a while..." or "broken installs", and include in it a list of plugins, which are not installed, not in the repository, and not in the current "install from custom url" box.

Another reason I want the Installing tab is related to this, since it allows us to treat every kind of plugin the same: wherever you're coming from, you click install, things happen in the background, stuff gets loaded into the DB, then you end up on an installing tab if things go wrong, or get installed+enabled if things go right, no matter where the plugin came from?

(Yeah, maybe broken installs is a better name than Installing)


About installing DB model & upgrades: (related to point 1 and 6): I actually don't want to add an extra layer on top of enabled/disabled for checking installation status. Rather, "installing" is a subset of the disabled state.

So, how I imagine it works with updating: You click update, the plugin goes into an installing state (and gets disabled while installing), as the source archives are changing, we go through the setup loop, and if things work fine, installing state is set back to false, and plugin is enabled. (If things didn't work out, we'll go to the broken tab and keep it disabled, since it literally can't work anymore)

So, I actually do think this works for upgrades as well? I wouldn't go the another model route unless we really want the installed_at / installed_by audit logs?

Another possibility with updating is to keep the plugin enabled while in installing status - this way, old plugin keeps running until installation finishes, at which point we either: (1) disable it if installation failed. or (2) reload workers if it succeeded (somehow?)

@mariusandra
Copy link
Collaborator Author

Oh, plugin installation should happen before configuration. The point is still currently to capture capabilities on install, and verify that all the plugin JS works. We don't need user data for that. All that's needed is to comment out the part that runs setupPlugin in vm.ts in order not to get errors if some config var is missing.

And we really absolutely totally can not disable plugins when updating. Imagine your fortune 500 company is relying on the "bigquery export" plugin... that just skips exporting a few thousand events for the 10 seconds the plugin was down for upgrades. That's a no go.

@neilkakkar
Copy link
Collaborator

Hmm, that's the thing, I don't think we can verify JS works until we have the config in place? Since those are all parameters used by the JS code?

Interesting, I was actually thinking installation would mean running the setup and see everything is OK.


Fair point about disabling while updating, we can keep the old version running while updating.

@mariusandra
Copy link
Collaborator Author

We definitely can... and if you look at vm.ts you see that we already have all we need to check capabilities before we run setupPlugin, which is the only thing that actually uses anything from config.

I mean, we can't verify that setupPlugin works, but if it doesn't, you'll know immediately when you enable your plugin.


Regarding keeping the old version running, I don't see how you can do that without a separate table. If you override any source/archive fields in the plugin, you can't predict what happens next. Some old plugins might be running, but what happens in a new server instance that boots up? We need to store the code to try to run somewhere, and a "plugin_installations" or "plugin_installed_versions" or whatever table makes most sense IMO.


Regarding this "Installing tab", that's thinking like a developer. While we are marketing towards developers, even developers like tools designed for regular people :). That means, the simpler, less cluttered, least making you think the interface is, the better.

@neilkakkar
Copy link
Collaborator

I understand, that was the missing bit: I assumed installation would test that things work before actually enabling things. That clears some things out, thanks!

.. And if there's no extra information to relay back, we don't need an extra tab/section.

Your initial points make a lot more sense now. Thanks!

@tiina303 tiina303 transferred this issue from PostHog/plugin-server Nov 3, 2021
@posthog-bot
Copy link
Contributor

This issue hasn't seen activity in two years! If you want to keep it open, post a comment or remove the stale label – otherwise this will be closed in two weeks.

@posthog-contributions-bot
Copy link
Contributor

This issue has 2037 words at 10 comments. Issues this long are hard to read or contribute to, and tend to take very long to reach a conclusion. Instead, why not:

  1. Write some code and submit a pull request! Code wins arguments
  2. Have a sync meeting to reach a conclusion
  3. Create a Request for Comments and submit a PR with it to the meta repo or product internal repo

Is this issue intended to be sprawling? Consider adding label epic or sprint to indicate this.

@posthog-bot
Copy link
Contributor

This issue was closed due to lack of activity. Feel free to reopen if it's still relevant.

@posthog-bot posthog-bot closed this as not planned Won't fix, can't repro, duplicate, stale Nov 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Done This Sprint
Development

No branches or pull requests

3 participants