Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for engine calls from plugins #12029

Merged
merged 5 commits into from Mar 9, 2024

Conversation

devyn
Copy link
Contributor

@devyn devyn commented Mar 1, 2024

Description

This allows plugins to make calls back to the engine to get config, evaluate closures, and do other things that must be done within the engine process.

Engine calls can both produce and consume streams as necessary. Closures passed to plugins can both accept stream input and produce stream output sent back to the plugin.

Engine calls referring to a plugin call's context can be processed as long either the response hasn't been received, or the response created streams that haven't ended yet.

This is a breaking API change for plugins. There are some pretty major changes to the interface that plugins must implement, including:

  1. Plugins now run with &self and must be Sync. Executing multiple plugin calls in parallel is supported, and there's a chance that a closure passed to a plugin could invoke the same plugin. Supporting state across plugin invocations is left up to the plugin author to do in whichever way they feel best, but the plugin object itself is still shared. Even though the engine doesn't run multiple plugin calls through the same process yet, I still considered it important to break the API in this way at this stage. We might want to consider an optional threadpool feature for performance.

  2. Plugins take a reference to EngineInterface, which can be cloned. This interface allows plugins to make calls back to the engine, including for getting config and running closures.

  3. Plugins no longer take the config parameter. This can be accessed from the interface via the .get_plugin_config() engine call.

User-Facing Changes

Not only does this have plugin protocol changes, it will require plugins to make some code changes before they will work again. But on the plus side, the engine call feature is extensible, and we can add more things to it as needed.

Plugin maintainers will have to change the trait signature at the very least. If they were using config, they will have to call engine.get_plugin_config() instead.

If they were using the mutable reference to the plugin, they will have to come up with some strategy to work around it (for example, for Inc I just cloned it). This shouldn't be such a big deal at the moment as it's not like plugins have ever run as daemons with persistent state in the past, and they don't in this PR either. But I thought it was important to make the change before we support plugins as daemons, as an exclusive mutable reference is not compatible with parallel plugin calls.

I suggest this gets merged sometime after the current pending release, so that we have some time to adjust to the previous plugin protocol changes that don't require code changes before making ones that do.

Tests + Formatting

  • 🟢 toolkit fmt
  • 🟢 toolkit clippy
  • 🟢 toolkit test
  • 🟢 toolkit test stdlib

After Submitting

I will document the additional protocol features (EngineCall, EngineCallResponse), and constraints on plugin call processing if engine calls are used - basically, to be aware that an engine call could result in a nested plugin call, so the plugin should be able to handle that.

@WindSoilder WindSoilder added the pr:breaking-change This PR implies a change affecting users and has to be noted in the release notes label Mar 2, 2024
@WindSoilder
Copy link
Collaborator

Hi @devyn , thank you for such a good improvement! While I'm trying to review it, I found there is a conflict. Can you resolve it first?

@devyn
Copy link
Contributor Author

devyn commented Mar 4, 2024

@WindSoilder yes, this should fix the conflict

@sholderbach
Copy link
Member

Your suggestions regarding the plugin interface sound sensible (although this is certainly outside my wheelhouse)

One concern I have going forward is what we expose of Closure. It certainly depends a bit what our "threat"/failure model is.
The closure includes the block id pointing to the code to execute (and its metadata) as well as the captures both in copied Value form and a VarId.
Since #10894 this is fully serializable.

Picking arbitrary blocks is par for the course of allowing execution of code (so more in the bucket, you can make a mistake but no escalation)
With the captures this could become a bit thornier should we move from full clones to some form of reference. The scoping here is manually managed inside the Closure type.

Everything that properly round-trips serde as a Value would be fine as a clousre argument. Do we have anything that would not fit this bill?

@devyn
Copy link
Contributor Author

devyn commented Mar 5, 2024

Everything that properly round-trips serde as a Value would be fine as a clousre argument. Do we have anything that would not fit this bill?

The only Value currently not allowed to pass through the plugin interface as-is is LazyRecord, but even that just gets collapsed to a plain record before sending, so there shouldn't be any issues there.

Custom values were previously handled with pretty special behavior, but they're now a lot more sensible (plugins can only take custom values that they produced, but they can be in any context)

This allows plugins to make calls back to the engine to get config,
evaluate closures, and do other things that must be done within the
engine process.

Engine calls can both produce and consume streams as necessary. Closures
passed to plugins can both accept stream input and produce stream output
sent back to the plugin.

Engine calls referring to a plugin call's context can be processed as
long either the response hasn't been received, or the response created
streams that haven't ended yet.

This is a breaking API change for plugins. There are some pretty major
changes to the interface that plugins must implement, including:

1. Plugins now run with `&self` and must be `Sync`. Executing multiple
   plugin calls in parallel is supported, and there's a chance that a
   closure passed to a plugin could invoke the same plugin. Supporting
   state across plugin invocations is left up to the plugin author to
   do in whichever way they feel best, but the plugin object itself is
   still shared. Even though the engine doesn't run multiple plugin
   calls through the same process yet, I still considered it important
   to break the API in this way at this stage. We might want to consider
   an optional threadpool feature for performance.

2. Plugins take a reference to `EngineInterface`, which can be cloned.
   This interface allows plugins to make calls back to the engine,
   including for getting config and running closures.

3. Plugins no longer take the `config` parameter. This can be accessed
   from the interface via the `.get_plugin_config()` engine call.
@devyn
Copy link
Contributor Author

devyn commented Mar 8, 2024

@sholderbach

With the captures this could become a bit thornier should we move from full clones to some form of reference. The scoping here is manually managed inside the Closure type.

I think I finally got what you mean here, and I think it can probably be solved by just making sure that the context held by the plugin in order to do the engine calls keeps those references alive. Unless something crazy happens with global mutable state, I think it should "just work" - the plugin interface always stores an immutable copy of EngineState, Stack, Call for that particular plugin call context, and the cleanup of references I'm assuming would happen somewhere in the execution cycle where Stack or EngineState are mutable, which shouldn't affect plugins.

Now, on that note, I would really like it if EngineState and Stack were cheaper to copy, because I'm sure what's happening right now is expensive... but I think no matter what ends up happening there, unless problematic mutable state is introduced, it shouldn't cause a problem for plugins.

@sholderbach
Copy link
Member

I think for now sending the internals of the closure back and forth is OKish, even though in my view it should be an opaque unit/reference that contains both the code and closed over values in unmodifiable form. (Your example doesn't touch the Values but in theory could). That could be a general gripe with how our EngineState internals are pretty transparently pub but is exacerbated when that gets shared over the plugin boundary.

To get things rolling I am fine with landing this, but as you mentioned the mass of clones we accrued and want to get rid off in the future provides an incentive to move to more reference like architectures in the internals. We just need to watch out that we don't accidentally get mutable in places we don't want to.

@devyn
Copy link
Contributor Author

devyn commented Mar 8, 2024

Sure, I think probably then the main gotcha is that a closure that captures an engine / other plugin custom value won't be possible to send to a plugin. It won't be able to serialize the captures. So I think that's yet another incentive to fix that and make the captures some kind of reference instead.

On that note, I actually do need to fix this to support the custom values in the closure captures. One sec.

@devyn
Copy link
Contributor Author

devyn commented Mar 8, 2024

Ok, that works properly now

devyn added a commit to devyn/nushell.github.io that referenced this pull request Mar 9, 2024
@kubouch kubouch added pr:api-change This PR should be mentioned in #api-updates channel on Discord and removed pr:breaking-change This PR implies a change affecting users and has to be noted in the release notes labels Mar 9, 2024
@fdncred fdncred added the pr:breaking-change This PR implies a change affecting users and has to be noted in the release notes label Mar 9, 2024
@fdncred
Copy link
Collaborator

fdncred commented Mar 9, 2024

let's give it a try. thanks!

@fdncred fdncred merged commit 430fb1f into nushell:main Mar 9, 2024
19 checks passed
fdncred pushed a commit to nushell/nushell.github.io that referenced this pull request Mar 9, 2024
fdncred pushed a commit that referenced this pull request Mar 9, 2024
# Description
This PR uses the new plugin protocol to intelligently keep plugin
processes running in the background for further plugin calls.

Running plugins can be seen by running the new `plugin list` command,
and stopped by running the new `plugin stop` command.

This is an enhancement for the performance of plugins, as starting new
plugin processes has overhead, especially for plugins in languages that
take a significant amount of time on startup. It also enables plugins
that have persistent state between commands, making the migration of
features like dataframes and `stor` to plugins possible.

Plugins are automatically stopped by the new plugin garbage collector,
configurable with `$env.config.plugin_gc`:

```nushell
  $env.config.plugin_gc = {
      # Configuration for plugin garbage collection
      default: {
          enabled: true # true to enable stopping of inactive plugins
          stop_after: 10sec # how long to wait after a plugin is inactive to stop it
      }
      plugins: {
          # alternate configuration for specific plugins, by name, for example:
          #
          # gstat: {
          #     enabled: false
          # }
      }
  }
```

If garbage collection is enabled, plugins will be stopped after
`stop_after` passes after they were last active. Plugins are counted as
inactive if they have no running plugin calls. Reading the stream from
the response of a plugin call is still considered to be activity, but if
a plugin holds on to a stream but the call ends without an active
streaming response, it is not counted as active even if it is reading
it. Plugins can explicitly disable the GC as appropriate with
`engine.set_gc_disabled(true)`.

The `version` command now lists plugin names rather than plugin
commands. The list of plugin commands is accessible via `plugin list`.

Recommend doing this together with #12029, because it will likely force
plugin developers to do the right thing with mutability and lead to less
unexpected behavior when running plugins nested / in parallel.

# User-Facing Changes
- new command: `plugin list`
- new command: `plugin stop`
- changed command: `version` (now lists plugin names, rather than
commands)
- new config: `$env.config.plugin_gc`
- Plugins will keep running and be reused, at least for the configured
GC period
- Plugins that used mutable state in weird ways like `inc` did might
misbehave until fixed
- Plugins can disable GC if they need to
- Had to change plugin signature to accept `&EngineInterface` so that
the GC disable feature works. #12029 does this anyway, and I'm expecting
(resolvable) conflicts with that

# Tests + Formatting
- 🟢 `toolkit fmt`
- 🟢 `toolkit clippy`
- 🟢 `toolkit test`
- 🟢 `toolkit test stdlib`

Because there is some specific OS behavior required for plugins to not
respond to Ctrl-C directly, I've developed against and tested on both
Linux and Windows to ensure that works properly.

# After Submitting
I think this probably needs to be in the book somewhere
@hustcer hustcer added this to the v0.92.0 milestone Mar 10, 2024
@devyn devyn deleted the call-closure-from-plugin branch March 10, 2024 21:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr:api-change This PR should be mentioned in #api-updates channel on Discord pr:breaking-change This PR implies a change affecting users and has to be noted in the release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants