-
Notifications
You must be signed in to change notification settings - Fork 9
rfc: Prompt API support #14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
I love the simplicity of directly setting the handler on The main drawback I see is that with this design, the main process is always an intermediary between the utility process and the renderer in which In |
text/0014-ai-prompt-api.md
Outdated
if (!this.#aiProcess) { | ||
throw new Error('AI model process not started'); | ||
} | ||
this.#aiProcess.postMessage({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we expect to get a prompt response back out of the utility process? Will it postMessage
back to the main process?
text/0014-ai-prompt-api.md
Outdated
} | ||
} | ||
|
||
session.defaultSession.setPromptAPIHandler(() => { ElectronAI }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we directly pass a construction function, I think it'll make it easier to more flexibly swap out the implementation, and will also make it easier to adapt to changes in the upstream API
session.defaultSession.setPromptAPIHandler(() => { ElectronAI }); | |
session.defaultSession.setPromptAPIHandler((webContents, options) => new ElectronAI(webContents, options)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some non-substantive comments, and a note that we should probably make the source.chromium.org
links permalinks for historical reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there other imagined use cases at the moment for the WebContents
argument to setPromptAPIHandler
, or is the only imagined use case currently to distinguish different WebContents
instances to avoid conversation bleed over between them?
Exposing WebContents
to utilityProcess
would be a rather large side effect of this API, so I wonder if we want to start with the minimal surface area and only provide webContents.id
to start?
It would also expose a bit of a security hole, utility process's shouldn't have arbitrary mojo access to webContents. And building a whole system where we pass handles back and forth to ensure that only the given utility process can call those methods without a strong reason for utility process's to need that access is gonna be a hard sell (at least to me). Agree with david, would love to avoid exposing any part of the |
Pass webContentsId instead of webContents Additionally pass securityOrigin Added ses.getUtilityProcessForLocalAI so that a handle to the utility process can be obtained so that messages can be sent to and received from the utilityProcess (eg the utilityProcess sends a message with the webContentsId in order to get more information about the webContents if needed).
@dsanders11 @MarshallOfSound Good points on passing the webContents. I've adjusted the design to pass a webContentsId instead and I've also added the securityOrigin. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've got some questions about the lifetime of the associated UtilityProcess
.
It looks like the Prompt API allows for explicit session destruction, and I see you've got the destroy
method in the handler script. Do we want destroy handled at that level, and leave the UtilityProcess
alive, or do we also kill the UtilityProcess
on destroy?
Do we support multiple LanguageModel.create
calls, and/or the clone
operation? Does each one get its own UtilityProcess
, or is it a single UtilityProcess
with multiple ElectronAI
instances?
Do we automatically restart the UtilityProcess
if it crashes, like network service in Chromium? Or for now should we simply let the process die and throw errors for any LanguageModel
associated with the dead process?
Also addressed review feedback
@dsanders11 in regards to your questions:
I think this should be left up to those implementing the API as to whether they want to leave the UtilityProcess running or not.
Again, I think this is up to those implementing the API. You could in theory register multiple local AI Handler Scripts/Utility processes or you could use one utility process.
This is a good question that I'd like the API WG to weigh in on, but I think we'd want to let the process die and throw errors because if someone wrote a utility process that continually crashes we wouldn't want to repeatedly restart it only to have it die again. |
* `webContentsId` (Integer | null) - The unique id of the [WebContents](web-contents.md) | ||
calling the Prompt API. The webContents id may be null if the Prompt API is called from | ||
a service worker or shared worker. | ||
* `securityOrigin` String - Origin of the page calling the Prompt API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can structure this in a way that allows looking up more details of the sender similar to how preload realms introduced senderType
to IPCs. That way the handler can more easily validate the initiator.
rfcs/text/0008-preload-realm.md
Lines 319 to 329 in d89000c
```diff | |
interface IpcMainEvent { | |
// Breaking change | |
- sender: WebContents; | |
+ sender: WebContents | ServiceWorkerMain; | |
// Non-breaking change | |
+ senderType: 'frame' | 'service-worker'; | |
+ senderServiceWorker: ServiceWorkerMain; | |
} | |
``` |
Could be something like
type HandlerDetails = { senderType: 'frame', webContentsId: number, frameId: number } |
{ senderType: 'service-worker', versionId: number, scopeUrl: string } |
{ senderType: 'process', processId: number }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, should the handler allow rejecting the prompt API request?
|
||
* `id` string - local AI handler script ID | ||
|
||
Unregisters the specified local AI handler script and terminates the associated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this call the destroy()
method on any prompt API handlers, or does that only happen if the prompt instance is destroyed?
Are there any other considerations we should make for how the utility process should handle shutdown, or is that already covered by existing utility process APIs?
The `localAIHandler` module has the following methods: | ||
|
||
##### `localAIHandler.setPromptAPIHandler(handler)` | ||
* `handler` Function\<[LanguageModel](https://github.com/webmachinelearning/prompt-api?tab=readme-ov-file#full-api-surface-in-web-idl)\> | null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to make this async? Since we're just getting a webContentsId
, it might be necessary to make an IPC call to the Main process to gather other info.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may eventually want to add other parameters in future, so it might be more future-proof if we pass a dictionary to the handler, rather than positional paramters
The Prompt API is part of Built-in AI capabilities in Chromium based browsers.
With the Prompt API developers can send natural language requests from the renderer for local processing.
This proposal adds support for this API in Electron.
📄 Read rendered document