-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add build-time execution #856
Conversation
type ICellBlockOutput = GenericNode & { | ||
data: IOutput[]; | ||
}; | ||
|
||
type ICellBlock = GenericNode & { | ||
children: (Code | ICellBlockOutput)[]; | ||
}; | ||
|
||
function isCellBlock(node: GenericNode): node is ICellBlock { | ||
return node.type === 'block' && select('code', node) !== null && select('output', node) !== null; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a better way to type this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Things look good?!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fwkoch did we have a kind
on the block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are tagging executable code blocks with data.type = 'notebook-code'
- From notebooks - https://github.com/executablebooks/mystmd/blob/main/packages/myst-cli/src/process/notebook.ts#L24-L27 and from code cell directives - https://github.com/executablebooks/mystmd/blob/main/packages/myst-directives/src/code.ts#L215
However, I'm not sure we are relying on this field anywhere? In the myst-theme execution code at least, we find executable blocks in a similar way to your implementation: https://github.com/executablebooks/myst-theme/blob/855f29ec42a77deaa145481bfcab25b2d3c53757/packages/jupyter/src/execute/utils.ts#L8 The only difference is that function supports executable figures, i.e. container
nodes with code/output children, in addition to blocks.
@fwkoch I've pinged you for review to hopefully course correct anything that's not looking like it's going in the right direction. At a high level, this is my thinking about the transforms / user flow:
In getting the current WIP working, I switched from finding My plan is to slightly re-work inline expressions too, so that the "rendering" happens at a similar stage as the output rendering — inline expressions should likely be minified. |
At this stage, environment variables can be set to point to an existing Jupyter Server instance. In future, these will be configurable and optional (we could spin up our own server). |
I've now unified the treatment of outputs and expression results such that the notebook processor just populates these from the notebook itself (e.g. metadata). A later stage would be required to minify these, and then "simplify" them for rendering. @rowanc1 could you elaborate on the purpose of them minimisation? I assume it's to make manipulations of the AST less memory intensive? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few questions at this stage about where we are minifying outputs and rendering inline expressions/adding IDs etc.
Not sure if this is fully working yet, but looks like we have a cache, pulling out executable pieces, executing those against a kernel, and providing results.
Another thought as I am looking through this, I wonder if this would be good as a separate package? myst-execute
maybe?
type ICellBlockOutput = GenericNode & { | ||
data: IOutput[]; | ||
}; | ||
|
||
type ICellBlock = GenericNode & { | ||
children: (Code | ICellBlockOutput)[]; | ||
}; | ||
|
||
function isCellBlock(node: GenericNode): node is ICellBlock { | ||
return node.type === 'block' && select('code', node) !== null && select('output', node) !== null; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Things look good?!
The minimize outputs takes the encoded outputs (base64 images, html stuff, raw logs, etc.) and pulls them out of the mdast and writes them to disk. This is needed for writing latex, word, typst, etc. and for html is also more efficient for network requests and lazy loading of images in a long article. We are also transforming the images into other formats (webp), which further improves load time. All of that needs them on disk. |
0f27917
to
b18370e
Compare
async jupyterSessionManager(): Promise<SessionManager | undefined> { | ||
if (this._jupyterSessionManager !== null) { | ||
return Promise.resolve(this._jupyterSessionManager); | ||
} | ||
try { | ||
const partialServerSettings = await new Promise<JupyterServerSettings>( | ||
async (resolve, reject) => { | ||
if (process.env.JUPYTER_BASE_URL === undefined) { | ||
resolve(findExistingJupyterServer() || (await launchJupyterServer(this.contentPath(), this.log))); | ||
} else { | ||
resolve({ | ||
baseUrl: process.env.JUPYTER_BASE_URL, | ||
token: process.env.JUPYTER_TOKEN, | ||
}); | ||
} | ||
}, | ||
); | ||
const serverSettings = ServerConnection.makeSettings(partialServerSettings); | ||
const kernelManager = new KernelManager({ serverSettings }); | ||
const manager = new SessionManager({ kernelManager, serverSettings }); | ||
// TODO: this is a race condition, even though we shouldn't hit if if this promise is actually awaited | ||
this._jupyterSessionManager = manager; | ||
return manager; | ||
} catch { | ||
this._jupyterSessionManager = undefined; | ||
return undefined; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not feel elegant to me, but I want to get "something" in place for us to talk about.
…nto agoose77/feat-build-execution
OK, at this point it seems like this is behaving as expected (though with no user configurability besides an ENV var). It will be good to discuss this on Monday :) |
@agoose77 re: |
I'd also just like to keep this visible: https://github.com/executablebooks/mystmd/blob/aa335d748edd4d636ad117a216ba78c1c1283e4f/packages/myst-cli/src/process/mdast.ts#L232 i.e. text-based notebooks are flagged as executable here and jupytext / myst text-based notebooks should probably be a test case for local execution too |
const serverSettings = ServerConnection.makeSettings(partialServerSettings); | ||
const kernelManager = new KernelManager({ serverSettings }); | ||
const manager = new SessionManager({ kernelManager, serverSettings }); | ||
// TODO: this is a race condition, even though we shouldn't hit if if this promise is actually awaited |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
await manager.ready
in turn does await the kernelManager.ready
if that's what you mean
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
callers of this function would need to know to await (await jupyterSessionManager()).ready
unless you await manager.ready
in here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this instance it's that there's no logic to avoid creating two managers if one doesn't properly await the promise to session.jupyterSessionManager()
. i.e., between the first await and finally setting this._jupyterSessionManager
.
This PR is a jumping off point for execution work, and should make it possible to scope discussions to just interested parties.
Closes #839, #550