Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configure Pyolite to pre-install packages #811

Closed
wants to merge 8 commits into from

Conversation

vasiljevic
Copy link
Contributor

References

This is proposed solution for jupyterlite/pyodide-kernel#60

Code changes

The central code change is in the PyoliteRemoteKernel.initKernel() method. Those lines are added:

    for (const packageName of options.pipliteRequiredPackages) {
      await this._pyodide.runPythonAsync(`
         await piplite.install(['${packageName}'], keep_going=True);
      `);
    }

The idea is to install configurable packages in the same way as some other packages are already pre-installed (in the same method above the added lines).

New option pipliteRequiredPackages may be configured in jupyter-lite.json like:

        "litePluginSettings": {
          "@jupyterlite/pyolite-kernel-extension:kernel": {
            "pipliteRequiredPackages": ["openpyxl"]
          }
        }

All other changes implement propagation of the new option from the runtime configuration to the options parameter of the initKernel() method.

Documentation is also added in docs/howto/python/packages.md.

User-facing changes

A user who build JupyterLite website may configure packages to be pre-installed. Then a who access the JupyterLite website does not need to do piplite.install for those packages and may just import them.

Backwards-incompatible changes

No backwards-incompatible changes

@github-actions
Copy link
Contributor

lite-badge 👈 Try it on ReadTheDocs

@vasiljevic
Copy link
Contributor Author

vasiljevic commented Sep 23, 2022

I have just added the commit 'Make matplotlib optional but included by default'. Currently, there is hard-coded pre-installation of matplotlib.

When we introduce an option to configure what packages will be pre-installed, it makes sense to let matplotlib be configurable. The reason to include this change in this PR is backward compatibility: this change is backwards compatible now, but will not be backwards compatible later.

In other words, we should decide now will we instruct those who use pipliteRequiredPackages parameter to put matplotlib in the list when matplotlib is needed.

Update

I have to revert the commit 'Make matplotlib optional but included by default' since pyolite package depends on matplotlib and pillow (pyolite patches matplotlib and PIL modules)

@jtpio
Copy link
Member

jtpio commented Oct 15, 2022

Thanks @vasiljevic for opening this PR.

Although there are good reasons for not encouraging code to be run silently (as discussed in jupyterlite/pyodide-kernel#60), the new config option looks simple enough and opt-in. This would allow site deployers the ability to configure it if they want to. But it could indeed make it more difficult to troubleshoot in case something goes wrong during the kernel initialization.

If things move forward, it would be interesting to check how to unify this with what xeus-python offers: https://xeus-python-kernel.readthedocs.io/en/latest/configuration.html#pre-installed-packages

@bollwyvl what is your opinion on this? Maybe something that could be more easily iterated on once Pyolite is moved to its own repo (#386).

@jtpio
Copy link
Member

jtpio commented Oct 15, 2022

Also the UI tests failure indicate some issue while executing a cell, which looks relevant:

image

@jtpio jtpio added the enhancement New feature or request label Oct 15, 2022
@bollwyvl
Copy link
Collaborator

I'll say it again, If stuff breaks at this level, it leaves the kernel in an unrecoverable statte, and you can never get out. I don't want this repo to have to field support for those kinds of failures.

We can leave more hooks open to allow these things with additional extensions, but it should not be in core.

@vasiljevic
Copy link
Contributor Author

Also the UI tests failure indicate some issue while executing a cell, which looks relevant:

Thank you @jtpio, I have overlooked that pyolite package depends on matplotlib and pillow since since pyolite need to patch matplotlib and PIL modules. I have reverted the commit 'Make matplotlib optional but included by default' and tests have passed now.

BTW, how can I access images and videos from UI test report on GitHub? I this particular case I could not reproduce the error since it is race condition related and tests had not failed in my local environment.

@vasiljevic
Copy link
Contributor Author

If things move forward, it would be interesting to check how to unify this with what xeus-python offers: https://xeus-python-kernel.readthedocs.io/en/latest/configuration.html#pre-installed-packages

This feature is why I like xeus-python. But xeus-python do package installation in build time by using empack as opposed to Pyodide. It would be nice if there was an easy way to create a custom Pyodide distribution and integrate it into pyolite build, just like it works for xeus-python.

Before Pyodide offers an easy build of custom distributions, we can achieve similar behavior to XeusPythonEnv.packages by enhanced version of LiteBuildConfig.piplite_urls that additionally pre-installs appropriate packages in runtime. In this case a solution like one proposed in this PR may implement the runtime part of pre-installation. Then we can also unify configuration syntax for pre-installing packages in xeus-python and Pyodide, but to use different solutions under the hood.

@vasiljevic
Copy link
Contributor Author

vasiljevic commented Oct 15, 2022

I'll say it again, If stuff breaks at this level, it leaves the kernel in an unrecoverable statte, and you can never get out. I don't want this repo to have to field support for those kinds of failures.

We can leave more hooks open to allow these things with additional extensions, but it should not be in core.

I fully understand your concerns, but I just have not found better way to run package pre-installation.

The major constraint is that pre-installation must happen before anything is executed in kernel. During the Pyolite kernel initialization a pending execution may occur and stay awaiting at the beginning of the PyoliteRemoteKernel.execute() method. The last statement in the Pyolite kernel initialization is this._initializer?.resolve(). When that statement finishes, the pending execution starts immediately. Thus, we need to pre-install packages before this._initializer?.resolve().

We can put such configurable staff just before this._initializer?.resolve(), can properly catch all exceptions, etc, but looks like those staff cannot be placed after the kernel initialization.

@yuvipanda
Copy link

With my deployer hat on, this is the absolute one single feature that means I must use XPython now for end users to have a decent experience, even if that means I've to spend a bunch of time adding packages to conda-forge. That's super un-ideal, as PyPI will (IMO) always be a superset of (python) packages available on conda-forge. I don't have enough of an understanding of how this should be implemented here, and understand the difficulties with error recovery. However, from a 'can end users actually use this?' perspective, this would go a long way in helping us actually deploy Pyodide (vs xeus-python)

@jtpio
Copy link
Member

jtpio commented Nov 18, 2022

We can leave more hooks open to allow these things with additional extensions, but it should not be in core.

Maybe a good compromise would indeed be to leave some hooks. Something that could be done after #854

@tomjakubowski
Copy link
Contributor

tomjakubowski commented Oct 13, 2023

Our site's build pulls some tricks to pre-install wheels - in short, at build time we boot up a pyodide interpreter with node, install the packages we need with piplite, and then call micropip.freeze() to generate a new repodata.json file which replaces the repodata.json in the final build.

leave some hooks [for an addon]

I'm interested in working on this - are hooks for a third-party "pre-installer addon" still the way you all envision supporting this feature?

@jtpio
Copy link
Member

jtpio commented Mar 27, 2024

Thanks all for the discussion.

Support for loadPyodide landed in jupyterlite/pyodide-kernel#103, which should help support this use case at some point. So it looks like we can close this PR for now, as this is related to the Pyodide kernel only.

@vasiljevic if you would like to continue exploring and discussing this in https://github.com/jupyterlite/pyodide-kernel, please feel free to open an issue or comment in jupyterlite/pyodide-kernel#40. Thanks!

@jtpio jtpio closed this Mar 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants