Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation: jupyter doc & API discussion #269331

Open
5 tasks
teto opened this issue Nov 22, 2023 · 11 comments
Open
5 tasks

Documentation: jupyter doc & API discussion #269331

teto opened this issue Nov 22, 2023 · 11 comments
Labels
6.topic: jupyter Interactive computing tooling: kernels, notebook, jupyterlab 9.needs: documentation

Comments

@teto
Copy link
Member

teto commented Nov 22, 2023

Problem

I first thought I could rely on jupyterwith (now jupyenv https://github.com/tweag/jupyenv) but it has been unreliable and having strong (jupyter) foundations in nixpkgs is helpful anyway.
Jupyter is so complex, it's the perfect ecosystem for nix to shine (notwithstanding the everevoling python packages).

Proposal

With 23.05 branched off, I wonder if anyone would be willing to start documenting how to create a jupyter notebook with several kernels avaialbe in the nixpkgs documentation ?
I am currently trying to build one such notebook so I could prepare a skeleton but I might be short on time.
My goal would be to just get the ball rolling so that we have a nice doc for 24.05 .

@thomasjm you might be the most motivated with the quarto project ?

@natsukium @GTrunSec @GaetanLepage

  • use nixos/modules/services/development/jupyter/kernel-options.nix in jupyterLib with evalModules outside the nixos module
  • convert top-level jupyter pkgs/development/python-modules/jupyter/default.nix in attrset ?

Checklist

Priorities

Add a 👍 reaction to issues you find important.

@GTrunSec
Copy link
Contributor

GTrunSec commented Nov 23, 2023

I embedded Quarto in Jupyenv several months ago, and it works perfectly.

  • https://gtrunsec.github.io/desci-workflow-template/
    As a long-time user and developer of JupyWithDataScienceEnv, I still have some doubts about building a notebook environment with multiple kernels solely using nixpkgs. It has certain limitations that I find challenging to overcome.

@natsukium
Copy link
Member

natsukium commented Nov 23, 2023

It has certain limitations that I find challenging to overcome.

What are the limitations?
To be honest, I'm a light user of Jupyter, so I can't imagine a complex use case in the real world, but I'm interested in adopting it as much as possible.

I felt in fixing pkgs.jupyter, I would like to unify the API with pkgs.jupyter-console.
In any case, it would be nice to refactor them, including the modules, for documentation.

@teto teto added the 6.topic: jupyter Interactive computing tooling: kernels, notebook, jupyterlab label Nov 23, 2023
@teto
Copy link
Member Author

teto commented Nov 27, 2023

I've tried to create a multikernel jupyter experience and this was a bit maddening. Here was my experience writing the following shell

let
	python3PkgsFn = ps: [
	ps.numpy
	ps.scipy
	ps.matplotlib
	ps.graphviz
	ps.tensorflow
	ps.scikit-learn
	ps.numpy
	ps.keras
	];
	pyEnv = pkgs.python3.withPackages(python3PkgsFn);

	kernel-definitions = {
		python3 = mkPythonKernelDef pyEnv;
		haskell = mkHaskellKernel ghcEnv;
	};

	# creates a folder with kernels/ subpath
	allKernels = pkgs.jupyter-kernel.create {
	definitions = kernel-definitions;

	};

	/* for now must contain ihaskell */
	ghcEnv = pkgs.haskellPackages.ghcWithPackages (p: with p; [
	aeson
	ihaskell
	ihaskell-blaze
	]);


	/* jupyter console is an attrset of 2 functions 
	*/
	jup-console = pkgs.jupyter-console.mkConsole {

	definitions = kernel-definitions;
	# why ?
	kernel = null; 
	};

in
pkgs.mkShell {
	buildInputs = [
	allKernels
	# -notebook
	pkgs.jupyter

	jup-console

	];

					

	shellHook = ''
	echo "ohayo "
	echo "${allKernels} "
	export JUPYTER_PATH=${allKernels}
	export IPYTHONDIR=_ipythondir
	export JUPYTER_CONFIG_DIR=_jupyter_cfg
	'';
	};

1/ When I enter the shell without setting the value of IPYTHONDIR, it gets set to /ipython, which triggers:

  jupyter-console 
/nix/store/5s5djlfcb1l655fwkydm8hrsksckqrh3-python3-3.11.6-env/lib/python3.11/site-packages/IPython/paths.py:69: UserWarning: IPython parent '/' is not a writable location, using a temp directory.
  warn("IPython parent '{0}' is not a writable location,"
/nix/store/r4rwxai2g45i3ax51ic7k6flsvhg3yz3-python3-3.11.6-env/bin/python3.11: No module named ipykernel_launcher
^CTraceback (most recent call last):
  File "/nix/store/1rh6y3hlhwj3idgg16149wg0nf5sw8vy-python3.11-jupyter-console-6.6.3/bin/.jupyter-console-wrapped", line 9, in <module>
  sys.exit(main())
  echo $IPYTHONDIR
  /ipython

2/ similarly, if I dont set JUPYTER_CONFIG_DIR I get:

jupyter-console

.py", line 26, in ensure_dir_exists
    os.makedirs(path, mode=mode)
  File "<frozen os>", line 225, in makedirs
PermissionError: [Errno 13] Permission denied: '/jupyter'
$ echo $JUPYTER_CONFIG_DIR 
/jupyter

3/ the all-packages.nix ihaskell derivation is backwards: instead of being haskellPackages.ihaskell,
it's a python environment with a ihaskell kernel

4/ similarly, when I entered my shell and ran jupyter kernelspec list --debug
it was only listing a python kernel even though my JUPYTER_PATH contained haskell and python.
Turns out that the jupyter wrapper sets (no prefix, no suffix) the path to:
"--set JUPYTER_PATH ${jupyter-kernel.create { inherit definitions; }}"
While discarding the environment JUPYTER_PATH could be defensible from a purity point of view, I found that surprising too. It should have no kernel by default. The API should make it clearer that it embeds kernels.
Like jupyter.withKernels() or wrapJupyter jupyter kernels; the jupyter.override { definitions ? defaultDef } is treacherous.

5/ I've got the same feeling about pkgs/applications/editors/jupyter/console.nix, looks it doesn't know what it wants to be.
It would be cleaner to have wrapJupyter jupyter-console kernels or jupyter-console.withKernels(kernels)

What I would like to do:

  1. add jupyter/lib.nix with some common functions
  2. add a jupyter/README.md on how to use the API (which we can migrate to the official doc later on)
  3. maybe standardize interpreter wrappers to have a .jupyterKernel attribute ?
  4. I quite like the jupyter.withPackages(...) but to generalize this, I wonder if wrapJupyter jupyter kernels is not more easier. Opinions ?
  5. remove pkgs/applications/editors/jupyter/console.nix ?

As for backwards compability we would throw exceptions with the new way to define the equivalent ?

I am willing to prepare an MR if there is interest/agreement.

@teto teto changed the title Documentation: add jupyter doc Documentation: jupyter doc & API discussion Nov 27, 2023
@teto
Copy link
Member Author

teto commented Nov 27, 2023

to sum up, the saner expression is to ignore all the surprising top-level wrappers and use the base packages. This does what I expect in a shell: find the kernel in JUPYTER_PATH (in the final derivation, we would wrap it).

            pkgs.mkShell {

              /* dont use the top-level 'jupyter' as it needs to be overriden with proper kernel-definitions
              */
              buildInputs = [
                allKernels

                pkgs.python3Packages.notebook
                pkgs.python3Packages.jupyter-console
              ];

              shellHook = ''
                echo "ohayo "
                echo "${allKernels} "
                export JUPYTER_PATH=${allKernels}
                export IPYTHONDIR=_ipythondir
                export JUPYTER_CONFIG_DIR=_jupyter_cfg

              '';
              };

@thomasjm
Copy link
Contributor

thomasjm commented Nov 27, 2023

Hmm, I don't think you should be running into this much trouble making an environment. If you look at the jupyter-all derivation, you can see how this is intended to work at present:

  jupyter-all = jupyter.override {
    definitions = {
      clojure = clojupyter.definition;
      octave = octave-kernel.definition;
      # wolfram = wolfram-for-jupyter-kernel.definition; # unfree
    };
  };

Now, there's a bit of a wrinkle where you're trying to add additional packages to your Python environment. My instinct would be to handle this as part of the Python kernel definition.

My general plan for this stuff has been as follows:

  • Move all scattered kernels currently in Nixpkgs into pkgs/applications/editors/jupyter-kernels. This would include the ipython kernel.
  • Create pkgs/applications/editors/jupyter-kernels/default.nix, containing a map of all kernel names => kernel definitions. Expose this in all-packages.nix as jupyter-kernels.
  • Firm up an API, which could perhaps look like this: jupyter.withKernels (ks: [ks.python ks.haskell ks.octave]). And similarly jupyter-console.withKernel, to which you can pass jupyter-kernels.whatever.
  • Add documentation etc.
  • Tackle further challenges like defining packages for kernels. You could imagine it looking like this:
# Jupyter lab with two kernels, each with custom packages
jupyter.withKernels (kernels: [
  kernels.python.withPackages (ps: [ps.matplotlib ps.scipy])
  kernels.haskell.withPackages (ps: [ps.aeson])
])

# Jupyter console with kernel with custom packages
jupyter-console.withKernel jupyter-kernels.python.withPackages (ps: [ps.matplotlib ps.scipy])

@teto
Copy link
Member Author

teto commented Dec 2, 2023

Hmm, I don't think you should be running into this much trouble making an environment. If you look at the jupyter-all derivation, you can see how this is intended to work at present:

I dont think it should be that complex either. I dont like the use of "override" in your example, in my mind, it's reserved to change things one shouldn't.

I think I agree with all your points. Some packages will have to be inherited from elsewhere, e.g., ihaskell derivation is generated from hackage. If @natsukium agrees let's do that ?

Also to keep track of my hardship/discoveries:

            mkHaskellKernel = ghcEnv: 
              {
            displayName = "Haskell";
            argv = [
              "${ghcEnv}/bin/ihaskell"

              # Without this line I can't import packages from ghcEnv
              # the ihaskell flake does `-l $(${env}/bin/ghc --print-libdir`
              # we  hardcode the (guessed) path instead here to avoid a wrapper
              "-l"       "${ghcEnv}/lib/ghc-${ghcEnv.version}"
              
              "kernel"
              "{connection_file}"
              "+RTS"
            ];

To sump ihaskell code is hard to use in nixpkgs because of this trick, and the flake implementation is convoluted.

@teto
Copy link
Member Author

teto commented Dec 2, 2023

and my previous setup listing python packages would work with jupyter-console but not jupyter-notebook because it hit #255923 aka my shell had a plain python that couldn't find the files expected by notebook.
So I had to create a python environment as done in the PR:

                # I need an env else I hit https://github.com/NixOS/nixpkgs/issues/255923, aka
                # the main python can't find notebook files.
                pyEnv = pkgs.python3.withPackages(ps: [
                  ps.notebook
                  ps.jupyter-console
                ]);

but this would in turn break because the (other) python environment used as my python kernel was missing ipykernel and thus triggered

                #  the kernel launches -m ipykernel_launcher so if you dont add it 
                # you get /nix/store/r4rwxai2g45i3ax51ic7k6flsvhg3yz3-python3-3.11.6-env/bin/python3.11: No module named ipykernel_launcher

.
so I end up with this flake that works with jupyter-console and jupyter-notebook with both a python and haskell kernels: https://gist.github.com/teto/4d12998d734f982e27f48d8bb001c8ae

There is no reason it should be this complex except manpower. Now we have a taskforce let's make jupyter+nix enjoyable !

@teto
Copy link
Member Author

teto commented Dec 13, 2023

@natsukium do you agree with the plan ?

At least with the easy part:

Create pkgs/applications/editors/jupyter-kernels/default.nix, containing a map of all kernel names => kernel definitions.
Expose this in all-packages.nix as jupyter-kernels.

I have some cycles to allocate to that. I am interested in adding mKernelForIhaskell functions for instances in jupyter/lib.nix.

It might be interesting to create a #jupyter room in the nixpkgs matrix space, no ?

@natsukium
Copy link
Member

Sorry, I missed the notification. The proposed plan seems fine to me. Could you please proceed as such?
#jupyter room sounds interesting.

@teto teto mentioned this issue Jan 2, 2024
13 tasks
@GTrunSec
Copy link
Contributor

GTrunSec commented Jan 4, 2024

@teto Due to the lack of response from maintainers to my PRs in jupyenv for a long time, I've decided to support this issue thoroughly. My current plan is first to migrate kernels to nixpkgs for unified maintenance

@teto
Copy link
Member Author

teto commented Jan 5, 2024

thanks to the moderation team, we have a room to discuss implementation details, I've invited all of you https://matrix.to/#/#jupyter:nixos.org

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
6.topic: jupyter Interactive computing tooling: kernels, notebook, jupyterlab 9.needs: documentation
Projects
None yet
Development

No branches or pull requests

4 participants