Skip to content

Commit

Permalink
Merge pull request #1168 from alexarchambault/update-documentation
Browse files Browse the repository at this point in the history
Add advanced installation page
  • Loading branch information
alexarchambault committed Feb 15, 2024
2 parents 3ac41f3 + 86d1887 commit 92fef56
Show file tree
Hide file tree
Showing 3 changed files with 278 additions and 7 deletions.
271 changes: 271 additions & 0 deletions docs/pages/install-advanced.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,271 @@
---
title: Advanced
---

Before diving into how Almond starts, a few words about the Almond installation.
The installation of Almond can look complex by some aspects. Beyond that, customizing
the installation, making sense of what it does in detail, can be tricky.
This originates from two concerns that Almond tries to reconcile:
- isolation of the internal dependencies of Almond from users
- allowing the customization / tweaking of the Almond class path upon installation

## Isolation of Almond internal dependencies

Almond depends on fs2 and cats-effect. In order not to force its own cats / cats-effect /
fs2 versions upon users, these internal dependencies of Almond are effectively hidden
from users.

In more detail, isolation in Almond works this way: the
[`scala-kernel-api` module](https://repo1.maven.org/maven2/sh/almond/scala-kernel-api_@SCALA213_VERSION@)
and its dependencies are user-facing. The rest of the Almond class path
(the [`scala-kernel` module](https://repo1.maven.org/maven2/sh/almond/scala-kernel_@SCALA213_VERSION@)
and its dependencies, but for everything already pulled by `scala-kernel-api`) are internal
dependencies, hidden from users.

Isolation between these two set of dependencies is achieved by starting Almond using
[launchers generated by coursier](https://get-coursier.io/docs/cli-bootstrap). (Note that
the coursier documentation doesn't really detail how dependency isolation works. We detail
that right below here.)

If we put dependencies isolation aside, these launchers are a small Java application, alongside a
resource file listing URLs of the JARs that should be loaded. Upon startup, the launcher reads
the URL list, check if these are in the coursier cache and downloads them if they're not. Then it
creates a `ClassLoader`, loads the local copies of the JARs in it, and loads and calls the main
class of the application from it.

With dependencies isolation enabled, these launchers actually embed two lists of JARs: a "top"
one (corresponding to user dependencies for Almond), and a "bottom" one (for Almond, internal
dependencies, with cats-effect, fs2, etc.). Upon startup, they create a first class loader
with the top dependencies, and
a second one, having the first class loader as a parent, with the bottom dependencies. In such
cases, the way class loaders work on the JVM makes the classes loaded in the bottom class loader
"see" the classes in the top one, but not the other way around. Later on,
when the app runs, it can ask for the class loader of a class it knows is part of the top
dependencies, which gives it a reference to the top class loader, that knows nothing about the
bottom dependencies.

Almond relies on that mechanism to get a class loader that only knows about user-facing
dependencies. That class loader is used as a parent of the class loader that's going to
load the classes generated during the session (those corresponding to the input user
code in the notebook). That way, users only "see" user-facing dependencies, not internal
ones, and they can load whichever other version of the same internal dependencies as Almond.

## About the installation process

When passed the `--install` option, Almond tries to determine the path of its own launcher,
and copies it alongside the `kernel.json` file it generates for Jupyter to be able to launch
Almond. That way, the launcher used during the installation can be safely deleted right after
the installation.

## Creating an Almond launcher and installing it

This section describes how to install Almond for a specific Scala version. Even
though the instructions below may rely on newer coursier features, installing Almond
this way has been the recommended way to proceed since Almond exists. In the next section,
we'll describe a novel way to install Almond, relying on an intermediate launcher, that allows
notebook users to customize the Scala version, the JVM they use, Java options (including
memory options), on a per notebook basis.

### `cs launch --use-bootstrap`

The easiest way to generate an Almond launcher consists in… not generating one. The
`cs launch` command, when passed `--use-bootstrap`, launches an application via a
temporary launcher it generates on-the-fly.

```text
$ cs launch --use-bootstrap almond:@VERSION@ --scala @SCALA213_VERSION@ -- --install
```

Note the use of `--` - arguments before it are arguments for `cs`, those after are arguments
for Almond.

To list the options that the launcher accepts, run
```text
$ cs launch --use-bootstrap almond:@VERSION@ --scala @SCALA213_VERSION@ -- --help
```

### `cs bootstrap`

Alternatively to using `cs launch --use-bootstrap`, you can generate a launcher with
`cs bootstrap`, then launch it on your own:
```text
$ cs bootstrap almond:@VERSION@ --scala @SCALA213_VERSION@ -o almond
$ ./almond --install
$ rm -f almond
```

## Creating an Almond launcher and installing it - newer launcher

The newer launcher allows users to configure Almond in the first cells of notebooks
(more precisely, before any actual code - that is, not comments or blank lines - needs to be compiled), with
directives like
```scala
//> using scala "2.12"
//> using scala "@SCALA212_VERSION@"
//> using jvm "17"
//> using javaOpt "-Xmx10g"
//> using javaOpt "-Dfoo=bar"
```

The newer launcher can be installed with
```text
$ cs launch --use-bootstrap sh.almond::launcher:@VERSION@ -- --scala @SCALA213_VERSION@ --install
```
(`sh.almond:launcher_3:@VERSION@` also works as a dependency in case `cs` is having issues finding out the
`_3` suffix on its own)

Just like above, you can also generate a launcher on your own first, then use it install Almond:
```text
$ cs bootstrap sh.almond::launcher:@VERSION@ -- --scala @SCALA213_VERSION@ -o almond --install
$ ./almond --install
$ rm -f almond
```

Important thing to note here: the Scala version must be passed as argument to Almond itself, rather than
to `cs`. The launcher uses its own Scala version (Scala 3), then, later on, spawns the same Almond kernel as above
on its own, that takes over as a kernel. The argument passed via `--scala` corresponds to the default
Scala version that will be used, if users don't specify a version of their own via a directive like
`//> using scala "@SCALA213_VERSION@"`. Specify that option is optional: without it, Almond will default
to the Scala 3 version that Almond uses at the time of its release (which should be the latest stable Scala 3
version at the time of the release).

To list the options that the launcher accepts, run
```text
$ cs launch --use-bootstrap sh.almond::launcher:@VERSION@ -- --help
```

Note that some options can be passed directly to the kernel that the launcher will spawn at the beginning
of notebooks, after another `--`, like
```text
$ cs launch --use-bootstrap sh.almond::launcher:@VERSION@ -- --scala @SCALA213_VERSION@ --install -- --toree-api
```

To list the options that can be passed this way, run
```text
$ cs launch --use-bootstrap almond:@VERSION@ --scala @SCALA213_VERSION@ -- --help
```

## Custom URL protocol support

Custom protocol support for `java.net.URL` needs the JARs supporting it to be passed to the `-cp`
option of `java`. Using the new launcher above, such JARs can be passed via `--extra-startup-class-path`, like
```text
$ cs launch --use-bootstrap sh.almond::launcher:@VERSION@ -- \
--install \
--scala @SCALA213_VERSION@ \
--extra-startup-class-path "$(cs fetch io.get-coursier:s3-support:0.1.0)"
```
(In this example, we pass the JARs of [coursier s3-support](https://github.com/coursier/s3-support) to `-cp` via
`--extra-startup-class-path`.)

## Enabling Toree compatibility

### Core magics

Almond has some support for [Toree](https://github.com/apache/incubator-toree) ["magics"](https://github.com/apache/incubator-toree/blob/5b19aac2e56a56d35c888acc4ed5e549b1f4ed7c/etc/examples/notebooks/magic-tutorial.ipynb).
This support makes it easier to migrate from Toree to Almond for example.

Enable it with the `--toree-magics` option, that both the former and the new Almond launcher accept:
```text
$ cs launch --use-bootstrap sh.almond::launcher:@VERSION@ -- \
--install \
--scala @SCALA213_VERSION@ \
--toree-magics
```

Alternatively to `--toree-magics`, pass `--toree-compatibility` that enables both `--toree-magics`
and `--toree-api` (see below).

You can then use magics like `%AddDeps`, `%AddJar`, `%LsMagic`, etc., in notebook cells.

### Spark magics

Support for the `%sql` magic, relying on Spark, needs to be enabled from a "predef" script. It also
requires parts of the Toree magics support to be put in the user-facing part of the Almond class path
(so that calls to it from the user side are picked up by Almond internals later on).

To enable that from a former launcher, pass `--shared sh.almond::toree-hooks` when generating the launcher:
```text
$ cs launch --use-bootstrap almond:@VERSION@ --shared sh.almond::toree-hooks --scala @SCALA213_VERSION@ -- --install
```

To enable it from a new launcher, pass `--shared-dependencies sh.almond::toree-hooks:_` to the launcher, like
```text
$ cs launch --use-bootstrap sh.almond::launcher:@VERSION@ -- \
--install \
--scala @SCALA213_VERSION@ \
--shared-dependencies sh.almond::toree-hooks:_ \
--toree-magics
```

Then, from a predef script, call
```scala
almond.spark.ToreeSql.setup()
```

Complete example with the new launcher:
```text
$ cat predef.sc
import $ivy.`sh.almond::almond-toree-spark:@VERSION@`
almond.spark.ToreeSql.setup()
$ cs launch --use-bootstrap sh.almond::launcher:@VERSION@ -- \
--install \
--scala @SCALA213_VERSION@ \
--shared-dependencies sh.almond::toree-hooks:_ \
--toree-magics \
--predef predef.sc
```

### Toree API

Enable basic support for the Toree API with `--toree-api`.

Enable it with the `--toree-magics` option, that both the former and the new Almond launcher accept:
```text
$ cs launch --use-bootstrap sh.almond::launcher:@VERSION@ -- \
--install \
--scala @SCALA213_VERSION@ \
--toree-api
```

Alternatively to `--toree-api`, pass `--toree-compatibility` that enables both `--toree-magics` (see above)
and `--toree-api`.

Some Toree API calls are then supported:
```scala
kernel.display.html("<b>foo</b>")
```

## Using custom Maven repositories

If you want to use custom Maven repositories rather than Maven Central, you need to set
[`COURSIER_REPOSITORIES`](https://github.com/coursier/coursier/blob/3e212b42d3bda5d80453b4e7804670ccf75d4197/doc/docs/other-repositories.md)
at few places. It needs to be set when installing Almond, and when Jupyter starts Almond.

This can be achieved the following way, with the former launcher:
```text
$ export COURSIER_REPOSITORIES="ivy2Local|https://artifacts.company.com/maven"
$ cs launch --use-bootstrap almond:@VERSION@ --scala @SCALA213_VERSION@ -- \
--env "COURSIER_REPOSITORIES=$COURSIER_REPOSITORIES"
```

`--env` sets environment variables in the kernel spec that gets written when installing Almond. Jupyter
sets those prior to launching Almond when users open notebooks.

## Available options

To list the options that the former launcher accepts, run
```text
$ cs launch --use-bootstrap almond:@VERSION@ --scala @SCALA213_VERSION@ -- --help
```

To list the options that the newer launcher accepts, run
```text
$ cs launch --use-bootstrap sh.almond::launcher:@VERSION@ -- --help
```

Note that the newer launcher also accepts the former launcher options after an extra `--`, like
```text
$ cs launch --use-bootstrap sh.almond::launcher:@VERSION@ -- --scala @SCALA213_VERSION@ --install -- --toree-api
```
12 changes: 6 additions & 6 deletions docs/pages/quick-start-install.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ like
```text
$ curl -Lo coursier https://git.io/coursier-cli
$ chmod +x coursier
$ ./coursier launch --fork almond -- --install
$ ./coursier launch --use-bootstrap almond -- --install
$ rm -f coursier
```

Expand All @@ -28,21 +28,21 @@ from the ones handled by coursier.

You can specify explicit Almond and / or Scala versions, like
```text
$ ./coursier launch --fork almond:0.10.0 --scala 2.12.11 -- --install
$ ./coursier launch --use-bootstrap almond:0.10.0 --scala 2.12.11 -- --install
```

Short Scala versions, like just `2.12` or `2.13`, are accepted too.
The available versions of Almond can be found [here](https://github.com/almond-sh/almond/releases).
Not all Almond and Scala versions combinations are available.
See the possible combinations [here](install-versions.md)).
See the possible combinations [here](install-versions.md).


<details>
<summary>Equivalent Windows command</summary>
```bat
> bitsadmin /transfer downloadCoursierCli https://git.io/coursier-cli "%cd%\coursier"
> bitsadmin /transfer downloadCoursierBat https://git.io/coursier-bat "%cd%\coursier.bat"
> .\coursier launch --fork almond -M almond.ScalaKernel -- --install
> .\coursier launch --use-bootstrap almond -M almond.ScalaKernel -- --install
```
</details>

Expand All @@ -52,12 +52,12 @@ Once the kernel is installed, you can use it within Jupyter or nteract.

Pass `--help` instead of `--install`, like
```text
$ ./coursier launch --fork almond -- --help
$ ./coursier launch --use-bootstrap almond -- --help
```

## Update the almond kernel

To update the almond kernel, just re-install it, but passing the `--force` option to almond (like `./coursier launch --fork almond -- --install --force`). That will override any previous almond (or kernel with name `scala`).
To update the almond kernel, just re-install it, but passing the `--force` option to almond (like `./coursier launch --use-bootstrap almond -- --install --force`). That will override any previous almond (or kernel with name `scala`).

## Uninstall the almond kernel

Expand Down
2 changes: 1 addition & 1 deletion docs/website/sidebars.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"docs": {
"Intro": ["intro"],
"Try it": ["try-mybinder", "try-docker", "try-deepnote"],
"Installation": ["quick-start-install", "install-options", "install-multiple", "install-versions", "install-other"],
"Installation": ["quick-start-install", "install-options", "install-multiple", "install-versions", "install-other", "install-advanced"],
"Usage": ["usage-plotting", "usage-spark"],
"User API": ["api", "api-ammonite", "api-jupyter", "api-access-instances"],
"Development": ["dev-from-sources", "dev-custom-kernel", "dev-libraries", "dev-website"]
Expand Down

0 comments on commit 92fef56

Please sign in to comment.