Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pitch] Renku 2.0 Compute Sessions #35

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
220 changes: 220 additions & 0 deletions feature-pitches/010-compute-sessions/compute-sessions.md
@@ -0,0 +1,220 @@
# Renku 2.0 Compute Sessions

Authors: Laura Kinkead, Rok Roškar

> [!IMPORTANT]
> We are building the next version of Renku! (Currently referred to Renku 2.0) Would you like to get involved in shaping the future of Renku? Interested to participate in our user research? Get in touch! hello@renku.io

## 🤔 Problem

Renku 2.0 needs sessions! But more than that, we want to improve on the current sessions, solving the following user problems:

- When I am starting a new project, I want to **start out with a basic compute environment** (for example with a pre-built data science stack), and I’ll decide what extra packages I need later.
- When I create a new Renku project, I want to get started right away with a pre-built environment- **I don’t want to have to wait for the environment image to be built** to get started.
- When I start a session, I want to **clearly understand what environment image the session will use and what is installed in it**, so I am not confused where the compute environment comes from.
- When I go to launch a session on a project where I recently made an edit to a code file that does not impact the project environment, **I don’t want to have to wait for the image to re-build** since the file I changed didn’t change the environment anyway.

There are a few ways we want 1.0 Environments to work in the future that we won’t be able to make work right away (these are *out of scope* for this pitch), but I include them here to paint a bigger picture and give some extra context for the direction we want Environments to go.

- When I have already created a Docker image for my project that I use in other places, **I want Renku to use that pre-exising image** so I don’t have to re-create a new environment.
- When I know my teammate has already created a great environment they do their work in, I want to **search and reuse that environment** in my own project in RenkuLab.
- When I create a dashboard in a Renku Project, I want to **provide a separate entry point for app viewers** that does not reveal the development environment I use to build the app, so that my app consumers are not confused.
- When I use my Renku project image in other places, I want the **image to be small** so it’s easy to move to other systems (aka not have Jupyter installed in it if I don’t need it).

## 🍴 Appetite

6 weeks.

## 🎯 Solution

In Renku 2.0 (generally, not already in this pitch!), we want to enable 2 key new features:

- Enabling more than one environment per project
- Sharing environments between projects

For these features, we need some new terminology.

### Terminology

#### Environment

An environment consists of a docker image plus metadata. An environment is independent of any one project and can be reused in multiple projects.

An environment has the following properties:

- name
- an image/tag
- metadata:
- who created it
- date of creation
- [optional] description (e.g. why it is useful)
- [optional] keywords/tags to aid in searching (for example, I might tag an environment with `#python` `#pytorch` to give a quick impression of what is in the image)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a very general sense, I feel that these changes are centered on the actual Docker images Renku is using. This is very helpful, however there is something that we cannot do with docker images: "mixing" the environments.

Let's put a concrete case:

Let's say I have an app for "paraview" GUI viewer and an app for a finite element software and another and finally one for Solidipes. The user would wish to create a single session with /paraview, /solidipes, /fe and /lab automatically. In the current Renku, I would do with python plugins and jupyter proxy: this is conditioned by the currently installed python packages. But what is proposed here, do not allow that, I think.

How am I supposed to mix environments ? Can I create as session with two environments ? From your proposal, I think that someone has to make a Dockerfile (somewhere) and describe the installation for the mix.
However, it would make sense to split the process of installation (and not requiring the users to write the Dockerfile for any combination).

I think the metadata attached to an environment should also include possible (compatible) requirements. That would allow the dependencies to be somehow selected while creating the session (and select them in a catalog).

Another possibility could be to have two types of "environments": one for the Dockerfile level and one for the requirements. (Sadly, this is orthogonal to the request that users do not want to wait installation times when spawning a new session....).

To summarize: I think it is important to think of a way/procedure, which allows mixing environment/apps easily.

In Renku 2.0, Renku users search and reuse *environments* between projects. (However, enabling environment sharing and reuse is out of scope for this pitch! (See 'Sharing environments to be reused between projects' in the No-gos).

> [!NOTE]
> Still to be shaped in the future: Are environments mutable? Are environments versioned? Can a user browse previous or related versions of an environment? Can an environment be updated? If so, are those updates propagated to project sessions that are using them?

#### Session

A session is an instantiation of an *environment* in a specific project. A user can start a session and have live, interactive access to the project and its code and storage.

Sessions can have the following statuses:

- ⬜ Inactive
- 🕗 Launching
- 🟢 Running
- ⏸ Paused
- ⚠️ Failed/Error

When an *environment* is added to a project, it automatically becomes a project *session* in the state *inactive*.

#### Session Configuration

The session configuration customizes a session for an environment in a project. Via the session configuration, a user can:

- set minimum compute resource requirements
- [nice to have] select which of the project’s code repos to mount in the session [default: mount all]
- [for the future/out of scope] select which of the project’s storages to mount in the session [default: mount all]
- customize the name of the session

When a user wants to customize the resources of their session (compute resources, repos or storages mounted), they modify the *session configuration*.

Session configuration is project-specific.

#### Session Front-End [for the future/out of scope]

In Renku 2.0, we would like to offer that a user can swap out different front ends to their sessions, independently of the session environment. So, a user could find a data science environment on Renku that they want to reuse, and but use VSCode instead of Jupyter as the front end for the session without changing the rest of the environment.

This is something we want in Renku 2.0, but is out of scope for this build for the sake of time.

### User Flows

#### User Flow: Add a Session to a Project

![breadboard-add-session-pt-1.jpg](breadboard-add-session-pt-1.jpg)

continuing to the right...

![breadboard-add-session-pt-2.jpg](breadboard-add-session-pt-2.jpg)

A user has 4 options for adding a session to their project. Only the first 2 are considered critical for this build. These options are listed in order of priority, so if this build doesn’t get to implementing all of them, that’s ok.

##### Priority 1: Select from the RenkuLab Default Environments

- These are built and maintained by us
- **At a minimum, 1 default environment must be created as part of the build!** (a Jupyter Python environment)
- These images are pre-built, so launching a session from one is fast (the user doesn’t have to wait for an image build)
- [nice to have] The default environments should be configured via the admin panel

##### Priority 2: Use an external image

- There are a whole bunch of restrictions for what must be in the image for it to work on RenkuLab. We will have to document these limitations for users.
- We *think* that any image coming from the Jupyter stack should work
- ... Or at least Renku base images
- Only publicly accessible images are supported
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should either say "... will be supported as a part of this build, and authentication options added later." or add a "nice to have" to deal with authentication.


##### Priority 3 [nice to have]: Configure a Session from a Code repo

- This uses the already existing CI image build pipeline
- We do not automatically update the image that is used for the session when new commits are pushed and new images are built. It is up to the user if they want to update which image is built
- [nice-to-have]: The user is notified when a new image is available and invited to change the image their session uses if they wish.

##### Priority 4: [for the future/out of scope] Search & select from preexisting RenkuLab environments

#### User Flow: Start Session

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no provision here for selecting the resource pool / class?

![breadboard-start-session.jpg](breadboard-start-session.jpg)

#### User Flow: Configure Session [nice to have]

![breadboard-configure-session.jpg](breadboard-configure-session.jpg)

#### User Flow: Publish Session [for the future/out of scope]

![breadboard-publish-session.jpg](breadboard-publish-session.jpg)

#### User Flow: Update a Repo-Based Session [nice to have]

![breadboard-update-repo-based-session.jpg](breadboard-update-repo-based-session.jpg)

### Things not communicated via the User Flows

#### Starting Assumptions

- In this build, we will not change the project GitLab CI-based image build process (for 'Configure a Session from a Code repo')

#### Key Points

- When a user creates a new project, the project has no sessions until a user adds one.
- [nice to have] Provide shortcuts for adding a session from the RenkuLab default environments
- A user can create as many sessions as they want in a project.
- A user can run 1 session per environment per project (similar to our restriction now of 1 session per user per project)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we need to actually prevent the user from creating multiple sessions from the same environment - should this be mentioned in the user flow?


#### 1-click to start the Default Session

- One session in a project should be the “default” session, so that we can still offer a 1-click ‘Start Session’ button for the project as a whole, which starts the default session.

#### Permissions

- **Creating sessions:** users with write access to the project should be allowed to create and configure sessions.
- **Launching a session:** For the purposes of this pitch, anyone with access to the project can launch any session that has been added in the project.
- **Viewing running sessions:** Only the user who launched a session can see its status.

#### Inside the session

The session is launched into the `/work` directory. The repositories and external storage sources are provisioned as sub-directories:

```bash
/work
/repository-1
/repository-2
/external-storage-1
/external-storage-2
```

### Revisiting the Motivating Problems

- When I am starting a new project, I want to **start out with a basic compute environment** (for example with a pre-built data science stack), and I’ll decide what extra packages I need later.
- → Solved by offering RenkuLab built-in session “templates” (name TBD) as the fastest route to adding a session to a project
- When I create a new Renku project, I want to get started right away with a pre-built environment- **I don’t want to have to wait for the environment image to be built** to get started.
- → Same as above
- When I start a session, I want to **clearly understand what environment image the session will use and what is installed in it**, so I am not confused where the compute environment comes from.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imho this doesn't really address the "clearly understand what is installed" part?

- → Solved by defaulting to “pinning” the image when the user chooses the “create session from project repo” route, and not automatically updating which commit image to use.
- When I go to launch a session on a project where I recently made an edit to a code file that does not impact the project environment, **I don’t want to have to wait for the image to re-build** since the file I changed didn’t change the environment anyway.
- → Same as above

There are a few ways we want 1.0 Environments to work in the future that we won’t be able to make work right away (these are *out of scope* for this pitch), but I include them here to paint a bigger picture and give some extra context for the direction we want Environments to go.

- When I have already created a Docker image for my project that I use in other places, **I want Renku to use that pre-exising image** so I don’t have to re-create a new environment.
- → Partially solved by offering new option to base a session off of an existing (public) image, and will require further work beyond this pitch to enable accepting non-Renku images.
- When I know my teammate has already created a great environment they do their work in, I want to **search and reuse that environment** in my own project in RenkuLab.
- → Will be solved in the future by making environments sharable
- When I create a dashboard in a Renku Project, I want to **provide a separate entry point for app viewers** that does not reveal the development environment I use to build the app, so that my app consumers are not confused.
- → Groundwork laid now by supporting multiple sessions per project. Will be solved in the future with a dedicated Apps pitch.
- When I use my Renku project image in other places, I want the **image to be small** so it’s easy to move to other systems (aka not have Jupyter installed in it if I don’t need it).
- → Will be solved in the future by implementing the Session Front-End concept.

## 🐰 Rabbit Holes

## 🙅‍♀️ No-gos

### For the Future/Out of Scope: Sharing environments to be reused between projects

In the future, environments will be able to be published and shared between users and projects. We want definitely want to do this eventually, but this is out of scope of this build for the sake of making this pitch fit in 6 weeks. For the scope of this pitch, when a user creates a session from a new environment, that underlying environment is not accessible or shareable in any meaningful manner to the user.

### No Go: Ability to launch multiple environments (images) in the same session

Ability to launch multiple environments within the same session. The user could select e.g. a VSCode and a dashboard app to be launched together, giving them the opportunity to fine-tune the app while having access to a dev environment. We will not enter into this territory with this build, we simply assume 1 environment per session.

### No Go: Shared file system between sessions

With multiple environments per project, it is tempting to imagine that you could launch several of them and have the same underlying filesystem. This is unlikely to be doable in the short-term (we don’t have RW-many storage). The only way it could work is if the sessions/environments were actually separate containers in the same pod. However, users *can* attach the same external storage to different sessions.

### No Go: Changing how GitLab repo images are defined or built

Do not redefine how a user builds an image for an environment or how this environment is defined. We are keeping *exactly* the same mechanisms for building images that exist currently i.e. through GitLab CI and a combination of `Dockerfile` with some language-specific configuration files.

### No go: Access control (user permissions) guarding specific project sessions

We can imagine it would be useful for a Renku developer to develop their project in one session, build a dashboard in another session, and then only give an external collaborator access to the dashboard session in the project (so that the collaborator doesn’t get confused by the development session?). However, this has potentially large implications, so defer to a future build.
23 changes: 15 additions & 8 deletions roadmap.md
Expand Up @@ -4,18 +4,25 @@ Welcome to the Renku Roadmap! Here we share the developments we are currently wo

Want a feature that's not on here? [Submit a feature request!](https://github.com/SwissDataScienceCenter/renku/issues/new/choose)

## December 2023
> [!IMPORTANT]
> We are building the next version of Renku! (Currently referred to Renku 1.0) Would you like to get involved in shaping the future of Renku? Interested to participate in our user research? Get in touch! hello@renku.io

## March 2023

- [🤖 Renku 1.0 Compute Sessions](feature-pitches/010-compute-sessions/compute-sessions.md) - **planned**

**User facing changes**
## January 2024

- [✏️ Read/Write Cloud Storage](feature-pitches/007-read-write-cloud-storage/read-write-cloud-storage.md)
- [✏️ Read/Write Cloud Storage](feature-pitches/007-read-write-cloud-storage/read-write-cloud-storage.md) - **released**

## December 2023

- [📌 Pin Projects to the Dashboard](feature-pitches/008-pin-to-dashboard/pin-to-dashboard.md)
- [📌 Pin Projects to the Dashboard](feature-pitches/008-pin-to-dashboard/pin-to-dashboard.md) - **released**

- [🎨 Redesign the Landing Page](feature-pitches/009-redesign-landing-page/redesign-landing-page.md)
- [🎨 Redesign the Landing Page](feature-pitches/009-redesign-landing-page/redesign-landing-page.md) - **released**

**Internal changes**
### Internal changes

- [🐸 Start of the Renku Native Projects](feature-pitches/005-define-renku-native-projects-and-users/define-renku-native-projects-and-users.md)
- [🐸 Start of the Renku Native Projects](feature-pitches/005-define-renku-native-projects-and-users/define-renku-native-projects-and-users.md) - **completed**

- [📈 Build Renku Product Metrics](feature-pitches/006-first-product-metrics/first-product-metrics.md)
- [📈 Build Renku Product Metrics](feature-pitches/006-first-product-metrics/first-product-metrics.md) - **completed**