Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(core): design of renku session #2529

Closed
wants to merge 18 commits into from
Closed

feat(core): design of renku session #2529

wants to merge 18 commits into from

Conversation

vigsterkr
Copy link
Contributor

@vigsterkr vigsterkr commented Dec 15, 2021

Description

Fixes #1991 #2394

session

@vigsterkr vigsterkr changed the title design of renku session feat(core): design of renku session Dec 15, 2021
@vigsterkr
Copy link
Contributor Author

@SwissDataScienceCenter/poc this is an initial implementation of supporting a local interactive session. any feedback is more than welcome.

Copy link
Member

@rokroskar rokroskar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I really like this, it's very simple and clean. There are a couple of outstanding questions, with the error handling and UX around suggesting alternative paths in case of failure being a big one here, I think, since we have to assume we are dealing with users with very little docker experience.


A command to start an interactive session.

By default, the `Dockerfile` present in the given project is going to be used for building the docker image
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say by default we look for the image on the hosted instance (if we can identify one). That would be quicker than building in most cases. We can identify the registry from the origin remote.

There is a lot of potential for failure here so we need to think about the error handling with user-friendly suggestions for what to do next. E.g.

  • check local docker images --> none exists, check remote
  • remote image lookup fails because image isn't there and no local image exists --> prompt to build one
  • remote image lookup fails (private project and no token) --> suggest to login
  • remote image lookup fails, (private project with token) --> suggest image build
  • image build fails, no docker --> link to docker install / documentation
  • image build fails, docker working but problem in Dockerfile --> suggest a base image? or 🤷

A related question is how do we actually identify which image we need? Simply by the commit sha? When the runtime doesn't change for many commits, we could just check which commit we need (i.e. when did the runtime dependencies change last - though we might not know all the paths in the repo that affect the runtime). Probably the safest is to go with the commit sha even if it results in too many image builds.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by using commit hash when the runtime env doesn't change the caching mechanism (at least in case of docker) will circumvent the unnecessary rebuilds. by default the code-base itself (of the project) is mounted into the running container, so it shouldn't affect the image building at all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a lot of potential for failure here so we need to think about the error handling with user-friendly suggestions for what to do next. E.g.
check local docker images --> none exists, check remote
remote image lookup fails because image isn't there and no local image exists --> prompt to build one
remote image lookup fails (private project and no token) --> suggest to login
remote image lookup fails, (private project with token) --> suggest image build
image build fails, no docker --> link to docker install / documentation
image build fails, docker working but problem in Dockerfile --> suggest a base image? or 🤷

we should describe this logic or do you think we should always build and never fetch from the remote registry?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

regarding the logic of image fetching/building: since now it is a plugin based solution, this is strictly up to the plugin what and how to do things. the most this RFC can do is to kind of write a suggestion how it should be handled, but honestly the plugin will do whatever it wants to do... note i'm not so sure what happens in case of notebook api when an image is defined...

design/003-interactive-session/003-interactive-session.md Outdated Show resolved Hide resolved
design/003-interactive-session/003-interactive-session.md Outdated Show resolved Hide resolved
## Drawbacks

One of the major drawback of the current design is the dependency on docker. Namely, to only support local
interactive session if docker is available. Many HPC environments do not support docker, but they support
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think considering other runtimes is definitely a valid concern but with regards to HPC specifically, there are a lot of other hurdles (port-forwarding to compute nodes, for example). We might want to consider that a separate case altogether.

design/003-interactive-session/003-interactive-session.md Outdated Show resolved Hide resolved
other container formats, like [runC](https://github.com/opencontainers/runc). As a long term goal of this
effort it would be good to consider support for different container engines, hence the design and implementation
of the initial `renku session` should take this into consideration.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One general question - the idea here is to mount the local repo directory into the container. Our containers typically run as user 1000 with gid 1000 - what happens with the file ownership outside the container?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep i was looking into this, specifically from windows perspective 🙃

@vigsterkr vigsterkr force-pushed the session-rfc branch 2 times, most recently from be2e037 to 1ba476e Compare February 2, 2022 13:55
@vigsterkr vigsterkr marked this pull request as ready for review February 2, 2022 13:56
@vigsterkr vigsterkr requested a review from a team as a code owner February 2, 2022 13:56
Copy link
Member

@rokroskar rokroskar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great! Just a few clarifying comments. Sorry it took me a while 🙈

design/003-interactive-session/003-interactive-session.md Outdated Show resolved Hide resolved
design/003-interactive-session/003-interactive-session.md Outdated Show resolved Hide resolved
design/003-interactive-session/003-interactive-session.md Outdated Show resolved Hide resolved
@vigsterkr
Copy link
Contributor Author

actually today i was thinking a bit more about this whole local vs remote story and realised that this is not the best approach, i.e. it shouldn't be distinguished like this. rather renku session sub-command should just have a support for various container engines---as plugins---and just as in case of workflow execute provider one would choose which one to use when running the command, for example:

renku session start --engine singularity

by default renku would provide two different plugins: docker and renkulab. all the others would be simply plugins as usual (e.g. singularity or youki).
Note i think the other problem of the current --remote design is that a docker can be remote as well! it all depends on the env variables settings.

maybe the --engine flag is not the best name. maybe we should be consistent and use --provider just as in case of workflows... none the less this would in a way unify the whole renku session across various container backends and 'locations'

@rokroskar
Copy link
Member

I think --provider makes a lot of sense! I could imagine something like this for e.g. HPC clusters which would have very specific needs but it would be self-consistent if we name them all "providers". So I guess the RFC doc should then define the interface?

@vigsterkr
Copy link
Contributor Author

I think --provider makes a lot of sense! I could imagine something like this for e.g. HPC clusters which would have very specific needs but it would be self-consistent if we name them all "providers". So I guess the RFC doc should then define the interface?

refactored, updated the RFC. now docker based interactive sessions are implemented as a plugin.

i'll add the ISessionProvider interface definition to the RFC ASAP

@vigsterkr
Copy link
Contributor Author

mmm seems a good side-effect of this refactor that the docker plugin won't even show up if docker engine is not installed 🙃 just need to do some extra exception handling (see macos ci failure)

Viktor Gal added 3 commits February 8, 2022 10:43
this way the plugin will always be present and the error will only
be raised when actually trying to use a session command and
the docker instance cannot be found.
Copy link
Member

@olevski olevski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vigsterkr I tried to install this and run it on my mac. With a project I created with renku init. I am getting this message:

AttributeError: 'ImageNotFound' object has no attribute 'msg'

It also happens if I specify a wrong not-real image with --image

If I specify a real image then things work.

I know this is a WIP but just thought I would point this out.

Also see my comment about getting the default url from the project. I think that would be really useful.

renku/core/management/session/docker.py Outdated Show resolved Hide resolved
@olevski
Copy link
Member

olevski commented Feb 8, 2022

Sorry for spamming. I noticed I was a few commits back on this branch when I was testing. A small clarification on the errors I am seeing.

  • renku session start results in:
AttributeError: 'APIError' object has no attribute 'msg'
  • renku session start --image fdsfdsf results in:
AttributeError: 'ImageNotFound' object has no attribute 'msg'

@vigsterkr
Copy link
Contributor Author

Sorry for spamming. I noticed I was a few commits back on this branch when I was testing. A small clarification on the errors I am seeing.

* `renku session start` results in:
AttributeError: 'APIError' object has no attribute 'msg'
* `renku session start --image fdsfdsf` results in:
AttributeError: 'ImageNotFound' object has no attribute 'msg'

@olevski nw, added a fix for those comments

Copy link
Member

@rokroskar rokroskar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some additional comments on the RFC.

Also, could we separate the implementation from the RFC? The PR discussion is muddled by the fact that both things are getting addressed at once imho and it would be easier for posterity if they were separate.

an interactive session. By default two different providers are shipped with Renku CLI:
- docker
- renkulab
The provider specific configuration values can be specified by using the `--config <config.yaml>` flag.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should be able to persist them also in the project settings?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is in line with the workflow execute provider logic.... of course these values could be stored anywhere... i dont have any strong opinion about this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

although i'm not so sure why would one want to store youki, singularity etc config values on a project level...these these key-value pairs meant for actually specifying some config value for the provider plugin, that i see more like a global setting not a project level setting. none the less, they could be persisted anywhere...


A command to start an interactive session.

By default, the `Dockerfile` present in the given project is going to be used for building the docker image
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a lot of potential for failure here so we need to think about the error handling with user-friendly suggestions for what to do next. E.g.
check local docker images --> none exists, check remote
remote image lookup fails because image isn't there and no local image exists --> prompt to build one
remote image lookup fails (private project and no token) --> suggest to login
remote image lookup fails, (private project with token) --> suggest image build
image build fails, no docker --> link to docker install / documentation
image build fails, docker working but problem in Dockerfile --> suggest a base image? or 🤷

we should describe this logic or do you think we should always build and never fetch from the remote registry?

@vigsterkr
Copy link
Contributor Author

closing in favour of #2687 #2688

@vigsterkr vigsterkr closed this Feb 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CLI command for launching an interactive session locally
4 participants