Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Offer image choices for the JMTE hub #2683

Merged
merged 2 commits into from
Jun 21, 2023
Merged

Conversation

yuvipanda
Copy link
Member

@yuvipanda yuvipanda commented Jun 20, 2023

I think overall, we want to reduce the number of images we maintain for our end users. A big part of this to use upstream images directly wherever possible, and allow users to choose. This helps us benefit from upstream fixes as quickly as possible, and reduces the total amount of work done. For example, instead of specifically bumping the version of Julia just for this one
image (pangeo-data/jupyter-earth#166), we could instead do that upstream and benefit
everyone (jupyter/docker-stacks#1917).

Faster startup times is another benefit, as the more specific images are smaller than a big 'all-in-one' image.

There are some features of the all-in-one image that currently don't easily exist upstream:

  • Linux desktop
  • Nix
  • Specific extra packages that maybe installed

We can figure these out over time, but not maintaining the all-in-one
image is a nice goal to shoot for. To this end, the JMTE image is
still the default, but marked as 'deprecated' as I don't want to
continue doing a lot of maintenance on it.

Ref #2201

This is how it looks like.

image

@yuvipanda yuvipanda requested a review from a team as a code owner June 20, 2023 22:59
@yuvipanda
Copy link
Member Author

I will communicate this to the users of the hub.

I think overall, we want to reduce the number of images we maintain
for our end users. A big part of this to use upstream images directly
wherever possible, and allow users to choose. This helps us benefit
from upstream fixes as quickly as possible, and reduces the total
amount of work done. For example, instead of specifically bumping
the version of Julia just for this one
image (pangeo-data/jupyter-earth#166),
we could instead do that upstream and benefit
everyone (jupyter/docker-stacks#1917).

Faster startup times is another benefit, as the more specific
images are smaller than a big 'all-in-one' image.

There are some features of the all-in-one image that currently don't
easily exist upstream:

- Linux desktop
- Nix
- Specific extra packages that maybe installed

We can figure these out over time, but not maintaining the all-in-one
image is a nice goal to shoot for. To this end, the JMTE image is
still the default, but marked as 'deprecated' as I don't want to
continue doing a lot of maintenance on it.

Ref 2i2c-org#2201
@yuvipanda
Copy link
Member Author

yuvipanda commented Jun 20, 2023

The following message was communicated to hub users:

Hello hub users! More exciting changes (and features) afoot, thanks to integration with the 2i2c infrastructure! Once #2683 lands, you'll be able to specifically choose from multiple images to launch into your session! To begin with, there are 4 images:

  1. The existing, 'all-in-one' big image. No changes here.
  2. Upstream Pangeo image for pytorch. (1) is based off an older version of this, so this newer image will bring in newer versions of libraries that may help!
  3. Upstream pangeo image for tensorflow. New!
  4. Upstream jupyter/docker-stacks' 'datascience notebook' image, which brings in an up to date version of Julia as well as R.

In the long run, I'd like us to stop maintaining (1) and instead allow users to choose from various special purpose images for the following reasons:

  1. Faster startup time! The all in one image is about 9G big, while the special purpose images are much smaller. For example, the datascience-notebook image is only 1.8G, compared to 9G for our all-in-one image! This means pod startups should be much faster.
  2. Image changes are shared across the whole ecosystem, and hence faster. There's a larger community maintaining these images than just us, so this benefits everyone. The example I have is of bumping version of Julia I did for our image (Bump version of Julia pangeo-data/jupyter-earth#166), vs me doing that upstream (Bump Julia version to 1.9 jupyter/docker-stacks#1917).
  3. Allows us to introduce more features without paying a huge complexity penalty. The more combination of things you want in a single image, the more complex it becomes. It's far easier to have a special purpose image that just does one thing. For example, an image focused on showing the Linux Desktop would be far more suited to that than putting that here.

Nothing changes right now, as the old image is still available. However, I ask that y'all try out the other images and let me know what you think! Particularly,

  1. Are you using GPUs? Try out the pytorch & tensorflow images and let me know if they are enough, or if further modifications are needed
  2. Are you using Julia? Try out the jupyter/datascience-notebook image, and let me know if that is enough!

@yuvipanda
Copy link
Member Author

#2583 is related

@yuvipanda
Copy link
Member Author

A good example of this is the fact that the julia package is installed in the JMTE image for interfacing between python and julia, but it doesn't actually work because that package isn't compatible by default with python installed from conda! I'd rather we solve that upstream than downstream here :)

Copy link
Member

@GeorgianaElena GeorgianaElena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me and agree it's a good pattern to follow! Thank you @yuvipanda

@yuvipanda yuvipanda merged commit e3e526a into 2i2c-org:master Jun 21, 2023
8 checks passed
@github-actions
Copy link

🎉🎉🎉🎉

Monitor the deployment of the hubs here 👉 https://github.com/2i2c-org/infrastructure/actions/runs/5335693246

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Archived in project
Development

Successfully merging this pull request may close these issues.

None yet

2 participants