Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨add image prepulling to all swarm nodes #3073

Closed
wants to merge 14 commits into from

Conversation

GitHK
Copy link
Contributor

@GitHK GitHK commented May 25, 2022

What do these changes do?

The Problem

Some service are really big and take a very long time to pull. To make services start faster, images should already be present on the node when a user starts a service.

An Initial Solution

  • ✨ add an image-puller service, running on all the nodes, which asks (the catalog) for a list of services to be pulled:
    • the image-puller will be the simplest possible service which only asks for instruction at set regular intervals and acts upon them
  • ♻️ catalog acts as single source of truth providing a list of services to each image-puller service asking for it:
    • catalog will identify each image-puller service by it's docker host
    • catalog will have to run on a swarm manger node since it requires access to the docker swarm API (asking the director-v2 for the information was discarded since it's preferable to avoid coupling and we might end up using more than 1 call in the future)
    • catalog based on the id of each requester, the catalog will compile a list of services keeping into consideration which resources that node has available
  • as a sync policy, the latest and fattest images will be selected for sync, meaning: the latest version of a service and only services over a certain threshold

Related issue/s

How to test

Checklist

@GitHK GitHK self-assigned this May 25, 2022
@GitHK GitHK added this to the Croissant milestone May 25, 2022
@codecov
Copy link

codecov bot commented May 25, 2022

Codecov Report

Merging #3073 (ba93c1c) into master (58b426f) will increase coverage by 0.0%.
The diff coverage is 100.0%.

Impacted file tree graph

@@          Coverage Diff           @@
##           master   #3073   +/-   ##
======================================
  Coverage    80.7%   80.8%           
======================================
  Files         716     716           
  Lines       30903   30905    +2     
  Branches     4032    4032           
======================================
+ Hits        24962   24973   +11     
+ Misses       5074    5063   -11     
- Partials      867     869    +2     
Flag Coverage Δ
integrationtests 66.1% <ø> (+<0.1%) ⬆️
unittests 76.7% <100.0%> (+<0.1%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...ckages/models-library/src/models_library/docker.py 100.0% <100.0%> (ø)
.../simcore_service_catalog/db/repositories/groups.py 72.9% <0.0%> (-2.8%) ⬇️
.../simcore_service_catalog/services/access_rights.py 78.7% <0.0%> (-2.5%) ⬇️
.../director/src/simcore_service_director/producer.py 61.6% <0.0%> (-0.5%) ⬇️
...simcore_service_director_v2/modules/dask_client.py 92.5% <0.0%> (+0.6%) ⬆️
...e_service_director_v2/modules/dask_clients_pool.py 94.2% <0.0%> (+1.4%) ⬆️
...rector_v2/modules/comp_scheduler/base_scheduler.py 88.6% <0.0%> (+1.8%) ⬆️
...c/simcore_service_catalog/core/background_tasks.py 68.4% <0.0%> (+2.1%) ⬆️
...mcore_service_webserver/garbage_collector_utils.py 82.3% <0.0%> (+2.9%) ⬆️
...ore_service_director_v2/utils/client_decorators.py 76.6% <0.0%> (+3.3%) ⬆️
... and 1 more

@sonarcloud
Copy link

sonarcloud bot commented May 25, 2022

SonarCloud Quality Gate failed.    Quality Gate failed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot E 1 Security Hotspot
Code Smell A 1 Code Smell

No Coverage information No Coverage information
0.4% 0.4% Duplication

Copy link
Member

@pcrespov pcrespov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice description and good first iteration
I assume this is still not complete since I see no changes in catalog etc...
In any case I left here some feedback that hope it helps

BTW: check sonarcloud input as well

@sanderegg
Copy link
Member

Ok, so I guess this was decided last week.
I have some questions:

  • why do you actually need to have access to docker? just for having the host name? if that is the case it is not necessary and we can discuss that. These service can have their host names set automatically. If possible let's not add too many constraints. What is the use of having 20 catalogs on the same machine ??
  • About the fattest policy.. is this an AND policy? I hope you are not going to pre-download 20Gb images just because they are > 10Gb...
  • personally I still do not find this system really nice... and I think users would be ok to see the progress of the pulling instead. Actually who or what is going to clean the nodes?

@GitHK
Copy link
Contributor Author

GitHK commented Dec 6, 2022

this is obsolete, if required can be taken care by :

@GitHK GitHK closed this Dec 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants