Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Pull pre Image failed on K8S cluster #3364

Closed
JackyYangPassion opened this issue Nov 20, 2023 · 9 comments · Fixed by #3365
Closed

[BUG] Pull pre Image failed on K8S cluster #3364

JackyYangPassion opened this issue Nov 20, 2023 · 9 comments · Fixed by #3365

Comments

@JackyYangPassion
Copy link
Contributor

Describe the bug
Pull pre Image failed on K8S cluster

To Reproduce
Steps to reproduce the behavior:

  1. install graphscope pre
pip3 install -U graphscope-client --pre   -i http://mirrors.aliyun.com/pypi/simple/ --trusted-host mirrors.aliyun.com

launch k8s

import graphscope
graphscope.set_option(log_level='DEBUG')
graphscope.set_option(show_log=True)
# Create GraphScope client session, the 'cluster_type' is k8s by default.
session = graphscope.session(
                             k8s_coordinator_cpu=1,
                             k8s_coordinator_mem="1Gi",
                             k8s_vineyard_cpu=0.2,
                             k8s_vineyard_mem="1Gi",
                             vineyard_shared_mem="1Gi",
                             k8s_engine_cpu=0.2,
                             k8s_namespace='gs-new-orc-jacky',
                             k8s_engine_mem="1Gi",
                             num_workers=1,
                             enabled_engines="analytical,interactive",
                             k8s_client_config='~/.kube/config')
print('========= Session created. ==========')

See error

2023-11-20 16:12:26,603 [INFO][utils:187]: coordinator-jzwbkf-65fb74466f-nb967: Failed to pull image "registry.cn-hongkong.aliyuncs.com/graphscope/coordinator:0.26.0a20231119": Error response from daemon: manifest for registry.cn-hongkong.aliyuncs.com/graphscope/coordinator:0.26.0a20231119 not found: manifest unknown: manifest unknown
2023-11-20 16:12:26,604 [INFO][utils:187]: coordinator-jzwbkf-65fb74466f-nb967: Error: ErrImagePull
2023-11-20 16:12:26,605 [INFO][utils:187]: coordinator-jzwbkf-65fb74466f-nb967: Back-off pulling image "registry.cn-hongkong.aliyuncs.com/graphscope/coordinator:0.26.0a20231119"
coordinator-jzwbkf-65fb74466f-nb967: Successfully assigned gs-new-orc-jacky/coordinator-jzwbkf-65fb74466f-nb967 to minikube
coordinator-jzwbkf-65fb74466f-nb967: Pulling image "registry.cn-hongkong.aliyuncs.com/graphscope/coordinator:0.26.0a20231119"
coordinator-jzwbkf-65fb74466f-nb967: Failed to pull image "registry.cn-hongkong.aliyuncs.com/graphscope/coordinator:0.26.0a20231119": Error response from daemon: manifest for registry.cn-hongkong.aliyuncs.com/graphscope/coordinator:0.26.0a20231119 not found: manifest unknown: manifest unknown
coordinator-jzwbkf-65fb74466f-nb967: Error: ErrImagePull
coordinator-jzwbkf-65fb74466f-nb967: Back-off pulling image "registry.cn-hongkong.aliyuncs.com/graphscope/coordinator:0.26.0a20231119"

Environment (please complete the following information):

  • GraphScope version: 0.26.0a20231119
  • OS: MacOS
  • Kubernetes Version 1.23
@siyuan0322
Copy link
Collaborator

siyuan0322 commented Nov 20, 2023

Recent 3 image building failed. The latest usable image are 0.26.0a20231116, please specify it manually by k8s_image_tag='0.26.0a20231116'.
We will fix the building process ASAP.

For reference: https://github.com/alibaba/GraphScope/actions/workflows/build-graphscope-images-linux.yml

@JackyYangPassion
Copy link
Contributor Author

Recent 3 image building failed. The latest usable image are 0.26.0a20231116, please specify it manually by k8s_image_tag='0.26.0a20231116'. We will fix the building process ASAP.

For reference: alibaba/GraphScope/actions/workflows/build-graphscope-images-linux.yml

Thanks for your reply.

Motivation:
For this bug fix #3363, I want to use this PR on the PRD env ASAP.

I think of two ways

  1. Pull the official pre version, because it is built and released every day
  2. Compile locally and publish to internal docker image repo

Can you guide me how to publish the image from the local development environment to the internal repo?

@siyuan0322
Copy link
Collaborator

I'll first try to rerun it, seems it stuck during trying to apt update.

You could reference to this workflow file and I think you only need to

  1. Rerun make analytical in your case.
  2. Tag it appropriate tag registry.cn-hongkong.aliyuncs.com/graphscope/analytical:tag
  3. If your cluster is a single machine cluster like minikube, you use minikube image load xxx to load it into your control-plane.
  4. If your cluster is not a single machine, you could docker save and docker load the image to the nodes machine.
  5. And make sure the image pull policy is IfNotPresent.

sighingnow added a commit that referenced this issue Nov 20, 2023
<!--
Thanks for your contribution! please review
https://github.com/alibaba/GraphScope/blob/main/CONTRIBUTING.md before
opening an issue.
-->

## What do these changes do?

<!-- Please give a short brief about these changes. -->

## Related issue number

<!-- Are there any issues opened that will be resolved by merging this
change? -->

Fixes #3364

See also: #3363

Signed-off-by: Tao He <linzhu.ht@alibaba-inc.com>
@siyuan0322
Copy link
Collaborator

Fixed by #3365 , The images would be available by tomorrow.

@JackyYangPassion
Copy link
Contributor Author

JackyYangPassion commented Nov 22, 2023

coordinator-mvekpb-5f9ffddcdd-4nck8: Failed to pull image "registry.cn-hongkong.aliyuncs.com/graphscope/coordinator:0.26.0a20231121": rpc error: code = Unknown desc = Error response from daemon: manifest for registry.cn-hongkong.aliyuncs.com/graphscope/coordinator:0.26.0a20231121 not found: manifest unknown: manifest unknown
2023-11-22 16:03:20,061 [INFO][utils:187]: coordinator-mvekpb-5f9ffddcdd-4nck8: Error: ErrImagePull
coordinator-mvekpb-5f9ffddcdd-4nck8: Successfully assigned gs-new-orc-jacky2/coordinator-mvekpb-5f9ffddcdd-4nck8 to
coordinator-mvekpb-5f9ffddcdd-4nck8: Pulling image "registry.cn-hongkong.aliyuncs.com/graphscope/coordinator:0.26.0a20231121"
coordinator-mvekpb-5f9ffddcdd-4nck8: Failed to pull image "registry.cn-hongkong.aliyuncs.com/graphscope/coordinator:0.26.0a20231121": rpc error: code = Unknown desc = Error response from daemon: manifest for registry.cn-hongkong.aliyuncs.com/graphscope/coordinator:0.26.0a20231121 not found: manifest unknown: manifest unknown

@JackyYangPassion
Copy link
Contributor Author

Pull pre Image failed on K8S cluster still error @siyuan0322

@siyuan0322
Copy link
Collaborator

Sorry, the disk space of runner is exhausted. Rerunning.

@JackyYangPassion
Copy link
Contributor Author

Thanks for your reply, I will try it tomorrow

@siyuan0322
Copy link
Collaborator

This tag is available now 0.26.0a20231122

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants