Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Can't get the sock shop example running due to kubernetes version #188

Open
zahidulhaque opened this issue May 30, 2023 · 9 comments
Assignees
Labels
bug Something isn't working feature request

Comments

@zahidulhaque
Copy link

Contact Details

No response

Tell us the project / group you are associated with

Community (Default)

What happened?

I am trying to play with the sock shop example by following the getting started doc from: https://github.com/vmware-tanzu/graph-framework-for-microservices/blob/main/docs/getting_started/Playground.md.
I have been able to compile the data model, but running into issues while installing Nexus Runtime

It seems there is a very tight dependency on the k8s version (<=1.26).
Is there a way where we can download the correct kubernetes version as expected by graph framework?

Below is the output of the command
$ nexus runtime install --namespace default
Error: K8s Version should not be more that 1.26, current Version is 1.27.1
Usage:
nexus runtime install [flags]

Flags:
--admin Install the Nexus Admin runtime
--client-id string client id of the OIDC application. ignored if not --admin runtime
--client-secret string client secret of the OIDC application. ignored if not --admin runtime
--cpuResources stringArray for configuring cpu resources
-h, --help help for install
--jwt-claim string the JWT claim to be used as part of the admin match condition. ignored if not --admin runtime
--jwt-claim-value string the JWT claim to be used as part of the admin match condition. ignored if not --admin runtime
--memoryResources stringArray for configuring memory resources
-n, --namespace string name of the namespace to be created
--oauth-issuer-url string OAuth Issuer URL of the identity provider. ignored if not --admin runtime
--oauth-redirect-url string OAuth Redirect/Callback URL. ignored if not --admin runtime
--options stringArray for configuring additional helm values
-r, --registry string Registry where validation webhook and api-gw is located (default "gcr.io/nsx-sm/nexus")
-s, --secretname string Registry where validation webhook and api-gw is located
--skip-bootstrap skips the bootstrap step (only relevant for admin-runtime)

Global Flags:
--debug Enables extra logging
--list-prereq List prerequisites
--skip-prereq-check Skip prerequisites check

Describe the expected behavior

Nexus runtime should get successfully installed.

What version are you running?

$ nexus version
NexusCli: v0.0.163
NexusCompiler: 8f34b5f
NexusAppTemplates: v0.0.10
NexusDatamodelTemplates: v0.0.25
NexusRuntimeManifets: v0.2.66-cosmos-release-v2

How critical is this bug to you?

Major - important to fix

How can we recreate the bug?

No response

Any debug data that you are able to share?

$ nexus prereq verify
✅ docker docker daemon should be running on the host
✅ go 1.17
✅ kubernetes kubernetes cluster should be reachable via kubectl
❌ kubernetes version verify failed with err: K8s Version should not be more that 1.26, current Version is 1.27.1

$ go version
go version go1.20.4 linux/amd64

$ kubectl version
WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short. Use --output=yaml|json to get the full version.
Client Version: version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.3", GitCommit:"9e644106593f3f4aa98f8a84b23db5fa378900bd", GitTreeState:"clean", BuildDate:"2023-03-15T13:40:17Z", GoVersion:"go1.19.7", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.1", GitCommit:"4c9411232e10168d7b050c49a1b59f6df9d7ea4b", GitTreeState:"clean", BuildDate:"2023-05-12T19:03:40Z", GoVersion:"go1.20.3", Compiler:"gc", Platform:"linux/amd64"}

What is your operating system?

None

Any additional / relevant info

No response

@zahidulhaque zahidulhaque added the bug Something isn't working label May 30, 2023
@ramramu3433
Copy link
Contributor

New JIRA Created with ID: https://jira.eng.vmware.com/browse/NPT-912

@xmen4xp
Copy link
Contributor

xmen4xp commented May 30, 2023

Hello @zahidulhaque thanks for the detailed info.
We have not qualified Nexus for K8s > 1.26 at this point. So will treat this as a feature request.
Don't anticipate any major issue but primarily a resourcing constraint.
I will get back to you with additional info.

@zahidulhaque
Copy link
Author

Thanks @ramramu3433 for the feedback. Any idea from where I can install the desired dependency of kubernetes which is expected by nexus?

@xmen4xp
Copy link
Contributor

xmen4xp commented May 31, 2023

Thanks @ramramu3433 for the feedback. Any idea from where I can install the desired dependency of kubernetes which is expected by nexus?

@zahidulhaque If you are using "kind" based k8s cluster, you can install a kind K8s cluster version 1.23 with the following command:

kind create cluster --image=kindest/node:v1.23.3

This will be the easiest way to test it locally. Does that work for you ?

@zahidulhaque
Copy link
Author

zahidulhaque commented May 31, 2023

Ignore my previous comment. I was able to downgrade kubernetes version and install Nexus Runtime.

$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 443/TCP 106m
nexus-api-gw ClusterIP 10.100.16.75 80/TCP,443/TCP,18000/TCP 105m
nexus-apiserver ClusterIP 10.99.32.79 8080/TCP,6443/TCP 105m
nexus-etcd ClusterIP 10.104.147.73 2379/TCP,2380/TCP 105m
nexus-etcd-headless ClusterIP None 2379/TCP,2380/TCP 105m
nexus-graphql ClusterIP 10.97.182.43 8080/TCP 105m
nexus-ingress-nginx-controller ClusterIP 10.105.193.91 80/TCP,443/TCP 105m
nexus-ingress-nginx-controller-admission ClusterIP 10.97.236.110 443/TCP 105m
nexus-nginx ClusterIP 10.104.59.95 80/TCP 105m
nexus-proxy ClusterIP 10.104.8.238 80/TCP,10000/TCP,443/TCP,19000/TCP 105m
nexus-proxy-container ClusterIP 10.105.133.28 80/TCP 105m
nexus-validation ClusterIP 10.100.40.227 443/TCP 105m

Now getting error while Installing data model. The logs don't give much information,

$ nexus datamodel install image sockshop.com:latest --namespace default --debug

time="2023-05-31T11:14:53+05:30" level=debug msg="Latest available Nexus CLI version: v0.0.163\n"
time="2023-05-31T11:14:53+05:30" level=debug msg="Current Nexus CLI version: v0.0.163\n"
time="2023-05-31T11:19:53+05:30" level=error msg="could not complete datamodel install due to: Datamodel installation job sockshop.com-dmi not be completed due to exit status 1"
Error: Datamodel installation job sockshop.com-dmi not be completed due to exit status 1
Usage:
nexus datamodel install image [flags]

Flags:
-h, --help help for image
-s, --secretname string secret to pull images on namespace - needs to be created by user

Global Flags:
--debug Enables extra logging
--graphql-url string Url where graphql plugin is available if any custom storage is used
--list-prereq List prerequisites
-r, --namespace string name of the namespace to install to
--skip-prereq-check Skip prerequisites check
--title string title of the swaggerDocs for rest endpoints

@xmen4xp
Copy link
Contributor

xmen4xp commented May 31, 2023

Datamodel is installed by a K8s Job. In your case the name of the k8s job is: sockshop.com-dmi
The CLI is reporting that the job did not complete successfully.
Can you share the logs of the k8s job: kubectl logs sockshop.com-dmi

@xmen4xp
Copy link
Contributor

xmen4xp commented May 31, 2023

@ramramu3433 can you followup on this thread during your day if there are any updates. @zahidulhaque is attempting the sockshop based playground workflow and seems to be running into failure during datamodel install.

@zahidulhaque
Copy link
Author

$ kubectl logs sockshop.com-dmi-mqwjp
Defaulted container "datamodel-installer-job" out of: datamodel-installer-job, datamodel-installer-job-graphql-patch, check-nexus-proxy-container (init)
Error from server (BadRequest): container "datamodel-installer-job" in pod "sockshop.com-dmi-mqwjp" is waiting to start: trying and failing to pull image

Here is a list of all pods:

$ kubectl get pods
NAME READY STATUS RESTARTS AGE
nexus-api-dm-obj-installer-7l5jw 0/1 Completed 0 167m
nexus-api-dm-obj-installer-hd9mv 0/1 Error 0 173m
nexus-api-dmi-2gk6c 0/1 Completed 2 173m
nexus-api-gw-574457d87f-b25x9 1/1 Running 5 (169m ago) 173m
nexus-connect-controller-7b6864fdbb-bwqtm 1/1 Running 5 (169m ago) 173m
nexus-create-signed-cert-validation-webhook-gbttk 0/1 Completed 0 173m
nexus-etcd-0 1/1 Running 0 173m
nexus-etcd-defrag-28091910-dz87k 0/1 Completed 0 29m
nexus-ingress-nginx-controller-6fbc647889-l78tt 1/1 Running 0 173m
nexus-k8scert-creation-job-z6lnc 0/1 Completed 0 173m
nexus-kube-apiserver-6945dbb875-ll822 1/1 Running 0 173m
nexus-kube-controllermanager-7648b755d4-gpdns 1/1 Running 0 173m
nexus-nginx-79fb55bc4c-269g8 1/1 Running 0 173m
nexus-nginx-admsn-default-create-8sbbq 0/1 Completed 0 173m
nexus-nginx-admsn-default-patch-njcj5 0/1 Completed 2 173m
nexus-proxy-6cf5576589-9rm48 1/1 Running 0 173m
nexus-proxy-container-6c697b9cb9-hvkrp 1/1 Running 0 173m
nexus-validation-6f6cffd68c-rjm48 1/1 Running 0 173m
nexus-validation-webhook-creation-qwd2c 0/1 Completed 0 173m
sockshop.com-dmi-mqwjp 0/2 ImagePullBackOff 0 74m

@xmen4xp
Copy link
Contributor

xmen4xp commented May 31, 2023

@zahidulhaque

sockshop.com-dmi-mqwjp 0/2 ImagePullBackOff 0 74m

Error from server (BadRequest): container "datamodel-installer-job" in pod "sockshop.com-dmi-mqwjp" is waiting to start: trying and failing to pull image

The above errors indicate that the datamodel image that you built in the sandbox is not reachable to your k8s cluster.
The image name is: sockshop.com:latest

Upload this image to a container registry that is accessible to your cluster and re-run the command: nexus datamodel install image sockshop.com:latest --namespace default

This is an image that was built from the datamodel in your machine. So no one else has access to it.

Note:

If you are using a Kind based K8s cluster:
ie cluster created with a command like: kind create cluster --image=kindest/node:v1.24.2
then please follow the step in the "NOTE" section here to load the image inside the kind sandbo and re-run the datamodel install command.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working feature request
Projects
None yet
Development

No branches or pull requests

3 participants