Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running mizu fails silently with no logs #706

Closed
peter-dolkens opened this issue Jan 27, 2022 · 9 comments · Fixed by #776
Closed

Running mizu fails silently with no logs #706

peter-dolkens opened this issue Jan 27, 2022 · 9 comments · Fixed by #776
Assignees
Labels
CLI Indicates an issue or PR is related to the CLI program. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. stale Denotes an issue or PR has remained open with no activity and has become stale.

Comments

@peter-dolkens
Copy link

peter-dolkens commented Jan 27, 2022

Describe the bug
A clear and concise description of what the bug is.

To Reproduce
Steps to reproduce the behavior:

  1. Run mizu tap or any variant of it which does NOT include mizu-resources-namespace and --namespaces
  2. See mizu terminate instantly with no logs or error output
  3. Run mizu tap --set mizu-resources-namespace=my-ns --namespaces=my-ns
  4. Mizu executes as expected
  5. Run mizu tap --set mizu-resources-namespace=mizu --namespaces=mizu
  6. See mizu terminate instantly with no logs or error output

I'm admin on this cluster, and have even gone so far as to create the mizu namespace ahead of time after it failed originally.

Expected behavior

mizu tap command works without mizu-resources-namespace and --namespaces command

Logs

WARNING: No zip logs generated, only CLI logs.

Screenshots

n/a

Desktop (please complete the following information):

  • OS: Darwin Kernel Version 20.6.0: Mon Aug 30 06:12:21 PDT 2021; root:xnu-7195.141.6~3/RELEASE_X86_64
  • Web Browser: n/a

Additional context
Add any other context about the problem here.

We do have an OPA policy which enforces a naming scheme on our namespaces unless a certain label is added. As such, I tried adding the mizu namespace manually with the partner: core label.

Additionally, the mac install instructions don't add mizu to your path, so I manually copied it to /usr/local/bin/mizu though the issue was happening even when executing in my home directory with ./mizu instead of just mizu

➜  ~ rm -rf .mizu
➜  ~ mizu tap --set mizu-resources-namespace=mizu --namespaces=mizu --set dump-logs=true
Mizu will store up to 200MB of traffic, old traffic will be cleared once the limit is reached.
Tapping pods in namespaces "mizu"
+bash
Waiting for Mizu Agent to start...
➜  ~ cat .mizu/mizu_cli.log
[2022-01-27T02:46:07.633+0000] DEBUG ▶ Checking for newer version... ▶ [57108 versionCheck.go:47 CheckNewerVersion]
[2022-01-27T02:46:07.633+0000] DEBUG ▶ Init config finished
 Final config: {
        "Tap": {
                "UploadIntervalSec": 10,
                "PodRegexStr": ".*",
                "GuiPort": 8899,
                "ProxyHost": "127.0.0.1",
                "Namespaces": [
                        "mizu"
                ],
                "Analysis": false,
                "AllNamespaces": false,
                "PlainTextFilterRegexes": null,
                "IgnoredUserAgents": null,
                "DisableRedaction": false,
                "HumanMaxEntriesDBSize": "200MB",
                "DryRun": false,
                "Workspace": "",
                "EnforcePolicyFile": "",
                "ContractFile": "",
                "AskUploadConfirmation": true,
                "ApiServerResources": {
                        "CpuLimit": "750m",
                        "MemoryLimit": "1Gi",
                        "CpuRequests": "50m",
                        "MemoryRequests": "50Mi"
                },
                "TapperResources": {
                        "CpuLimit": "750m",
                        "MemoryLimit": "1Gi",
                        "CpuRequests": "50m",
                        "MemoryRequests": "50Mi"
                },
                "ServiceMesh": false
        },
        "Version": {
                "DebugInfo": false
        },
        "View": {
                "GuiPort": 8899,
                "Url": ""
        },
        "Logs": {
                "FileStr": ""
        },
        "Auth": {
                "EnvName": "up9.app",
                "Token": ""
        },
        "Config": {
                "Regenerate": false
        },
        "AgentImage": "gcr.io/up9-docker-hub/mizu/main:0.22.0",
        "ImagePullPolicyStr": "Always",
        "MizuResourcesNamespace": "mizu",
        "Telemetry": true,
        "DumpLogs": true,
        "KubeConfigPathStr": "",
        "ConfigFilePath": "/Users/peter.dolkens/.mizu/config.yaml",
        "HeadlessMode": false,
        "LogLevelStr": "INFO",
        "ServiceMap": false,
        "OAS": false
}
 ▶ [57108 config.go:57 InitConfig]
[2022-01-27T02:46:07.633+0000] INFO  ▶ Mizu will store up to 200MB of traffic, old traffic will be cleared once the limit is reached. ▶ [57108 tap.go:82 func8]
[2022-01-27T02:46:07.633+0000] DEBUG ▶ Using kube config /Users/peter.dolkens/.kube/config ▶ [57108 provider.go:1055 loadKubernetesConfiguration]
[2022-01-27T02:46:08.017+0000] DEBUG ▶ successfully reported telemetry for cmd tap ▶ [57108 telemetry.go:36 ReportRun]
[2022-01-27T02:46:08.175+0000] INFO  ▶ Tapping pods in namespaces "mizu" ▶ [57108 tapRunner.go:116 RunMizuTap]
[2022-01-27T02:46:08.310+0000] INFO  ▶ +bash ▶ [57108 tapRunner.go:179 printTappedPodsPreview]
[2022-01-27T02:46:08.310+0000] DEBUG ▶ Finished version validation, github version 0.22.0, current version 0.22.0, took 676.551796ms ▶ [57108 versionCheck.go:95 CheckNewerVersion]
[2022-01-27T02:46:08.310+0000] INFO  ▶ Waiting for Mizu Agent to start... ▶ [57108 tapRunner.go:126 RunMizuTap]
➜  ~ ls .mizu
total 8
-rw-r--r--  1 peter.dolkens  staff  2466 Jan 27 02:46 mizu_cli.log
➜  ~ mizu tap datasync
Mizu will store up to 200MB of traffic, old traffic will be cleared once the limit is reached.
Tapping pods in namespaces "my-namespace"
+datasync-deploy-7d94dc6446-d9h5k
Waiting for Mizu Agent to start...
➜  ~ mizu tap --set mizu-resources-namespace=my-namespace --namespaces=my-namespace datasync
Mizu will store up to 200MB of traffic, old traffic will be cleared once the limit is reached.
Tapping pods in namespaces "my-namespace"
+datasync-deploy-5c65c9868c-c44pb
Waiting for Mizu Agent to start...
Mizu is available at http://localhost:8899
➜  ~ k get ns mizu -o yaml
apiVersion: v1
kind: Namespace
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","kind":"Namespace","metadata":{"annotations":{"name":"mizu"},"labels":{"partner":"core"},"name":"mizu"}}
    name: mizu
  creationTimestamp: "2022-01-27T02:04:06Z"
  labels:
    kubernetes.io/metadata.name: mizu
    partner: core
  name: mizu
  resourceVersion: "175991464"
  uid: 078c01bb-9ef6-4f63-98aa-caefed0d2401
spec:
  finalizers:
  - kubernetes
status:
  phase: Active
@RoyIsland
Copy link
Contributor

RoyIsland commented Jan 27, 2022

Hey @peter-dolkens,
Thanks for reaching out!

Can you please run again with --set dump-logs=true flag, the full command should look like mizu tap --set dump-logs=true.
After the execution ends you should find the logs in ~/.mizu/mizu_logs_**.zip, can you please attach them here.

If no zip logs generated, Can you please send us ~/.mizu/mizu_cli.log file?

Thanks!

@RoyIsland RoyIsland self-assigned this Jan 27, 2022
@peter-dolkens
Copy link
Author

Hi @RoyUP9, I performed those steps already - the CLI log is included in my initial report. No zip log was generated.

@RoyIsland
Copy link
Contributor

Hey @peter-dolkens,
Thanks for your answer.

I'll try to explain the difference between the --set mizu-resources-namespace flag and the namespaces flag:
--set mizu-resources-namespace flag - Declares an existing namespace which mizu resources will be installed on, defaults to mizu which creates a namespace named mizu and creates the resources there.
namespaces flag - Declares the namespaces that their pods will be recorded (has no effect on the creation of mizu resources).

When --set mizu-resources-namespace flag value is mizu, we try to create a namespace named mizu and create all of mizu's resources there, which will fail in your case - in step 1 it will fail because of OPA policy, and in step 5 it will fail because the namespace already exists.

I would suggest using --set mizu-resources-namespace flag on a different existing namespace like you did in step 3 (it can be any existing namespace except mizu).

Thanks!

@peter-dolkens
Copy link
Author

Excellent - would be great if we could see a log message if it fails because it already exists.

Even better - it would be great if it worked even if the namespace already exists!

For now I'll look at adding an exception to our OPA policy

Thanks

@nimrod-up9
Copy link
Contributor

Mizu should have printed an error message. We will look into that.

Besides that, did you manage to get Mizu running?
If adding an exception to the OPA policy is not possible or is undesired, could you achieve the same effect by adding a custom ClusterRole? It is possible to add custom ClusterRole and ClusterRoleBindings to Mizu. See https://github.com/up9inc/mizu/blob/main/docs/PERMISSIONS.md for instructions.

@mertyildiran mertyildiran added CLI Indicates an issue or PR is related to the CLI program. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Feb 5, 2022
@nimrod-up9
Copy link
Contributor

Managed to reproduce the bug by running mizu from a user which has no permissions to create k8s resources.
The expected behavior from Mizu in case it is missing some permissions is to print out the missing permission quit.
Instead of doing that Mizu quits without warning.

@IgorGov IgorGov linked a pull request Feb 9, 2022 that will close this issue
@nimrod-up9 nimrod-up9 self-assigned this Feb 9, 2022
@nimrod-up9
Copy link
Contributor

Pushed a commit that should solve make Mizu print out the errors again.
@peter-dolkens, can you try running with version 26.0-dev15 or newer to confirm the fix?

@peter-dolkens
Copy link
Author

Sorry, I can confirm that after we added an exception to the policy controller to allow the mizu namespace, everything works as expected.

I can try recreating the problem in a bit to see if an error is printed in this use case (for us, I've got permissions to create resources, but there's an admission controller enforcing a naming policy on those resources)

@gadotroee gadotroee added the stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 18, 2022
@github-actions
Copy link

github-actions bot commented Mar 5, 2022

This issue was closed because it has been inactive for 14 days since being marked as stale.

@github-actions github-actions bot closed this as completed Mar 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLI Indicates an issue or PR is related to the CLI program. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants