Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Marathon fails to properly escape command health checks #3333

Closed
lloesche opened this issue Feb 24, 2016 · 4 comments
Closed

Marathon fails to properly escape command health checks #3333

lloesche opened this issue Feb 24, 2016 · 4 comments
Assignees
Milestone

Comments

@lloesche
Copy link
Contributor

If I want to execute a command health check like

/bin/bash -c "</dev/tcp/$HOST/$PORT0"

using the following definition

    {
      "protocol": "COMMAND",
      "command": { "value": "/bin/bash -c \"</dev/tcp/$HOST/$PORT0\"" }
    }

the health check will fail even though it is a perfectly valid command when executed on the system.

The reason being that Mesos wrapps the command health check in another sh -c "" and Marathon fails to properly escape the quotes in the configured command.

To successfully run the check above I have to use a config like this:

  {
    "protocol": "COMMAND",
    "command": { "value": "/bin/bash -c \\\"</dev/tcp/$HOST/$PORT0\\\"" }
  }

This means a Marathon user would have to have intimate knowledge of how Mesos further processes his command. I would expect Marathon to accept any valid shell command and properly escape it for further processing.

By escape I don't mean shell escape because then you can't do pipes and redirection and command concatenation. Just replace " with \" in the user provided command before handing it off to Mesos.

@aquamatthias aquamatthias added this to the 0.16.0 milestone Feb 25, 2016
@lxwbr lxwbr self-assigned this Mar 7, 2016
@aquamatthias
Copy link
Contributor

@bbannier @lloesche Marathon just passes the command string directly to Mesos. Escaping should be done on the Mesos side.
See:
https://issues.apache.org/jira/browse/MESOS-4812
https://mesosphere.atlassian.net/browse/CORE-265

@haosdent
Copy link

Hi, do you use MesosContainerizer or DockerContainerizer? @lloesche

As I checked by HealthCheckTest.HealthyTaskShellEscape

All below ways could pass:

  // First way.
  command.set_shell(true);
  command.set_value("bash -c \"</dev/tcp/$HOST/$PORT0\"");
  // Second way.
  command.set_shell(true);
  command.set_value("</dev/tcp/$HOST/$PORT0");
  // Third way.
  command.set_shell(false);
  command.set_value("bash");
  command.add_arguments("bash");
  command.add_arguments("-c");
  command.add_arguments("</dev/tcp/$HOST/$PORT0");

And I am going to set up a marathon to test e2e.

@haosdent
Copy link

My marathon task json

{
  "app": {
    "id": "/test-health",
    "cmd": "sleep 200",
    "args": null,
    "user": null,
    "env": {},
    "instances": 1,
    "cpus": 1,
    "mem": 128,
    "disk": 0,
    "executor": "",
    "constraints": [],
    "uris": [],
    "fetch": [],
    "storeUrls": [],
    "ports": [
      10000
    ],
    "portDefinitions": [
      {
        "port": 10000,
        "protocol": "tcp",
        "labels": {}
      }
    ],
    "requirePorts": false,
    "backoffSeconds": 1,
    "backoffFactor": 1.15,
    "maxLaunchDelaySeconds": 3600,
    "container": null,
    "healthChecks": [
      {
        "protocol": "COMMAND",
        "command": {
          "value": "bash -c \"</dev/tcp/www.google.com/80\""
        },
        "gracePeriodSeconds": 300,
        "intervalSeconds": 60,
        "timeoutSeconds": 20,
        "maxConsecutiveFailures": 3,
        "ignoreHttp1xx": false
      }
    ],
    "readinessChecks": [],
    "dependencies": [],
    "upgradeStrategy": {
      "minimumHealthCapacity": 1,
      "maximumOverCapacity": 1
    },
    "labels": {},
    "acceptedResourceRoles": null,
    "ipAddress": null,
    "version": "2016-04-17T06:24:40.457Z",
    "residency": null,
    "versionInfo": {
      "lastScalingAt": "2016-04-17T06:24:40.457Z",
      "lastConfigChangeAt": "2016-04-17T06:24:40.457Z"
    },
    "tasksStaged": 0,
    "tasksRunning": 1,
    "tasksHealthy": 1,
    "tasksUnhealthy": 0,
    "deployments": [],
    "tasks": [
      {
        "id": "test-health.1108b4a6-0465-11e6-9af4-0242d20a0294",
        "slaveId": "7f8c8ce6-45b2-4e63-a5c4-a88e08af12ed-S0",
        "host": "127.0.0.1",
        "startedAt": "2016-04-17T06:24:40.770Z",
        "stagedAt": "2016-04-17T06:24:40.537Z",
        "ports": [
          31619
        ],
        "version": "2016-04-17T06:24:40.457Z",
        "ipAddresses": [
          {
            "ipAddress": "127.0.0.1",
            "protocol": "IPv4"
          }
        ],
        "appId": "/test-health",
        "healthCheckResults": [
          {
            "alive": true,
            "consecutiveFailures": 0,
            "firstSuccess": "2016-04-17T06:24:40.924Z",
            "lastFailure": null,
            "lastSuccess": "2016-04-17T06:24:40.924Z",
            "lastFailureCause": null,
            "taskId": "test-health.1108b4a6-0465-11e6-9af4-0242d20a0294"
          }
        ]
      }
    ]
  }
}

Which could pass health check in Mesos

Registered executor on 127.0.0.1
Starting task test-health.1108b4a6-0465-11e6-9af4-0242d20a0294
sh -c 'sleep 200'
Forked command at 27046
Launching health check process: /home/haosdent/mesos/build/src/mesos-health-check --executor=(1)@127.0.0.1:39822 --health_check_json={"command":{"shell":true,"value":"bash -c \"<\/dev\/tcp\/www.google.com\/80\""},"consecutive_failures":3,"delay_seconds":0.0,"grace_period_seconds":300.0,"interval_seconds":60.0,"timeout_seconds":20.0} --task_id=test-health.1108b4a6-0465-11e6-9af4-0242d20a0294
Health check process launched at pid: 27047
Received task health update, healthy: true

@haosdent
Copy link

By the way, I noticed marathon didn't set env for health check command. https://github.com/mesosphere/marathon/blob/master/src/main/scala/mesosphere/marathon/health/HealthCheck.scala#L89

        MesosProtos.HealthCheck.newBuilder
          .setCommand(this.command.get.toProto)

@mesosphere mesosphere locked and limited conversation to collaborators Mar 27, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants