Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable GCM and GCL instead of InfluxDB on GCE #7751

Merged
merged 1 commit into from May 6, 2015

Conversation

saad-ali
Copy link
Member

@saad-ali saad-ali commented May 5, 2015

  • Enabled Google Cloud Monitoring (GCM) and Google Cloud Logging (GCL) instead of InfluxDB for Google Compute Engine (GCE) deployments.
  • Modified cluster addon setup process to take JSON files in addition to YAML
  • Converted all Heapster Kubernetes config files to JSON
  • Get rid of all Heapster v1beta1 configs

CC @vishh

@vishh
Copy link
Contributor

vishh commented May 5, 2015

cc @zmerlynn for salt config changes.

@saad-ali
Copy link
Member Author

saad-ali commented May 5, 2015

E2E is green:

Ran 38 of 43 Specs in 1820.331 seconds
SUCCESS! -- 38 Passed | 0 Failed | 1 Pending | 4 Skipped I0504 23:12:37.766365   23115 driver.go:96] All tests pass

"ports": [
{
"port": 80,
"targetPort": 8082
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the heapster rc pod template expose this port?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. The heapster service is not needed, I will remove it from both influxdb and google.

@a-robinson
Copy link
Contributor

What's the motivation for switching to JSON? Just a few days ago @bgrant0607 wrote that "We should just use YAML for examples. It's generally more user-friendly than JSON, and supports comments."

"env": [
{
"name": "FLAGS",
"value": "--poll_duration=2m --stats_resolution=1m --sink influxdb:http://$MONITORING_INFLUXDB_SERVICE_HOST:$MONITORING_INFLUXDB_PORT_8086_TCP_PORT"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use env variables? Why not use DNS? This could be http://monitoring-influxdb:80.
The advantage with dns is that all these services can be started in parallel.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the idea of setting the port to a constant, I prefer to have it dynamically picked up--so if the service is ever modified and the ports are changed, this will continue to work.

I don't see why this changes whether the sinks can be started in parallel.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The env variables won't be set for heapster pod unless the influxdb service has been created. Since we start all the monitoring services together, there is no guarantee that that influxdb service will be created before heapster.
Support for multi-port DNS is on the way. until then we can hardcode port 80. WDYT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that is the case, there would be no guarantee that Kube-DNS would resolve monitoring-influxdb correctly--i.e. if heapster pod gets created before the heapster service, Kube-DNS would not have a entry for monitoring-influxdb--we would end up with the same problem.

With the env variables approach, if the pod comes up before the service, and the variables are not set, heapster will terminate with error--and Kubernetes will create a new container which should have the variables set.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The heapster pod is supposed to handle backend failures. So if the dns name
does not resolve, it will retry internally before deciding to crash/fail.
Crashing without retires might confuse users.

On Tue, May 5, 2015 at 2:30 PM, Saad Ali notifications@github.com wrote:

In cluster/addons/cluster-monitoring/influxdb/heapster-controller.json
#7751 (comment)
:

  • "template": {
  •  "metadata": {
    
  •    "labels": {
    
  •      "name": "heapster",
    
  •      "kubernetes.io/cluster-service": "true"
    
  •    }
    
  •  },
    
  •  "spec": {
    
  •    "containers": [
    
  •      {
    
  •        "image": "gcr.io/google_containers/heapster:v0.11.0",
    
  •        "name": "heapster",
    
  •        "env": [
    
  •          {
    
  •            "name": "FLAGS",
    
  •            "value": "--poll_duration=2m --stats_resolution=1m --sink influxdb:http://$MONITORING_INFLUXDB_SERVICE_HOST:$MONITORING_INFLUXDB_PORT_8086_TCP_PORT"
    

If that is the case, there would be no guarantee that Kube-DNS would
resolve monitoring-influxdb correctly--i.e. if heapster pod gets created
before the heapster service, Kube-DNS would not have a entry for
monitoring-influxdb--we would end up with the same problem.

With the env variables approach, if the pod comes up before the service,
and the variables are not set, heapster will terminate with error--and
Kubernetes will create a new container which should have the variables set.


Reply to this email directly or view it on GitHub
https://github.com/GoogleCloudPlatform/kubernetes/pull/7751/files#r29715109
.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's reasonable. I'll go ahead and make the change.

@saad-ali saad-ali force-pushed the updateHeapsterConfigs branch 2 times, most recently from 2cbdfc3 to 6811387 Compare May 5, 2015 22:45
@saad-ali
Copy link
Member Author

saad-ali commented May 5, 2015

PTAL.

Regarding JSON: There should be no reason that the cluster addons mechanism doesn't support both JSON and YAML. And it is good to have examples of both in our code. The change was suggested by @vishh -- his argument, as I understood it, was that YAML would internally be translated in to JSON anyway. That said, I don't feel too strongly about it and would be willing to change it back to YAML.

Rerunning E2E for verification.

@bgrant0607
Copy link
Member

See #7257

@a-robinson
Copy link
Contributor

I agree that the mechanism should support both, just wondering whether we should be writing any in JSON anymore. I'll defer to @bgrant0607 - do you care whether these cluster addon configs are in YAML vs JSON?

@saad-ali
Copy link
Member Author

saad-ali commented May 5, 2015

FYI, E2E is green:

Ran 38 of 43 Specs in 1687.865 seconds
SUCCESS! -- 38 Passed | 0 Failed | 1 Pending | 4 Skipped I0505 16:13:05.378063   24215 driver.go:96] All tests pass

@bgrant0607
Copy link
Member

Let's please leave the configs in YAML.

@saad-ali saad-ali force-pushed the updateHeapsterConfigs branch 2 times, most recently from 4cf8ccd to 6588486 Compare May 6, 2015 05:35
@saad-ali
Copy link
Member Author

saad-ali commented May 6, 2015

PTAL Switched to YAML

E2E is clean:

Ran 38 of 43 Specs in 1653.461 seconds
SUCCESS! -- 38 Passed | 0 Failed | 1 Pending | 4 Skipped I0505 22:08:03.650021   24184 driver.go:96] All tests pass

@@ -0,0 +1,32 @@
apiVersion: "v1beta3"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to quote any of the strings in the yaml files (except perhaps if the string is "true", judging by the deleted files).

@a-robinson
Copy link
Contributor

I'm sorry for asking you to change this again, but can you un-quote all the strings in the yaml files?

@saad-ali
Copy link
Member Author

saad-ali commented May 6, 2015

No worries. Quotes removed. PTAL

Re-confirmed E2E:

Ran 38 of 43 Specs in 1800.775 seconds
SUCCESS! -- 38 Passed | 0 Failed | 1 Pending | 4 Skipped I0506 14:54:35.184922   29922 driver.go:96] All tests pass

# none - No cluster monitoring setup
# influxdb - Heapster, InfluxDB, and Grafana
# google - Heapster, Google Cloud Metrics, and Google Cloud Logging
ENABLE_CLUSTER_MONITORING="${KUBE_ENABLE_CLUSTER_MONITORING:-influxdb}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we want to use google backend here as well. We need to update the monitoring e2e tests to not to depend on the default cluster addons before switching to google backend.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we can do that separately.

@vishh
Copy link
Contributor

vishh commented May 6, 2015

LGTM

# Optional: Cluster monitoring to setup as part of the cluster bring up:
# none - No cluster monitoring setup
# influxdb - Heapster, InfluxDB, and Grafana
# google - Heapster, Google Cloud Metrics, and Google Cloud Logging
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Google Cloud Monitoring, not Google Cloud Metrics

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

# Optional: Cluster monitoring to setup as part of the cluster bring up:
# none - No cluster monitoring setup
# influxdb - Heapster, InfluxDB, and Grafana
# google - Heapster, Google Cloud Metrics, and Google Cloud Logging
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

Influxdb for Google Compute Engine deployments.
@a-robinson
Copy link
Contributor

Thanks! LGTM.

a-robinson added a commit that referenced this pull request May 6, 2015
Enable GCM and GCL instead of InfluxDB on GCE
@a-robinson a-robinson merged commit 0909cce into kubernetes:master May 6, 2015
@saad-ali saad-ali deleted the updateHeapsterConfigs branch May 6, 2015 23:58
@piosz
Copy link
Member

piosz commented May 13, 2015

cc @jszczepkowski

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants