New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

autoscaling with kubernetes components running outside of docker #18652

Closed
mzupan opened this Issue Dec 14, 2015 · 15 comments

Comments

Projects
None yet
5 participants
@mzupan
Contributor

mzupan commented Dec 14, 2015

I have all the kubernetes components for the master/nodes running outside of docker and running as processes right on the hosts.

I have heapster working but autoscaling isn't picking it up since I think it expects everything to be running in the kube-system namespace inside kubernetes.

Is there a way to have autoscaling work with things running outside of kubernetes?

Thanks

@mzupan

This comment has been minimized.

Show comment
Hide comment
@mzupan

mzupan Dec 14, 2015

Contributor

So it looks like its hard coded here

https://github.com/kubernetes/kubernetes/blob/master/cmd/kube-controller-manager/app/controllermanager.go#L363

Is there anyway to get around this.. like using /etc/hosts to trick the controller into hitting heapster somewhere outside of kube-system

Contributor

mzupan commented Dec 14, 2015

So it looks like its hard coded here

https://github.com/kubernetes/kubernetes/blob/master/cmd/kube-controller-manager/app/controllermanager.go#L363

Is there anyway to get around this.. like using /etc/hosts to trick the controller into hitting heapster somewhere outside of kube-system

@vishh

This comment has been minimized.

Show comment
Hide comment
@vishh
Member

vishh commented Dec 14, 2015

@jszczepkowski

This comment has been minimized.

Show comment
Hide comment
@jszczepkowski

jszczepkowski Dec 15, 2015

Contributor

Pod autoscaling will be only possible if metrics for pods are visible on Heapster.

I guess in the described case metrics will not be exposed, @mwielgus please correct me if I'm wrong.

Contributor

jszczepkowski commented Dec 15, 2015

Pod autoscaling will be only possible if metrics for pods are visible on Heapster.

I guess in the described case metrics will not be exposed, @mwielgus please correct me if I'm wrong.

@mzupan

This comment has been minimized.

Show comment
Hide comment
@mzupan

mzupan Dec 15, 2015

Contributor

so the metrics are in heapster for the replication controller I setup for the auto scale. The issue is kubernetes doesn't know where heapster is so it says in a <waiting> state.

Contributor

mzupan commented Dec 15, 2015

so the metrics are in heapster for the replication controller I setup for the auto scale. The issue is kubernetes doesn't know where heapster is so it says in a <waiting> state.

@mzupan mzupan closed this Dec 15, 2015

@mzupan mzupan reopened this Dec 15, 2015

@vishh

This comment has been minimized.

Show comment
Hide comment
@vishh

vishh Dec 15, 2015

Member

Mike can you setup a heapster service in kube-system namespace with
heapster's IP explicitly specified?

On Tue, Dec 15, 2015 at 5:42 AM, Mike Zupan notifications@github.com
wrote:

so the metrics are in heapster for the replication controller I setup for
the auto scale. The issue is kubernetes doesn't know where heapster is so
it says in a state.


Reply to this email directly or view it on GitHub
#18652 (comment)
.

Member

vishh commented Dec 15, 2015

Mike can you setup a heapster service in kube-system namespace with
heapster's IP explicitly specified?

On Tue, Dec 15, 2015 at 5:42 AM, Mike Zupan notifications@github.com
wrote:

so the metrics are in heapster for the replication controller I setup for
the auto scale. The issue is kubernetes doesn't know where heapster is so
it says in a state.


Reply to this email directly or view it on GitHub
#18652 (comment)
.

@mzupan

This comment has been minimized.

Show comment
Hide comment
@mzupan

mzupan Dec 15, 2015

Contributor

@vishh I've tried that also and I see no traffic coming out of k8 on port 80/443 or 8082

Contributor

mzupan commented Dec 15, 2015

@vishh I've tried that also and I see no traffic coming out of k8 on port 80/443 or 8082

@mzupan

This comment has been minimized.

Show comment
Hide comment
@mzupan

mzupan Dec 16, 2015

Contributor

Here is the command I'm trying

tcpdump dst port 80 or dst port 443 or dst port 8082

I don't see any connections going out if I keep it open 10 minutes.

Contributor

mzupan commented Dec 16, 2015

Here is the command I'm trying

tcpdump dst port 80 or dst port 443 or dst port 8082

I don't see any connections going out if I keep it open 10 minutes.

@vishh

This comment has been minimized.

Show comment
Hide comment
@vishh

vishh Dec 17, 2015

Member

Maybe @bprashanth can help you with the network issues.

Member

vishh commented Dec 17, 2015

Maybe @bprashanth can help you with the network issues.

@mzupan

This comment has been minimized.

Show comment
Hide comment
@mzupan

mzupan Dec 23, 2015

Contributor

anything on this.. i'm trying to run things inside the kube-system namespace and not sure if its forcing a location to a certain service or what

Contributor

mzupan commented Dec 23, 2015

anything on this.. i'm trying to run things inside the kube-system namespace and not sure if its forcing a location to a certain service or what

@mzupan

This comment has been minimized.

Show comment
Hide comment
@mzupan

mzupan Dec 24, 2015

Contributor

so running the kube-up example in aws and tcpdumping I found the controller I believe hits

X-Forwarded-Uri: /api/v1/proxy/namespaces/kube-system/services/heapster/api/v1/model/namespaces/default/pod-list/my-nginx-iy3vx/metrics/cpu-usage

So I have heapster running in kube-system with a service.. and I can hit the following URL for my other cluster at

http://10.122.0.20:8080/api/v1/proxy/namespaces/kube-system/services/heapster/api/v1/model/namespaces/default/pods/my-nginx-5k13l/stats

stats are displayed but still nothing for

[root@kube-master ~] [dev] # kubectl get hpa
NAME       REFERENCE                              TARGET    CURRENT     MINPODS   MAXPODS   AGE
my-nginx   ReplicationController/my-nginx/scale   80%       <waiting>   1         5         19h
Contributor

mzupan commented Dec 24, 2015

so running the kube-up example in aws and tcpdumping I found the controller I believe hits

X-Forwarded-Uri: /api/v1/proxy/namespaces/kube-system/services/heapster/api/v1/model/namespaces/default/pod-list/my-nginx-iy3vx/metrics/cpu-usage

So I have heapster running in kube-system with a service.. and I can hit the following URL for my other cluster at

http://10.122.0.20:8080/api/v1/proxy/namespaces/kube-system/services/heapster/api/v1/model/namespaces/default/pods/my-nginx-5k13l/stats

stats are displayed but still nothing for

[root@kube-master ~] [dev] # kubectl get hpa
NAME       REFERENCE                              TARGET    CURRENT     MINPODS   MAXPODS   AGE
my-nginx   ReplicationController/my-nginx/scale   80%       <waiting>   1         5         19h
@mzupan

This comment has been minimized.

Show comment
Hide comment
@mzupan

mzupan Dec 24, 2015

Contributor

I enabled --v=5 on the controller

I'm getting

W1224 18:23:13.317083       1 horizontal.go:185] Failed to reconcile my-nginx: failed to compute desired number of replicas based on CPU utilization for ReplicationController/default/my-nginx: failed to get cpu utilization: failed to get CPU consumption and request: some pods do not have request for cpu
W1224 18:23:43.325812       1 horizontal.go:185] Failed to reconcile my-nginx: failed to compute desired number of replicas based on CPU utilization for ReplicationController/default/my-nginx: failed to get cpu utilization: failed to get CPU consumption and request: some pods do not have request for cpu
W1224 18:24:13.334022       1 horizontal.go:185] Failed to reconcile my-nginx: failed to compute desired number of replicas based on CPU utilization for ReplicationController/default/my-nginx: failed to get cpu utilization: failed to get CPU consumption and request: some pods do not have request for cpu
W1224 18:24:43.353948       1 horizontal.go:185] Failed to reconcile my-nginx: failed to compute desired number of replicas based on CPU utilization for ReplicationController/default/my-nginx: failed to get cpu utilization: failed to get CPU consumption and request: some pods do not have request for cpu
W1224 18:25:13.363546       1 horizontal.go:185] Failed to reconcile my-nginx: failed to compute desired number of replicas based on CPU utilization for ReplicationController/default/my-nginx: failed to get cpu utilization: failed to get CPU consumption and request: some pods do not have request for cpu
W1224 18:25:43.374869       1 horizontal.go:185] Failed to reconcile my-nginx: failed to compute desired number of replicas based on CPU utilization for ReplicationController/default/my-nginx: failed to get cpu utilization: failed to get CPU consumption and request: some pods do not have request for cpu
W1224 18:26:13.381464       1 horizontal.go:185] Failed to reconcile my-nginx: failed to compute desired number of replicas based on CPU utilization for ReplicationController/default/my-nginx: failed to get cpu utilization: failed to get CPU consumption and request: some pods do not have request for cpu
W1224 18:26:43.388174       1 horizontal.go:185] Failed to reconcile my-nginx: failed to compute desired number of replicas based on CPU utilization for ReplicationController/default/my-nginx: failed to get cpu utilization: failed to get CPU consumption and request: some pods do not have request for cpu
W1224 18:27:13.394663       1 horizontal.go:185] Failed to reconcile my-nginx: failed to compute desired number of replicas based on CPU utilization for ReplicationController/default/my-nginx: failed to get cpu utilization: failed to get CPU consumption and request: some pods do not have request for cpu
W1224 18:27:43.425126       1 horizontal.go:185] Failed to reconcile my-nginx: failed to compute desired number of replicas based on CPU utilization for ReplicationController/default/my-nginx: failed to get cpu utilization: failed to get CPU consumption and request: some pods do not have request for cpu
Contributor

mzupan commented Dec 24, 2015

I enabled --v=5 on the controller

I'm getting

W1224 18:23:13.317083       1 horizontal.go:185] Failed to reconcile my-nginx: failed to compute desired number of replicas based on CPU utilization for ReplicationController/default/my-nginx: failed to get cpu utilization: failed to get CPU consumption and request: some pods do not have request for cpu
W1224 18:23:43.325812       1 horizontal.go:185] Failed to reconcile my-nginx: failed to compute desired number of replicas based on CPU utilization for ReplicationController/default/my-nginx: failed to get cpu utilization: failed to get CPU consumption and request: some pods do not have request for cpu
W1224 18:24:13.334022       1 horizontal.go:185] Failed to reconcile my-nginx: failed to compute desired number of replicas based on CPU utilization for ReplicationController/default/my-nginx: failed to get cpu utilization: failed to get CPU consumption and request: some pods do not have request for cpu
W1224 18:24:43.353948       1 horizontal.go:185] Failed to reconcile my-nginx: failed to compute desired number of replicas based on CPU utilization for ReplicationController/default/my-nginx: failed to get cpu utilization: failed to get CPU consumption and request: some pods do not have request for cpu
W1224 18:25:13.363546       1 horizontal.go:185] Failed to reconcile my-nginx: failed to compute desired number of replicas based on CPU utilization for ReplicationController/default/my-nginx: failed to get cpu utilization: failed to get CPU consumption and request: some pods do not have request for cpu
W1224 18:25:43.374869       1 horizontal.go:185] Failed to reconcile my-nginx: failed to compute desired number of replicas based on CPU utilization for ReplicationController/default/my-nginx: failed to get cpu utilization: failed to get CPU consumption and request: some pods do not have request for cpu
W1224 18:26:13.381464       1 horizontal.go:185] Failed to reconcile my-nginx: failed to compute desired number of replicas based on CPU utilization for ReplicationController/default/my-nginx: failed to get cpu utilization: failed to get CPU consumption and request: some pods do not have request for cpu
W1224 18:26:43.388174       1 horizontal.go:185] Failed to reconcile my-nginx: failed to compute desired number of replicas based on CPU utilization for ReplicationController/default/my-nginx: failed to get cpu utilization: failed to get CPU consumption and request: some pods do not have request for cpu
W1224 18:27:13.394663       1 horizontal.go:185] Failed to reconcile my-nginx: failed to compute desired number of replicas based on CPU utilization for ReplicationController/default/my-nginx: failed to get cpu utilization: failed to get CPU consumption and request: some pods do not have request for cpu
W1224 18:27:43.425126       1 horizontal.go:185] Failed to reconcile my-nginx: failed to compute desired number of replicas based on CPU utilization for ReplicationController/default/my-nginx: failed to get cpu utilization: failed to get CPU consumption and request: some pods do not have request for cpu
@mzupan

This comment has been minimized.

Show comment
Hide comment
@mzupan

mzupan Dec 28, 2015

Contributor

So I loaded a grafana pod to see influxdb and I don't see any pod stats.. the drop downs are empty.. but cluster stats for all k8 nodes are there

screen shot 2015-12-28 at 10 31 48 am

Contributor

mzupan commented Dec 28, 2015

So I loaded a grafana pod to see influxdb and I don't see any pod stats.. the drop downs are empty.. but cluster stats for all k8 nodes are there

screen shot 2015-12-28 at 10 31 48 am

@mzupan

This comment has been minimized.

Show comment
Hide comment
@mzupan

mzupan Dec 30, 2015

Contributor

So I'm not trying to debug code and this line

https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/podautoscaler/metrics/metrics_client.go#L133

If i add the following above it

glog.Errorf("container info: %v", container.Resources.Requests)

I get the following

E1230 05:38:12.224275    3804 metrics_client.go:139] container info: map[]

Going to try to dig in why that is empty.

Contributor

mzupan commented Dec 30, 2015

So I'm not trying to debug code and this line

https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/podautoscaler/metrics/metrics_client.go#L133

If i add the following above it

glog.Errorf("container info: %v", container.Resources.Requests)

I get the following

E1230 05:38:12.224275    3804 metrics_client.go:139] container info: map[]

Going to try to dig in why that is empty.

@mzupan

This comment has been minimized.

Show comment
Hide comment
@mzupan

mzupan Jan 4, 2016

Contributor

ok turns out all i needed to add was

        resources:
          requests:
            cpu: 400m

Then it started to work.. not sure why in the kube-up setup I do not have to do that.

Contributor

mzupan commented Jan 4, 2016

ok turns out all i needed to add was

        resources:
          requests:
            cpu: 400m

Then it started to work.. not sure why in the kube-up setup I do not have to do that.

@mzupan mzupan closed this Jan 4, 2016

@evilfrog

This comment has been minimized.

Show comment
Hide comment
@evilfrog

evilfrog Apr 3, 2017

I had 1 pod running 2 containers (1 doing nothing - just providing code).

It turned out I need to specify "requests" on both of the containers.

evilfrog commented Apr 3, 2017

I had 1 pod running 2 containers (1 doing nothing - just providing code).

It turned out I need to specify "requests" on both of the containers.

openshift-publish-robot pushed a commit to openshift/kubernetes that referenced this issue Mar 22, 2018

Merge pull request #18652 from deads2k/hypershift-02-openshift-apiserver
Automatic merge from submit-queue (batch tested with PRs 18778, 18709, 18876, 18897, 18652).

Run openshift aggregated behind kube

This is ready for review.  I've split some bits out into separate pulls to make it easier to review.  It separates openshift-apiserver and kube-apiserver and successfully produces a working cluster by aggregating openshift-apiserver using cluster up (on my machine).

Origin-commit: 6160a5e56ceffef0bd24933c79a78851045bafa5

openshift-publish-robot pushed a commit to openshift/kubernetes that referenced this issue Mar 23, 2018

Merge pull request #18652 from deads2k/hypershift-02-openshift-apiserver
Automatic merge from submit-queue (batch tested with PRs 18778, 18709, 18876, 18897, 18652).

Run openshift aggregated behind kube

This is ready for review.  I've split some bits out into separate pulls to make it easier to review.  It separates openshift-apiserver and kube-apiserver and successfully produces a working cluster by aggregating openshift-apiserver using cluster up (on my machine).

Origin-commit: 6160a5e56ceffef0bd24933c79a78851045bafa5

openshift-publish-robot pushed a commit to openshift/kubernetes that referenced this issue Mar 23, 2018

Merge pull request #18652 from deads2k/hypershift-02-openshift-apiserver
Automatic merge from submit-queue (batch tested with PRs 18778, 18709, 18876, 18897, 18652).

Run openshift aggregated behind kube

This is ready for review.  I've split some bits out into separate pulls to make it easier to review.  It separates openshift-apiserver and kube-apiserver and successfully produces a working cluster by aggregating openshift-apiserver using cluster up (on my machine).

Origin-commit: 6160a5e56ceffef0bd24933c79a78851045bafa5

openshift-publish-robot pushed a commit to openshift/kubernetes that referenced this issue Mar 23, 2018

Merge pull request #18652 from deads2k/hypershift-02-openshift-apiserver
Automatic merge from submit-queue (batch tested with PRs 18778, 18709, 18876, 18897, 18652).

Run openshift aggregated behind kube

This is ready for review.  I've split some bits out into separate pulls to make it easier to review.  It separates openshift-apiserver and kube-apiserver and successfully produces a working cluster by aggregating openshift-apiserver using cluster up (on my machine).

Origin-commit: 6160a5e56ceffef0bd24933c79a78851045bafa5
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment