Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wait for port to be available #90

Conversation

ravisantoshgudimetla
Copy link
Contributor

Add an init container so that we wait for 10259 port to be available before throwing error.

/cc @deads2k @sjenning @sttts

@openshift-ci-robot openshift-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Apr 8, 2019
@ravisantoshgudimetla
Copy link
Contributor Author

/test e2e-aws

@sjenning
Copy link
Contributor

xref https://bugzilla.redhat.com/show_bug.cgi?id=1698251

That bug shows port 10251 not 10259. Just want to make sure we are doing the right thing.

@ravisantoshgudimetla
Copy link
Contributor Author

https://github.com/openshift/cluster-kube-scheduler-operator/pull/88/files disabled insecure port(10251) and moved to secure port(10259) which is default secure port.(https://github.com/openshift/cluster-kube-scheduler-operator/pull/77/files)

@wking
Copy link
Member

wking commented Apr 10, 2019

Also related to the port switch: openshift/installer#1576

@sjenning
Copy link
Contributor

just checking, thanks!
/lgtm
/retest

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Apr 10, 2019
- name: wait-for-host-port
image: ${IMAGE}
imagePullPolicy: IfNotPresent
command: ['/usr/bin/timeout', '105', '/bin/bash', '-c'] # a bit more than 60s for graceful termination + 35s for minimum-termination-duration, 5s extra cri-o's graceful termination period
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is far too much for kube-scheduler. It does not hurt much, but still. At least the comment is wrong :)

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@ravisantoshgudimetla
Copy link
Contributor Author

/hold

This has been failing continuously, perhaps there is a timing issue and could be related to what Stefan pointed out. Need to debug this.

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 10, 2019
@ravisantoshgudimetla
Copy link
Contributor Author

/retest

@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Apr 11, 2019
@mfojtik
Copy link
Member

mfojtik commented Apr 12, 2019

openshift/origin#22543 this might fix the port

@ravisantoshgudimetla
Copy link
Contributor Author

/retest

2 similar comments
@ravisantoshgudimetla
Copy link
Contributor Author

/retest

@ravisantoshgudimetla
Copy link
Contributor Author

/retest

@openshift-ci-robot openshift-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed approved Indicates a PR has been approved by an approver from all required OWNERS files. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Apr 13, 2019
@openshift-ci-robot openshift-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Apr 13, 2019
@ravisantoshgudimetla
Copy link
Contributor Author

/retest

@wking
Copy link
Member

wking commented Apr 13, 2019

bootkube.service:

Apr 13 13:01:10 ip-10-0-10-192 bootkube.sh[8748]:         Pod Status:openshift-cluster-version/cluster-version-operator        Ready
...
Apr 13 13:06:15 ip-10-0-10-192 bootkube.sh[8748]:         Pod Status:          kube-apiserver        Ready
Apr 13 13:06:15 ip-10-0-10-192 bootkube.sh[8748]:         Pod Status:openshift-kube-scheduler/openshift-kube-scheduler        Pending
Apr 13 13:06:15 ip-10-0-10-192 bootkube.sh[8748]:         Pod Status: kube-controller-manager        Ready
Apr 13 13:06:15 ip-10-0-10-192 bootkube.sh[8748]:         Pod Status:openshift-cluster-version/cluster-version-operator        Ready

Maybe too slow?

time="2019-04-13T13:07:30Z" level=fatal msg="failed to wait for bootstrap-complete event: timed out waiting for the condition"

/retest

@ravisantoshgudimetla
Copy link
Contributor Author

Thank you @wking, yeah but it's inline with other control plane components.

To be clear, I have added 10251 port as well because, we are starting metrics server and /healthz on 10251.

@ravisantoshgudimetla
Copy link
Contributor Author

/test e2e-aws-serial

@sjenning
Copy link
Contributor

/retest

@openshift-ci-robot openshift-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Apr 15, 2019
@ravisantoshgudimetla ravisantoshgudimetla removed the kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API label Apr 15, 2019
@ravisantoshgudimetla
Copy link
Contributor Author

/test verify

@ravisantoshgudimetla
Copy link
Contributor Author

/test e2e-aws-operator

/test e2e-aws

@ravisantoshgudimetla ravisantoshgudimetla force-pushed the add-init-container branch 3 times, most recently from 2a5caa9 to eef78a2 Compare April 15, 2019 19:07
@sjenning
Copy link
Contributor

/hold
while I take a look

@sjenning
Copy link
Contributor

The scheduler is in Pending

bootkube.sh[8405]:         Pod Status:openshift-kube-scheduler/openshift-kube-scheduler        Pending
                "containerStatuses": [
                    {
                        "image": "registry.svc.ci.openshift.org/ci-op-23nmvg9r/stable@sha256:7e63406b1f14afd77c484907457b045cc33e31f278a5b65d5c5020d2b194cbe5",
                        "imageID": "",
                        "lastState": {},
                        "name": "scheduler",
                        "ready": false,
                        "restartCount": 0,
                        "state": {
                            "waiting": {
                                "reason": "PodInitializing"
                            }
                        }
                    }
                ],
                "hostIP": "10.0.140.122",
                "initContainerStatuses": [
                    {
                        "image": "${IMAGE}",
                        "imageID": "",
                        "lastState": {},
                        "name": "wait-for-host-port",
                        "ready": false,
                        "restartCount": 0,
                        "state": {
                            "waiting": {
                                "message": "Failed to apply default image tag \"${IMAGE}\": couldn't parse image reference \"${IMAGE}\": invalid reference format: repository name must be lowercase",
                                "reason": "InvalidImageName"
                            }
                        }
                    }
                ],
                "phase": "Pending",
                "podIP": "10.0.140.122",
                "qosClass": "Burstable",
                "startTime": "2019-04-15T19:30:35Z"

@sjenning
Copy link
Contributor

sjenning commented Apr 15, 2019

@ravig did you mean to force push this change? There is no init container any more.

@sjenning
Copy link
Contributor

sjenning commented Apr 15, 2019

anyway, the reason it was failing before was because we need this code in pkg/operator/targetconfigcontroller/targetconfigcontroller.go

https://github.com/openshift/cluster-kube-controller-manager-operator/pull/197/files#diff-6501874255487b3161b3dfc0f3f34070

otherwise the templating of IMAGE in the init container doesn't happen

@ravisantoshgudimetla
Copy link
Contributor Author

Ohh you're right @sjenning, thanks for the help :)

@ravisantoshgudimetla
Copy link
Contributor Author

/retest

1 similar comment
@ravisantoshgudimetla
Copy link
Contributor Author

/retest

@sjenning
Copy link
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Apr 16, 2019
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ravisantoshgudimetla, sjenning

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [ravisantoshgudimetla,sjenning]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ravisantoshgudimetla
Copy link
Contributor Author

/hold cancel

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 16, 2019
@openshift-merge-robot openshift-merge-robot merged commit 569458f into openshift:master Apr 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants