Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streamline cluster up output #13636

Merged
merged 1 commit into from
Apr 7, 2017

Conversation

csrwng
Copy link
Contributor

@csrwng csrwng commented Apr 5, 2017

Outputs previous messages when either an error occurs during startup or
loglevel > 0.

Now only executes container network test if loglevel > 0 to speed up
startup time.

Fixes #12715
#12531

@csrwng
Copy link
Contributor Author

csrwng commented Apr 5, 2017

Output when image present:

Starting OpenShift using openshift/origin:v3.6.0-alpha.0 ...
OpenShift server started.

The server is accessible via web console at:
    https://127.0.0.1:8443

You are logged in as:
    User:     developer

To login as administrator:
    oc login -u system:admin

Output when having to pull image:

Starting OpenShift using openshift/origin:v3.6.0-alpha.0 ...
Pulling image openshift/origin:v3.6.0-alpha.0
Pulled 1/3 layers, 38% complete
Pulled 2/3 layers, 86% complete
Pulled 3/3 layers, 100% complete
Extracting
Image pull complete
OpenShift server started.

The server is accessible via web console at:
    https://127.0.0.1:8443

You are logged in as:
    User:     developer

To login as administrator:
    oc login -u system:admin

Old output is still available with --loglevel=1

@csrwng
Copy link
Contributor Author

csrwng commented Apr 5, 2017

@bparees @smarterclayton ptal
@jorgemoralespou fyi

@bparees
Copy link
Contributor

bparees commented Apr 5, 2017

lgtm but better let @smarterclayton have the final say since he instigated these changes.

@jorgemoralespou
Copy link

@csrwng I see a difference between a cluster up with and without pulling images but I don't see a difference from cluster start if you keep config for second boot. In that case showing information on users is superfluous and might not be correct as the user is not really logged in. I would so this info in that case.

Also I would move the layers percentage in the pull to loglevel=1.

@csrwng
Copy link
Contributor Author

csrwng commented Apr 5, 2017

@jorgemoralespou the reason the layers percentage is displayed is that if you have a particularly slow connection, you wouldn't see anything happening for a good while and you'd think that the command is just stuck.

Would this be ok for when you're reusing existing config/data?

Starting OpenShift using openshift/origin:v3.6.0-alpha.0 ...
OpenShift server started.

The server is accessible via web console at:
    https://127.0.0.1:8443

@jorgemoralespou
Copy link

jorgemoralespou commented Apr 5, 2017 via email

@csrwng csrwng force-pushed the clusterup_shorter_display branch from d1ee956 to 0ebd8b7 Compare April 5, 2017 14:11
@csrwng
Copy link
Contributor Author

csrwng commented Apr 5, 2017

@jorgemoralespou updated the display for when you're reusing config/data.
The progress writer for download is a bigger change... we should tackle it in a different pull.

@jorgemoralespou
Copy link

jorgemoralespou commented Apr 5, 2017 via email

@smarterclayton
Copy link
Contributor

smarterclayton commented Apr 5, 2017 via email

@smarterclayton
Copy link
Contributor

Network check tends to be very slow for me - that's another spot where some output is useful.

@smarterclayton
Copy link
Contributor

Nm, saw your comment

@jorgemoralespou
Copy link

jorgemoralespou commented Apr 5, 2017 via email

@smarterclayton
Copy link
Contributor

How often does the container network test fail? Can we only run it if something else fails first?

@csrwng
Copy link
Contributor Author

csrwng commented Apr 5, 2017

@smarterclayton so it likely fails when you first run 'cluster up' on a machine that doesn't have the right firewall rules set. Unfortunately, it's not something that you notice in the initial setup of things. Everything will succeed but then you either won't be able to push to the registry or your dns lookups will fail.

So the issue is that you pay this premium every time you start cluster up when after the first time you run it successfully, you likely won't need to check any more.

@csrwng
Copy link
Contributor Author

csrwng commented Apr 5, 2017

Something that would be nice would be to start the test asynchronously and then notify you that things are not right as you try to use openshift. But there's not a single interaction entry point, so that's hard.

@smarterclayton
Copy link
Contributor

smarterclayton commented Apr 6, 2017 via email

@jorgemoralespou
Copy link

jorgemoralespou commented Apr 6, 2017 via email

@smarterclayton
Copy link
Contributor

smarterclayton commented Apr 6, 2017 via email

@jorgemoralespou
Copy link

jorgemoralespou commented Apr 6, 2017 via email

@csrwng
Copy link
Contributor Author

csrwng commented Apr 6, 2017

I'm investigating what's making the test slow. In theory, it should not take that long. I am hitting the master api endpoint from a container after the healthz endpoint is returning ok.

https://github.com/openshift/origin/blob/master/pkg/bootstrap/docker/openshift/cnetwork.go#L5-L20

The DNS server would maybe take a little longer to come up, but I wouldn't expect it to take as long as 20sec as I've seen sometimes.

If for whatever reason that test can't be made faster, an alternate test could be done with a pair of containers, one using the pod network and the other one using the host network.

@csrwng
Copy link
Contributor Author

csrwng commented Apr 6, 2017

So the container networking test is much faster now that I've fixed a very embarrassing bug (the first part of the test was not working at all and only failing after 40 tries). So now it will run every time no matter what. If the firewall is setup correctly, it doesn't add any/much time to startup.

@csrwng
Copy link
Contributor Author

csrwng commented Apr 6, 2017

[test]

@jorgemoralespou
Copy link

jorgemoralespou commented Apr 6, 2017 via email

@csrwng
Copy link
Contributor Author

csrwng commented Apr 6, 2017

@jorgemoralespou I'll submit a fix for that branch

@csrwng csrwng force-pushed the clusterup_shorter_display branch from 0d63ba6 to 40fc3ed Compare April 6, 2017 20:10
@csrwng
Copy link
Contributor Author

csrwng commented Apr 6, 2017

@jorgemoralespou actually in 1.5 it's not broken in the same way

@smarterclayton
Copy link
Contributor

smarterclayton commented Apr 6, 2017 via email

@csrwng csrwng force-pushed the clusterup_shorter_display branch from 40fc3ed to ae9c5c4 Compare April 7, 2017 13:50
@csrwng
Copy link
Contributor Author

csrwng commented Apr 7, 2017

integration test seems to have gotten stuck... restesting

@csrwng
Copy link
Contributor Author

csrwng commented Apr 7, 2017

#12007
[test]

Outputs previous messages when either an error occurs during startup or
loglevel > 0.

Now only executes container network test if loglevel > 0 to speed up
startup time.
@csrwng csrwng force-pushed the clusterup_shorter_display branch from ae9c5c4 to 55dc4ef Compare April 7, 2017 18:44
@openshift-bot
Copy link
Contributor

Evaluated for origin test up to 55dc4ef

@openshift-bot
Copy link
Contributor

continuous-integration/openshift-jenkins/test SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pull_request_origin/646/) (Base Commit: 44d4f23)

@csrwng
Copy link
Contributor Author

csrwng commented Apr 7, 2017

[merge]

@openshift-bot
Copy link
Contributor

Evaluated for origin merge up to 55dc4ef

@openshift-bot
Copy link
Contributor

openshift-bot commented Apr 7, 2017

continuous-integration/openshift-jenkins/merge SUCCESS (https://ci.openshift.redhat.com/jenkins/job/merge_pull_request_origin/282/) (Base Commit: 1ea122b) (Image: devenv-rhel7_6125)

@openshift-bot openshift-bot merged commit 0d82899 into openshift:master Apr 7, 2017
@hferentschik
Copy link
Contributor

@csrwng sorry, late to the game. +1 for improving on the output.

What is the best approach to get an oc version containing this change? Any chance you are building and hosting binaries as part of a pull request build? Or do I need to rebuild latest origin master myself?

@csrwng
Copy link
Contributor Author

csrwng commented Apr 10, 2017

@hferentschik this will be included in the next release for origin. In the meantime, you can build master locally. If you have a working openshift environment, this is easy to do with a template:
https://github.com/csrwng/build-origin

@csrwng csrwng deleted the clusterup_shorter_display branch April 10, 2017 15:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants