-
Notifications
You must be signed in to change notification settings - Fork 7.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Deployment Details Bug #699
Conversation
src/frontend/deployment_details.go
Outdated
var metaServerClient = metadata.NewClient(&http.Client{}) | ||
|
||
// The use of httpRequest here lets us to see HTTP request info in the logs. | ||
var log = httpRequest.Context().Value(ctxKeyLog{}).(logrus.FieldLogger) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, ideally, this loadDeploymentDetails
function would not know about the httpRequest
object.
So, in the future, we could pass in the log
object instead of the httpRequest
object into the function.
But such an improvement would be outside the scope of this pull-request.
🚲 PR staged at http://34.132.215.174 |
I have tested this on a Kind cluster (i.e., a non-GCP cluster) and it works! 😌 See the footer:
I think we should eventually work on actually displaying deployment details for non-GCP clusters (instead of the above message). |
src/frontend/deployment_details.go
Outdated
DetailsAreNotLoaded = 0 | ||
DetailsAreLoading = 1 | ||
DetailsAreLoaded = 2 | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was inspired to do enum this way by this article, Using Enums (and Enum Types) in Golang.
@ckim328, thanks for testing these changes on What's happening?
The changes from this pull-request ensures that the
So yes, this is expected. If you want, instead of showing an empty value for zone and cluster, I can state, "failed to load" (or a similar message) or I can hide the "zone" and "cluster" labels completely? |
Thanks for the clarification @NimJay 👍 It's ok if it's not captured in the scope of this P1 discussion, we can def address these later! |
src/frontend/templates/footer.html
Outdated
@@ -32,6 +32,9 @@ | |||
<b>Cluster: </b>{{index .deploymentDetails "CLUSTERNAME" }}<br/> | |||
<b>Zone: </b>{{index .deploymentDetails "ZONE" }}<br/> | |||
<b>Pod: </b>{{index .deploymentDetails "HOSTNAME" }} | |||
{{ else }} | |||
Deployment details are either still loading or failed to load. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not too opiniated on this but I think we should silently fail and not show anything rather than show an error message. Or just show "Deployment details are unavailable" or similar.
The reasoning behind this is that more often than not, the deployment details are simply not available, but the current diff makes it seem like the user should be reloading, or that there is something fundamentally wrong with their cluster's well being.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ckim328 suggested a similar "fail silently" user experience (i.e., hiding labels that have no values).
So I've gone ahead and implemented that. :)
I have removed the following verbiage:
or failed to load
because this was not true. If an error occurred, we would've just seen labels with no values.
I have preserved the following verbiage:
Deployment details are still loading.
Try refreshing this page.
because good error message are actionable.
And there is a tutorial that asks developers to look at the deployment details (see #628).
Thanks, you two.
deploymentDetailsMap["HOSTNAME"] = podHostname | ||
deploymentDetailsMap["CLUSTERNAME"] = podCluster | ||
deploymentDetailsMap["ZONE"] = podZone |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we just get this information once, and only once ever? We can do this from an init()
function, with a package variable that we can query as many times as we want.
// deployment_details.go
package main
import (...)
var deploymentDetails map[string]string
func init() {
// get the deployment details from the meta server
deploymentDetails = map[string]string {
"HOSTNAME": podHostname,
"CLUSTERNAME": podCluster,
"ZONE": podZone,
}
}
Then from the handlers:
if err := templates.ExecuteTemplate(w, "home", map[string]interface{}{
. . .
. . .
"deploymentDetails": deploymentDetails,
}
That would mean no mutex, no enums, no race conditions, no "loading / not loading", and reduce complexity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the meta server is not available or reachable (e.g. if you're deploying locally), then you would leave the map blank, and that should be taken care of seamlessly by the HTML template with the "if deploymentDetails" bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great suggestion! I'm glad you took a look at my changes.
I'm now loading the deployment details from func init()
.
I was able to get rid of the following complexities:
- mutex
- race conditions
- enums
However, I wasn't able to get rid of the "no 'loading / not loading'" stuff. This is because we don't want the loading of details to block the main thread in non-GCP deployments (as described in #685).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder, if we should still account for temporary unreachability of the metadata server
. I thought the previous mutex approach took care of all possibilities. The current change would render the details never loaded if the first attempt on init fails.
Just a thought. We could still see how this works and add it as an improvement if need be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a fair point, though, in what scenario would the meta server to be unavailable? I thought maybe if the GCE API (https://status.cloud.google.com/) is down completely, but the meta server is internal, no? The SLA for any arbitrary single-zone cluster is >= 99.5%
, so I'm imagining the meta server to be much higher, though that is solely intuition.
I think that's a very low probability, but even so mutex seems overkill. We could do periodic retries in the go routine until we get a result, for example (though I think that's still overkill).
I think it's safe to leave it as-is, but I'm also opened to adding retries (taking into account that it may retry forever for non-GCP instances).
🚲 PR staged at http://34.132.215.174 |
🚲 PR staged at http://34.132.215.174 |
src/frontend/deployment_details.go
Outdated
DetailsAreLoaded = 2 | ||
) | ||
|
||
var areDeploymentDetailsLoaded = false | ||
var deploymentDetailsMap = make(map[string]string) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can stay as var deploymentDetailsMap = map[string]string
and then you have an easy way to check if the deployment are loaded / present, which is that deploymentDetailsMap != nil
.
That would also remove the need for that additional areDeploymentDetailsLoaded
variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great suggestion! 👌
Now the declaration at the top looks like:
var deploymentDetailsMap map[string]string
and it's initialized in the loadDeploymentDetails()
function:
func loadDeploymentDetails() {
deploymentDetailsMap = make(map[string]string)
and the getDeploymentDetailsIfLoaded
function is now:
func getDeploymentDetailsIfLoaded() map[string]string {
return deploymentDetailsMap
}
|
||
// The use of httpRequest here lets us to see HTTP request info in the logs. | ||
var log = httpRequest.Context().Value(ctxKeyLog{}).(logrus.FieldLogger) | ||
func initializeLogger() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm okay with having a single logger, but if so:
- Let's put it in either
main.go
or its ownlogger.go
, and - Let's use it everywhere. I notice some files (like
main.go
uses their own duplicated loggers)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking the same thing.
Let's track this in a different issue: #703
I'd say it's outside the scope of this P1.
src/frontend/deployment_details.go
Outdated
@@ -50,18 +60,14 @@ func loadDeploymentDetails(httpRequest *http.Request) map[string]string { | |||
"hostname": podHostname, | |||
}).Debug("Loaded deployment details") | |||
|
|||
detailsLoadingStatus = DetailsAreLoaded | |||
areDeploymentDetailsLoaded = true | |||
|
|||
return deploymentDetailsMap | |||
} | |||
|
|||
func getDeploymentDetailsIfLoaded(httpRequest *http.Request) map[string]string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is completely redundant if we change the deploymentDetailsMap
to be nil
at initialization (so remove the make()
from there, so this entire method can go away.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've preserved the getDeploymentDetailsIfLoaded()
method because it implies that the thing being returned might not have been loaded; thus, I assume anyone using/seeing getDeploymentDetailsIfLoaded()
will cautiously check if it returns nil
.
Thoughts?
src/frontend/deployment_details.go
Outdated
@@ -50,18 +60,14 @@ func loadDeploymentDetails(httpRequest *http.Request) map[string]string { | |||
"hostname": podHostname, | |||
}).Debug("Loaded deployment details") | |||
|
|||
detailsLoadingStatus = DetailsAreLoaded | |||
areDeploymentDetailsLoaded = true | |||
|
|||
return deploymentDetailsMap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to return anything. The called (go loadDeploymentDetails()
) is not expecting anything returned. We just have to modify the global map and silently return (which is implicit at the end of the function, so we don't need a return
at all)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch!
I will work on not returning anything. :)
🚲 PR staged at http://34.132.215.174 |
@bourgeoisor thanks for the new round of comments! |
🚲 PR staged at http://34.132.215.174 |
deploymentDetailsMap["HOSTNAME"] = podHostname | ||
deploymentDetailsMap["CLUSTERNAME"] = podCluster | ||
deploymentDetailsMap["ZONE"] = podZone |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a fair point, though, in what scenario would the meta server to be unavailable? I thought maybe if the GCE API (https://status.cloud.google.com/) is down completely, but the meta server is internal, no? The SLA for any arbitrary single-zone cluster is >= 99.5%
, so I'm imagining the meta server to be much higher, though that is solely intuition.
I think that's a very low probability, but even so mutex seems overkill. We could do periodic retries in the go routine until we get a result, for example (though I think that's still overkill).
I think it's safe to leave it as-is, but I'm also opened to adding retries (taking into account that it may retry forever for non-GCP instances).
🚲 PR staged at http://34.132.215.174 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, with the caveats of the open comment threads
* Improve loading of deployment details * Hide missing deployment details labels * Load deployment details once via init() * Add comment about go loadDeploymentDetails() * Simplify deployment_details.go * Remove getDeploymentDetailsIfLoaded() function Co-authored-by: Shabir Mohamed Abdul Samadh <7249208+Shabirmean@users.noreply.github.com>
Fixes #697
Background
getDeploymentDetails
(upon every HTTP request) in thefrontend
service.getDeploymentDetails
is using a Google Cloud API (cloud.google.com/go/compute/metadata) to load the cluster name and zone.getDeploymentDetails
asynchronously (so that the HTTP request handler isn't blocked)getDeploymentDetails
load necessary info once.Summary of Changes
1. New File: deployment_details.go
deployment_details.go
.handlers.go
file is too long in my opinion.2. Don't Wait on Deployment Details
go
routine to load deployment details asynchronously.3. Only Load Deployment Details Once
4. Mutex
5. Front-end Message
Testing
brew install kind
kind create cluster
kubectx
. You should seekind-kind
highlighted.git pull nimjay/non-gcp-bug
.docker push
docker push gcr.io/my-project/new-front-end
kubernetes-manifests.yaml
file: usedocker push gcr.io/my-project/new-front-end
instead ofgcr.io/google-samples/microservices-demo/frontend:v0.3.5
.kubectl apply -f kubernetes-manifests.yaml
localhost:8080
, run:kubectl port-forward <name-of-frontend-pod> 8080:8080
. (Usekubectl get pods
to get the name of your front-end pod.)Additional Notes