Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pprof for galley #10047

Merged
merged 2 commits into from Jan 6, 2019
Merged

Conversation

clyang82
Copy link
Member

@clyang82 clyang82 commented Nov 19, 2018

Enable pprof for galley only. the pprof will be started in 9094 by default.

Signed-off-by: Chun Lin Yang clyang@cn.ibm.com

@istio-testing
Copy link
Collaborator

Hi @clyang82. Thanks for your PR.

I'm waiting for a istio member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ayj
Copy link
Contributor

ayj commented Nov 19, 2018

Thanks @clyang82 for added this. Could we expose the pprof endpoint on a different port than metrics? Metrics are exposed outside the pod (via container port / service), whereas pprof can be accessed via kubectl port-foward when we need to.

@ayj
Copy link
Contributor

ayj commented Nov 19, 2018

/ok-to-test

@istio-testing istio-testing added ok-to-test Set this label allow normal testing to take place for a PR not submitted by an Istio org member. and removed needs-ok-to-test labels Nov 19, 2018
@clyang82
Copy link
Member Author

@ayj Thanks for your comments. Right now, mixer, pilot, citadel and galley expose the pprof endpoint on 9093. I think we should keep consistence, consider that they are for self-monitoring. What do you think?

@hzxuzhonghu
Copy link
Member

/ok-to-test

@ayj
Copy link
Contributor

ayj commented Nov 20, 2018

I thought Pilot exposed pprof on port 8080 along with the other legacy discovery services (see

func NewDiscoveryService(environment *model.Environment, o DiscoveryServiceOptions) (*DiscoveryService, error) {
).

@clyang82
Copy link
Member Author

@ayj Yes. you are right. But 9093 is also working for pilot. they are 2 http ports to expose the same content /metrics, /pprof, /debug, /ready, /version. We need to remove one of them. I suggest to keep 9093 and remove 8080. What is your idea?

@@ -123,6 +124,8 @@ func GetRootCmd(args []string, printf, fatalf shared.FormatFn) *cobra.Command {
"Interval of updating file for the Galley readiness probe.")
rootCmd.PersistentFlags().UintVar(&monitoringPort, "monitoringPort", 9093,
"Port to use for exposing self-monitoring information")
rootCmd.PersistentFlags().BoolVar(&enableProfiling, "enableProfiling", true,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe false for default?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pilot, mixer are true by default, while citadel is false. Do we need to make them to false for default?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure. Let's wait for community feedback on this because it doesn't seem right for me that profiling is on by default.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your response. Let us wait for the feedback from community.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for having them on by default?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know. It is on in the initial version. @kyessenov any thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep it off by default. We originally made this default=true in Pilot ~2 years ago to help with early debugging before Istio was used in production.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make it to default=false for mixer, pilot, galley and citadel. right?

@ayj
Copy link
Contributor

ayj commented Nov 21, 2018

I suggest to keep 9093 and remove 8080. What is your idea?

It seems like exposing telemetry and internal diagnostic/debug information on the same port might be risky. If they are separate ports, we can more easily limit pprof exposure by not adding its port to the k8s pod/svc.

@clyang82
Copy link
Member Author

clyang82 commented Nov 22, 2018

I suggest to keep 9093 and remove 8080. What is your idea?

It seems like exposing telemetry and internal diagnostic/debug information on the same port might be risky. If they are separate ports, we can more easily limit pprof exposure by not adding its port to the k8s pod/svc.

@ayj sounds good. how about?

  • 9093 for telemetry - it can reduce the changes. many documents mention that.
  • 9094 for debug information - not adding to pod/svc
  • 8080 pilot only for service discover - List all known services in /v1/registration

@ayj
Copy link
Contributor

ayj commented Nov 27, 2018

Sounds good in general. We need to be careful about renumbering any Pilot ports that are used by istioctl.

cc @rshriram (pilot), @douglas-reid (mixer), @liamawhite (istioctl)

@codecov
Copy link

codecov bot commented Nov 27, 2018

Codecov Report

Merging #10047 into release-1.1 will increase coverage by 2%.
The diff coverage is 13%.

Impacted file tree graph

@@              Coverage Diff               @@
##           release-1.1   #10047     +/-   ##
==============================================
+ Coverage           68%      70%     +2%     
==============================================
  Files              571      442    -129     
  Lines            48706    41362   -7344     
==============================================
- Hits             32982    28830   -4152     
+ Misses           13946    11122   -2824     
+ Partials          1778     1410    -368
Impacted Files Coverage Δ
galley/pkg/server/monitoring.go 0% <0%> (ø) ⬆️
mixer/pkg/server/args.go 100% <100%> (ø) ⬆️
mixer/adapter/inventory.gen.go 0% <0%> (-100%) ⬇️
pilot/pkg/networking/plugin/mixer/mixer.go 1% <0%> (-77%) ⬇️
mixer/adapter/rbac/rbac.go 0% <0%> (-70%) ⬇️
galley/pkg/meshconfig/cache.go 0% <0%> (-63%) ⬇️
pkg/mcp/server/monitoring.go 3% <0%> (-57%) ⬇️
mixer/pkg/runtime/config/queries.go 64% <0%> (-36%) ⬇️
mixer/pkg/runtime/config/snapshot.go 58% <0%> (-36%) ⬇️
pilot/pkg/serviceregistry/memory/discovery.go 57% <0%> (-29%) ⬇️
... and 275 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c24863b...0acc8bd. Read the comment docs.

@clyang82 clyang82 changed the title Enable pprof for galley Add pprof for galley and disable pprof for pilot/mixer/citadel/istioctl Nov 27, 2018
@clyang82
Copy link
Member Author

Thanks @ayj . I would like to use this PR to add pprof for galley and disable pprof for pilot/mixer/citadel/istioctl by default. Will have a separated PR to handle:

9093 for telemetry - it can reduce the changes. many documents mention that.
9094 for debug information - not adding to pod/svc
8080 pilot only for service discover - List all known services in /v1/registration

Would you please help review this PR firstly? Thanks.

@clyang82
Copy link
Member Author

/test e2e-dashboard

@liamawhite
Copy link
Member

@clyang82 if you're changing the debug port in Pilot to 9094 you will need to make changes so that istioctl works. This is the line that needs changing -> https://github.com/istio/istio/blob/master/pilot/cmd/pilot-discovery/request.go#L33

@@ -123,6 +124,8 @@ func GetRootCmd(args []string, printf, fatalf shared.FormatFn) *cobra.Command {
"Interval of updating file for the Galley readiness probe.")
rootCmd.PersistentFlags().UintVar(&monitoringPort, "monitoringPort", 9093,
"Port to use for exposing self-monitoring information")
rootCmd.PersistentFlags().BoolVar(&enableProfiling, "enableProfiling", false,
"Enable pprof for Galley")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add the pprofPort flag now and plumb that through StartSelfMonitoring? That way we can be done with the galley specific changes and address Pilot and Mixer individually.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ayj sorry for late. I just add pprof in 9094. please review again. Let us use this ticket to handle galley specific changes. Thanks.

@douglas-reid
Copy link
Contributor

/hold

@mandarjog wants mixer to have pprof on by default.

@istio-testing istio-testing added the do-not-merge/hold Block automatic merging of a PR. label Nov 27, 2018
@clyang82
Copy link
Member Author

@mandarjog could you share your point why you want to have pprof on by default for mixer? Thanks.

@costinm
Copy link
Contributor

costinm commented Nov 29, 2018

/hold - please don't change pilot pprof or ports without a strong reason.

@costinm
Copy link
Contributor

costinm commented Nov 29, 2018

Also: please use the PR description or comments to explain why a change is made - it makes no sense to me to make gratuitous backward incompatible changes like renaming ports.
We can be careful for new ports to make sure they use the 150xx range.

@clyang82
Copy link
Member Author

clyang82 commented Dec 3, 2018

/hold - please don't change pilot pprof or ports without a strong reason.

@costinm right now, for pilot, they are 2 http ports (8080 & 9093) to expose the same content /metrics, /pprof, /debug, /ready, /version. that is obvious duplication we should keep one for it.
if there is no strong reason to enable the debug, it should be disabled by default to reduce the performance impact and unnecessary noise. Thanks.

@liamawhite
Copy link
Member

@clyang82 /debug is required for a couple of istioctl commands. IMO we should leave debug on by default but allow users to turn it off. The people who really need the perf will be a much fewer number of people and they are more likely to understand what happens when you turn off the debug.

@stale
Copy link

stale bot commented Dec 17, 2018

This pull request has been automatically marked as stale because it has not had activity in the last 2 weeks. It will be closed in 30 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@stale stale bot added the stale label Dec 17, 2018
@istio-testing istio-testing added the needs-rebase Indicates a PR needs to be rebased before being merged label Dec 22, 2018
@stale stale bot removed the stale label Dec 26, 2018
@istio-testing istio-testing removed the needs-rebase Indicates a PR needs to be rebased before being merged label Dec 26, 2018
@clyang82 clyang82 changed the title Add pprof for galley and disable pprof for pilot/mixer/citadel/istioctl Add pprof for galley Dec 26, 2018
@istio-testing
Copy link
Collaborator

istio-testing commented Dec 26, 2018

@clyang82: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
prow/e2e-mixer-no_auth-mcp.sh a61a98e7dabbca82a244937c55f028cada198f1f link /test e2e-mixer-no_auth-mcp

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@clyang82
Copy link
Member Author

/retest

Signed-off-by: Chun Lin Yang <clyang@cn.ibm.com>
Signed-off-by: Chun Lin Yang <clyang@cn.ibm.com>
@clyang82
Copy link
Member Author

clyang82 commented Jan 3, 2019

/assign @ayj
I would like to use this ticket to handle galley specific changes. Thanks.

Copy link
Contributor

@ayj ayj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@istio-testing
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ayj, clyang82

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@clyang82 clyang82 removed the do-not-merge/hold Block automatic merging of a PR. label Jan 6, 2019
@istio-testing istio-testing merged commit ee30c79 into istio:release-1.1 Jan 6, 2019
@clyang82 clyang82 deleted the enable_pprof_galley branch January 11, 2019 07:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ok-to-test Set this label allow normal testing to take place for a PR not submitted by an Istio org member.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants