Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

profiler: add hooks to override functions for testing #1259

Merged
merged 3 commits into from
Apr 29, 2022

Conversation

nsrip-dd
Copy link
Contributor

The profiler uses global startCPUProfile/stopCPUProfile/lookupProfile
functions that are overridden by tests.

Any test that wants to override these functions has to reset them. This
results in duplicated code, and can cause difficult-to-understand issues
if a new test is added that overrides the values and fails to reset
them. For example, this has contributed to race conditions in our unit
tests that are unrelated to actual bugs in our code. See PR #1255 for
example. Having bugs in our tests makes it more difficult to diagnose
real bugs in our code.

Instead of having global values, add override hooks to the profiler
struct that can be set during tests and fall back to the default
behavior if no hooks are set. So long as each test uses its own profiler
struct, each test can set its own hooks without affecting any other
test.

The profiler uses global startCPUProfile/stopCPUProfile/lookupProfile
functions that are overridden by tests.

Any test that wants to override these functions has to reset them. This
results in duplicated code, and can cause difficult-to-understand issues
if a new test is added that overrides the values and fails to reset
them. For example, this has contributed to race conditions in our unit
tests that are unrelated to actual bugs in our code. See PR #1255 for
example. Having bugs in our tests makes it more difficult to diagnose
real bugs in our code.

Instead of having global values, add override hooks to the profiler
struct that can be set during tests and fall back to the default
behavior if no hooks are set. So long as each test uses its own profiler
struct, each test can set its own hooks without affecting any other
test.
@nsrip-dd nsrip-dd requested a review from felixge April 25, 2022 19:00
@nsrip-dd nsrip-dd added this to the 1.39.0 milestone Apr 25, 2022
Copy link
Member

@felixge felixge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! This is definitely an improvement, so I think you should go ahead and merge this.

However, PTAL at my comment: Going forward we should investigate getting rid of these test hooks completely. I'm not fully sure it's doable, but from a first glance it seems to be worth investigating.

// depend on accessing runtime state that is not needed/available for the test
type testHooks struct {
startCPUProfile func(w io.Writer) error
stopCPUProfile func()
Copy link
Member

@felixge felixge Apr 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just looked at the test cases that would use the CPU hooks here:

  • TestProfilerInternal
  • TestRunProfile/cpu

Now that we have TestAllUploaded they are starting to feel mostly redundant. I.e. we could probably test the same behavior with integration testing and get rid of those two hooks entirely.

Getting rid of lookupProfile will probably be a bit trickier, as there is more test code depending on it, so we'd have to carefully analyze if we're willing to throw that away and replace it.

Anyway, my main point is: We can delete existing tests if we can replace them with better ones. Especially if it would allow us to reduce the amount of implementation code that is just needed to facilitate testing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be doable. Most of the lookupProfile uses are just to put something in the profile output and check that it comes through (or nothing in the case of TestRunProfile/goroutineswaitLimit). For that use-case, TestAllUploaded-style tests can capture that while running the real profiles.

The TestRunProfile/goroutineswait test could be replaced with a standalone test of goroutineDebug2ToPprof since I think the test is really looking for whether a goroutine debug stack trace gets converted to a pprof correctly.

The other slightly complicated use of replacing lookupProfile is in TestRunProfile/delta. But I think that test is basically a duplicate of the delta profile tests in profiler/internal/pprofutils so practically speaking we don't lose anything by just deleting the test completely. We could have TestAllUploaded variants that check whether delta or non-delta profiles get uploaded, at least.

@nsrip-dd nsrip-dd merged commit e7717ad into v1 Apr 29, 2022
@nsrip-dd nsrip-dd deleted the nick.ripley/profiler-test-hooks branch April 29, 2022 12:54
nsrip-dd added a commit that referenced this pull request Apr 29, 2022
After the discussion on #1259, replace tests that hook into internal
implementation details with more integration-like tests similar to
TestAllUploaded, which only test the package through its public
interface.

The following tests were removed:

* TestProfilerInteral, TestProfilerPassthrough

  These tests have been made redundant by TestAllUploaded

* TestRunProfile

  Mostly replaced by TestAllUploaded. The "delta" test case is a
  duplicate of the tests in internal/pprofutils/delta_test. The
  "goroutineswait" test case is really testing the
  goroutineDebug2ToPprof function, and doesn't need the overhead of
  going through the rest of the profiler pipeline. So it was moved to
  its own test. The "goroutineswaitLimit" test was moved to a seprate
  test as well, based on TestAllUploaded.

As a result of these changes, the test hooks added by #1259 are no
longer used and have been removed.

This commit also picks up a small bug fix for TestAllUploaded. The test
used a wait group to signal that a profile upload has been processed. If
the followup processing after the call to wg.Wait() takes too long, the
profiler might upload another profile. If this happens, the server's
handler will make wg's counter go negative and panic. Replace with a
channel + a non-blocking send.
nsrip-dd added a commit that referenced this pull request May 13, 2022
After the discussion on #1259, replace tests that hook into internal
implementation details with more integration-like tests similar to
TestAllUploaded, which only test the package through its public
interface.

The following tests were removed:

* TestProfilerInteral, TestProfilerPassthrough

  These tests have been made redundant by TestAllUploaded

* TestRunProfile

  Mostly replaced by TestAllUploaded. The "delta" test case is a
  duplicate of the tests in internal/pprofutils/delta_test. The
  "goroutineswait" test case is really testing the
  goroutineDebug2ToPprof function, and doesn't need the overhead of
  going through the rest of the profiler pipeline. So it was moved to
  its own test. The "goroutineswaitLimit" test was moved to a seprate
  test as well, based on TestAllUploaded.

As a result of these changes, the test hooks added by #1259 are no
longer used and have been removed.
nsrip-dd added a commit that referenced this pull request May 13, 2022
After the discussion on #1259, replace tests that hook into internal
implementation details with more integration-like tests similar to
TestAllUploaded, which only test the package through its public
interface.

The following tests were removed:

* TestProfilerInteral, TestProfilerPassthrough

  These tests have been made redundant by TestAllUploaded

* TestRunProfile

  Mostly replaced by TestAllUploaded. The "delta" test case is a
  duplicate of the tests in internal/pprofutils/delta_test. The
  "goroutineswait" test case is really testing the
  goroutineDebug2ToPprof function, and doesn't need the overhead of
  going through the rest of the profiler pipeline. So it was moved to
  its own test. The "goroutineswaitLimit" test was moved to a seprate
  test as well, based on TestAllUploaded.

As a result of these changes, the test hooks added by #1259 are no
longer used and have been removed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants