Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

e2e/storage: avoid definining unusable tests #70992

Closed

Conversation

@pohly
Copy link
Contributor

commented Nov 13, 2018

What type of PR is this?
/kind cleanup

What this PR does / why we need it:

After the recent test/e2e/storage refactoring (merged for 1.13), a lot of tests get defined that can only run with specific cluster providers. Those tests then get skipped at runtime. Even tests that get skipped take time (test setup, creating client set or worse, deploying the driver if the driver needs that) and the list of skipped tests in the CI grew a lot without adding any information (skip reason not captured). Therefore this PR:

  • enables the definition of tests depending on the test context
  • changes the "skip test" API so that test definition gets skipped, while still allowing runtime skipping
  • changes the CSI and in-tree drivers to skip definition as much as possible

Special notes for your reviewer:

This contains PR #70862 because of code conflicts, so that PR must be merged first.

Does this PR introduce a user-facing change?:

NONE

/sig storage
/cc @mkimuram
/assign @msau42

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented Nov 13, 2018

@pohly: GitHub didn't allow me to request PR reviews from the following users: mkimuram.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

What type of PR is this?
/kind cleanup

What this PR does / why we need it:

After the recent test/e2e/storage refactoring (merged for 1.13), a lot of tests get defined that can only run with specific cluster providers. Those tests then get skipped at runtime. Even tests that get skipped take time (test setup, creating client set or worse, deploying the driver if the driver needs that) and the list of skipped tests in the CI grew a lot without adding any information (skip reason not captured). Therefore this PR:

  • enables the definition of tests depending on the test context
  • changes the "skip test" API so that test definition gets skipped, while still allowing runtime skipping
  • changes the CSI and in-tree drivers to skip definition as much as possible

Special notes for your reviewer:

This contains PR #70862 because of code conflicts, so that PR must be merged first.

Does this PR introduce a user-facing change?:

NONE

/sig storage
/cc @mkimuram
/assign @msau42

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@pohly

This comment has been minimized.

Copy link
Contributor Author

commented Nov 13, 2018

/hold

PR #70862 must be merged first.

@pohly

This comment has been minimized.

Copy link
Contributor Author

commented Nov 13, 2018

There's one change that I didn't make to avoid unnecessary churn: testsuites.RunTestSuite isn't actually running any tests, it merely defines them. This is an important distinction for people reading the code or trying to understand how to use the API, because they have to understand that defining tests and executing them happens at entirely different points in time.

"test suite" is also ambiguous, because in the larger context of the final test binary, the entire set of specs is the "test suite". What the method adds is merely a certain set of tests (or even more precisely, some "specs" in Ginkgo terms).

Therefore it would be useful to rename the method to "AddTests" or "DefineTests". I'd prefer to not use the term "specs" because it is Ginkgo-specific and doesn't match at all with the naming of the package ("testsuites").

Anyway, let me know what you think about this. I can still add the renaming to this PR. But if we do it, we should do it now, to avoid breaking external tests suites which might start using the package.

@neolit123

This comment has been minimized.

Copy link
Member

commented Nov 13, 2018

/priority important-longterm

@wackxu wackxu referenced this pull request Nov 14, 2018

Merged

add e2e test for snapshot #69036

@pohly pohly force-pushed the pohly:storage-volume-testsuites-avoid-tests branch 4 times, most recently from 57c3b16 to 91bba41 Nov 14, 2018

@pohly pohly force-pushed the pohly:storage-volume-testsuites-avoid-tests branch from 91bba41 to 831e473 Nov 16, 2018

@pohly pohly force-pushed the pohly:storage-volume-testsuites-avoid-tests branch 2 times, most recently from d049d9c to 007ebd8 Nov 16, 2018

pohly added a commit to intel/pmem-CSI that referenced this pull request Nov 22, 2018

test: automatic E2E testing
To run the tests, do:
  make start
  export KUBECONFIG=<value as shown by make start>
  REPO_ROOT=$(pwd) go test -v ./test/e2e -ginkgo.v

This is intentionally pretty verbose. "go test" (without -v) will only
print something on failure, while "-ginkgo.v" can be removed to make
the test execution itself less verbose.

At the moment, a single provisioning test will be executed, with two
being skipped (mount options not declared, no block support).

The out-of-tree version of the "testsuites" package has to be used
because two PRs are still under review:
kubernetes/kubernetes#70862
kubernetes/kubernetes#70992

Kubernetes 1.13 could be used instead, but then more code would get
pulled in (PR 70862) or the API would be different (PR 70992). Either
way, the implementation of k8s.io/kubernetes/pkg/util/mount
changes, which affects the pmem-csi-driver.

That code itself hasn't changed much. The real change is that
Kubernetes now uses a fork of the more-or-less-unmaintained glog. That
is a good thing also for csi-pmem, so all remaining references to glog
get replaced.

pohly added a commit to intel/pmem-CSI that referenced this pull request Nov 22, 2018

test: automatic E2E testing
To run the tests, do:
  make start
  export KUBECONFIG=<value as shown by make start>
  REPO_ROOT=$(pwd) go test -v ./test/e2e -ginkgo.v

This is intentionally pretty verbose. "go test" (without -v) will only
print something on failure, while "-ginkgo.v" can be removed to make
the test execution itself less verbose.

At the moment, a single provisioning test will be executed, with two
being skipped (mount options not declared, no block support).

The out-of-tree version of the "testsuites" package has to be used
because two PRs are still under review:
kubernetes/kubernetes#70862
kubernetes/kubernetes#70992

Kubernetes 1.13 could be used instead, but then more code would get
pulled in (PR 70862) or the API would be different (PR 70992). Either
way, the implementation of k8s.io/kubernetes/pkg/util/mount
changes, which affects the pmem-csi-driver.

That code itself hasn't changed much. The real change is that
Kubernetes now uses a fork of the more-or-less-unmaintained glog. That
is a good thing also for csi-pmem, so all remaining references to glog
get replaced.

The YAML files get updated to make them a bit more similar to how it
is done elsewhere and how the test code expects it:
- separate storage class, because conceptually there can be more
  than one class per driver, each with different settings
- StatefulSet instead of ReplicaSet, resulting in stable pod names
@pohly

This comment has been minimized.

Copy link
Contributor Author

commented Dec 20, 2018

/test pull-kubernetes-e2e-gce-100-performance

@verult
Copy link
Contributor

left a comment

Reviewed the first 2 of the commits. Thanks for improving the skip process!

@@ -1020,8 +1006,8 @@ func (c *cinderDriver) GetDriverInfo() *testsuites.DriverInfo {
return &c.driverInfo
}

func (c *cinderDriver) SkipUnsupportedTest(pattern testpatterns.TestPattern) {
framework.SkipUnlessProviderIs("openstack")
func (c *cinderDriver) SkipUnsupportedTest(pattern testpatterns.TestPattern) bool {

This comment has been minimized.

Copy link
@verult

verult Dec 21, 2018

Contributor

Rename func to "IsTestSupported". The current form seems like a double negative to me

This comment has been minimized.

Copy link
@pohly

pohly Dec 21, 2018

Author Contributor

I agree. I just didn't want to mess with the original code more than strictly necessary.
But as this is now an API change, I might was well change the method name (not just here, but also the other "skip unsupported test" methods).

}

if bTestDriver, ok := driver.(BeforeEachTestDriver); ok {
pattern := pattern

This comment has been minimized.

Copy link
@verult

verult Dec 21, 2018

Contributor

nit: something like p := pattern to make the variable less ambiguous

This comment has been minimized.

Copy link
@pohly

pohly Dec 21, 2018

Author Contributor

https://golang.org/doc/faq#closures_and_goroutines also uses the same variable name:

for _, v := range values {
        v := v // create a new 'v'.

But I can change it, no problem.

// CreateDriver cleanup all the resources that is created in CreateDriver
// CreateDriver cleanup all the resources that is created in CreateDriver. There is
// no guarantee that CreateDriver succeeded or even was called at all, so the test driver
// has to track resources.

This comment has been minimized.

Copy link
@verult

verult Dec 21, 2018

Contributor

Is there a well-defined expected behavior if CleanupDriver() was called but CreateDriver() either failed or wasn't called?

This comment has been minimized.

Copy link
@pohly

pohly Dec 21, 2018

Author Contributor

CleanupDriver() will always get called, under all circumstances. This has been true also before, it just wasn't explicitly mentioned, therefore I added this comment.

This comment has been minimized.

Copy link
@verult

verult Dec 21, 2018

Contributor

To me, the new comment seems to imply the behavior of CleanupDriver() is different depending on what happens with CreateDriver(). Could you improve the comment to make what you said here more explicit? (Maybe ".... and the test driver has to clean up resources regardless")

This comment has been minimized.

Copy link
@pohly

pohly Dec 21, 2018

Author Contributor

Okay.

}

// StaticSkipTestDriver is an optional interface that drivers can
// implement to filter out unsuitable tests.

This comment has been minimized.

Copy link
@verult

verult Dec 21, 2018

Contributor

add: "...unsuitable tests while tests are defined."

to clarify why the struct is "Static".

This comment has been minimized.

Copy link
@pohly

pohly Dec 21, 2018

Author Contributor

I've renamed the interface to FilterTestDriver (seemed to make more sense, now that the method is called IsTestSupported) and added the comment.

}
return false

This comment has been minimized.

Copy link
@verult

verult Dec 21, 2018

Contributor

nit:

!framework.NodeOSDistroIs("gci", "ubuntu", "custom") || 
(pattern.FsType == "xfs" && 
!framework.NodeOSDistroIs("ubuntu", "custom"))

(negated if you are changing the function name)

This comment has been minimized.

Copy link
@pohly

pohly Dec 21, 2018

Author Contributor

Yep.

func RunTestSuite(f *framework.Framework, driver TestDriver, tsInits []func() TestSuite, tunePatternFunc func([]testpatterns.TestPattern) []testpatterns.TestPattern) {
for _, testSuiteInit := range tsInits {
suite := testSuiteInit()
patterns := tunePatternFunc(suite.getTestSuiteInfo().testPatterns)

for _, pattern := range patterns {
if skipUnsupportedTest(suite, driver, pattern) {
continue

This comment has been minimized.

Copy link
@verult

verult Dec 21, 2018

Contributor

Log then skip. The user should know a particular test is detected but skipped due to driver incompatibility.

This comment has been minimized.

Copy link
@pohly

pohly Dec 21, 2018

Author Contributor

That would make each test suite invocation very verbose, because that information would get printed before running any test, whether the user intends to run the filtered out test or not.

I agree that this is useful information, but we should make it depend on some non-default command line flag. Perhaps we can (ab)use -ginkgo.dryRun for this?

This comment has been minimized.

Copy link
@verult

verult Dec 21, 2018

Contributor

To further reduce the verbosity, we could print a single log message per test suite and accumulate incompatible drivers in a list. This is still a huge improvement over the verbosity we have today.

I'm not a fan of relying on a non-default flag that could potentially be hard to discover, or in the case of using -ginkgo.dryRun, hard to correlate with controlling the log level here.

This comment has been minimized.

Copy link
@pohly

pohly Dec 21, 2018

Author Contributor

I'm not sure I understand what the output should look like. How many lines should that log message have? What information do you expect?

Ideally, it would be something like:

The following tests were not added because they are unsupported:
[sig-storage] In-tree Volumes [Driver: nfs] [Testpattern: Dynamic PV (default fs) block mode] - block mode not supported
...
[sig-storage] CSI Volumes [Driver: gcepd] [Testpattern: Dynamic PV (default fs)] - not running on GCE
...

But that will be many lines of output, basically one line for each suite/pattern combination unless we implement some intelligence in RunTestSuite that determines whether all patterns or even all combinations were removed. The other problem is that the reason for skipping is not transported back to RunTestSuite. We either just print the test name prefix and no explanation, or we have to change the return type.

This comment has been minimized.

Copy link
@pohly

pohly Dec 23, 2018

Author Contributor

I tried to implement something like this, but hit a major snag: there's no way to retrieve the current set of components (i.e. the text from previous Describe and Context calls) from ginkgo during test registration. The effect is that at the place where an It call gets skipped, not enough information is available to record the entire test name. In my current prototype, the inner test name is available and (with some extra effort) the testsuite/pattern part. The driver name could be added, but that's still no guarantee that the resulting string is complete.

Here's what it looks like:

$ go test -v ./test/e2e 
W1223 17:21:55.928475   18945 test_context.go:392] Unable to find in-cluster config, using default host : http://127.0.0.1:8080
Dec 23 17:21:55.928: INFO: The --provider flag is not set.  Treating as a conformance test.  Some tests may not be run.
I1223 17:21:56.041325   18945 test_context.go:427] The following tests were not added to the test suite because they are not supported in the current configuration:
??? [Testpattern: Dynamic PV (default fs)] provisioning should provision storage with mount options
??? [Testpattern: Dynamic PV (default fs)] provisioning should provision storage with mount options
??? [Testpattern: Dynamic PV (default fs)] provisioning should provision storage with mount options
??? [Testpattern: Dynamic PV (default fs)] provisioning should provision storage with mount options
??? should create and delete block persistent volumes
??? should create and delete block persistent volumes
??? should create and delete block persistent volumes
??? should create and delete block persistent volumes
??? should create and delete block persistent volumes

The prototype commit is here: pohly@20e270a

I see two options:

  1. take this PR after addressing only the other feedback, without logging tests skipped during test registration
  2. work with Ginkgo upstream such that it better supports skipped test registration

The latter will take time. Ginkgo is not entirely frozen, but maintenance isn't particularly active either. One of my PRs has been waiting to get merged for a while now.

@verult @msau42: I'll let you guys decide.

}

func skipPersistenceTest(driver TestDriver) {
func skipPersistenceTest(driver TestDriver) bool {

This comment has been minimized.

Copy link
@verult

verult Dec 21, 2018

Contributor

also seems like a double negative

This comment has been minimized.

Copy link
@pohly

pohly Dec 21, 2018

Author Contributor

I've inlined the code from these functions and removed them. They were trivial and now testVolumes is easier to understand.

}

func skipExecTest(driver TestDriver) {
func skipExecTest(driver TestDriver) bool {

This comment has been minimized.

Copy link
@verult

verult Dec 21, 2018

Contributor

ditto

func SIGDescribe(text string, body func()) bool {
return ginkgo.Describe("[sig-storage] "+text, body)
return framework.DescribeWithTestContext("[sig-storage] "+text, body)

This comment has been minimized.

Copy link
@verult

verult Dec 21, 2018

Contributor

/cc @rmmh

// time when TestContext has been fully initialized. A callback
// invoked via framework.Describe therefore can add or parameterize
// specs depending on the test configuration.
func DescribeWithTestContext(text string, body func()) bool {

This comment has been minimized.

Copy link
@verult

verult Dec 21, 2018

Contributor

/cc @rmmh

@k8s-ci-robot k8s-ci-robot requested review from rmmh Dec 21, 2018

@pohly pohly force-pushed the pohly:storage-volume-testsuites-avoid-tests branch from 772e3ef to d6a43cc Dec 21, 2018

@pohly

This comment has been minimized.

Copy link
Contributor Author

commented Dec 21, 2018

@verult pushed, please have another look.

@verult
Copy link
Contributor

left a comment

Posting comments for first 2 commits for now; I'm working on the last 2

}
return true

This comment has been minimized.

Copy link
@verult

verult Dec 21, 2018

Contributor

nit:

Something like this is arguably easier to read:

if framework.NodeOSDistroIs("gci", "ubuntu", "custom") {
    if pattern.FsType != "xfs" || framework.NodeOSDistroIs("ubuntu", "custom") {
        return true
    }
}
return false

To improve it further, I would personally use this (but up to you):

return framework.NodeOSDistroIs("gci", "ubuntu", "custom") &&
       (pattern.FsType != "xfs" ||
        framework.NodeOSDistroIs("ubuntu", "custom"))

The double negatives in your code made it a bit hard for me to parse

This comment has been minimized.

Copy link
@pohly

pohly Dec 21, 2018

Author Contributor

I also like the latter better, but again didn't want to change too much to keep the review simpler: when a ginkgo.Skip gets replaced by a return true or now return true, then it is obvious that the overall logic is unchanged.

I'll change it.

This comment has been minimized.

Copy link
@pohly

pohly Jan 4, 2019

Author Contributor

Done.

}
return true

This comment has been minimized.

Copy link
@verult

verult Dec 21, 2018

Contributor

ditto

This comment has been minimized.

Copy link
@pohly

pohly Jan 4, 2019

Author Contributor

Done.

@@ -37,8 +37,8 @@ import (
type TestSuite interface {
// getTestSuiteInfo returns the TestSuiteInfo for this TestSuite
getTestSuiteInfo() TestSuiteInfo
// skipUnsupportedTest skips the test if this TestSuite is not suitable to be tested with the combination of TestPattern and TestDriver
skipUnsupportedTest(testpatterns.TestPattern, TestDriver)
// isTestSupported returns false if this TestSuite cannot be tested with the combination of TestPattern and TestDriver

This comment has been minimized.

Copy link
@verult

verult Dec 21, 2018

Contributor

nit: "returns true if this TestSuite can be..."

This comment has been minimized.

Copy link
@pohly

pohly Jan 4, 2019

Author Contributor

Done.

// is not suitable to be tested.
// isTestSupported will determine if the combination of driver, testsuite, and testpattern
// can be tested.
//
// Whether it needs to be skipped is checked by following steps:

This comment has been minimized.

Copy link
@verult

verult Dec 21, 2018

Contributor

"Whether the test combination is supported is checked..."

This comment has been minimized.

Copy link
@pohly

pohly Jan 4, 2019

Author Contributor

Done.

// FilterTestDriver is an optional interface that drivers can
// implement to filter out unsuitable tests while tests get defined.
type FilterTestDriver interface {
// IsTestSupported returns false if the Testpattern is not

This comment has been minimized.

Copy link
@verult

verult Dec 21, 2018

Contributor

Testpattern -> TestPattern

nit: "...returns true if the TestPattern is suitable..."

This comment has been minimized.

Copy link
@pohly

pohly Jan 4, 2019

Author Contributor

Done.

@verult
Copy link
Contributor

left a comment

2nd half look good. Added some minor comments

cs: driver.GetDriverInfo().Config.Framework.ClientSet,
pvc: resource.pvc,
sc: resource.sc,
dInfo: driver.GetDriverInfo(),

This comment has been minimized.

Copy link
@verult

verult Dec 21, 2018

Contributor

Can we remove the field entirely?

This comment has been minimized.

Copy link
@pohly

pohly Dec 21, 2018

Author Contributor

Which field? I'm sorry, but it's not obvious in the Github UI.

This comment has been minimized.

Copy link
@verult

verult Dec 21, 2018

Contributor

Np. dInfo

This comment has been minimized.

Copy link
@pohly

pohly Dec 21, 2018

Author Contributor

But dInfo is gone already, I'm removing it in this PR already because it was clearly redundant.

Is there anything else you want to get removed?

This comment has been minimized.

Copy link
@pohly

pohly Jan 4, 2019

Author Contributor

#72434 removes the entire struct holding these fields.

@@ -120,13 +119,11 @@ func (p *provisioningTestSuite) execTest(driver TestDriver, pattern testpatterns
// Ginkgo's "Global Shared Behaviors" require arguments for a shared function
// to be a single struct and to be passed as a pointer.
// Please see https://onsi.github.io/ginkgo/#global-shared-behaviors for details.
testProvisioning(&input)
testProvisioning(driver, &input)

This comment has been minimized.

Copy link
@verult

verult Dec 21, 2018

Contributor

Just want to make sure I understand: BeforeEach() runs before It()s, but not necessarily before testProvisioning()?

This comment has been minimized.

Copy link
@pohly

pohly Dec 21, 2018

Author Contributor

First testProvisioning runs. It's the function which registers tests (aka It callbacks). BeforeEach is called while tests get registered, and it's callback then gets called before the callback given to It.

That's the background behind this comment before testProvisioning: the input variable gets instantiated before calling testProvisioning, but it's content only gets initialized later when the corresponding BeforeEach callback executes.

FWIW, I find this code pattern of setting up per-test input data unnecessarily complex. When reading tests, it is not obvious where the input data really comes from. But this is unrelated to this PR, and I don't want to change it. It's part of the overall testsuites design.

This comment has been minimized.

Copy link
@pohly

pohly Jan 4, 2019

Author Contributor

PR #72434 addresses this.

This comment has been minimized.

Copy link
@mkimuram

mkimuram Jan 4, 2019

Contributor

@pohly

- testProvisioning(&input)
+ testProvisioning(driver, &input)

I didn't know that the restriction of single variable and pointer only applies to the argument for execution phase and we can pass additional arguments for preparation phase, here. Good improvement!

e2e/framework: support delayed spec description
A common pattern is to define a test and then skip it at runtime if
the current provider isn't the right one. The downside of this
approach is the high number of useless tests in many test runs.

With the new framework.Describe API it becomes possible to delay spec
definition until TestContext is initialized. Then the spec can be
created dynamically so that only tests that actually can run get
defined.

@pohly pohly force-pushed the pohly:storage-volume-testsuites-avoid-tests branch from d6a43cc to 44f677f Jan 4, 2019

@k8s-ci-robot k8s-ci-robot added size/L and removed size/XXL labels Jan 4, 2019

@pohly

This comment has been minimized.

Copy link
Contributor Author

commented Jan 4, 2019

I think I addressed all review feedback except the open question around logging of tests that don't get registered (#70992 (comment)) and rebased onto current master.

I put the logging topic onto the agenda for the next SIG-storage meeting on January 8th.

@pohly

This comment has been minimized.

Copy link
Contributor Author

commented Jan 4, 2019

/test pull-kubernetes-integration

@mkimuram
Copy link
Contributor

left a comment

@pohly

Sorry for late response.

I still don't completely understand the logic behind DescribeWithTestContext, the other logics look good to me if it works correctly.
Other than above, let me add minor comments for naming and comment.

SkipUnsupportedTest(testpatterns.TestPattern)
}

// FilterTestDriver is an optional interface that drivers can

This comment has been minimized.

Copy link
@mkimuram

mkimuram Jan 4, 2019

Contributor

IIUC, this PR aims to separate skip to two types, runtime skip and preparation-phase skip. (There will be better wording, though.) This concept sounds good and the comments for FilterTestDriver and BeforeEachTestDriver explain this concept well, but it's hard to guess it from their names and their method's names.

So, it would be better that interfaces and methods are named to imply this concept.

This comment has been minimized.

Copy link
@pohly

pohly Jan 4, 2019

Author Contributor

Do you have a better name for FilterTestDriver? I'm completely open for suggestions here.

This comment has been minimized.

Copy link
@mkimuram

mkimuram Jan 4, 2019

Contributor

@pohly

Now that BeforeEachTestDriver is deleted, so there is no need to compare and distinguish these two for whether they are runtime or not. In this situation, I think that current naming is good enough.

Thank you for your consideration and handling on this.

}

// BeforeEachTestDriver is an optional interface that drivers can
// implement to hook into test execution. It can be used to initialize

This comment has been minimized.

Copy link
@mkimuram

mkimuram Jan 4, 2019

Contributor

Do we need to allow this method to initialize additional resources?
Are there any existing use cases for it?

(I'm worrying about that it was due to my previous mistake in putting resource initialization code for GCE driver here, which has already been moved to right place in another commit.)

This comment has been minimized.

Copy link
@pohly

pohly Jan 4, 2019

Author Contributor

It's not really needed and in fact later (#72002) I ended up removing it again because it didn't turn out to be that useful. CreateDriver gets called at the same time and can replace the extra BeforeEachTestDriver.

I'll remove it directly in this PR instead.

This comment has been minimized.

Copy link
@pohly

pohly Jan 4, 2019

Author Contributor

Done.

pohly added some commits Nov 9, 2018

e2e/storage: don't define tests that are just going to be skipped
Most of the checks whether a certain test can run can already be made
while defining tests because it only depends on static
information (driver, testsuite, test pattern, test run
configuration).

Not defining tests that can't run leads to cleaner test suites (no
useless tests) and faster execution (in particular when a driver or
server need to be brought up before starting a test that then doesn't
do anything).

While implementing this, the TestDriver.SkipUnsupportedTest method was
moved into a separate interface because many drivers do not need to
implement it and the the semantic of the functions was inverted ("is
supported" instead of "skip) to make the boolean logic simpler.

Runtime skipping is still supported in CreateDriver.
e2e: update bazel BUILD files
Generated via hack/update-bazel.sh.
e2e/storage: skip provisioning tests if unsupported
Previously, running the tests against a driver without dynamic
provisioning would have resulted in a test failure - hasn't happened
in practice because all drivers support it.
test/e2e: skip unsupported provisioning sub-tests
When passing the driver to the test setup function directly, it
becomes possible to skip test definitions at init time instead of
skipping test execution at runtime. The structs also can be
simplified.

@pohly pohly force-pushed the pohly:storage-volume-testsuites-avoid-tests branch from 44f677f to e34e7bb Jan 4, 2019


var specs []spec

// DescribeWithTestContext is like ginkgo.Describe, but in contrast to

This comment has been minimized.

Copy link
@wongma7

wongma7 Jan 7, 2019

Contributor

I would like the comment to mention the motivation for why someone might want to use this function to "invoke spec bod[ies]" only after "TestContext has been fully initialized". i.e. this exists so that we can skip tests in spec bodies based on TestContext. Otherwise the file is too generic looking and the technical language imposing :]

This comment has been minimized.

Copy link
@wongma7

wongma7 Jan 7, 2019

Contributor

I understand "can add or parameterize specs depending on the test configuration" basically explains it but I would word it differently is all.

This comment has been minimized.

Copy link
@pohly

pohly Jan 8, 2019

Author Contributor

I had another use case in mind: testing CSI drivers with the upstream e2e test binary.

With this change, the e2e test binary could have a command line parameter(s) that enable testing of a driver by providing the necessary information. The "TestDriver" instance that is given to the storage test suite then gets defined based on these parameters.

Without this change, the same can still be done, but would have to be based on parsing environment variables.

@pohly

This comment has been minimized.

Copy link
Contributor Author

commented Jan 8, 2019

In the SIG-storage meeting today, the first question was why skipping tests at runtime is considered bad. I pointed out that drivers get deployed for each test, for which the reply was that a) the driver installation should be triggered when a test really runs (which I don't think can be done nicely) or b) driver installation should happen before the test suite runs.

However, it's not clear how to get the driver installation hooked into the cluster setup. This will require further discussion.

But even when driver installation is not part of the test setup, skipped tests still take time. I don't have hard numbers about that, though. Let's take the gcepd production driver as example, which is pre-installed. @msau42 what's your opinion, do you prefer skipping unusable tests for that driver at runtime or at testsuite definition time?

@msau42

This comment has been minimized.

Copy link
Member

commented Jan 8, 2019

But even when driver installation is not part of the test setup, skipped tests still take time. I don't have hard numbers about that, though.

Could you try running locally some non-supported environment tests, with only the change that adds skip to CreateDriver, to measure how much just skipping the tests take?

@pohly

This comment has been minimized.

Copy link
Contributor Author

commented Jan 9, 2019

That can already be tested without any changes. On a 1.13 cluster
brought up with hack/local-up-cluster.sh, running all tests involving
the external gcepd driver takes:

$ go test -v ./test/e2e -args -ginkgo.v -ginkgo.focus=Feature..gcePD-external
...
Will run 31 of 2161 specs
...

S [SKIPPING] in Spec Setup (BeforeEach) [6.157 seconds]
[sig-storage] CSI Volumes
/nvme/gopath/src/k8s.io/kubernetes/test/e2e/storage/utils/framework.go:22
  [Driver: pd.csi.storage.gke.io][Feature: gcePD-external]
  /nvme/gopath/src/k8s.io/kubernetes/test/e2e/storage/csi_volumes.go:129
    [Testpattern: Dynamic PV (ext3)] volumes
    /nvme/gopath/src/k8s.io/kubernetes/test/e2e/storage/testsuites/volumes.go:128
      should be mountable [BeforeEach]
      /nvme/gopath/src/k8s.io/kubernetes/test/e2e/storage/testsuites/volumes.go:169

      Jan  9 06:01:46.774: Only supported for providers [gce gke] (not )

      /nvme/gopath/src/k8s.io/kubernetes/test/e2e/framework/util.go:292
...
Ran 0 of 2161 Specs in 190.920 seconds
SUCCESS! -- 0 Passed | 0 Failed | 0 Pending | 2161 Skipped --- PASS: TestE2E (191.01s)

That's 6 seconds per test. Not too bad and 3 minutes overall don't matter in a CI setup. It could be more annoying when running the tests interactively, though.

But as soon as CreateDriver is expensive, it gets worse of course. We might be able to solve this (eventually) by moving the driver installation into the test cluster setup phase, but what about drivers like NFS which bring up a NFS server for each test? Should and can that also be part of the cluster setup?

It might be easier to delay the CreateDriver call. Right now it gets called outside of the test suites. If we make the test suites responsible for calling it right before running a specific test, then runtime skipping of tests that aren't supported by the driver + test pattern combination becomes cheap.

@pohly

This comment has been minimized.

Copy link
Contributor Author

commented Jan 9, 2019

Even better, if the framework instantiation also gets moved after checking the driver + test pattern combination, then we can also save those 6 seconds.

@pohly

This comment has been minimized.

Copy link
Contributor Author

commented Jan 10, 2019

There is value in skipping tests at runtime (consistency, result reporting via Ginkgo). By reordering the initialization steps it is possible to address the main pain point that this PR was meant for (avoiding useless driver deployment in skipped tests), see PR #72434. Therefore I am closing this PR.

@pohly pohly closed this Jan 10, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.