Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ephemeral/generic #383

Merged
merged 6 commits into from Oct 29, 2019
Merged

Ephemeral/generic #383

merged 6 commits into from Oct 29, 2019

Conversation

avalluri
Copy link
Contributor

Added ephemeral volume provisiong

@avalluri avalluri force-pushed the ephemeral/generic branch 2 times, most recently from b193579 to 439a346 Compare August 13, 2019 12:00
@avalluri avalluri requested review from okartau and pohly and removed request for okartau August 22, 2019 08:02
@pohly
Copy link
Contributor

pohly commented Sep 25, 2019

Let's consider next steps...

We need to make changes for 1.16 (https://github.com/kubernetes-csi/docs/pull/209/files):

  • extend driver info to include volumeLifecycleModes: persistent, ephemeral in a new deployment for 1.16
  • enable pod info
  • in the code which decides between "ephemeral" and "persistent" check podinfo for csi.storage.k8s.io/ephemeral and if set, use that (and only that); if not set, use the heuristic

We need automated testing on 1.16. @okartau, you were working on an extension of our test cluster setup for Cent OS + Kubernetes 1.16, right? Is that ready for merging?

@avalluri
Copy link
Contributor Author

extend driver info to include volumeLifecycleModes: persistent, ephemeral in a new deployment for 1.16

We can deal this volumeLifecycleModes change as part of Kubernetes 1.16 deployment change. This shouldn't effect v1.15.

enable pod info

our deployment already using podInfoOnMount: true,

in the code which decides between "ephemeral" and "persistent" check podinfo for csi.storage.k8s.io/ephemeral and if set, use that (and only that); if not set, use the heuristic

This check is already part of this change. In addition to make it work for v1.15, we provision ephemeral only volume if all below conditions are met:

  • Volume with given volume-id is not found
  • Missing StagingTargetPath in the NodePublishRequest
  • No provisioner information in NodePlubishRequest.VolumeContext

@pohly
Copy link
Contributor

pohly commented Sep 30, 2019 via email

@pohly
Copy link
Contributor

pohly commented Oct 4, 2019

All PRs are now getting tested against 1.16 (on Fedora) and against 1.15 (on Clear). Can you update the PR such that it works with both?

To start a cluster with 1.16 locally, use TEST_DISTRO=fedora TEST_KUBERNETES_VERSION=1.16 make start.

@avalluri avalluri force-pushed the ephemeral/generic branch 6 times, most recently from af0a2bc to f43203a Compare October 7, 2019 12:45
Copy link
Contributor

@pohly pohly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's support 1.16 from the start, i.e. avoid accumulating technical debt.

@avalluri
Copy link
Contributor Author

avalluri commented Oct 7, 2019

Let's support 1.16 from the start, i.e. avoid accumulating technical debt.

I have pushed the 1.16 deployment changes.

Copy link
Contributor

@pohly pohly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of copying quite a few files into deploy/kustomize/kubernetes-1.16-* dirs, would it be possible to take the existing Kubernetes 1.15 deployment and just patch the volumeLifecycleField in?

@@ -0,0 +1,6 @@
# Append volumeLifeCycltModes to DervierInfo valid only for Kubernetes v1.16+
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typos.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of copying quite a few files into deploy/kustomize/kubernetes-1.16-* dirs, would it be possible to take the existing Kubernetes 1.15 deployment and just patch the volumeLifecycleField in?

Do you mean something this: avalluri@099d91d

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typos.

Fixed

@pohly
Copy link
Contributor

pohly commented Oct 7, 2019

Do we have tests? Probably we need to vendor Kubernetes 1.16.0 first, then we can enable these tests: https://github.com/kubernetes/kubernetes/blob/master/test/e2e/storage/testsuites/ephemeral.go

Copy link
Contributor

@pohly pohly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of copying quite a few files into deploy/kustomize/kubernetes-1.16-* dirs, would it be possible to take the existing Kubernetes 1.15 deployment and just patch the volumeLifecycleField in?

Do you mean something this:
avalluri@099d91d

Yes, exactly what I had in mind - thanks for turning it into code :-)

But comments are currently inconsistent, some of them look like copy-and-paste leftovers. I think it is enough to have one README.md in the deploy/kubernetes-1.16 directory that explains the difference and then drop all other comments from the directories and kustomization.yaml files underneath it.

Another nit: also check the file endings in those file - sometimes you have two line breaks, sometimes one, sometimes none.

Ending a file with a line break looks best IMHO.

@avalluri avalluri force-pushed the ephemeral/generic branch 2 times, most recently from 1303e50 to f37b3b5 Compare October 9, 2019 07:08
@avalluri
Copy link
Contributor Author

avalluri commented Oct 9, 2019

Instead of copying quite a few files into deploy/kustomize/kubernetes-1.16-* dirs, would it be possible to take the existing Kubernetes 1.15 deployment and just patch the volumeLifecycleField in?

Do you mean something this:
avalluri@099d91d

Yes, exactly what I had in mind - thanks for turning it into code :-)

But comments are currently inconsistent, some of them look like copy-and-paste leftovers. I think it is enough to have one README.md in the deploy/kubernetes-1.16 directory that explains the difference and then drop all other comments from the directories and kustomization.yaml files underneath it.

Another nit: also check the file endings in those file - sometimes you have two line breaks, sometimes one, sometimes none.

Ending a file with a line break looks best IMHO.

I just pushed a new revision, that covers all these review comments.

@pohly
Copy link
Contributor

pohly commented Oct 11, 2019

devicemanager: Make GetDevice() return pointer

Why?

README.md Outdated
@@ -344,8 +344,8 @@ the node cannot start until resources become available.

#### Usage on Kubernetes

Kubernetes cluster administrators can expose above mentioned [volume
persistency types](#volume-persistency) to applications using
Kubernetes cluster administrators can expose above mentioned Persistent and Cache volumes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Lowercase please. These are normal English words.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

README.md Outdated
@@ -400,6 +400,9 @@ application](deploy/common/pmem-app-cache.yaml) example.
* A node is only chosen the first time a pod starts. After that it will always restart
on that node, because that is where the persistent volume was created.

Volume requests embedded in Pod spec are provisioned as Ephemeral volumes. Check with
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

README.md Outdated
@@ -515,7 +518,7 @@ Kubernetes cluster has been verified on:
|-------------------|--------------------------------|------------------------------- |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit unrelated, but as we add another feature gate here: can you turn "feature gates" into a link to https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/ ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, added the link.

@@ -76,7 +89,7 @@ func (ns *nodeServer) NodeGetVolumeStats(ctx context.Context, req *csi.NodeGetVo
return nil, status.Error(codes.Unimplemented, "")
}

func (ns *nodeServer) NodePublishVolume(ctx context.Context, req *csi.NodePublishVolumeRequest) (*csi.NodePublishVolumeResponse, error) {
func (ns *nodeServer) NodePublishVolume(ctx context.Context, req *csi.NodePublishVolumeRequest) (*csi.NodePublishVolumeResponse, error) { // nolint(gocyclo)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect the linter is correct: this function is becoming too large. Can you split it up? The volume creation in the ephemeral case looks like a good candidate for a separate function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, I just pushed the ephemeral part to a separate function.

// 1) No Volume found with given volume id
// 2) No provisioner info found in VolumeContext "storage.kubernetes.io/csiProvisionerIdentity"
// 3) No StagingPath in the request
device, _ := ns.cs.dm.GetDevice(req.VolumeId)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ignores the error from GetDevice. Wouldn't it be cleaner to introduce a "not found" error type and check for that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

devicemanager: Make GetDevice() return pointer

Why?

Its easier/cleaner to deal with pointers than values for a struct type, especially while check if a device is valid using nil

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ignores the error from GetDevice. Wouldn't it be cleaner to introduce a "not found" error type and check for that?

Yes, I agree to introduce error codes. But in this case, the only error returned(by both implementations) is "not found" so we are not missing/ignoring any other errors.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So here the caller is allowed to ignore the error code because the only error is "not found", which isn't documented. That looks rather fragile to me - what if we introduce a second error later? Are you sure that this call site will then be updated?

If you don't want to check the error here and continue to test for nil, then better remove the error return from GetDevice entirely. Then it is obvious from the function signature that the function always succeeds (returns the device) or returns nil when not found.

Copy link
Contributor Author

@avalluri avalluri Oct 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetDevice() now returns os.ErrNotExist, and the same check added here.

if ephemeral {
val, ok := req.GetVolumeContext()["size"]
if !ok {
return nil, status.Error(codes.InvalidArgument, "size not found in volume context")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error message is eventually going to be shown to the user somewhere (or least that's the hope - need to check...). volume context doesn't mean anything to the user. size not specified for ephemeral inline volume may be better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it shows up:

  Warning  FailedMount  0s (x6 over 17s)  kubelet, pmem-csi-pmem-govm-worker3  MountVolume.SetUp failed for volume "my-csi-volume" : kubernetes.io/csi: mounter.SetupAt failed: rpc error: code = InvalidArgument desc = size not found in volume context

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, I rephrased the error test as you suggested.


size, err := resource.ParseQuantity(val)
if err != nil {
return nil, status.Error(codes.InvalidArgument, fmt.Sprintf("Failed to parse given size '%s': %s", val, err.Error()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Errors should always start with lowercase. That's because they might be wrapped (i.e. get another fooo: prefix).

Better use size of ephemeral inline volume (see comment above).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -400,6 +400,9 @@ application](deploy/common/pmem-app-cache.yaml) example.
* A node is only chosen the first time a pod starts. After that it will always restart
on that node, because that is where the persistent volume was created.

Volume requests embedded in Pod spec are provisioned as Ephemeral volumes. Check with
provided [example application](deploy/kubernetes-1.15/pmem-app-ephemeral.yaml) for
ephemeral volume usage.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of just the example, we need full documentation of all fields that can be set as volumeAttributes.

Copy link
Contributor Author

@avalluri avalluri Oct 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I added a table that lists all the possible volumeAttributes keys and their meaning.

options := []string{"bind"}
if readOnly {
options = append(options, "ro")
createResp, err := ns.cs.CreateVolume(ctx, createReq)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need full validation of the volume parameters. Unknown (or miss-spelled) parameters must trigger an error. Otherwise user errors may silently lead to a different than intended outcome.

As a testcase, I tried with foo: bar added to pmem-app-ephemeral.yaml and that was silently ignored.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And we also need an automated test case for this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beware that we enable pod info on mount and thus have to ignore all csi.storage.k8s.io/* entries. Unknown fields may get added under that namespace in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, Now we validate the volume context and raise an error if occurred any unexpected keys. Testcase is pending.

}

// Create filesystem
err = ns.provisionDevice(device, req.GetVolumeCapability().GetMount().GetFsType())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have tests which use a non-default filesystem, do we? Let's not make this a blocker for merging this PR, but keep it in mind: #431

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have tests which use a non-default filesystem, do we?

I am not sure about ephemral volumes, But at least for persistent volumes, Volumes test-suite covers this. Here is the needed change in our test code: avalluri@bd357fb

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of ephemeral volumes here.
_

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, current ephemeral tests are quite limited and not supported to choose the filesystem type for volume.

@avalluri avalluri force-pushed the ephemeral/generic branch 3 times, most recently from 1145792 to 7edb07d Compare October 21, 2019 09:22
@@ -0,0 +1,13 @@
# Patch Kubernetes 1.16 deployment such that driver
# supports both persistent and ephemral volume modes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: "ephemeral"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in avalluri@f37b3b5

@@ -0,0 +1,6 @@
# Append volumeLifeCycleModes to DriverInfo valid only for Kubernetes v1.16+
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are still a lot of kustomization.yaml files which do something other than just adding the volumeLifecycleMode, for example deploy/kustomize/kubernetes-1.16-lvm-testing/kustomization.yaml.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's my bad, result of the wrong commit. This commit should be replaced with avalluri@f37b3b5. Now I fixed this.

// 2) No provisioner info found in VolumeContext "storage.kubernetes.io/csiProvisionerIdentity"
// 3) No StagingPath in the request
device, err := ns.cs.dm.GetDevice(req.VolumeId)
if err != nil && err != os.ErrNotExist {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't GetDevice wrapping os.ErrNoExist? Then a direct comparison will always fails. You have to unwrap with errors.Cause, then compare.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right, I should use os.IsNotExist(). This check was added when initially GetDevice() returning direct os.ErrNotExist without wrapping, later I added wrapping.

@avalluri avalluri force-pushed the ephemeral/generic branch 2 times, most recently from 941c17a to a0fa977 Compare October 21, 2019 12:33
Copy link
Contributor

@pohly pohly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The question around checking for os.ErrNotExit is also still open.

README.md Outdated
@@ -325,7 +325,7 @@ can reuse existing local data when restarting.
Volume | Kubernetes | PMEM-CSI | Limitations
--- | --- | --- | ---
Persistent | supported | supported | topology aware scheduling<sup>1</sup>
Ephemeral | [in design](https://github.com/kubernetes/enhancements/blob/master/keps/sig-storage/20190122-csi-inline-volumes.md#proposal) | in design | topology aware scheduling<sup>1</sup>, resource constraints<sup>2</sup>
Ephemeral | supported | supported | resource constraints<sup>2</sup>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably should add to the Kubernetes column remarks about which versions support it, something like "alpha support in 1.15 (CSIInlineVolume), beta in 1.15".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When looking at the rendered file, I noticed that the definition list (https://github.com/intel/pmem-csi#volume-persistency) is no longer rendered with the text on the next paragraph. This intentionally had two spaces (https://meta.stackexchange.com/questions/72395/is-it-possible-to-have-definition-lists-in-markdown/298872#298872) but we lost those in b30c9a0 - can you add them back?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When looking at the rendered file, I noticed that the definition list (https://github.com/intel/pmem-csi#volume-persistency) is no longer rendered with the text on the next paragraph. This intentionally had two spaces (https://meta.stackexchange.com/questions/72395/is-it-possible-to-have-definition-lists-in-markdown/298872#298872) but we lost those in b30c9a0 - can you add them back?

Done.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably should add to the Kubernetes column remarks about which versions support it, something like "alpha support in 1.15 (CSIInlineVolume), beta in 1.15".

Added a column remark as suggested. Can you please check if it's good enough.

README.md Outdated
|---|-------|--------|-------------|
|size|Size of the requested ephemeral volume|No||
|nsmode|PMEM namespace mode of requested ephemeral volume|Yes|"fsdax"|
|eraseAfter|Should erase volume content after use|Yes|"true"|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Should" or "will"? I think the meaning becomes more obvious when using a different verb in the description: "eraseAfter - clear all data after use and before deleting the volume"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced the description as suggested.

README.md Outdated
|key|meaning|optional|default value|
|---|-------|--------|-------------|
|size|Size of the requested ephemeral volume|No||
|nsmode|PMEM namespace mode of requested ephemeral volume|Yes|"fsdax"|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What other values besides "fsdax" are valid?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even sector is a valid option here. Modified the meaning column to: "PMEM namespace mode(fsdax/sector) of requested ephemeral volume"

@@ -0,0 +1,22 @@
# This demonstrates how to use inline pmem csi volumes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Full sentence, end with a dot at the end?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

There are no specific deployment changes needed for Kubernetes v1.16 except
that the CSIDriver supports new field of specifying volume life-cycle modes
supported by the driver. So we take Kubernetes-1.15 final
deployments(not base overlays) and patch with this new field.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: missing space before opening bracket

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@avalluri
Copy link
Contributor Author

The question around checking for os.ErrNotExit is also still open.

Now the code uses os.IsErrorNotExist() call, which does the digging and checks for real error. This should address the unwrapping issue you pointed.

@avalluri avalluri force-pushed the ephemeral/generic branch 2 times, most recently from 589927d to 6f8e30a Compare October 22, 2019 12:48
@pohly
Copy link
Contributor

pohly commented Oct 23, 2019

The question around checking for os.ErrNotExit is also still open.
Now the code uses os.IsErrorNotExist() call, which does the digging and checks for real error. This should address the unwrapping issue you pointed.

Hmm, where do you see that os.IsErrorNotExist unwraps errors? The API only says "It is satisfied by ErrNotExist as well as some syscall errors" and the source code confirms that (https://golang.org/src/os/error.go#L92). Errors that wrap ErrNotExist are opaque to it, as the following test program confirms:

package main

import (
	"fmt"
	"github.com/pkg/errors"
	"os"
)

func main() {
	err := errors.Wrap(os.ErrNotExist, "wrapping")
	fmt.Printf("%q: os.IsNotExists %v, os.IsNotExists(errors.Cause) %v\n",
		err,
		os.IsNotExist(err),
		os.IsNotExist(errors.Cause(err)))
}
$ go run ./error.go 
"wrapping: file does not exist": os.IsNotExists false, os.IsNotExists(errors.Cause) true

@avalluri
Copy link
Contributor Author

Hmm, where do you see that os.IsErrorNotExist unwraps errors?

Sorry, I did not update the comment. I later figured out that it does not unwrap, but I already updated the code to use errors.Cause().

@avalluri avalluri force-pushed the ephemeral/generic branch 2 times, most recently from 1929859 to 513afbc Compare October 24, 2019 08:02
@@ -104,7 +106,7 @@ func (ctx *Context) GetNamespaceByName(name string) (*Namespace, error) {
}
}
}
return nil, fmt.Errorf("Not found")
return nil, errors.Wrapf(os.ErrNotExist, "namespace '%s' not found", name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, still not convinced that reusing os.ErrNotExist is the right approach here. It means that the error occurred during some file-level operation, which isn't the case here. What if at some point this layer really wants to expose OS-level errors as well as the semantic "namespace not found"? os.ErrNotExist then cannot be used for both.

Please follow the example from https://blog.golang.org/go1.13-errors and create your own NotFound sentinel value.

Also consider what the final error text will look like when wrapping: the code above leads to "namespace 'foo' not found: file does not exist" (https://golang.org/src/internal/oserror/errors.go). "file" - which file?!

That message doesn't make sense, which shows that reusing os.ErrNotExist isn't suitable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defined new error codes for DeviceManager interface: fe4f6e9

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change also demands go.1.13 in CI to build.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this need Go 1.13? You don't need to use the new features from Go 1.13; github.com/pkg/errors is good enough. With "follow the example rom https://blog.golang.org/go1.13-errors and create your own NotFound sentinel value" I meant this:

var ErrNotFound = errors.New("not found")

That works also in older Go.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have preferred to switch from Go 1.12 to 1.13 in a separate PR. Now I am not sure how to proceed: rewrite the code so that it works with 1.12 or switch to 1.13 as minimum version? Suggestions?

Copy link
Contributor Author

@avalluri avalluri Oct 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have preferred to switch from Go 1.12 to 1.13 in a separate PR. Now I am not sure how to proceed: rewrite the code so that it works with 1.12 or switch to 1.13 as minimum version? Suggestions?

Ok, then shall we deal this as part of #420, which have other changes to device manager and even unit tests. In this PR I would prefer ephemeral code to make use of existing DeviceManager API do a nil check for GetDevice() call in NodeServer, as it was in my initial change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this PR I would prefer ephemeral code to make use of existing DeviceManager API

Okay.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I reverted the error code change. Now it's ready for final review.

README.md Outdated
|key|meaning|optional|default value|
|---|-------|--------|-------------|
|size|Size of the requested ephemeral volume|No||
|nsmode|PMEM namespace mode(fsdax/sector) of requested ephemeral volume|Yes|"fsdax"|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not obvious that fsdax and sector are the valid values. What about this?

key meaning optional values
size Size of the requested ephemeral volume No
nsmode PMEM namespace mode of requested ephemeral volume Yes fsdax (default), sector
eraseAfter Clear all data after use and before deleting the volume Yes true (default), false

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even better, mark up the keys as size, nsmode and eraseAfter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

func (m *manifestDriver) GetCSIDriverName(config *testsuites.PerTestConfig) string {
// Return real driver name.
// We can't use m.driverInfo.Name as its not the real driver name
return "pmem-csi.intel.com"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

manifestDriver was meant to be a configurable implementation implementation of the test driver interface. Hard-coding "pmem-csi.intel.com" here is inconsistent with that.

Two options:

  • keep the current approach an introduce a m.CSIDriverName field that gets set to "pmem-csi.intel.com"
  • remove the manifestDriver struct and replace the implementation with something that is specific to PMEM-CSI

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chose option 1. Done

Also took the opportunity to make sure GetDevice() returns os.ErrNotExist error
if it could not find the device with given Id.
Added support for provisioning of ephemeral/csi-inline volumes. This only works
in Kubernetes v1.15+.

Kubernetes v1.15 does not provide any additional information about 'ephemeral'
volumes in NodePublishVolume() call. So, to address this issue, we assume to
create ephemeral volume if the following all cases are met:
  - Volume with given volume-id is not found
  - Missing StagingTargetPath in the NodePublishRequest
  - No provisioner information in NodePlubishRequest.VolumeContext

Added sample application deployment to illustrate ephemeral volume usage.

TODO: Extend e2e tests to ephemeral volumes.
Kubernetes v1.16 CSIDriver object spec supports new field named
'volumeLifecycleModes'. Which supposed to list all the volume provisioning life
cycles supported by the drivers - persisten, ephemeral.

We take final kubernetes-1.15 deployment files and patch CSIDriver to generate
Kubernetes-1.16 deployment files.
ListVolumes is supposed to act on Kubernetes persistent volumes, so we should
not return ephemeral volumes as list elements.
Placing double spaces at the end of a line to insert a line break in a list
item.
@pohly pohly merged commit 6b3e8a4 into intel:devel Oct 29, 2019
@avalluri avalluri deleted the ephemeral/generic branch October 20, 2020 20:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants