Skip to content
This repository has been archived by the owner on Jul 22, 2024. It is now read-only.

Fix/ub 1103 ds8k storwize lun0 rescan #265

Merged
merged 33 commits into from
Nov 27, 2018

Conversation

yixuan0825
Copy link
Contributor

@yixuan0825 yixuan0825 commented Nov 8, 2018

The fix has passed the CI testing for DS8k.
The fix include:

  1. Add a new struct:ScbeVolumeMapInfo, it will include the Host and LunNumber
    The LunNumber will be get from GetVolMapping.
  2. Add two new function:
    function isLun0 to judge if the volume is Lun0
    RescanSCSILun0 to scan the Lun0 volume.
  3. In the function RescanSCSILun0, I used WriteFile to rescan the volume.
    In the design, I wanted to use the command echo to rescan the volume . But the command doesn't work here. So I used WriteFile to rescan the volume, it works well.

This change is Reviewable

@coveralls
Copy link

coveralls commented Nov 8, 2018

Coverage Status

Coverage increased (+1.3%) to 59.09% when pulling 73c226f on fix/UB-1103_ds8k_storwize_lun0_rescan into ae6a0a5 on dev.

Copy link
Collaborator

@olgashtivelman olgashtivelman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 18 of 18 files at r1.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @yixuan0825 and @shay-berman)


local/scbe/scbe.go, line 433 at r1 (raw file):

	s.locker.WriteLock(attachRequest.Host)
	s.logger.Debug("Attaching", logs.Args{{"volume", existingVolume}})
	if _, err := scbeRestClient.MapVolume(existingVolume.WWN, attachRequest.Host); err != nil {

are you sure you need the ":" in the ":=" ? since u see that err was defined above. and there seems to be no new variables defined.


remote/mounter/scbe.go, line 178 at r1 (raw file):

		return false
	}
	if int(lunNumber.(float64)) == 0 {

not something critical - but I am guessing that lun number must be integer? so why convert to float64?


remote/mounter/block_device_utils/rescan.go, line 84 at r1 (raw file):

	}
	if len(hostInfos) == 0 {
		return b.logger.ErrorRet(err, "There is no fc_host found.", logs.Args{{"fcHostDirectory", fcHostDirectory}})

not sure i understand why do you need to throw an error in this case? cant you just skip the rescan?

in this PR the idempotent issues in the delete request are fixed.
the issues are as follows: (in delete volume function)

* in get Back-end in the storage_api_handler delete function - if volume does not exist then return nil.
* in get volume: if volume does not exist then return success
* in vol delete from XIV : if an 404 error is returned from SC then continue with the flow. (to delete from ubiquity DB)
* in delete volume from DB (for the case where we have a non-db volume) if a volume not found error is returned then return success.

in all the above issues an idempotent there is a warning message in the logger.
comment : an assumption is made that if a volume is removed from the DB then it is definitely removed from the storage since the removal from the DB is the latter operation.

Note: addition PR to handle delete idempotent in the flex side -> #266
Copy link
Contributor

@shay-berman shay-berman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 17 unresolved discussions (waiting on @yixuan0825)


local/scbe/scbe.go, line 51 at r1 (raw file):

	MaxVolumeNameLength      = 63                         // IBM block storage max volume name cannot exceed this length

	GetVolumeConfigExtraParams = 3 // number of extra params added to the VolumeConfig beyond the scbe volume struct

why you add another one?
please explain.


local/scbe/scbe.go, line 304 at r1 (raw file):

	}

	hostAttach := volMapInfo.Host

any reason to use this middle var hostAttach? or u can just use volMapInfo.Host for that?


local/scbe/scbe_rest_client.go, line 211 at r1 (raw file):

	if len(mappings) == 1 {
		lunNumber = mappings[0].LunNumber

r u sure that the LunNumber is the same for XIV \ ds8k and SVC? if not its an issue


local/scbe/scbe_rest_client.go, line 222 at r1 (raw file):

		host = hostResponse.Name
	} else {
		lunNumber = -1

who uses this -1
if someone use it then maybe put it as const


local/scbe/scbe_rest_client_test.go, line 194 at r1 (raw file):

			Expect(err).NotTo(HaveOccurred())
			Expect(volMapInfo.Host).To(Equal(""))
			Expect(volMapInfo.LunNumber).To(Equal(-1))

use const


remote/mounter/scbe.go, line 72 at r1 (raw file):

		_, ok := err.(*block_device_utils.VolumeNotFoundError)
		if ok && isLun0(mountRequest) {
			s.logger.Debug("It is the first lun of DS8K or Storwize, will try to rescan lun0.")
  1. how do you know its for DS8k or svc? what about A9000? is it impact a9000 as well?
  2. change from debug to Info message since its important

remote/mounter/scbe.go, line 74 at r1 (raw file):

			s.logger.Debug("It is the first lun of DS8K or Storwize, will try to rescan lun0.")
			if err := s.blockDeviceMounterUtils.RescanAll(!s.config.SkipRescanISCSI, volumeWWN, false, true); err != nil {
				return "", s.logger.ErrorRet(err, "RescanAll Targets failed", logs.Args{{"volumeWWN", volumeWWN}})

state that is the special rescan thing that was failed.


remote/mounter/scbe.go, line 78 at r1 (raw file):

			devicePath, err = s.blockDeviceMounterUtils.Discover(volumeWWN, true)
			if err != nil {
				return "", s.logger.ErrorRet(err, "Discover failed after rescan lun0 for the second time", logs.Args{{"volumeWWN", volumeWWN}})

fix text
after run rescan and also additional rescan with special lun0 scanning (please specify here exactly what was done)


remote/mounter/scbe_test.go, line 46 at r1 (raw file):

			"Profile": fakeProfile, "StorageType": fakeV7kStorageType, "UsedCapacity": fakeUsedCapacity, "Wwn": "wwn", "attach-to": "node1",
			"LogicalCapacity": fakeLogicalCapacity, "LunNumber": float64(1), "PoolName": "pool", "StorageName": "IBM.2706", "fstype": "ext4"}}
		mountRequestForDS8kLun2 = resources.MountRequest{Mountpoint: "test_mountpointDS8k", VolumeConfig: map[string]interface{}{"Name": "u_vol", "PhysicalCapacity": fakePhysicalCapacity,

please add tests also for A9000, just to make sure its not impact it


remote/mounter/block_device_mounter_utils/block_device_mounter_utils.go, line 156 at r1 (raw file):

// 3. multipathing rescan
// return error if one of the steps fail
func (b *blockDeviceMounterUtils) RescanAll(withISCSI bool, wwn string, rescanForCleanUp bool, Lun0 bool) error {
  1. lower case - keep golang standard please
  2. give better name please - like extraLunZeroScanning

remote/mounter/block_device_mounter_utils/block_device_mounter_utils.go, line 195 at r1 (raw file):

	if Lun0 {
		if err := b.blockDeviceUtils.RescanSCSILun0(); err != nil {
			return b.logger.ErrorRet(err, "Rescan failed for FC Lun0 in the second time", logs.Args{{"protocol", block_device_utils.SCSI}})

dont; state "in second time" because this is a lib - it doesn't know when it run


remote/mounter/block_device_utils/rescan.go, line 84 at r1 (raw file):

	}
	if len(hostInfos) == 0 {
		return b.logger.ErrorRet(err, "There is no fc_host found.", logs.Args{{"fcHostDirectory", fcHostDirectory}})

why to fail if there is nothing to scan?


remote/mounter/block_device_utils/rescan.go, line 90 at r1 (raw file):

		b.logger.Debug("scan the host", logs.Args{{"name: ", host.Name()}})
		fcHostFile := "/sys/class/fc_host/" + host.Name() + "/issue_lip"
		if err := ioutil.WriteFile(fcHostFile, []byte("1"), 0666); err != nil {
  1. r u sure about 666? its the devil :-(
  2. anyway please add UT for this one, i didn't see it and its critical.
  3. don't u want to send debug to the log in error

remote/mounter/block_device_utils/rescan.go, line 95 at r1 (raw file):

		filename := "/sys/class/scsi_host/" + host.Name() + "/scan"
		if err := ioutil.WriteFile(filename, []byte("- - -"), 0666); err != nil {
			continue

don't u want to send debug to the log in error

deeghuge and others added 2 commits November 12, 2018 23:16
Parameter Validation
- Added check before initializing the Spectrumscale backend
- Added check before the creation of fileset
- changed chown to spectrum scale native rest command
- Get Mount point from the spectrum scale configuration
- Check for quota before the creation of fileset
- UT change and Addition
Copy link
Contributor Author

@yixuan0825 yixuan0825 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 14 of 18 files reviewed, 17 unresolved discussions (waiting on @olgashtivelman and @shay-berman)


local/scbe/scbe.go, line 51 at r1 (raw file):

Previously, shay-berman wrote…

why you add another one?
please explain.

Because we add a LunNumber .


local/scbe/scbe.go, line 304 at r1 (raw file):

Previously, shay-berman wrote…

any reason to use this middle var hostAttach? or u can just use volMapInfo.Host for that?

Done.


local/scbe/scbe.go, line 433 at r1 (raw file):

Previously, olgashtivelman wrote…

are you sure you need the ":" in the ":=" ? since u see that err was defined above. and there seems to be no new variables defined.

Done.


local/scbe/scbe_rest_client.go, line 211 at r1 (raw file):

Previously, shay-berman wrote…

r u sure that the LunNumber is the same for XIV \ ds8k and SVC? if not its an issue

Confirmed with SCB developer, it is the same for XIV/ds8k/svc.


local/scbe/scbe_rest_client.go, line 222 at r1 (raw file):

Previously, shay-berman wrote…

who uses this -1
if someone use it then maybe put it as const

It is just for those case without lunNumber.


local/scbe/scbe_rest_client_test.go, line 194 at r1 (raw file):

Previously, shay-berman wrote…

use const

Will fix it .


remote/mounter/scbe.go, line 72 at r1 (raw file):

Previously, shay-berman wrote…
  1. how do you know its for DS8k or svc? what about A9000? is it impact a9000 as well?
  2. change from debug to Info message since its important

Because XIV doesn't use LunNubmer 0.


remote/mounter/scbe.go, line 74 at r1 (raw file):

Previously, shay-berman wrote…

state that is the special rescan thing that was failed.

Done.


remote/mounter/scbe.go, line 78 at r1 (raw file):

Previously, shay-berman wrote…

fix text
after run rescan and also additional rescan with special lun0 scanning (please specify here exactly what was done)

Done.


remote/mounter/scbe.go, line 178 at r1 (raw file):

Previously, olgashtivelman wrote…

not something critical - but I am guessing that lun number must be integer? so why convert to float64?

If we use lunNuber(int), The unit test will be failed with interface {} is float64,


remote/mounter/scbe_test.go, line 46 at r1 (raw file):

Previously, shay-berman wrote…

please add tests also for A9000, just to make sure its not impact it

Since we don't care storage type in the patch, do we need special test?


remote/mounter/block_device_mounter_utils/block_device_mounter_utils.go, line 156 at r1 (raw file):

Previously, shay-berman wrote…
  1. lower case - keep golang standard please
  2. give better name please - like extraLunZeroScanning

Done.


remote/mounter/block_device_mounter_utils/block_device_mounter_utils.go, line 195 at r1 (raw file):

Previously, shay-berman wrote…

dont; state "in second time" because this is a lib - it doesn't know when it run

Done.


remote/mounter/block_device_utils/rescan.go, line 84 at r1 (raw file):

Previously, olgashtivelman wrote…

not sure i understand why do you need to throw an error in this case? cant you just skip the rescan?

If we don't have error here, it will try to discover with multipath,The discover action is redundant


remote/mounter/block_device_utils/rescan.go, line 84 at r1 (raw file):

Previously, shay-berman wrote…

why to fail if there is nothing to scan?

As the reply for Olga's comment


remote/mounter/block_device_utils/rescan.go, line 90 at r1 (raw file):

Previously, shay-berman wrote…
  1. r u sure about 666? its the devil :-(
  2. anyway please add UT for this one, i didn't see it and its critical.
  3. don't u want to send debug to the log in error

Modify to 200, And it passed the CI testing.


remote/mounter/block_device_utils/rescan.go, line 95 at r1 (raw file):

Previously, shay-berman wrote…

don't u want to send debug to the log in error

Done.

Copy link
Contributor Author

@yixuan0825 yixuan0825 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 12 of 18 files reviewed, 17 unresolved discussions (waiting on @olgashtivelman and @shay-berman)


local/scbe/scbe.go, line 51 at r1 (raw file):

Previously, yixuan0825 wrote…

Because we add a LunNumber .

It's used for unitest


local/scbe/scbe_rest_client.go, line 222 at r1 (raw file):

Previously, yixuan0825 wrote…

It is just for those case without lunNumber.

Done.


local/scbe/scbe_rest_client_test.go, line 194 at r1 (raw file):

Previously, yixuan0825 wrote…

Will fix it .

Done.

Copy link
Contributor

@shay-berman shay-berman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed it again. please follow the comments.
Also run staging CI not only on SVC, run it also on A9000 please. and update when it passed. In addition I would like you to test it on Z\P as well before merge to dev. Waiting for your staging feedback.

Reviewable status: 12 of 18 files reviewed, 6 unresolved discussions (waiting on @olgashtivelman and @yixuan0825)


remote/mounter/scbe.go, line 72 at r1 (raw file):

Previously, yixuan0825 wrote…

Because XIV doesn't use LunNubmer 0.

  1. please state it as comment line
    2.what about the second item i mentioned?

remote/mounter/scbe_test.go, line 46 at r1 (raw file):

Previously, yixuan0825 wrote…

Since we don't care storage type in the patch, do we need special test?

yes add another UT to cover that as well.


remote/mounter/block_device_utils/rescan.go, line 90 at r1 (raw file):

Previously, yixuan0825 wrote…

Modify to 200, And it passed the CI testing.

  1. 200 is much better, thanks
  2. what about UT for this function?

Copy link
Contributor Author

@yixuan0825 yixuan0825 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 11 of 19 files reviewed, 6 unresolved discussions (waiting on @olgashtivelman, @shay-berman, and @yixuan0825)


remote/mounter/scbe.go, line 72 at r1 (raw file):

Previously, shay-berman wrote…
  1. please state it as comment line
    2.what about the second item i mentioned?

Done.


remote/mounter/scbe.go, line 178 at r1 (raw file):

Previously, yixuan0825 wrote…

If we use lunNuber(int), The unit test will be failed with interface {} is float64,

Done.


remote/mounter/scbe_test.go, line 46 at r1 (raw file):

Previously, shay-berman wrote…

yes add another UT to cover that as well.

Done.


remote/mounter/block_device_utils/rescan.go, line 84 at r1 (raw file):

Previously, yixuan0825 wrote…

If we don't have error here, it will try to discover with multipath,The discover action is redundant

Done.


remote/mounter/block_device_utils/rescan.go, line 90 at r1 (raw file):

Previously, shay-berman wrote…
  1. 200 is much better, thanks
  2. what about UT for this function?

Has add the UT for the function

Copy link
Contributor Author

@yixuan0825 yixuan0825 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 11 of 19 files reviewed, 6 unresolved discussions (waiting on @olgashtivelman, @shay-berman, and @yixuan0825)


remote/mounter/block_device_utils/rescan.go, line 90 at r1 (raw file):

Previously, yixuan0825 wrote…

Has add the UT for the function

Done.

olgashtivelman and others added 3 commits November 21, 2018 15:19
in this PR the idea is to allow the user to use ISCSI + FC on the same kubernetes cluster.
before this PR the user had to define a config parameter : skipRescanIscsI to define if they want to use ISCSI or not in ubiquity.
this PR reomves this config and works under the assumption that if a user defined and installed iscsiadm on the node then this node should be connected to the storage system with ISCSI. also the existance of the iscsiadm command or a login to the storage will not fail the rescan and the mount\umount process!
Copy link
Contributor

@shay-berman shay-berman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

I would like @olga to review it as well, since you rebase to get Olga PR iscsi\FC support.
If olga provide LGTM then you can merge to dev. (please open ticket to test this PR on SVC with Power or Z)

Reviewable status: 0 of 35 files reviewed, 3 unresolved discussions (waiting on @olgashtivelman and @yixuan0825)

Copy link
Collaborator

@olgashtivelman olgashtivelman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 16 of 35 files at r5, 19 of 19 files at r6.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @yixuan0825)


remote/mounter/block_device_utils/rescan.go, line 84 at r1 (raw file):

Previously, yixuan0825 wrote…

Done.

if I remeber then this comment was near "if len(hostInfos) == 0 " right? so you are not going into the for loop in this function - because the hostInfos is empty..
so where are you skipping the Discover function?
and you are probably returning a nil error message right? how is this helpful? you can always return something other then error in this function if you want to influence something happening outside, i just dont understand how returning an empty error and logging that an error occurred helps.

Copy link
Collaborator

@olgashtivelman olgashtivelman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added one small comment - but other then that its LGTM - also seems fine with the iscsi\fc ticket as far as i could see.

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @yixuan0825)

Copy link
Contributor Author

@yixuan0825 yixuan0825 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 34 of 35 files reviewed, 1 unresolved discussion (waiting on @olgashtivelman)


remote/mounter/block_device_utils/rescan.go, line 84 at r1 (raw file):

Previously, olgashtivelman wrote…

if I remeber then this comment was near "if len(hostInfos) == 0 " right? so you are not going into the for loop in this function - because the hostInfos is empty..
so where are you skipping the Discover function?
and you are probably returning a nil error message right? how is this helpful? you can always return something other then error in this function if you want to influence something happening outside, i just dont understand how returning an empty error and logging that an error occurred helps.

Thank you, Olga. I forget to define the error here.
The function is invoked by RescanAll, and RescanAll was invokede by Mounter at line 72 in remote/mounter/scbe.go. The function blockDeviceMounterUtils.Discover will be invoked at line 75 if there is no erorr in RescanAll. If we return error here, The function RescanAll will return error, then the blockDeviceMounterUtils.Discover will not be invoked.
I have added an error information for the user to check the fc host in the new patch.

Copy link
Contributor Author

@yixuan0825 yixuan0825 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@olga, I have fixed your comment, could you review again?Then I can merge to dev.

Reviewable status: 34 of 35 files reviewed, 1 unresolved discussion (waiting on @olgashtivelman)

Copy link
Collaborator

@olgashtivelman olgashtivelman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 34 of 35 files reviewed, all discussions resolved (waiting on @olgashtivelman)

Copy link
Collaborator

@olgashtivelman olgashtivelman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: 34 of 35 files reviewed, all discussions resolved (waiting on @olgashtivelman)

Copy link
Contributor Author

@yixuan0825 yixuan0825 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Shay, since Olga has reviewed, may I merge it now?

Reviewable status: 34 of 35 files reviewed, all discussions resolved (waiting on @olgashtivelman)

@shay-berman
Copy link
Contributor

feel free to merge but don't forget to merge it as "Squash and merge" (so we will not see all the 33 commits in the dev branch only 1)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants