Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use with segregated Data networks? #50

Closed
matyat opened this issue Oct 21, 2019 · 17 comments
Closed

How to use with segregated Data networks? #50

matyat opened this issue Oct 21, 2019 · 17 comments
Assignees
Milestone

Comments

@matyat
Copy link

matyat commented Oct 21, 2019

At the moment I can't input any information about the network topology and the driver will select a iSCSI discovery address which is not accessible to the nodes. The API in https://github.com/hpe-storage/container-storage-provider seems to have no provision for this workflow (also I can't find the source for the csp service, is it under a different license?)

@rgcostea
Copy link
Collaborator

The hpenodeinfo object captures all of the networks on the host. These networks are then passed to the CSP implementation. That CSP will then report the discovery IP to use when the volume is published. The CSP will only report a discovery IP that matches the given networks. So it "should" be discoverable. Are you saying you have a network topology where the networks match but the discovery IP is not discoverable?

The Nimble CSP is currently closed source for now. We are working on open sourcing that over time.

@matyat
Copy link
Author

matyat commented Oct 21, 2019

Ah ok, maybe that mechanism isn't working as expected then - here's some output from my system:

$ kubectl -n kube-system get hpenodeinfos -o yaml storex-k8s-2
apiVersion: storage.hpe.com/v1
kind: HPENodeInfo
metadata:
  creationTimestamp: 2019-10-18T14:06:29Z
  generation: 2
  name: storex-k8s-2
  resourceVersion: "29621871"
  selfLink: /apis/storage.hpe.com/v1/hpenodeinfos/storex-k8s-2
  uid: 7b1838bf-f1b0-11e9-8ca5-941882095e78
spec:
  iqns:
  - iqn.2019-06.com.hpe-csi-driver.nimble:storex-k8s-2
  networks:
  - 127.0.0.0
  - 16.26.128.0
  - 16.26.128.0
  - 172.17.0.0
  - 10.233.42.123
  - 10.233.65.0
  - 10.233.65.0
  uuid: 0242a1dc-4458-7374-6f72-65782d6b3873

From the hpe-csi-driver running on storex-k8s-2, seems to be trying to connect 10.20.0.120 which is on another VLAN that isn't accessible to the node.

time="2019-10-21T09:32:07Z" level=error msg="command iscsiadm with pid: 1731 killed as timeout of 60 seconds reached" file="cmd.go:57"
time="2019-10-21T09:32:07Z" level=error msg="error command iscsiadm with pid: 1731 killed as timeout of 60 seconds reached" file="iscsi.go:469"
time="2019-10-21T09:32:07Z" level=error msg="Unable to Perform Discovery with discoveryIp: 10.20.0.120. Error: error command iscsiadm with pid: 1731 killed as timeout of 60 seconds reached" file="iscsi.go:111"
time="2019-10-21T09:32:07Z" level=error msg="unable to create device for volume  with IQN iqn.2007-11.com.nimblestorage:pvc-43c9a374-f1be-11e9-b0b8-30e1715309c0-v1890b9d28022b717.00000204.98b20ec5" file="device.go:877" 

@rgcostea
Copy link
Collaborator

That looks like a bug. Could you look at the nimble-csp logs too see what the publish action is doing? The log message should start with

Executing VolumeObjectSet::actions/publish

Thanks.

@matyat
Copy link
Author

matyat commented Oct 21, 2019

Also the 10.233.x.x networks are internal to the kubernetes cluster - probably shouldn't be in the HPENodeInfo?

@matyat
Copy link
Author

matyat commented Oct 21, 2019

From nimble-csp.log:

...
2019-10-21 16:00:50,441000Z INFO: executor-thread-1 GroupMgmtClient - Response: 8ms, GET: https://16.24.163.230:5392/v1/application_servers/detail?metadata.csp_ns_NIM_host_uuid=0242a1dc-4458-7374-6f72-65782d6b3873&fields=name%2Cmetadata%2Chostname%2Cid%2Cserver_type 200 (OK)         
2019-10-21 16:00:50,454000Z DEBUG: executor-thread-1 GroupMgmtClient - 3 * Sending client request on thread executor-thread-1                                                                                                                                                               
3 > POST https://16.24.163.230:5392/v1/application_servers                                                                                                                                                                                                                                  
3 > Content-Type: application/json                                                                                                                                                                                                                                                          
3 > X-Auth-Token: 29cffb153c2a8c32ce8061bc354bcc17                                                                                                                                                                                                                                          
{"data":{"name":"Container-Node-storex-k8s-2","hostname":"storex-k8s-2","metadata":[{"key":"csp_ns_NIM_host_uuid","value":"0242a1dc-4458-7374-6f72-65782d6b3873"},{"key":"csp_ns_NIM_networks","value":"127.0.0.0,16.26.128.0,16.26.128.0,172.17.0.0,10.233.42.123,10.233.65.0,10.233.65.0"}
,{"key":"csp_ns_NIM_iqns","value":"iqn.2019-06.com.hpe-csi-driver.nimble:storex-k8s-2"}]}}                                                                                                                                                                                                  
                                                                                                                                                                                                                                                                                            
2019-10-21 16:00:50,487000Z DEBUG: executor-thread-1 GroupMgmtClient - 3 * Client response received on thread executor-thread-1                                                                                                                                                             
3 < 201                                                                                                                                                                                                                                                                                     
3 < Connection: Keep-Alive                                                                                                                                                                                                                                                                  
3 < Content-Type: application/json;charset=utf-8                                                                                                                                                                                                                                            
3 < Date: Mon, 21 Oct 2019 16:00:50 GMT                                                                                                                                                                                                                                                     
3 < Transfer-Encoding: chunked                                                                                                                                                                                                                                                              
{"data":{"creation_time":1571673650,"description":"","hostname":"storex-k8s-2","id":"291890b9d28022b717000000000000000000000008","last_modified":1571673650,"metadata":[{"key":"csp_ns_NIM_host_uuid","value":"0242a1dc-4458-7374-6f72-65782d6b3873"},{"key":"csp_ns_NIM_networks","value":"
127.0.0.0,16.26.128.0,16.26.128.0,172.17.0.0,10.233.42.123,10.233.65.0,10.233.65.0"},{"key":"csp_ns_NIM_iqns","value":"iqn.2019-06.com.hpe-csi-driver.nimble:storex-k8s-2"}],"name":"Container-Node-storex-k8s-2","password":"","port":65536,"server_type":"vmware","username":""}}         
                                                                                                                                                                                                                                                                                            
2019-10-21 16:00:50,487000Z INFO: executor-thread-1 GroupMgmtClient - Response: 41ms, POST: https://16.24.163.230:5392/v1/application_servers 201 (Created)                                                                                                                                 
2019-10-21 16:00:50,491000Z INFO: executor-thread-1 VolumeObjectSet - Executing VolumeObjectSet::actions/publish with id 061890b9d28022b717000000000000000000000206 and options PublishOptions [hostUuid=0242a1dc-4458-7374-6f72-65782d6b3873, accessProtocol=iscsi]                        
2019-10-21 16:00:50,491000Z DEBUG: executor-thread-1 ServerProperties - Returning group service for array ip 16.24.163.230 based on group session                                                                                                                                           
2019-10-21 16:00:50,491000Z DEBUG: executor-thread-1 GroupMgmtClient - Enabling sslTrustAll                                                                                                                                                                                                 
2019-10-21 16:00:50,492000Z DEBUG: executor-thread-1 GroupMgmtClient - Enabling debug logging                                                                                                                                                                                               
2019-10-21 16:00:50,492000Z DEBUG: executor-thread-1 GroupMgmtClient - Enabling waiting for async jobs, waitTimeSeconds: 0                                                                                                                                                                  
2019-10-21 16:00:50,492000Z DEBUG: executor-thread-1 GroupMgmtClient - Creating client with specified token. Group Address: 16.24.163.230, token: 29cffb153c2a8c32ce8061bc354bcc17                                                                                                          
2019-10-21 16:00:50,496000Z DEBUG: executor-thread-1 GroupMgmtClient - 1 * Sending client request on thread executor-thread-1                                                                                                                                                               
1 > GET https://16.24.163.230:5392/v1/groups/detail?fields=version_current%2Cid%2Caccess_protocol_list%2Cscsi_vendor_id                       
1 > Accept: application/json                                                                                                                                                                                                                                                                
1 > X-Auth-Token: 29cffb153c2a8c32ce8061bc354bcc17                                                                                                                                                                                                                                          
                                                                                                                                                                                                                                                                                            
2019-10-21 16:00:50,505000Z DEBUG: executor-thread-1 GroupMgmtClient - 1 * Client response received on thread executor-thread-1                                                                                                                                                             
1 < 200                                                                                                                                                                                                                                                                                     
1 < Connection: Keep-Alive                                                                                                                                                                                                                                                                  
1 < Content-Type: application/json;charset=utf-8                                                                                                                                                                                                                                            
1 < Date: Mon, 21 Oct 2019 16:00:50 GMT                                                                                                                                                                                                                                                     
1 < Transfer-Encoding: chunked                                                                                                                                                                                                                                                              
{"startRow":0,"endRow":1,"totalRows":1,"data":[{"id":"001890b9d28022b717000000000000000000000001","access_protocol_list":["iscsi"],"version_current":"5.0.5.0-582658-opt","scsi_vendor_id":"Nimble  "}]}                                                                                    
                                                                                                                                              
2019-10-21 16:00:50,506000Z INFO: executor-thread-1 GroupMgmtClient - Response: 10ms, GET: https://16.24.163.230:5392/v1/groups/detail?fields=version_current%2Cid%2Caccess_protocol_list%2Cscsi_vendor_id 200 (OK)                                                                         
...

@rgcostea
Copy link
Collaborator

Could you just attach the full log please?

@matyat
Copy link
Author

matyat commented Oct 21, 2019

@rgcostea
Copy link
Collaborator

Thanks. So it looks like you truly don't have any matching networks. Your array has the following data networks:

  • 16.24.160.0
  • 10.20.0.0
  • 172.17.13.0

And your host provides the following networks:

  • 127.0.0.0
  • 16.26.128.0
  • 16.26.128.0
  • 172.17.0.0
  • 10.233.42.123
  • 10.233.65.0
  • 10.233.65.0

Because there is no match, the CSP will just pick a random discovery IP from the array. It picked 10.20.0.120 which is not reachable.

@matyat
Copy link
Author

matyat commented Oct 21, 2019

I think the driver is making some assumptions about my network topology - as 16.24.160.0 and 172.17.13.0 are accessible to my kubenetes nodes. It would be nice if I could override manually.

@rgcostea
Copy link
Collaborator

Agreed. We can make that configurable. For now, the driver is following our best practices. Without a matching network, it can't make a good decision as to which discovery IP to report.

@rgcostea rgcostea self-assigned this Oct 22, 2019
@rgcostea
Copy link
Collaborator

rgcostea commented Nov 5, 2019

I've updated the nimble-csp to report all discovery IPs. The csi side code needs to be updated to choose the ideal one that will result in successful discovery. Re-assign to @shivamerla to take that up.

@rgcostea rgcostea assigned shivamerla and unassigned rgcostea Nov 5, 2019
@matyat
Copy link
Author

matyat commented Nov 5, 2019

As a user I would like the option to be able to select the discovery IP manually. Even if two or more IP addresses are reachable by the host, there still may be a reason that I would prefer one endpoint over another.

@rgcostea
Copy link
Collaborator

rgcostea commented Nov 5, 2019

Makes sense. We can probably handle that via a config map setting passed to the csi driver.

@knackaron
Copy link

As a user I would like the option to be able to select the discovery IP manually. Even if two or more IP addresses are reachable by the host, there still may be a reason that I would prefer one endpoint over another.

We'd also like to see this feature added. It would also help workaround this issue we're seeing: hpe-storage/common-host-libs#77

@shivamerla
Copy link
Collaborator

shivamerla commented Dec 3, 2019

@Logibox @knackaron currently we have made a fix to choose a discovery IP which is reachable by host. For Nimble targets, irrespective of any discovery IP used, target returns all mapped targets from all portals during send-targets discovery. This list is used later to perform iSCSI login. Does this solve your original issue? We are wondering what is the added advantage of allowing user to specify a particular discovery IP as long as we pick the one which is reachable. Are are other use-cases we are missing?

Also, storage array's like 3PAR, have different discovery IP for each portal and each one should be used. Allowing user to configure this for each array in either storage-class or secret adds significant burden on cluster admin.

@shivamerla shivamerla added this to the csi 1.0 milestone Dec 3, 2019
@knackaron
Copy link

@Logibox @knackaron currently we have made a fix to choose a discovery IP which is reachable by host. For Nimble targets, irrespective of any discovery IP used, target returns all mapped targets from all portals during send-targets discovery. This list is used later to perform iSCSI login. Does this solve your original issue? We are wondering what is the added advantage of allowing user to specify a particular discovery IP as long as we pick the one which is reachable. Are are other use-cases we are missing?

@shivamerla The changes in hpe-storage/common-host-libs#85 should address all of my use cases.

@shivamerla
Copy link
Collaborator

Thanks @knackaron closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants