Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pure api based truenas driver (ssh dependency removed) #101

Closed
travisghansen opened this issue Jun 23, 2021 · 42 comments
Closed

Pure api based truenas driver (ssh dependency removed) #101

travisghansen opened this issue Jun 23, 2021 · 42 comments

Comments

@travisghansen
Copy link
Member

travisghansen commented Jun 23, 2021

I’m working on a purely api based driver for truenas (currently requires latest/greatest scale). Any takers to do some early testing?

SCALE 21.08 epic: https://jira.ixsystems.com/browse/NAS-110637
SCALE 21.10 epic: https://jira.ixsystems.com/browse/NAS-111870

@infra-monkey
Copy link

Hey,
I could spin up an instance of truenas easily for that but i can't setup a new k8s cluster for testing.
Is it possible to deploy another democratic-csi in another namespace without interfering with the existing one? I think so but not sure...

@travisghansen
Copy link
Member Author

Yeah you don’t even need to use another namespace. Just a new helm release name and unique storage class names etc

@travisghansen
Copy link
Member Author

In order to test this out use the existing helm chart with values to mimic nearly exactly the existing freenas drivers with a few changes:

controller:
  driver:
    image: democraticcsi/democratic-csi:next
    imagePullPolicy: Always
    logLevel: debug

node:
  driver:
    image: democraticcsi/democratic-csi:next
    imagePullPolicy: Always
    logLevel: debug

Remove the sshConnection block entirely.

Ensure datasetPermissionsUser and datasetPermissionsGroup are numeric instead of alpha, ie:

  #datasetPermissionsGroup: root
  #datasetPermissionsGroup: wheel
  datasetPermissionsUser: 0
  datasetPermissionsGroup: 0

@infra-monkey
Copy link

First test result :
controller can't deploy
node deploys fine

Test environment:

kubernetes: v1.20.7
cri-o: 1.20.2
TrueNAS Scale: TrueNAS-SCALE-21.06-BETA.1 (in a vm)
namespace: democratic-test (newly created)

Deployment Status:

$ kubectl -n democratic-test get pods
NAME READY STATUS RESTARTS AGE
democratic-test-nfs-democratic-csi-controller-764b6888d4-67gdp 1/4 CrashLoopBackOff 15 8m21s
democratic-test-nfs-democratic-csi-node-lrfbl 3/3 Running 0 7m48s
democratic-test-nfs-democratic-csi-node-rhfdp 3/3 Running 0 5m12s

In order to avoid an extremely long post, here are the files with more information. (values, description, logs)
democratic-test-helm-values.txt
democratic-test-pod-description.txt
democratic-test-container-logs.txt
But basically, all the controller containers fail with an error
CSI driver probe failed: rpc error: code = Internal desc = Error: Invalid username

I guess you have tested the deployment, so I must have something wrong in my value file.

@travisghansen
Copy link
Member Author

I forgot to mention the most important detail :( the driver.config.driver value should be freenas-api-{nfs,iscsi,smb}

@travisghansen
Copy link
Member Author

And it may not work on the beta, it might need nightlies…a lot of collaborative work has been ongoing and I don’t think it all landed in the beta.

@infra-monkey
Copy link

infra-monkey commented Jul 10, 2021

I forgot to mention the most important detail :( the driver.config.driver value should be freenas-api-{nfs,iscsi,smb}

Indeed it works way better with this modification !
This is my value file that works:
democratic-test-helm-values.txt
And the claim definition:
test-claim-nfs.txt

And the results of testing :

Test NFS (quota enabled, reservation disabled):

create pvc: ok
extend pvc: ok
delete pvc: ok

Test NFS (quota enabled, reservation enabled):

create pvc: ok
extend pvc: nok (reservation and quota stays at the original value)

Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"democratic-test", Name:"test-claim-nfs-api", UID:"d3058c78-ff9a-458b-9909-7d959c99d75e", APIVersion:"v1", ResourceVersion:"39355165", FieldPath:""}): type: 'Warning' reason: 'VolumeResizeFailed' resize volume "pvc-d3058c78-ff9a-458b-9909-7d959c99d75e" by resizer "org.democratic-csi.nfs-api" failed: rpc error: code = Internal desc = Error: {"pool_dataset_update.refreservation":[{"message":"size is greater than available space","errno":22}]}

delete pvc: ok
Are reservation and quota mutually exclusive? I probably don't understand well enough how reservation works in TrueNAS

Test NFS (quota disabled, reservation enabled):

create pvc: ok
extend pvc: ok
delete pvc: ok

Note: The NFS service on TrueNAS Scale can be started and enabled only when a share is defined.
When creating the first pvc, the NFS service is neither started nor enabled. The pvc is however marked as Bound.
Workaround: Create a dummy dataset and share it over nfs and make sure that the NFS service is started and enabled.

Is it the responsibility of the csi driver to start and enable the service when it creates the first share? same for stopping and disabling the service when deleting the last share?

I'll check out the iscsi and smb drivers later.

@travisghansen
Copy link
Member Author

They are not manually exclusive. That’s a good find that I’ll need to ask the ix team how it’s handled on the backend. I think that combination works fine in the ssh scenario but I’d have to double check that.

Thanks for the feedback!

@iamwillbar
Copy link

@travisghansen FYI, this does not appear to work in the latest TrueNAS CORE (12.0 U4.1) because the /pool/dataset API does not accept a create_ancestors parameter.

[freenas-iscsi-democratic-csi-controller-6ff745cd58-wn259 csi-driver] {"message":"FREENAS HTTP REQUEST: {\"method\":\"POST\",\"url\":\"https://[redacted[/api/v2.0/pool/dataset\",\"headers\":{\"Accept\":\"application/json\",\"User-Agent\":\"democratic-csi-driver\",\"Content-Type\":\"application/json\"},\"json\":true,\"body\":{\"refreservation\":0,\"volblocksize\":\"16K\",\"type\":\"VOLUME\",\"volsize\":1073741824,\"create_ancestors\":true,\"user_properties\":[{\"key\":\"democratic-csi:csi_volume_name\",\"value\":\"pvc-bfeb5a87-4a73-4937-85d0-34889d68d74e\"},{\"key\":\"democratic-csi:managed_resource\",\"value\":\"true\"},{\"key\":\"democratic-csi:volume_context_provisioner_driver\",\"value\":\"freenas-api-iscsi\"},{\"key\":\"democratic-csi:volume_context_provisioner_instance_id\",\"value\":\"iqn.2005-10.org.freenas.ctl\"}],\"name\":\"pool1/k8s/vols/pvc-bfeb5a87-4a73-4937-85d0-34889d68d74e\"},\"agentOptions\":{\"rejectUnauthorized\":false}}","level":"debug","service":"democratic-csi"}
[freenas-iscsi-democratic-csi-controller-6ff745cd58-wn259 csi-driver] {"message":"FREENAS HTTP ERROR: null","level":"debug","service":"democratic-csi"}
[freenas-iscsi-democratic-csi-controller-6ff745cd58-wn259 external-provisioner] I0725 20:29:30.239258       1 connection.go:185] GRPC response: {}
[freenas-iscsi-democratic-csi-controller-6ff745cd58-wn259 csi-driver] {"message":"FREENAS HTTP STATUS: 422","level":"debug","service":"democratic-csi"}
[freenas-iscsi-democratic-csi-controller-6ff745cd58-wn259 external-provisioner] I0725 20:29:30.239420       1 connection.go:186] GRPC error: rpc error: code = Internal desc = Error: {"create_ancestors":[{"message":"Field was not expected","errno":22}]}

@travisghansen
Copy link
Member Author

Thanks for testing! That’s expected atm unfortunately. This is currently targeted at TrueNAS scale (we’re looking to make it more official for the 21.08 beta release) but I’m fairly confident all the middleware updates will be pulled into core at some point.

@iamwillbar
Copy link

Thanks @travisghansen, makes sense, looking forward to breaking the SSH dependency 😄.

@travisghansen
Copy link
Member Author

I have reproduced the expand volume error and submitted an issue with ix. Things are looking really good to have this all ready in time to release with the scale beta release.

@travisghansen
Copy link
Member Author

OK, expand volume bug has been fixed in latest nightlies and several other things buttoned up. I'm getting ready to snap a massive new release so any testing that you can do beforehand would be great. Make sure you have the latest nightly of scale and that your cluster has the most recent next images of democratic-csi.

@travisghansen
Copy link
Member Author

Also note there are fixes in place for the nfs server issue you mentioned (not starting when no shares) and various other things. 21.08 + democratic-csi should be solid when it hits.

@travisghansen
Copy link
Member Author

Ok, I’ve released v1.3.0 and scale 21.08 has landed. Any testing is appreciated!

@travisghansen
Copy link
Member Author

Known issued that were NOT included in 21.08: https://jira.ixsystems.com/browse/NAS-111870

In particular, if the system boots without any targets/luns clients will not be able to connect. A workaround is to systemctl restart scst. Note this is only required after the very first target/lun is created (assuming the machine continues to have at least 1 target/lun when it boots up).

@mister2d
Copy link
Contributor

mister2d commented Sep 21, 2021

I just ran into the "create_ancestors" issue on TrueNAS Core 12.0-U5.1. Hope it's fixed soon. Not looking to moving to SCALE any time soon but I can set up a virtual test instance if needed to move this forward.

@travisghansen
Copy link
Member Author

@mister2d thanks for trying it out! There are no definitive plans yet for pulling all the new api calls back into CORE, but I'm guessing they won't land until 13 :( In the mean time I have have 0 intention of removing the ssh-based drivers ever.

I would love to have additional eyes on it! I'm currently working towards passing the full suite of csi compliance test (don't be alarmed, anything that's currently failing are generally cosmetic-level issues). I'm very close and actually working on the final failing test right this minute. After I have that fully passing I'm working with iX to provision some resources to run the full test suite across several different configurations and I hope to add 13 to that as soon as possible.

ATM the test suite running again 12 CORE pretty much fails everything (with iscsi anyway) because I can't make the tests run without using datasets that don't put me over the 63 char limit for zvol names. In BSD 13 that limit has been bumped to 255 so it will make running the test suite there more feasibly anyway.

@mister2d
Copy link
Contributor

There's always sadness when projects move on from BSD. Anyway, I'll be moving my ZFS pools over to SCALE by this weekend.

Is there anything specifically that needs testing on CORE?

@travisghansen
Copy link
Member Author

Not as it relates to this ticket no. The api-only drivers are currently limited in scope to SCALE only (21.08+). When CORE 13 hits I'm relatively confident it will support all the endpoints etc so until then, SCALE it is :)

I will be snapping a new release shortly with all the 'fixes' to make the drivers fully compliant with the csi test suite.

@PrivatePuffin
Copy link

@travisghansen Any plans on SCALE integration with iX?
If not, i'm willing to spend some priority time on getting it setup as a TrueCharts App :)

@travisghansen
Copy link
Member Author

@Ornias1993 there's nothing really to setup on the NAS itself, it uses the built-in api entirely. What's the use-case you have in mind?

@PrivatePuffin
Copy link

@Ornias1993 there's nothing really to setup on the NAS itself, it uses the built-in api entirely. What's the use-case you have in mind?

I was refering to integration with the SCALE kubernetes system. Unless iX told you they are going to do so (which I don't expect), i'm open to adding it as an app :)

@travisghansen
Copy link
Member Author

Is the use case for a cluster of scale nodes trying to use storage from only 1 of the nodes? Or a single node cluster trying to access storage from the server hosting the cluster? or something else?

@PrivatePuffin
Copy link

PrivatePuffin commented Dec 9, 2021

Is the use case for a cluster of scale nodes trying to use storage from only 1 of the nodes? Or a single node cluster trying to access storage from the server hosting the cluster? or something else?

Currently SCALE does not support kubernetes clustering at all. So we did get requests if we could help out in finding a solution where people have 1 compute and 1 storage node ,and some requests for having SCALE connect to core stored files.

So the usecase would, basically, be any case a normal k8s user would utilise democratic-CSI. (as it's basically stock k8s what this is concerned)

@travisghansen
Copy link
Member Author

Ok yeah, then just wrapping the chart sounds ideal. I’m not aware of any work to include it atm in the upstream repo but I can ask if you’d like to prevent double work!

@PrivatePuffin
Copy link

@travisghansen Awesome, in that case i'll safe you the trouble and throw Kris a bone right away myself :)

@PrivatePuffin
Copy link

And indeed a "wrap" like we did with metallb should be enough here as well and give the most stability guarantees 👍

@xeijin
Copy link

xeijin commented Feb 6, 2022

@Ornias1993 @travisghansen just checking -- did this ever get a truecharts wrapper? Couldn't find it in the stable core train

@democratic-csi democratic-csi deleted a comment from PrivatePuffin Feb 7, 2022
@travisghansen
Copy link
Member Author

Appears it hasn’t been done yet no :(

@PrivatePuffin
Copy link

PrivatePuffin commented Feb 7, 2022

Okey, if we go all "lets delete comments because I don't like them".
One can assure the priority of this getting added does not get any bigger.

The simple fact is that if something isn't showing in the catalog (or the website), it isn't in it. It's not because I want to be rude, but because that's simply how catalogs work.

@travisghansen
Copy link
Member Author

You are welcome to not include it in your catalog if you do not wish it to be there.

I’m not interested in your negativity (and now childish threats about not adding it to the catalog) in this space however (if I didn’t have experience with the toxic behavior previously I would have never deleted the comment, but I have seen it in other places and know your history and I won’t allow it here).

So add it or don’t, but don’t come here with ridiculous comments to users sincerely seeking answers. There are far better ways to relay the message you wished to relay, I would ask you to consider those ‘better ways’ in further communication in this space.

@PrivatePuffin
Copy link

You are welcome to not include it in your catalog if you do not wish it to be there.

That was never what I wrote or intended, I actually did plan to work on it, as you're well aware. But sadly enough I don't have endless amounts of time. But I think you very well understand that, considering you're maintaining this repo and are sometimes hitting the same issue.

I’m not interested in your negativity (and now childish threats about not adding it to the catalog) in this space

I'm not making a threat, i'm stating the obvious fact that I need to prioritise my work. This includes that communities I don't feel like I have "synergy" with personally, obviously get less development time.

however (if I didn’t have experience with the toxic behavior previously I would have never deleted the comment, but I have seen it in other places and know your history and I won’t allow it here).

I don't think the answer is resorting to deletions and/or personal attacks. A friendly message with "I cannot allow that, etcetc. please make it sound a bit nicer" would've sufficed. If you don't want harsh direct comments, that is 100% your call as maintainer.

So add it or don’t, but don’t come here with ridiculous comments to users sincerely seeking answers.

They are not seeking answers in this case, that was the whole point I tried to make. The whole point about a catalog is that if it's not in the catalog it's not in the catalog.

If you open the fisherprice catalog looking for a blue-toycar and it's not there, it's not there. You can ask random other communities on the internet if the blue car is there, but that doesn't change the fact "it's not there if it's not there".

There are far better ways to relay the message you wished to relay, I would ask you to consider those ‘better ways’ in further communication in this space.

You are in your right to request that and It's my responsibility to stick to the limits within your community.

i'll leave it at this, as i'm not likely to spend time on this due to the way priorities within TrueCharts have shifted. We basically already have 2022 completely layed out by now and supporting external PVC-backed storage is not really something we actively aim towards for BlueFin.

(mostly because BlueFin itself already will have quite some PVC and storage related changes when it comes to kubernetes)

@mister2d
Copy link
Contributor

mister2d commented May 10, 2022

@mister2d thanks for trying it out! There are no definitive plans yet for pulling all the new api calls back into CORE, but I'm guessing they won't land until 13 :( In the mean time I have have 0 intention of removing the ssh-based drivers ever.

Now that TrueNAS Core 13 is out does that mean this CSI driver should work without ssh?

@travisghansen
Copy link
Member Author

Unfortunately not :( I would suggest opening a ticket on the ix jira issue tracker to request support in future releases.

@ickebinberliner
Copy link

When trying to connect to TrueNAS Core (13.0-U2) i still get an error:

GRPC error: rpc error: code = FailedPrecondition desc = driver is only availalbe with TrueNAS SCALE
csi-provisioner.go:197] CSI driver probe failed: rpc error: code = FailedPrecondition desc = driver is only availalbe with TrueNAS SCALE

with API Key and SSH :-(
Is there any Update when its released in Core Version?

@travisghansen
Copy link
Member Author

The iX team has no immediate plans to add support for CORE :(

@ickebinberliner
Copy link

ickebinberliner commented Nov 7, 2022

Oh, thank you for your Feedback. Then the iX team should update your website
https://www.truenas.com/blog/truenas-enables-container-storage-and-kubernetes/

@travisghansen
Copy link
Member Author

I think there may be some confusion. You can use this project with CORE just fine, but you need to use the ssh-based driver vs the api-only driver. Maybe open up another issue for help getting it going with ssh and we’ll get it working for you :)

@ickebinberliner
Copy link

thank you, i try ssh connection

@dakaix
Copy link

dakaix commented Jan 9, 2023

Are there any examples for the api-only driver for iSCSI and Helm? I'm trying to test this with SCALE 22.12 but Helm is rejecting my attempt to install the driver, presumably because of a syntax error on my part!

This is what I have come up with so far, as a hybrid between the community examples for the ssh driver, and the NFS example provided in this issue.

csiDriver:
  name: "org.democratic-csi.iscsi"

storageClasses:
- name: freenas-iscsi-csi-api
  defaultClass: false
  reclaimPolicy: Delete
  volumeBindingMode: Immediate
  allowVolumeExpansion: true
  parameters:
    fsType: xfs

  mountOptions: []
  secrets:
    provisioner-secret:
    controller-publish-secret:
    node-stage-secret:
    node-publish-secret:
    controller-expand-secret:

driver: freenas-api-iscsi
instance_id:
httpConnection:
  protocol: https
  host: nas.domain.com
  port: 8443
  apiKey: ***api-key-omitted***
  allowInsecure: true
  apiVersion: 2
zfs:

  
  datasetParentName: nvme-pool/k8s/i/v
  detachedSnapshotsDatasetParentName: nvme-pool/k8s/i/s
  zvolCompression:
  zvolDedup:
  zvolEnableReservation: false
  zvolBlocksize:
iscsi:
  targetPortal: "aaa.bbb.ccc.ddd:3260"
  interface:
  namePrefix: csi-
  nameSuffix: "-clustera"

  targetGroups:
    - targetGroupPortalGroup: 4
      targetGroupInitiatorGroup: 4
      targetGroupAuthType: None
      targetGroupAuthGroup:

  extentInsecureTpc: true
  extentXenCompat: false
  extentDisablePhysicalBlocksize: true
  extentBlocksize: 4096
  extentRpm: "SSD"
  extentAvailThreshold: 0

When I attempt to execute that, this is what I get:

# helm upgrade --install --create-namespace --values freenas-api-iscsi.yaml --namespace democratic-csi --set node.kubeletHostPath="/var/snap/microk8s/common/var/lib/kubelet"  zfs-iscsi democratic-csi/democratic-csi
Release "zfs-iscsi" does not exist. Installing it now.
Error: template: democratic-csi/templates/required.yaml:3:49: executing "democratic-csi/templates/required.yaml" at <.Values.driver.config>: can't evaluate field config in type interface {}

@travisghansen
Copy link
Member Author

What you have is pretty good but the scope of the driver config is off as it relates to a helm values file. Something like this..

...
driver:
  config:
    driver: freenas-api-iscsi
    instance_id: 
    ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants