Skip to content
This repository has been archived by the owner on Jul 6, 2023. It is now read-only.

API Documentation #36

Closed
lpabon opened this issue Jun 25, 2015 · 26 comments
Closed

API Documentation #36

lpabon opened this issue Jun 25, 2015 · 26 comments
Assignees

Comments

@lpabon
Copy link
Contributor

lpabon commented Jun 25, 2015

Hi all,
Now that Heketi is focusing just on GlusterFS, we need to finalize the API. The implementation may or may not finish the entire API set, but it will need to do enough for Manila and Kubernetes to consume. Although the proposed API is quite simple, please make sure that the proposed API makes it easy to add future features.

Here is the proposed API:
https://github.com/heketi/heketi/wiki/API

Here is the old one for the prototype used by the demo:
https://github.com/heketi/heketi/wiki/API/c4be1ddcfd17e72117ebc584d646eec2987fcb58

@lpabon lpabon self-assigned this Jun 25, 2015
@lpabon lpabon added review and removed in progress labels Jun 26, 2015
@krisis
Copy link

krisis commented Jun 26, 2015

@lpabon, the Add Node command takes an IP address as the 'name' of the node. How do you plan to extend this to support multiple network interfaces attached to a node?

@krisis
Copy link

krisis commented Jun 26, 2015

@lpabon , Add Node command doesn't return an id in its JSON response (nit), while Add Device requires node id to be specified in its JSON request.

@lpabon
Copy link
Contributor Author

lpabon commented Jun 26, 2015

@krisis Are you sure the id is not there? I have it on Add Node.

@lpabon
Copy link
Contributor Author

lpabon commented Jun 26, 2015

@krisis Great question on the IP address. I'm not sure what to do there. What do you think we should do?

@kshlm
Copy link

kshlm commented Jun 26, 2015

Is using KB as the unit for size really needed? We talk of sizes in 100s of GBs, specifying such large sizes in KBs doesn't look pretty to me.

@kshlm
Copy link

kshlm commented Jun 26, 2015

The Create a Volume api has an optional cluster parameter. The response should include it as well. The caller should be able to know that volume was created in the cluster requested.
On the same lines, the cluster id should be included in all volume info responses.

@krisis
Copy link

krisis commented Jun 26, 2015

@lpabon, Add Node returns cluster id and not Node id.

@krisis
Copy link

krisis commented Jun 26, 2015

@lpabon, Node could follow same convention of taking an optional name, which would default to the Node's id.

@krisis
Copy link

krisis commented Jun 26, 2015

@lpabon you're right. Add Node does return node id in response.

@lpabon
Copy link
Contributor Author

lpabon commented Jun 26, 2015

Thanks for the reviews, here are my comments:

  1. @krisis The API now separates name from IP by adding an ips:[] entry in the request where the caller can pass a list of IP addresses for the specified node. name now becomes optional, just like other elements in Heketi. The only caveat is that all the IPs in the list must be accessable by Heketi for SSH. Is this a good idea? If not, we may need a manage_ip entry in the request for Heketi to manage the node. What do you think? Link
  2. @kshlm Great idea about adding cluster information to all volume responses. I have made the change in the API. Link
  3. @kshlm The API is a software interface, and I can see clients hiding the size as KB from their users in any manner they choose. The client would then change the value to KB and send it to Heketi.
  4. Update: I have updated the API with a volume expansion support. Link

@lpabon
Copy link
Contributor Author

lpabon commented Jun 26, 2015

On suggestion from Greg Meno I have updated the API with support for asynchronous operations.

@lpabon
Copy link
Contributor Author

lpabon commented Jun 29, 2015

Added Authentication documentation

@dpkshetty
Copy link

@lpabon My comments as below:

  • Clusters, Nodes & Devices APIs will typically be used/consumed by the storage admin, while Volumes API will be used by virt admin / openstack node. How do we ensure
    they both step on each other ? For eg: An openstack node / user might accidently remove a node from a cluster, and should be stopped from doing so.

    • Is there something in the Auth API that can create some separation / role such that Heketi can differentiate between storage / virt roles and ensure safe usage of
      its APIs ?
  • Clusters API

    • How does heketi client know the status of cluster. Cluster info should provide some 'status' field with values as 'ok', 'degraded' etc
    • status can be 'degraded' if one of the nodes in cluster isn't peer probed or connected properly (Disconnected state)
    • This also means in Cluster Info response, along with peer, we should also show the peer status in the cluster ?
  • Nodes API

    • In Add Node, why do you want multiple IP addr to be provided per node ? Typically if a node has mutlipel IPs, it would be split across mgmt, data and internal network, so just having 1 IP should be sufficient ?
    • Alternatively, we could support mgmt IP and data IP as fields. Heketi will login to node using mgmt IP but will use the node's data IP when using that node as part of volume create
    • "free" field shows actual free or free considering the thinprovisioning ? IOW, since we use thin LVs as bricks, free can have multiple meanings. We may end up overcommitting too much. Should we have a overcommit ratio while selecting which Node/device combination to choose when looking for a brick ?
  • Volumes API

    • In Volume Info, can we also have something like:
    "mount_options": {
            "backup-volfile-servers": <list of peer IPs for the said volume>
    }
    
    * This would help client in mounting the gluster volume with `-o backup-volfile-servers` and thus can enjoy HA if the IP provided against "glusterfs" happens to be down at the time of mount.
    
    "mount_options" can change based on the protocol supported / sent back as part of Volume Info
    
    • Why does Volumes List need to have brick info also sent back ? Just basic info on each volume is enuf, for more detailed info user should do a GET on that volume.
  • Finally, I had one more suggestion (maybe as a future potential feature)

    • Adding a notion of Pool such that hierarchy becomes Cluster -> Pool -> Node(s) -> Device(s) -> Brick(s)
    • Pool is again a logical entity like Cluster. A set of nodes form a pool and volumes are created/managed inside pool
    • We can then create HDDPool, SSDPool and provide some basic form or QoS for creating volumes
    • This also helps to ensure that Pool1 reaching out of space , doesn't affect Pool2
    • Helps us get closer to enterprise storage semantics. A BIG enuf cluster can have 1 pool for cinder, 1 for manila, all still managed as a single cluster from GlusterFS/Heketi's perspective

@lpabon
Copy link
Contributor Author

lpabon commented Jun 30, 2015

@dpkshetty How do we ensure they both step on each other. Great point. I will need to add issuers (like account) for topology, and one for volumes.

@lpabon
Copy link
Contributor Author

lpabon commented Jun 30, 2015

@dpkshetty

  • Cluster: How does heketi client know the status of cluster. The first version of Heketi is about volume control. It does not have status of cluster. Along with peer status, I think this is a great feature for future version of Heketi. The one after the next ~6 months.
  • Node:
    • Multiple IPs: This was requested by the GlusterD team as preparation for v2.
    • Free shows unallocated space, without consideration of thin provisioning. Currently, I am planning on having Heketi use tp for snapshot support, but not for overprovisioning. The first version of Heketi will focus on correctness, and then future versions can work on efficiency.
  • Volumes:
    • Mount option. Can this option be set after the volume has been created?
    • Good point on Volume List, I think I will change it to your suggestion.
  • Pools:
    • Yes, this sounds like a future enhancement. Currently, in the near term, Clusters=Pools, but we can enhance that in the future.

@lpabon
Copy link
Contributor Author

lpabon commented Jun 30, 2015

@dpkshetty Hi, I made the following updates:

  1. Added admin and user issuers to Authentication
  2. Updated Volume List
  3. On your suggestion of volume list, I thought it would be good to add it also to Cluster Information to see the volumes only on a specific cluster.

@dpkshetty
Copy link

@lpabon Great, thanks for the updates

  1. Without knowing status of cluster, volume creation might fail if we end up picking a node/brick which is not part of the cluster (or was part of, but got disconnected later), thus affecting the correctness. I feel we should atleast add a status field and use that (maybe for now it always shows status as 'ok') so that in future when we actually reflect the right status, we don't have to change code much ?
  2. Reg. Node's multiple IPs, i didn't quite understand what the glusterd said. IMHO multiple IPs are best used when they separate the mgmt and data traffic, hence it makes sense to have that in heketi
  3. Reg. Volumes mount option, this is not a volume set option. IIUC, after a volume is created, when the client tries to mount, the mount.glusterfs supports -o backup-volfile-servers amongst many other options, thus the peer IPs info if sent back as part of Volumes GET can be used by client during mount
  4. Reg. Volumes List, you reduced it to just a list of IDs, which is not what i expected :) With just a list of IDs the List won't be useful/helpful, since people would remember/recognize their volumes by name and other attributes more than just ID. So i feel, we should atleast return back ID, Name, cluster, replica and size as part of List, and leave the details (brick info, mount info etc) to the GET API. Same for Cluster Info, there too instead of only ID, atleast ID, Name should be sent back.
  5. It would still be ok to introduce Pools now and make Cluster=Pool and have Pool info returned in all the GET responses and have the code use Pool while creating volumes. This should help when in future we actually make Cluster != Pool, Heketi code and client code doesn't have to change much and fallback to old behavior can be possible by having an option to make Cluster = Pool (for debug purposes)

@lpabon
Copy link
Contributor Author

lpabon commented Jul 1, 2015

@dpkshetty Hi Deepak, thanks for the review. Before I supply my comments, I just wanted to point out that this is a software interface, not one to the users. The users will not see this, so they will not be confused. Here are my comments

  1. Since we can always add new fields, we can add this in the future. I agree that Heketi should mature and start noticing cluster health, but in the mean time, I propose to keep it simple.
  2. You may want to talk to GlusterD folks about this. But in reality, it does not matter, because an ip list of 1 element is the same as a single ip. The client should display this correctly to the user.
  3. Is this necessary for the first version? I do not think the API prevents this feature in the future, right?
  4. Since this is a software interface, the client would get a list, and for each element in the list, it would call VolumeInfo. The user would see the correct information.
  5. This one is confusing to me. I guess I would need to know what you mean by Pool in GlusterFS, since GlusterFS itself does not define what a pool is. Fyi, in Heketi, all unallocated storage in the cluster is like a 'pool' of storage which it uses to allocate bricks to create volumes.

@dpkshetty
Copy link

@lpabon Hi

  1. Ok
  2. Sure
  3. I think its necessary, because it adds robustness for the client (openstack node)
  4. Agree, this is s/w interface, but the data returned by Heketi will be consumed by the client s/w and presented to user, so User woudn't know which ID is which volume, unless something user-friendly field (like name, size, cluster etc) are also provided. It shouldn't be too difficult to return these basic fields ?
  5. Pool = logical sub-collection of Nodes , while Cluster = collection of all nodes. Yes, GlusterFS doesn't have notion of Pool, hence i said its a logical entity.

@lpabon
Copy link
Contributor Author

lpabon commented Jul 2, 2015

@dpkshetty Hi, I made the following updates

  1. Added options to mount information in the volume
  2. Added manage_ips and storage_ips to both Add Node and Node Information
  3. Cluster list returns a simple list of clusters
  4. Cluster info returns the storage information about that cluster.

@lpabon
Copy link
Contributor Author

lpabon commented Jul 6, 2015

Thank you all for the help. The work has started to implement the proposed API

@lpabon lpabon added this to the API Implemented milestone Jul 6, 2015
@lpabon lpabon closed this as completed Jul 6, 2015
@lpabon lpabon removed the review label Jul 6, 2015
@lpabon lpabon reopened this Jul 9, 2015
@lpabon
Copy link
Contributor Author

lpabon commented Jul 9, 2015

@krisis @dpkshetty Question on https://github.com/heketi/heketi/wiki/API#node_add. According to [1], it seems the API should really only ask the hostname of the system. Does it really need to use IPs? I'm assuming it still needs to know the management hostname/ip and the storage network hostname/ip, right?
[1] http://www.gluster.org/pipermail/gluster-users/2014-July/018028.html

@lpabon
Copy link
Contributor Author

lpabon commented Jul 9, 2015

@krisis @dpkshetty , should I remove "name" from the request?

@lpabon
Copy link
Contributor Author

lpabon commented Jul 9, 2015

@krisis @dpkshetty @kshlm Question, When adding a node, I think the username and the SSH private key should be added also, right? Or should we assume that the ssh-agent running on the system that heketi is running has been setup?

@dpkshetty
Copy link

@lpabon

  1. Reg. asking for hostname Vs IP, i think the gluster-users link preferred hostname for cases where the IP is changed, do we envision such scenarios to be frequent ? Also many a times, when using VMs or hosts installed via ks, hostnames may not be set properly, so IP seems better off. Also in case of a node that has 2 NICs , the IPs are diff but hostname is just 1 (unless u spoof the hostname via /etc/hosts on the heketi node?)

  2. What was the significance of 'name' ? If its a user-friendly identifier for UUID, and can be used
    in place of UUID in the REST API, i think its still better to have. Otherwise too, its easier for someone to put user-friendly names for Nodes eg: node1_zone1, node2_zone3, so it seems useful to me atm.

  3. Yup, username and s/private key/public key can be added if you only intend to provide key based auth. For phase 1, it should be ok to just assume ( and hence have pre-req) to have passwd-less root access to the systems pre-setup too. ssh-agent could be another way (I haven't used it much) but if heketi is running on all gluster nodes and/or will be spawned in case of a gluster node going down, who takes the ownership of setting up ssh-agent on the new node ?

@lpabon
Copy link
Contributor Author

lpabon commented Jul 9, 2015

@dpkshetty Great thanks for the help

@lpabon lpabon closed this as completed Jul 9, 2015
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants