Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CP-20676: Add gpumon interface #151

Merged
merged 2 commits into from
Apr 12, 2017
Merged

CP-20676: Add gpumon interface #151

merged 2 commits into from
Apr 12, 2017

Conversation

Frezzle
Copy link
Collaborator

@Frezzle Frezzle commented Apr 11, 2017

No description provided.

@coveralls
Copy link

Coverage Status

Coverage remained the same at 47.16% when pulling 224cb70 on Frezzle:gpumon into c41d9e9 on xapi-project:master.

type vgpu_compatibility = unit (* temporarily unit, until real type is determined *)

external get_pgpu_metadata: debug_info -> pgpu_address -> pgpu_metadata = ""
external get_pgpu_vm_compatibility: debug_info -> pgpu_metadata -> dom_id -> vgpu_compatibility list = ""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we need the address of the current pGPU as well (@mseri).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the return type, we essentially just need a Boolean, which give the answer to the question "can this VM be live migrated to this pGPU", plus a reason in case the answer is "no". Possible reasons are "host driver", "guest driver", "gpu", "other", or a combination of those. How about this type:

type incompatibility_reason = Host_driver | Guest_driver | GPU | Other
type compatibility = Compatible | Incompatible of incompatibility_reason list

Copy link
Collaborator

@mseri mseri Apr 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have extended the bindings in a way that may require some changes here. To get the pGPU handle, and then the metadata, you will need the pci.bus_id (a string) as pgpu_address.

Compatibility here can be any type useful to xapi, we can always reshape it inside gpumon when calling the idl to expose the correct informations. Seems good to simplify it as suggested by Rob

Copy link

@thomassa thomassa Apr 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For specifying the PCI device, I had imagined we might be able to use the xapi database's (misleadingly-named) pci_id field of a pci db entry, provided by xapi with code like Db.PCI.get_pci_id ~__context ~self:pci so it would be of format "%04x:%02x:%02x.%01x"
See http://wiki.xen.org/wiki/Bus:Device.Function_%28BDF%29_Notation
and/or xen-api.git/ocaml/xapi/pciops.ml

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I misunderstood the comment there. We do need the pgpu_address only to get the metadata. Fr the compatibility we need the current vgpu (we get it using the domid) and the destination pgpu metadata

Copy link
Member

@robhoes robhoes Apr 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, so:

external get_pgpu_vm_compatibility: debug_info -> pgpu_address -> pgpu_metadata -> dom_id -> compatibility

...where pgpu_address is for the pGPU that is currently in use, and pgpu_metadata is for the one we want to migrate the VM to. We can add labels or a comment to clarify that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, yes, I see what you mean. Indeed, we need the current pgpu and domid to get the current vgpu metadata, and we need the destination pgpu metadata.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I thought we could get the metadata about the VM's (current) vGPUs from the domid without needing to specify which pgpu (or pgpus) the VM is currently using.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thomassa Rob told me we can but it's not as straight-forward as we like (IIRC it involves looping through the pGPUs to check which one we're looking for), so having it here makes it simpler.

@robhoes
Copy link
Member

robhoes commented Apr 11, 2017

In any case, this needs to go onto a feature branch.

@Frezzle Frezzle changed the base branch from master to vgpu April 11, 2017 11:15
@Frezzle Frezzle changed the base branch from vgpu to vgpu-migration April 11, 2017 11:18

type pgpu_address = string
type pgpu_metadata = string
type dom_id = int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is called domid in other places.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that domid is a string for the bindings and the underlying library.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can translate it to a string inside gpumon where needed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I felt int was better for this interface, so that the translation to string will be done in gpumon rather than in xapi.

open Gpumon_interface
open Xcp_client

let json_url () = "file:" ^ json_path
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look like a file URL. I expected to see "file://"^json_path. https://en.wikipedia.org/wiki/File_URI_scheme#Unix

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

json_path is set in gpumon/gpumon_interface.ml:
let json_path = "/var/xapi/" ^ service_name ^ ".json"
so that json_url ends up as file:/var/xapi/gpu.json (as a side note, i'm changing the service_name to gpumon instead of gpu). I'm not sure if file:/... also works as the recommended file://..., but memory/memory_interface.ml does the same thing, and my assumption was that it worked.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying it out at http://www.freeformatter.com/url-parser-query-string-splitter.html
suggests that it is accepted. I still find it dubious but would have to read rules for URLs again whether this is syntactically valid.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wikipedia says:

The // after the file: is part of the general syntax of URLs. (The double slash // should always appear in a file URL according to the specification, but in practice many Web browsers allow it to be omitted).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, seeing as we seem to have had no problem with the memory interface (AFAIK) then maybe it's better to conform to what we're already doing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd argue the other way round: do the right thing and check that it works. If not, do the wrong thing but document it.

Copy link
Collaborator Author

@Frezzle Frezzle Apr 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The xenops interface appears to do the same (uses file:/...).

Edit: All the other interfaces also do the same.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could have a separate commit to change them all to use // but I see no reason to if it works?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jonludlam, @robhoes we need your opinion, here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all, I don't see json_url used anywhere, so it looks like dead code that should just be removed.

However, the same question applies to xml_url below. The place where this is used is in the function http_rpc in lib/xcp_client.ml, which passes it to Uri.of_string, which is provided by a library. I assume that this is the format that is accepted by this library, because indeed we use this format everywhere. For this PR, IMO, we should just comply with the other interfaces in xcp-idl, which is what the current patch does.

* GNU Lesser General Public License for more details.
*)

let service_name = "gpu"
Copy link
Collaborator

@mseri mseri Apr 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The client is called "gpumon" (if that is what ~dststr:"gpumon" means) but the service here is "gpu", is this correct?

Copy link
Collaborator Author

@Frezzle Frezzle Apr 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right; I'm changing it to gpumon now. (we previously had everything as gpu and forgot to change this one)

let service_name = "gpu"
let queue_name = Xcp_service.common_prefix ^ service_name
let json_path = "/var/xapi/" ^ service_name ^ ".json"
let xml_path = "/var/xapi/" ^ service_name
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for the client, do we use these paths anywhere?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

xml_path is used by the xml_url function in gpumon/gpumon_client.ml, which is used in the xml_http_rpc call below that in the same file. json_path, as @robhoes pointed out above, isn't used, so I'm removing it.

@Frezzle
Copy link
Collaborator Author

Frezzle commented Apr 12, 2017

^ New commit with all discussed changes.

@coveralls
Copy link

Coverage Status

Coverage remained the same at 47.16% when pulling acd8554 on Frezzle:gpumon into c41d9e9 on xapi-project:vgpu-migration.

@robhoes
Copy link
Member

robhoes commented Apr 12, 2017

Looks good now. Please squash.

type incompatibility_reason = Host_driver | Guest_driver | GPU | Other
type compatibility = Compatible | Incompatible of incompatibility_reason list

(* Get the metadata for a pGPU, given its address (PCI bus ID). *)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be (** text *) so that it is read as a docstring?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in new commit!

(* Get the metadata for a pGPU, given its address (PCI bus ID). *)
external get_pgpu_metadata: debug_info -> pgpu_address -> pgpu_metadata = ""

(* Check compatibility between a VM's vGPU(s) and another pGPU. *)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, I think this whole block of comments should be in

(** text ... 
     ...
     ... *)

to be interpreted as a docstring

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed it.

Commit 1 of 2 for new gpumon interface.

Signed-off-by: Frederico Mazzone <frederico.mazzone@citrix.com>
Commit 2 of 2 for new gpumon interface.

Signed-off-by: Frederico Mazzone <frederico.mazzone@citrix.com>
@coveralls
Copy link

Coverage Status

Coverage remained the same at 47.16% when pulling a1942f2 on Frezzle:gpumon into c41d9e9 on xapi-project:vgpu-migration.

@Frezzle
Copy link
Collaborator Author

Frezzle commented Apr 12, 2017

Squashed!

@robhoes robhoes merged commit 5bcf130 into xapi-project:vgpu-migration Apr 12, 2017
@coveralls
Copy link

Coverage Status

Coverage remained the same at 47.16% when pulling 1e9d016 on Frezzle:gpumon into c41d9e9 on xapi-project:vgpu-migration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants