Skip to content

Conversation

@bitmeal
Copy link
Contributor

@bitmeal bitmeal commented Mar 6, 2020

When creating and enabling SR-IOV networks, using xapi, xcp-networkd tries to enable virtual functions on the physical adapter using sysfs interface. xcp-networkd tries to set the number of VFs equal to the maximum number of of virtual functions (sriov_totalvfs). When using adapters that do not fully implement the sysfs interface for SR-IOV configuration, currently, enabling fails. Enabling virtual functions via sysfs is executed even if the virtual functions are already enabled. Failing to execute the enabling procedure (even though unnecessary, when VFs are already enabled), prevents successful creation of an SR-IOV network.

An example are Mellanox ConnectX-3 cards, using mlx4_core driver: Virtual functions can be configured in firmware or using modprobe config. All configured virtual interfaces are enabled by default. The driver, though, does not allow configuration using the sysfs interface, and trying to write to sriov_numvfs fails; thus preventing successful creation and enabling of a SR-IOV network - even though, the virtual functions are enabled.

This pull request allows to use network adapters that require manual enabling (do not fully support configuration via sysfs interface) with xapi/xcp-networkd managed SR-IOV networks, by checking for enabled VFs on the interface, prior to trying to enable them. When virtual functions are already enabled on the interface, enabling is skipped, a debug message about already enabled virtual functions is logged and config_sriov returns with Ok Sysfs_successful (Ok Sysfs_successful being the only valid return value for successfully enabling virtual functions).

@edwintorok
Copy link
Contributor

Thanks for the detailed explanation.

I think it would be better to try to enable the maximum number of VFs, and only if that fails fall back to checking how many VFs we have enabled (and if that is >0 as in your code then succeed).
Otherwise there could be a device that has some VFs enabled, perhaps not the maximum, and would allow configuring the number of VFs. In that case trying to increase the VFs would succeed.

@edwintorok
Copy link
Contributor

Please also add a Signed-off-by line to your commit, see the error from the DCO bot and https://elinux.org/Developer_Certificate_Of_Origin

@robhoes
Copy link
Member

robhoes commented Mar 6, 2020

We also have the Modprobe configuration option, which is handled just above Sysfs in the same function, which comes with another valid return value for the "success" case: Modprobe_successful. This was done to support the igb driver specifically, and we considered it a legacy mode.

We could extend this method to support the cards that you mentioned, so that xcp-networkd has the power to enable VFs in case they aren't yet, by updating the modprobe file. However, I'm not sure if it's worth doing that if, as you say, VFs are enabled by default and we can count on that.

@bitmeal
Copy link
Contributor Author

bitmeal commented Mar 6, 2020

Thank you everybody for the investment!
In reply to @edwintorok first: I could implement and test that behavior not before sunday or monday.
@robhoes The modprobe configuration on the Intel adapters sets the max_vfs option, but does not enable the virtual functions (just setting the maximum number of "enableable" virtual functions). Enabling the virtual functions is still done using the sysfs interface. Ultimately, you are right and Ok Modprobe_successful is another valid, successful return value. To fully control the Mellanox adapters, adding a xapi-templated modprobe file would be necessary as well. Personally I would try to abstain for from mechanisms tailored to a specific card/vendor, modprobe (+reboot) and sysfs-interface being the common denominator here, but would try to not prevent the use of cards that require manual configuration. Modprobe based configuration requires a reboot (per mechanism) and for now assumes the virtual functions to be enabled after reboot to not be enabled after reboot, but being enabled via sysfs interface. If the return value (except Ok *) should further indicate the successfully used mechanism, I, personally, would advocate for introducing an additional successful return Value, for adapters that automatically enable VFs via modprobe.

edit: fixed a word and clarification

@bitmeal
Copy link
Contributor Author

bitmeal commented Mar 9, 2020

I just found out, that what I wrote is not true! Excuse me please.

Modprobe based configuration requires a reboot (per mechanism) and for now assumes the virtual functions to be enabled after reboot to not be enabled after reboot, but being enabled via sysfs interface.

The modprobe interface does in fact not rely on a working sysfs interface implementation! Reading the code again, it should, as you @robhoes stated, in fact rely on the the functions being enabled directly after reboot. When a templated modprobe config is detected, the "modprobe branch" is taken and no "escaping" to the sysfs configuration code is possible.

To come back to the original issue:
What is still not possible, using both mechanisms as implemented, is the use of manually configured virtual functions. It could be possible to trick networkd into assuming a successful modprobe configuration, but this should not be the way to go in my opinion.

I think the way proposed by @edwintorok (and in the pull request) would be a solution that works in most cases.

In my opinion, a clean solution could be as follows:

  • templated modprobe file existing --> modprobe interface
  • writing current value of sriov_numfvs back to sriov_numfvs (itself) succeeds --> sysfs interface
  • everything failed, just check wether virtual functions are already enabled

This would require an additional return value, as neither the modprobe not the sysfs configuration were successful. sysfs would in fact be used to determine the actual number of virtual functions, but would not be used for configuration. Adding an additional return value, e.g. Ok Manual_successful would require a modification to the xapi_network_sriov_helpers.ml in xen-api, as the return value match is not exhaustive.

Any thoughts on this?

@robhoes
Copy link
Member

robhoes commented Mar 9, 2020

Enabling the virtual functions is still done using the sysfs interface.

I don't think that's true for Intel/igb, as you'll never get into the None case of the function that you are editing. However, I do agree that we should avoid adding a modprobe-style mechanism for all the various devices that don't support the standard sysfs interface. We did it for Intel due to special requirements.

@robhoes
Copy link
Member

robhoes commented Mar 9, 2020

Yes, what you wrote at the end looks like a clean way to do it. Unfortunately, it would require changing three repos: besides xcp-networkd and xen-api, you need to define the new return code in xcp-idl (network_interface.ml).

In xen-api, you should also add the manual mode to the sriov_configuration_mode enum in ocaml/idl/datamodel.ml for the API, and network_sriov_configuration_mode_to_string in ocaml/xapi/record_util.ml for the CLI. This is a user-visible addition to the API/CLI that tells the user what's going on.

@bitmeal
Copy link
Contributor Author

bitmeal commented Mar 9, 2020

Thank you for the pointers! That should be doable.
What would be the preferred procedure from your side to go about such a modification, involving three repos? Just filing individual pull requests and referencing each other request as a dependency in the comments?

@robhoes
Copy link
Member

robhoes commented Mar 9, 2020

Yes, that's how we normally do it.

…etworks managed by xapi; network server failed to use already enabled virtual functions, for network adapters that do not allow configurations via sysfs interface, but require manual configuration by user

Signed-off-by: Arne Wendt <arne.wendt@tuhh.de>
@bitmeal
Copy link
Contributor Author

bitmeal commented Mar 12, 2020

The proposed solution of:

  • templated modprobe file existing --> modprobe interface
  • writing current value of sriov_numfvs back to sriov_numfvs (itself) succeeds --> sysfs interface
  • everything failed, just check wether virtual functions are already enabled
    did not prove successful in testing!

This may be a special problem of the Mellanox driver (mlx4_core) thoug.
When writing the back the number of currently active VFs (sriov_numfvs) back to sriov_numfvs, the operation succeeds. When writing any other number, it fails. The proposed approach thus assumes a working sysfs interface.

I modified the approach to something similiar as @edwintorok proposed as an addition to the first pull request:

  • cache the number of currently active VFs
  • try to enable maxvfs on device
    • succeeds --> sysfs interface
    • fails, 'cached-numvfs' <> 0 --> manual config
    • fails,'cached-numvfs'=0 --> failure

Trying to deduce the configuration mode only happens when enabling sriov, disabling does not call Net.Sriov.disable, when mode is `manual, as disabling is not (assumed to not be) supported on manually configured devices.

updates and pull requests for xen-api and xcp-idl will be filed and referenced shortly

  • compiled from most recent master branch commit
  • against most recent xs-opam repository
  • against xcp-ng-8.1 rpms/repo
  • tested on xcp-ng-8.1

with e -> let msg = Printf.sprintf "Error: trying sysfs SR-IOV interface failed with exception %s on device: %s" (Printexc.to_string e) dev in Error(Other, msg) ),
present_numvfs
with
| Ok _ , _ -> debug "%s SR-IOV on a device: %s via sysfs" op dev; Ok Sysfs_successful
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was a bit surprised that the code would not use error and return (as I would expect in monadic code) but the existing code does not use it either.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to stick to the original code for error handling and return types, to not mess with other code portions and keep currently expected behavior.

end
else
begin
Sysfs.unbind_child_vfs dev >>= fun () -> Sysfs.set_sriov_numvfs dev 0 >>= fun _ -> debug "%s SR-IOV on a device: %s via sysfs" op dev; Ok Sysfs_successful
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please shorten lines by using

… >>= fun () ->
… >>= fun _ ->
…

@bitmeal
Copy link
Contributor Author

bitmeal commented Apr 6, 2020

@lindig: I hope I could address all of your comments with the latest commit. Thanks for all the input and pointers!

I'm sorry for the delay, but was occupied with getting our team working in the current situation.

Copy link
Contributor

@lindig lindig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks all good to me. Any comments from @robhoes who knows the domain better than I do.

@lindig
Copy link
Contributor

lindig commented Apr 6, 2020

# File "networkd/network_server.ml", line 142, characters 75-92:
9522# 142 |         | _, _ -> debug "SR-IOV/VFs %sd manually on device: %s" op dev; Ok Manual_successful
9523#                                                                                  ^^^^^^^^^^^^^^^^^
9524# Error: This variant expression is expected to have type
9525#          S.Sriov.enable_action_result
9526#        The constructor Manual_successful does not belong to type S.Sriov.enable_action_result

Copy link
Contributor

@lindig lindig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Travis build is broken - please take a look

@bitmeal
Copy link
Contributor Author

bitmeal commented Apr 7, 2020

This should be expected, or not? Manual_successful for Sriov.enable_action_result will only be introduced with xapi-project/xcp-idl#307 but is not part of the travis build, as the pull request is not yet merged (as this one really should be approved first I think).

match (Sysfs.get_sriov_maxvfs dev >>= fun maxvfs -> maxvfs |> Sysfs.set_sriov_numvfs dev), present_numvfs
with
| Ok _ , _ -> debug "%s SR-IOV on a device: %s via sysfs" op dev; Ok Sysfs_successful
| _, _ -> debug "SR-IOV/VFs %sd manually on device: %s" op dev; Ok Manual_successful
Copy link
Member

@robhoes robhoes Apr 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are not actually matching on present_numvfs. Don't you need at least | _, present -> when present > 0? Or better, check that it is equal to maxvfs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are absolutely right! When adapting the code I was more amazed by the fact that I may integrate the exception handling in the match, and forgot to think twice about the actual semantics.
The code now does something very different and it just tested fine on my machine, as currently enabled VFs and maximum number of VFs is equal.
I am currently thinking about a way to rewrite the code with my new knowledge about matching of exceptions, but for now did not come to a conclusion. I think the old way of implementing this + the "style-fixes" will be the way to go.
I hope to come to a conclusion within the next hour.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is a problem is the number of VFs is smaller than the max. At least, I hope that we do the accounting, when allocating VFs to VMs, based on the current number rather than the max. As long as the number is larger than zero. So the when clause I suggested above may be enough?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And to actually reply to your question:

Don't you need at least | _, present -> when present > 0? Or better, check that it is equal to maxvfs?

I would need to check if it is >0 or <>0. Problem here is, that the code to set the new number of VFs is expected to throw. I have to match for exception and > 0, or like in the last version, match exception, 0 first and know that anything matched afterwards can only be exception and > 0. This was the reason I built the Ok() and Error() returning try-catch-construct

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

| _, present -> when present > 0
So the when clause I suggested above may be enough?

The attempt to set a new number of VFs may return with Error _, when a failure is handled internally by the called methods, or may just throw. Both, exception and Error, are acceptable in the case of present_numvfs > 0.
As far as I understand, the wildcard _ does not match exceptions. But trying to match multi case on | exception _ | Error _ when present_numvfs > 0 -> ... is unfortunately not supported either (no mixing of value and exception patterns when using when-guard).
Including the exception in the match is possible, but for now I cannot find a solution that is actually cleaner and more concise than the original variant. Combining Error and exception into one match case by catching exceptions beforehand, effectively avoids code duplication.

It is possible to get rid of the try-catch and matching a tuple, but the match is not any more readable, due to the reason given above. Example:

| Ok _ -> debug [...]; Ok
| Error _ when present_numvfs > 0 -> debug [...]; Ok Manual_successful
| exception _ when present_numvfs > 0 -> debug [...]; Ok Manual_successful
| Error err -> Error err
| exception e -> let msg = [...]; Error(Other, msg)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is correct that a single pattern either matches a normal value or an exception, but never both. I still think a collection of patterns is easier to reason about and easier to extend than if-then-else or try-with chains. So I do like the above but would not insist on it.

Copy link
Contributor Author

@bitmeal bitmeal Apr 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, my take at the implementation with the least redundancy would be as follows:

let man_successful () =
    debug "SR-IOV/VFs %sd manually on device: %s" op dev;
    Manual_successful
in
if enable then
begin
    let present_numvfs = Sysfs.get_sriov_numvfs dev in
    match Sysfs.get_sriov_maxvfs dev >>= fun maxvfs -> maxvfs |> Sysfs.set_sriov_numvfs dev
    with
    | Ok _ -> debug "%s SR-IOV on a device: %s via sysfs" op dev; Ok Sysfs_successful
    | Error _     when present_numvfs > 0 -> Ok (man_successful ())
    | exception _ when present_numvfs > 0 -> Ok (man_successful ())
    | Error err -> Error err
    | exception e ->
        let msg = Printf.sprintf "Error: trying sysfs SR-IOV interface failed with exception %s on device: %s" (Printexc.to_string e) dev in
        Error(Other, msg)
end

should be equivalent to following code with doublings, but without introduction of an additional function:

if enable then
begin
    let present_numvfs = Sysfs.get_sriov_numvfs dev in
    match Sysfs.get_sriov_maxvfs dev >>= fun maxvfs -> maxvfs |> Sysfs.set_sriov_numvfs dev
    with
    | Ok _ -> debug "%s SR-IOV on a device: %s via sysfs" op dev; Ok Sysfs_successful
    | Error _     when present_numvfs > 0 -> debug "SR-IOV/VFs %sd manually on device: %s" op dev; Ok Manual_successful
    | exception _ when present_numvfs > 0 -> debug "SR-IOV/VFs %sd manually on device: %s" op dev; Ok Manual_successful
    | Error err -> Error err
    | exception e ->
        let msg = Printf.sprintf "Error: trying sysfs SR-IOV interface failed with exception %s on device: %s" (Printexc.to_string e) dev in
        Error(Other, msg)
end

or, using try catch:

begin
    let present_numvfs = Sysfs.get_sriov_numvfs dev in
    match
    ( try Sysfs.get_sriov_maxvfs dev >>= fun (maxvfs) -> maxvfs |> Sysfs.set_sriov_numvfs dev
        with e ->
        let msg = Printf.sprintf "Error: trying sysfs SR-IOV interface failed with exception %s on device: %s" (Printexc.to_string e) dev in
        Error(Other, msg) ),
    present_numvfs
    with
    | Ok _ , _ -> debug "%s SR-IOV on a device: %s via sysfs" op dev; Ok Sysfs_successful
    | Error(err), 0 -> Error(err)
    | _, _ -> debug "SR-IOV/VFs %sd manually on device: %s" op dev; Ok Manual_successful
end

if there are any preferences, tell me and I can commit the code

edit: updated code snippet 1, as there were errors

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lindig @robhoes : any preferences regarding style of the implementation?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the first option of the three, if that helps you unlock the work :)

Copy link
Member

@robhoes robhoes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good apart from what I commented above about present_numvfs.

@robhoes
Copy link
Member

robhoes commented Apr 7, 2020

I think that you need a rebase as well, since Edwin's commit shows up.

…bled sr-iov/vfs. manual configuration is assumed if: no modprobe template is present, setting numvfs to maxvfs via sysfs interface fails, but vfs are present.

a new return `Manual_successful` is introduced.
**this patch depends on updated versions of:**
	* xen-api
	* xcp-idl

Signed-off-by: Arne Wendt <arne.wendt@tuhh.de>

clean code

Signed-off-by: Arne Wendt <arne.wendt@tuhh.de>

fix broken functionality after code clean and style change

Signed-off-by: Arne Wendt <arne.wendt@tuhh.de>
@bitmeal
Copy link
Contributor Author

bitmeal commented Apr 20, 2020

As there have been no further comments on the style of the implementation, I'm following what I think to be the preferences of @lindig and @psafont.

Implementation snippet #1, as the now chosen one, included errors; snippet has been updated above with working code.

Code has been updated and functionality restored.

Copy link
Member

@robhoes robhoes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good now, thanks.

@robhoes
Copy link
Member

robhoes commented Jun 11, 2020

Unfortunately, there appears to be a merge conflict.

@lindig
Copy link
Contributor

lindig commented Jun 11, 2020

I believe Travis failures are expected because of mutual dependencies between pull requests.

@robhoes
Copy link
Member

robhoes commented Jun 12, 2020

I've rebased it and fixed the conflict: #177. I'll close this PR now in favour of the other.

@robhoes robhoes closed this Jun 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants