Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

verbs: Add an API to retrieve IB device number of ports via netlink #958

Merged
merged 4 commits into from
Apr 21, 2021

Conversation

yishaih
Copy link
Member

@yishaih yishaih commented Mar 14, 2021

This series adds a verbs API to retrieve IB device number of ports via netlink.

The new API is used by mlx5 in the DR area.

@jgunthorpe
Copy link
Member

Why not make the verb return the query device value if netlink isn't support and forget about the dv API?

@mark-bloch
Copy link
Contributor

Why not make the verb return the query device value if netlink isn't support and forget about the dv API?

What dv API?
Are you talking about the DR code? this is just internal implementation which isn't exposed.

About the verb, the current definition of "netlink based number of ports" is very clear and creates a separate API from the regular query device (phys_port_cnt) which is limited to u8.

Of course doing that (your comment) will allow us to say "don't user phys_port_cnt/ibv_query_device() and move to use ibv_get_device_num_ports()) are we okay with saying that?

@jgunthorpe
Copy link
Member

Yes move the logic to call to the device information to the ibv_query_device() and get rid of the dr_ function that tries to do that. We probably need to pass in the ibv_context instead of the device to make that work. Netlink or not netlink is too complicated a detail to expose as an API. This should always return the port count and be the only way to get a port count if it is larger than u8

@mark-bloch
Copy link
Contributor

Yes move the logic to call to the device information to the ibv_query_device() and get rid of the dr_ function that tries to do that. We probably need to pass in the ibv_context instead of the device to make that work. Netlink or not netlink is too complicated a detail to expose as an API. This should always return the port count and be the only way to get a port count if it is larger than u8

Okay, so as ibv_query_device_ex() can fail as well so DR will need some logic anyway.
I think I better expose a new field (phys_port_cnt_ex) in struct ibv_device_attr_ex which is u32 bits.
If the netlink value is valid (!= 0) fill that, if not just fill phys_port_cnt.

This way an application can always use the new phys_port_cnt_ex value and there is no need for a new verb.

@rleon
Copy link
Member

rleon commented Apr 8, 2021

Yes move the logic to call to the device information to the ibv_query_device() and get rid of the dr_ function that tries to do that. We probably need to pass in the ibv_context instead of the device to make that work. Netlink or not netlink is too complicated a detail to expose as an API. This should always return the port count and be the only way to get a port count if it is larger than u8

Okay, so as ibv_query_device_ex() can fail as well so DR will need some logic anyway.
I think I better expose a new field (phys_port_cnt_ex) in struct ibv_device_attr_ex which is u32 bits.
If the netlink value is valid (!= 0) fill that, if not just fill phys_port_cnt.

This way an application can always use the new phys_port_cnt_ex value and there is no need for a new verb.

Always fill that new field and encourage users to always read.

Something like phys_port_cnt_real ???

@mark-bloch
Copy link
Contributor

Yes move the logic to call to the device information to the ibv_query_device() and get rid of the dr_ function that tries to do that. We probably need to pass in the ibv_context instead of the device to make that work. Netlink or not netlink is too complicated a detail to expose as an API. This should always return the port count and be the only way to get a port count if it is larger than u8

Okay, so as ibv_query_device_ex() can fail as well so DR will need some logic anyway.
I think I better expose a new field (phys_port_cnt_ex) in struct ibv_device_attr_ex which is u32 bits.
If the netlink value is valid (!= 0) fill that, if not just fill phys_port_cnt.
This way an application can always use the new phys_port_cnt_ex value and there is no need for a new verb.

Always fill that new field and encourage users to always read.

Something like phys_port_cnt_real ???

Yea, that's the idea, it will look something like this (in ibv_cmd_query_device_any()):

	if (attr_size >= offsetofend(struct ibv_device_attr_ex, phys_port_cnt_ex)) {
		struct verbs_sysfs_dev *sysfs_dev = verbs_get_device(context->device)->sysfs;

		if (sysfs_dev->num_ports)
			attr->phys_port_cnt_ex = sysfs_dev->num_ports;
		else
			attr->phys_port_cnt_ex = attr->orig_attr.phys_port_cnt;
	}

so if the netlink value is there (sysfs_dev->num_ports) use it, otherwise just use the legacy phys_port_cnt.
About the name I was thinking about phys_port_cnt_ex.

mark-bloch and others added 4 commits April 11, 2021 09:54
As kernel RDMA devices can have more than 255 ports the legacy
port_phys_cnt doesn't have enough bits to represent that.

Expose a new field (phys_port_cnt_ex) to let users know.  Retrieve
number of ports via netlink and in case that fails populate
phys_port_cnt_ex with the legacy phys_port_cnt value.

Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Use ibv_query_port_ex() in order to support devices with more than 255
ports.

Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Expose phys_port_cnt_ex, the extended version of phys_port_cnt so
devices with more than 255 ports can be represented.

Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Edward Srouji <edwards@nvidia.com>
Phys_port_cnt_ex allows the user to get a uint32 value which is
retrieved via netlink. Test it by making sure the value is equal to
phys_port_cnt if the number of ports is less or equal to 255.

Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Edward Srouji <edwards@nvidia.com>
@yishaih
Copy link
Member Author

yishaih commented Apr 11, 2021

The PR was updated as was discussed above, thanks.

@yishaih yishaih merged commit 486ecb3 into linux-rdma:master Apr 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants