UCP: Implementation of routine to query datatype attributes by rdietric · Pull Request #8150 · openucx/ucx

rdietric · 2022-04-22T10:38:13Z

What

Implementation of the the routine ucp_dt_query according to PR #8120. Currently, the only datatype attribute to query is the packed size.

Why ?

The size of a message or data transfer is a performance-relevant information. Performance tools should be able to query this information similar to MPI_Type_size to query the size of any MPI datatype

How ?

For contig data types, the return value of ucp_contig_dt_elem_size() provides the element size.
For generic data types, a field packed_size has been added to ucp_dt_generic_t, which is set with the user-defined pack routine. When the size of the generic data type is queried, the value of this field is used. All other data types are not supported and UCS_ERR_INVALID_PARAM is returned as ucs_status_t.

A unit tests verifies the expected behavior for contig, iov and generic data types. It is also checked that UCS_ERR_UNSUPPORTED is returned, if the generic datatype is queried before packing.

yosefe · 2022-05-09T11:18:38Z

/azp run

azure-pipelines · 2022-05-09T11:18:59Z

Azure Pipelines successfully started running 3 pipeline(s).

rdietric · 2022-06-03T07:45:15Z

Since nobody reviewed yet, I rebased to have the new ucp.h with ucp_dt_query in the branch. I also fixed a bug in the implementation and another issue in the unit test. The current pipeline failures seem unrelated to this PR.

brminich · 2022-06-08T07:03:04Z

@rakhmets, can you please review?

brminich

only minor comments

brminich · 2022-06-08T15:30:05Z

-    *datatype_p     = ucp_dt_from_generic(dt_gen);
+    dt_gen->ops         = *ops;
+    dt_gen->context     = context;
+    dt_gen->packed_size = 0;


btw, maybe use SIZE_MAX as an indication of uninitialized type? I'd guess that zero-sized derived datatype is possible in theory, but currently query routine will always return UNSUPPORTED for such types.

I actually used 0 to allow checks like if (attr->packed_size), but I changed it according to your suggestion, so we can also use SIZE_MAX now. I think that a datatype of size 0 does not make sense. If you still think it's better to initialize with SIZE_MAX, I am okay doing so, especially since I probably don't know all the data descriptions/types that are possible with UCX.

i think having 0-sized type is more likely than SIZE_MAX, so I'd change it.
@yosefe, wdyt?

I don't think we can assume that packed size of bufferA is same as packed size of bufferB. The packing logic depends on the contents - consider compression.
Perhaps dt_query API should also accept buffer pointer and count.

You are right, if we consider compression, the packed size of a generic datatype can be different depending on the data. I thought of packing more like C struct packing/padding.
This basically means that the packed size can be different for each call to a communication function. Hence, the query would always return the same value for contiguous datatypes and could return different values after each communication operation for the same generic datatype. The function then still serves its purpose and allows me to find out how much data is transferred. However, this behavior has to be documented and I've to check what changes are needed in the implementation.

yes, IMO we should pass buffer and count and return the packed size of that buffer.
we need to decide on this before the release that freezes the API.
@shamisp WDYT?

Has a decision been made here? I agree with @yosefe that passing buffer and count and return the buffer size fits better to the existing UCP datatypes. The user of ucp_dt_query can still pass "1" as count to return the size of one element.

@shamisp ping on this

@yosefe This is good question actually. In our recent research work the size also depended on the target EP in addition to buffer and count - think about DT conversion across different architectures. Do we use this for any of the MPI call ? I don't think MPI passes buffer (I have to double check)

I did not come across the EP during implementation. The current implementation basically returns the last datatype size, which has been determined during the regular message/transfer preparation. This should include the EP.

yosefe · 2022-08-17T08:06:32Z

 * @return Error code as defined by @ref ucs_status_t
 */
-ucs_status_t ucp_dt_query(ucp_datatype_t datatype, ucp_datatype_attr_t *attr);
+ucs_status_t ucp_dt_query(ucp_datatype_t datatype, const void *buffer,


@shamisp do you think it would be better to make buffer,count parameters optional using a struct?
PS. This API was not released yet

@yosefe - I think it is a good idea.

@rdietric let's move buffer,count into ucp_datatype_attr_t (with corresponding flags)

yosefe · 2022-08-17T08:07:44Z

+        return UCS_ERR_INVALID_PARAM;
+    case UCP_DATATYPE_GENERIC:
+        dt_gen = ucp_dt_to_generic(datatype);
+        ucs_assert(NULL != dt_gen);


return invalid param if NULL

yosefe · 2022-08-21T13:51:16Z

 * @return Error code as defined by @ref ucs_status_t
 */
-ucs_status_t ucp_dt_query(ucp_datatype_t datatype, ucp_datatype_attr_t *attr);
+ucs_status_t ucp_dt_query(ucp_datatype_t datatype, const void *buffer,


@rdietric let's move buffer,count into ucp_datatype_attr_t (with corresponding flags)

yosefe · 2022-08-22T10:27:47Z

 */
 enum ucp_datatype_attr_field {
-    UCP_DATATYPE_ATTR_FIELD_PACKED_SIZE = UCS_BIT(0) /**< packed datatype size */
+    /** Query the packed datatype size. */


/** @ref ucp_datatype_attr_t::packed_size field is queried. */

yosefe · 2022-08-22T10:28:19Z

 * @brief UCP datatype attributes
 *
- * This structure provides attributes that can be queried for a UCP datatype.
+ * This structure provides attributes of a UCP datatype.


This structure provides attributes for querying a UCP datatype
@tonycurtis WDYT?

yosefe · 2022-08-22T10:28:42Z

+     * Number of elements in @a buffer.
+     * This value is optional.
+     * If @ref UCP_DATATYPE_ATTR_FIELD_COUNT is not set in @ref field_mask, the
+     * value of this field defaults to 0.


maybe the default should be 1?

I am not sure. 0 should basically signal that this value is not set (invalid). 0 would also enable zero-initialization of all fields after packed_size.
Since you bring this up, what do we expect the packed_size to be for a contiguous datatype, when count >1 is passed? count * sizeOfDatatype?

IMO the default should be "1" and for contig type we should multiply it by sizeOfDatatype
@shamisp WDYT?

yosefe · 2022-08-22T10:32:05Z

+        return UCS_ERR_INVALID_PARAM;
+    case UCP_DATATYPE_GENERIC:
+        dt_gen = ucp_dt_to_generic(datatype);
+        ucs_assert(NULL != dt_gen);


return invalid param if NULL

yosefe · 2022-08-22T12:44:55Z

LGTM besides :

API doc review by @tonycurtis
Set default value of count to 1 and use it for contig types
Fix CI failures

yosefe · 2022-08-22T14:18:02Z

    case UCP_DATATYPE_CONTIG:
        attr->packed_size = ucp_contig_dt_elem_size(datatype);
+
+        if (attr->field_mask & UCP_DATATYPE_ATTR_FIELD_COUNT) {


we can treat count as default 1 also for generic and iov

Makes sense and then also fits to the API documentation.

yosefe · 2022-08-22T14:55:41Z

+    if (attr->field_mask & UCP_DATATYPE_ATTR_FIELD_COUNT) {
+        count = attr->count;
+    }


don't initialize count during declaration
use UCP_PARAM_VALUE macro

Ok, UCP_ATTR_VALUE would work with enum ucp_datatype_attr_field. Or should I change attr to param in the enum? It seems to be used almost interchangeable.

yosefe · 2022-08-23T06:06:01Z

@tonycurtis @shamisp can you pls review?

brminich · 2022-08-23T06:15:36Z

+        datatype_attr.buffer              = buf;
+        datatype_attr.field_mask         |= UCP_DATATYPE_ATTR_FIELD_BUFFER;


indentation

brminich · 2022-08-23T06:15:46Z

+        datatype_attr.buffer                  = buf;
+        datatype_attr.field_mask             |= UCP_DATATYPE_ATTR_FIELD_BUFFER;


indentation

yosefe

pls squash

Changed the API so that buffer and count are optional input arguments via the `ucp_datatype_attr` parameter. Also added a unit test.

rdietric · 2022-08-29T13:31:36Z

I squashed the commits and all checks have passed.

yosefe · 2022-08-31T08:17:29Z

👍
@tonycurtis @shamisp can you pls review the API change?

tonycurtis · 2022-08-31T11:29:25Z

 * @brief UCP datatype attributes
 *
- * This structure provides attributes that can be queried for a UCP datatype.
+ * This structure provides attributes of a UCP datatype.


tonycurtis · 2022-08-31T11:30:55Z

+     * Number of elements in @a buffer.
+     * This value is optional.
+     * If @ref UCP_DATATYPE_ATTR_FIELD_COUNT is not set in @ref field_mask, the
+     * value of this field defaults to 1.


would 0 be a better default size for a NULL buffer?

Agree with @tonycurtis comment - 1 is odd

We've had this topic before (#8150 (comment)). Intuitively, I would also have taken 0, but there are also arguments for 1. I do not have a strong opinion on this.

@tonycurtis @shamisp if the default is 0, it means that if this parameter is not supplied the function will always return 0. and that is essentially as if we're making this parameter mandatory.
IMO it should be 1 by default since the neutral value w.r.t. multiplication is 1.

If @ref UCP_DATATYPE_ATTR_FIELD_COUNT is not set in @ref field_mask, the function executes the query for a single ucp_datatype_t . I think this sounds better. @tonycurtis ?

@yosefe , @tonycurtis approved the above and I think it sounds better while does not change the meaning

tonycurtis · 2022-09-04T15:39:48Z

On Sep 4, 2022, at 11:29 AM, Yossi Itigin ***@***.***> wrote: @yosefe commented on this pull request. In src/ucp/api/ucp.h <#8150 (comment)>: > */ - size_t packed_size; + const void *buffer; + + /** + * Number of elements in @A buffer. + * This value is optional. + * If @ref UCP_DATATYPE_ATTR_FIELD_COUNT is not set in @ref field_mask, the + * value of this field defaults to 1. @tonycurtis <https://github.com/tonycurtis> @shamisp <https://github.com/shamisp> if the default is 0, it means that if this parameter is not supplied the function will always return 0. and that is essentially as if we're making this parameter mandatory. IMO it should be 1 by default since the neutral value w.r.t. multiplication is 1.

A NULL pointer encapsulating data of length 1 seems weird. Maybe the semantics of this routine need to be revisited? tony

yosefe · 2022-09-04T15:55:09Z

A NULL pointer encapsulating data of length 1 seems weird. Maybe the semantics of this routine need to be revisited? tony

The pointer itself is not mandatory for calculating the size. The single mandatory parameter is the datatype itself, and count=1 by default would mean to calculate the size of a single element.

yosefe · 2022-09-19T07:10:21Z

A NULL pointer encapsulating data of length 1 seems weird. Maybe the semantics of this routine need to be revisited? tony

The pointer itself is not mandatory for calculating the size. The single mandatory parameter is the datatype itself, and count=1 by default would mean to calculate the size of a single element.

@shamisp @tonycurtis WDYT?

tonycurtis · 2022-09-21T23:55:02Z

Yeah, that sounds good

…

On Wed, Sep 21, 2022 at 7:53 PM Pavel Shamis (Pasha) < ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In src/ucp/api/ucp.h <#8150 (comment)>: > */ - size_t packed_size; + const void *buffer; + + /** + * Number of elements in @A buffer. + * This value is optional. + * If @ref UCP_DATATYPE_ATTR_FIELD_COUNT is not set in @ref field_mask, the + * value of this field defaults to 1. If @ref UCP_DATATYPE_ATTR_FIELD_COUNT is not set in @ref field_mask, the function executes the query for a single ucp_datatype_t . I think this sounds better. @tonycurtis <https://github.com/tonycurtis> ? — Reply to this email directly, view it on GitHub <#8150 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABB6C2CHAD6ILYZGNKVOGRDV7ONXBANCNFSM5UB5KY4A> . You are receiving this because you were mentioned.Message ID: ***@***.***>

rdietric mentioned this pull request Apr 22, 2022

UCP: Add API to query datatype attributes #8120

Merged

rdietric force-pushed the ucp_datatype/query_size_impl branch 4 times, most recently from 48e3dbb to 28be810 Compare June 2, 2022 07:12

brminich reviewed Jun 8, 2022

View reviewed changes

Comment thread src/ucp/dt/dt.c Outdated

Comment thread src/ucp/dt/datatype_iter.inl Outdated

Comment thread src/ucp/dt/dt.c Outdated

Comment thread src/ucp/dt/dt_generic.c Outdated

rakhmets reviewed Jun 8, 2022

View reviewed changes

Comment thread test/gtest/ucp/test_ucp_dt.cc

Comment thread test/gtest/ucp/test_ucp_dt.cc Outdated

brminich reviewed Jun 8, 2022

View reviewed changes

yosefe reviewed Aug 17, 2022

View reviewed changes

rdietric force-pushed the ucp_datatype/query_size_impl branch 2 times, most recently from f933aeb to 8b4bffb Compare August 18, 2022 06:08

yosefe reviewed Aug 21, 2022

View reviewed changes

yosefe reviewed Aug 22, 2022

View reviewed changes

yosefe previously approved these changes Aug 23, 2022

View reviewed changes

yosefe added the API label Aug 23, 2022

brminich previously approved these changes Aug 23, 2022

View reviewed changes

rdietric dismissed stale reviews from brminich and yosefe via d238631 August 23, 2022 06:55

brminich previously approved these changes Aug 23, 2022

View reviewed changes

yosefe previously approved these changes Aug 28, 2022

View reviewed changes

UCP: Implementation of ucp_dt_query routine

0718940

Changed the API so that buffer and count are optional input arguments via the `ucp_datatype_attr` parameter. Also added a unit test.

rdietric dismissed yosefe’s stale review via 0718940 August 28, 2022 11:33

rdietric dismissed brminich’s stale review via 0718940 August 28, 2022 11:33

rdietric force-pushed the ucp_datatype/query_size_impl branch from d238631 to 0718940 Compare August 28, 2022 11:33

tonycurtis suggested changes Aug 31, 2022

View reviewed changes

yosefe merged commit 3cf9746 into openucx:master Sep 22, 2022

		datatype_attr.buffer = buf;
		datatype_attr.field_mask \|= UCP_DATATYPE_ATTR_FIELD_BUFFER;

Conversation

rdietric commented Apr 22, 2022

What

Why ?

How ?

Uh oh!

yosefe commented May 9, 2022

Uh oh!

azure-pipelines Bot commented May 9, 2022

Uh oh!

rdietric commented Jun 3, 2022

Uh oh!

brminich commented Jun 8, 2022

Uh oh!

brminich left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yosefe Jun 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rdietric Jun 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

yosefe Jun 8, 2022 •

edited

Loading

rdietric Jun 9, 2022 •

edited

Loading