Skip to content

UCP: Implementation of routine to query datatype attributes#8150

Merged
yosefe merged 1 commit intoopenucx:masterfrom
rdietric:ucp_datatype/query_size_impl
Sep 22, 2022
Merged

UCP: Implementation of routine to query datatype attributes#8150
yosefe merged 1 commit intoopenucx:masterfrom
rdietric:ucp_datatype/query_size_impl

Conversation

@rdietric
Copy link
Copy Markdown

What

Implementation of the the routine ucp_dt_query according to PR #8120. Currently, the only datatype attribute to query is the packed size.

Why ?

The size of a message or data transfer is a performance-relevant information. Performance tools should be able to query this information similar to MPI_Type_size to query the size of any MPI datatype

How ?

For contig data types, the return value of ucp_contig_dt_elem_size() provides the element size.
For generic data types, a field packed_size has been added to ucp_dt_generic_t, which is set with the user-defined pack routine. When the size of the generic data type is queried, the value of this field is used. All other data types are not supported and UCS_ERR_INVALID_PARAM is returned as ucs_status_t.

A unit tests verifies the expected behavior for contig, iov and generic data types. It is also checked that UCS_ERR_UNSUPPORTED is returned, if the generic datatype is queried before packing.

@yosefe
Copy link
Copy Markdown
Contributor

yosefe commented May 9, 2022

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 3 pipeline(s).

@rdietric rdietric force-pushed the ucp_datatype/query_size_impl branch 4 times, most recently from 48e3dbb to 28be810 Compare June 2, 2022 07:12
@rdietric
Copy link
Copy Markdown
Author

rdietric commented Jun 3, 2022

Since nobody reviewed yet, I rebased to have the new ucp.h with ucp_dt_query in the branch. I also fixed a bug in the implementation and another issue in the unit test. The current pipeline failures seem unrelated to this PR.

@brminich
Copy link
Copy Markdown
Contributor

brminich commented Jun 8, 2022

@rakhmets, can you please review?

Copy link
Copy Markdown
Contributor

@brminich brminich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only minor comments

Comment thread src/ucp/dt/dt.c Outdated
Comment thread src/ucp/dt/datatype_iter.inl Outdated
Comment thread src/ucp/dt/dt.c Outdated
Comment thread src/ucp/dt/dt_generic.c Outdated
Comment thread test/gtest/ucp/test_ucp_dt.cc
Comment thread test/gtest/ucp/test_ucp_dt.cc Outdated
Comment thread src/ucp/dt/dt_generic.c Outdated
*datatype_p = ucp_dt_from_generic(dt_gen);
dt_gen->ops = *ops;
dt_gen->context = context;
dt_gen->packed_size = 0;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, maybe use SIZE_MAX as an indication of uninitialized type? I'd guess that zero-sized derived datatype is possible in theory, but currently query routine will always return UNSUPPORTED for such types.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually used 0 to allow checks like if (attr->packed_size), but I changed it according to your suggestion, so we can also use SIZE_MAX now. I think that a datatype of size 0 does not make sense. If you still think it's better to initialize with SIZE_MAX, I am okay doing so, especially since I probably don't know all the data descriptions/types that are possible with UCX.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think having 0-sized type is more likely than SIZE_MAX, so I'd change it.
@yosefe, wdyt?

Copy link
Copy Markdown
Contributor

@yosefe yosefe Jun 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can assume that packed size of bufferA is same as packed size of bufferB. The packing logic depends on the contents - consider compression.
Perhaps dt_query API should also accept buffer pointer and count.

Copy link
Copy Markdown
Author

@rdietric rdietric Jun 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, if we consider compression, the packed size of a generic datatype can be different depending on the data. I thought of packing more like C struct packing/padding.
This basically means that the packed size can be different for each call to a communication function. Hence, the query would always return the same value for contiguous datatypes and could return different values after each communication operation for the same generic datatype. The function then still serves its purpose and allows me to find out how much data is transferred. However, this behavior has to be documented and I've to check what changes are needed in the implementation.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, IMO we should pass buffer and count and return the packed size of that buffer.
we need to decide on this before the release that freezes the API.
@shamisp WDYT?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has a decision been made here? I agree with @yosefe that passing buffer and count and return the buffer size fits better to the existing UCP datatypes. The user of ucp_dt_query can still pass "1" as count to return the size of one element.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shamisp ping on this

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yosefe This is good question actually. In our recent research work the size also depended on the target EP in addition to buffer and count - think about DT conversion across different architectures. Do we use this for any of the MPI call ? I don't think MPI passes buffer (I have to double check)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not come across the EP during implementation. The current implementation basically returns the last datatype size, which has been determined during the regular message/transfer preparation. This should include the EP.

Comment thread src/ucp/api/ucp.h Outdated
Comment thread src/ucp/api/ucp.h Outdated
Comment thread src/ucp/dt/dt.c Outdated
Comment thread src/ucp/api/ucp.h Outdated
* @return Error code as defined by @ref ucs_status_t
*/
ucs_status_t ucp_dt_query(ucp_datatype_t datatype, ucp_datatype_attr_t *attr);
ucs_status_t ucp_dt_query(ucp_datatype_t datatype, const void *buffer,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shamisp do you think it would be better to make buffer,count parameters optional using a struct?
PS. This API was not released yet

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yosefe - I think it is a good idea.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rdietric let's move buffer,count into ucp_datatype_attr_t (with corresponding flags)

Comment thread src/ucp/dt/dt.c Outdated
Comment thread src/ucp/dt/dt.c Outdated
Comment thread src/ucp/dt/dt.c Outdated
return UCS_ERR_INVALID_PARAM;
case UCP_DATATYPE_GENERIC:
dt_gen = ucp_dt_to_generic(datatype);
ucs_assert(NULL != dt_gen);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return invalid param if NULL

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return invalid param if NULL

Comment thread src/ucp/dt/dt.c Outdated
Comment thread src/ucp/dt/dt_generic.c Outdated
Comment thread test/gtest/ucp/test_ucp_dt.cc Outdated
@rdietric rdietric force-pushed the ucp_datatype/query_size_impl branch 2 times, most recently from f933aeb to 8b4bffb Compare August 18, 2022 06:08
Comment thread src/ucp/api/ucp.h Outdated
Comment thread src/ucp/dt/dt.c
Comment thread test/gtest/ucp/test_ucp_dt.cc Outdated
Comment thread test/gtest/ucp/test_ucp_dt.cc Outdated
Comment thread src/ucp/api/ucp.h Outdated
* @return Error code as defined by @ref ucs_status_t
*/
ucs_status_t ucp_dt_query(ucp_datatype_t datatype, ucp_datatype_attr_t *attr);
ucs_status_t ucp_dt_query(ucp_datatype_t datatype, const void *buffer,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rdietric let's move buffer,count into ucp_datatype_attr_t (with corresponding flags)

Comment thread src/ucp/api/ucp.h Outdated
*/
enum ucp_datatype_attr_field {
UCP_DATATYPE_ATTR_FIELD_PACKED_SIZE = UCS_BIT(0) /**< packed datatype size */
/** Query the packed datatype size. */
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/** @ref ucp_datatype_attr_t::packed_size field is queried. */

Comment thread src/ucp/api/ucp.h
* @brief UCP datatype attributes
*
* This structure provides attributes that can be queried for a UCP datatype.
* This structure provides attributes of a UCP datatype.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This structure provides attributes for querying a UCP datatype
@tonycurtis WDYT?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree

Comment thread src/ucp/api/ucp.h Outdated
* Number of elements in @a buffer.
* This value is optional.
* If @ref UCP_DATATYPE_ATTR_FIELD_COUNT is not set in @ref field_mask, the
* value of this field defaults to 0.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe the default should be 1?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure. 0 should basically signal that this value is not set (invalid). 0 would also enable zero-initialization of all fields after packed_size.
Since you bring this up, what do we expect the packed_size to be for a contiguous datatype, when count >1 is passed? count * sizeOfDatatype?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO the default should be "1" and for contig type we should multiply it by sizeOfDatatype
@shamisp WDYT?

Comment thread src/ucp/api/ucp.h Outdated
Comment thread src/ucp/dt/dt.c Outdated
Comment thread src/ucp/dt/dt.c Outdated
Comment thread src/ucp/dt/dt.c Outdated
return UCS_ERR_INVALID_PARAM;
case UCP_DATATYPE_GENERIC:
dt_gen = ucp_dt_to_generic(datatype);
ucs_assert(NULL != dt_gen);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return invalid param if NULL

Comment thread test/gtest/ucp/test_ucp_dt.cc Outdated
Comment thread test/gtest/ucp/test_ucp_dt.cc Outdated
Comment thread test/gtest/ucp/test_ucp_dt.cc Outdated
@yosefe
Copy link
Copy Markdown
Contributor

yosefe commented Aug 22, 2022

LGTM besides :

  • API doc review by @tonycurtis
  • Set default value of count to 1 and use it for contig types
  • Fix CI failures

Comment thread src/ucp/dt/dt.c Outdated
case UCP_DATATYPE_CONTIG:
attr->packed_size = ucp_contig_dt_elem_size(datatype);

if (attr->field_mask & UCP_DATATYPE_ATTR_FIELD_COUNT) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can treat count as default 1 also for generic and iov

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense and then also fits to the API documentation.

Comment thread src/ucp/dt/dt.c Outdated
Comment on lines +157 to +159
if (attr->field_mask & UCP_DATATYPE_ATTR_FIELD_COUNT) {
count = attr->count;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't initialize count during declaration
use UCP_PARAM_VALUE macro

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, UCP_ATTR_VALUE would work with enum ucp_datatype_attr_field. Or should I change attr to param in the enum? It seems to be used almost interchangeable.

yosefe
yosefe previously approved these changes Aug 23, 2022
@yosefe
Copy link
Copy Markdown
Contributor

yosefe commented Aug 23, 2022

@tonycurtis @shamisp can you pls review?

@yosefe yosefe added the API label Aug 23, 2022
brminich
brminich previously approved these changes Aug 23, 2022
Comment thread test/gtest/ucp/test_ucp_dt.cc Outdated
Comment on lines +203 to +204
datatype_attr.buffer = buf;
datatype_attr.field_mask |= UCP_DATATYPE_ATTR_FIELD_BUFFER;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment thread test/gtest/ucp/test_ucp_dt.cc Outdated
Comment on lines +218 to +219
datatype_attr.buffer = buf;
datatype_attr.field_mask |= UCP_DATATYPE_ATTR_FIELD_BUFFER;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@rdietric rdietric dismissed stale reviews from brminich and yosefe via d238631 August 23, 2022 06:55
brminich
brminich previously approved these changes Aug 23, 2022
yosefe
yosefe previously approved these changes Aug 28, 2022
Copy link
Copy Markdown
Contributor

@yosefe yosefe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls squash

Changed the API so that buffer and count are optional input arguments
via the `ucp_datatype_attr` parameter. Also added a unit test.
@rdietric
Copy link
Copy Markdown
Author

I squashed the commits and all checks have passed.

@yosefe
Copy link
Copy Markdown
Contributor

yosefe commented Aug 31, 2022

👍
@tonycurtis @shamisp can you pls review the API change?

Comment thread src/ucp/api/ucp.h
* @brief UCP datatype attributes
*
* This structure provides attributes that can be queried for a UCP datatype.
* This structure provides attributes of a UCP datatype.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree

Comment thread src/ucp/api/ucp.h
* Number of elements in @a buffer.
* This value is optional.
* If @ref UCP_DATATYPE_ATTR_FIELD_COUNT is not set in @ref field_mask, the
* value of this field defaults to 1.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would 0 be a better default size for a NULL buffer?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @tonycurtis comment - 1 is odd

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've had this topic before (#8150 (comment)). Intuitively, I would also have taken 0, but there are also arguments for 1. I do not have a strong opinion on this.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tonycurtis @shamisp if the default is 0, it means that if this parameter is not supplied the function will always return 0. and that is essentially as if we're making this parameter mandatory.
IMO it should be 1 by default since the neutral value w.r.t. multiplication is 1.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If @ref UCP_DATATYPE_ATTR_FIELD_COUNT is not set in @ref field_mask, the function executes the query for a single ucp_datatype_t . I think this sounds better. @tonycurtis ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yosefe , @tonycurtis approved the above and I think it sounds better while does not change the meaning

@tonycurtis
Copy link
Copy Markdown
Contributor

tonycurtis commented Sep 4, 2022 via email

@yosefe
Copy link
Copy Markdown
Contributor

yosefe commented Sep 4, 2022

A NULL pointer encapsulating data of length 1 seems weird. Maybe the semantics of this routine need to be revisited? tony

The pointer itself is not mandatory for calculating the size. The single mandatory parameter is the datatype itself, and count=1 by default would mean to calculate the size of a single element.

@yosefe
Copy link
Copy Markdown
Contributor

yosefe commented Sep 19, 2022

A NULL pointer encapsulating data of length 1 seems weird. Maybe the semantics of this routine need to be revisited? tony

The pointer itself is not mandatory for calculating the size. The single mandatory parameter is the datatype itself, and count=1 by default would mean to calculate the size of a single element.

@shamisp @tonycurtis WDYT?

@tonycurtis
Copy link
Copy Markdown
Contributor

tonycurtis commented Sep 21, 2022 via email

@yosefe yosefe merged commit 3cf9746 into openucx:master Sep 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants