-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support for MPI data count larger than INT_MAX #43
Comments
I guess you are using I understand that you have a urgent deadline. Which StarPU data type are you using in your application? |
Greetings,
I am using Chameleon actually. I am not completely sure how Chaneleon
interact with StarPU. I am going to use multiple precisions (FP64, FP32,
FP16, and FP8).
Best,
Jie
Samuel Thibault ***@***.***>于2024年4月8日 周一14:25写道:
… StarPU does not support it so far. (e.g.,
https://gitlab.inria.fr/starpu/starpu/-/blob/master/src/drivers/mpi/driver_mpi_common.c?ref_type=heads#L300
)
I guess you are using starpu_mpi_task_insert etc., not the MPI
master-slave driver support, so it's rather the MPI datatype definition
from mpi/src/starpu_mpi_datatype.c that you need fixed.
I understand that you have a urgent deadline. Which StarPU data type are
you using in your application?
—
Reply to this email directly, view it on GitHub
<#43 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ALYD4VWAVU3HZNF6C77IKGTY4J5DRAVCNFSM6AAAAABF4M666CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBSGUYDAMBTGY>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Ok, then I guess you are using a matrix descriptor from Chameleon? |
Yes, specifically, my customized descriptor.
Samuel Thibault ***@***.***>于2024年4月8日 周一14:31写道:
… I am using Chameleon actually. I am not completely sure how Chaneleon
interact with StarPU
Ok, then I guess you are using a matrix descriptor from Chameleon?
—
Reply to this email directly, view it on GitHub
<#43 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ALYD4VWI6VRMSUOL6DU6NSLY4J5ZBAVCNFSM6AAAAABF4M666CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBSGUYTAMRVGY>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
How is it customized? Essentially, the question is which |
put another way, do you have any Is it using |
Does it use |
Sorry I am having my lunch, I will answer your question about 20mins later.
Samuel Thibault ***@***.***>于2024年4月8日 周一14:46写道:
… Does it use starpu_mpi_interface_datatype_register?
—
Reply to this email directly, view it on GitHub
<#43 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ALYD4VSCUKNRWHJUHAJVWYDY4J7RXAVCNFSM6AAAAABF4M666CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBSGUZTOMZXG4>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Is |
(basically, we would just want to use the |
Chameleon uses both |
Then it must be also using |
( |
Yes, you are right, here is the type registration although I do not understand it completely...: void
starpu_cham_tile_interface_init()
{
if ( starpu_interface_cham_tile_ops.interfaceid == STARPU_UNKNOWN_INTERFACE_ID )
{
starpu_interface_cham_tile_ops.interfaceid = starpu_data_interface_get_next_id();
#if defined(CHAMELEON_USE_MPI_DATATYPES)
#if defined(HAVE_STARPU_MPI_INTERFACE_DATATYPE_NODE_REGISTER)
starpu_mpi_interface_datatype_node_register( starpu_interface_cham_tile_ops.interfaceid,
cti_allocate_datatype_node,
cti_free_datatype );
#else
starpu_mpi_interface_datatype_register( starpu_interface_cham_tile_ops.interfaceid,
cti_allocate_datatype,
cti_free_datatype );
#endif
#endif
}
} This shows how Chameleon registers the tile. I thought attributes void
starpu_cham_tile_register( starpu_data_handle_t *handleptr,
int home_node,
CHAM_tile_t *tile,
cham_flttype_t flttype )
{
size_t elemsize = CHAMELEON_Element_Size( flttype );
starpu_cham_tile_interface_t cham_tile_interface =
{
.id = STARPU_CHAM_TILE_INTERFACE_ID,
.flttype = flttype,
.dev_handle = (intptr_t)(tile->mat),
.allocsize = -1,
.tilesize = tile->m * tile->n * elemsize,
};
memcpy( &(cham_tile_interface.tile), tile, sizeof( CHAM_tile_t ) );
/* Overwrite the flttype in case it comes from a data conversion */
cham_tile_interface.tile.flttype = flttype;
if ( tile->format & CHAMELEON_TILE_FULLRANK ) {
cham_tile_interface.allocsize = tile->m * tile->n * elemsize;
}
else if ( tile->format & CHAMELEON_TILE_DESC ) { /* Needed in case starpu ask for it */
cham_tile_interface.allocsize = tile->m * tile->n * elemsize;
}
else if ( tile->format & CHAMELEON_TILE_HMAT ) {
/* For hmat, allocated data will be handled by hmat library. StarPU cannot allocate it for the library */
cham_tile_interface.allocsize = 0;
}
starpu_data_register( handleptr, home_node, &cham_tile_interface, &starpu_interface_cham_tile_ops );
} |
Please also show |
Here you go: #if defined(CHAMELEON_USE_MPI_DATATYPES)
int
cti_allocate_datatype_node( starpu_data_handle_t handle,
unsigned node,
MPI_Datatype *datatype )
{
int ret;
starpu_cham_tile_interface_t *cham_tile_interface = (starpu_cham_tile_interface_t *)
starpu_data_get_interface_on_node( handle, node );
size_t m = cham_tile_interface->tile.m;
size_t n = cham_tile_interface->tile.n;
size_t ld = cham_tile_interface->tile.ld;
size_t elemsize = CHAMELEON_Element_Size( cham_tile_interface->flttype );
ret = MPI_Type_vector( n, m * elemsize, ld * elemsize, MPI_BYTE, datatype );
STARPU_ASSERT_MSG(ret == MPI_SUCCESS, "MPI_Type_vector failed");
ret = MPI_Type_commit( datatype );
STARPU_ASSERT_MSG(ret == MPI_SUCCESS, "MPI_Type_commit failed");
return 0;
}
int
cti_allocate_datatype( starpu_data_handle_t handle,
MPI_Datatype *datatype )
{
return cti_allocate_datatype_node( handle, STARPU_MAIN_RAM, datatype );
}
void
cti_free_datatype( MPI_Datatype *datatype )
{
MPI_Type_free( datatype );
}
#endif |
Also, again,
|
I didn't find any line includes |
That's it: you want to use |
Where did you not find it? Put another way: which MPI implementation are you using? |
I see that notably openmpi doesn't seem to be providing the |
Ok, Chameleon uses |
If your MPI implementation supports |
I am using mpich, I mean Chameleon does not use |
So in the end it's the chameleon code that needs fixing. StarPU will however want to do the same for its predefined vector/matrix/etc. types, so keeping this issue open for that. |
mpich does have |
(mpich does so apparently since its version 4) |
Thanks a lot! I was reading StarPU, but it seems that I did not do well. So |
yes, that's the idea. For application-defined interfaces it's |
Thanks a lot, you helped a lot! |
Is your feature request related to a problem? Please describe.
Since MPI uses int32 as the data count, see as follows
When we want to send a larger buffer (count > INT_MAX), we need to split the buffer into several chunks and send them one by one. However, StarPU does not support it so far. (e.g., https://gitlab.inria.fr/starpu/starpu/-/blob/master/src/drivers/mpi/driver_mpi_common.c?ref_type=heads#L300)
Describe the solution you'd like
Split the buffer into several chunks:
Functions' signatures (e.g., __starpu_mpi_common_send_to_device, __starpu_mpi_common_send, etc) should be changed correspondingly.
Describe alternatives you've considered
N/A
Additional context
N/A
The text was updated successfully, but these errors were encountered: