-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question: Is AMD processor + Intel compiler supported by SCHISM? #77
Comments
AMD is picky. We used to get same problem on an AMD cluster using Intel compiler. Dan recently found it's related to the MPI implementation requiring a few changes in batch scripts: unlimit stack size and a parameter related to IntelMPI:
export UCX_UNIFIED_MODE=y
…-Joseph
Y. Joseph Zhang
Web: schism.wiki
Office: 804 684 7466
From: Soroosh Mani ***@***.***>
Sent: Friday, August 19, 2022 4:07 PM
To: schism-dev/schism ***@***.***>
Cc: Subscribed ***@***.***>
Subject: [schism-dev/schism] Question: Is AMD processor + Intel compiler supported by SCHISM? (Issue #77)
[EXTERNAL to VIMS received message]
I'm trying this combination on ParallelWorks platform where they have AWS HPC6a instances (AMD) and I'm using the same Intel compilers (2021.3.0) that I used on Intel to run it, but the run doesn't go through, I get a segfault. So I was wondering if there are any known issues with this combination?
-
Reply to this email directly, view it on GitHub<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fschism-dev%2Fschism%2Fissues%2F77&data=05%7C01%7Cyjzhang%40vims.edu%7C6d6b6355ff1d410f4f7608da821e5189%7C8cbcddd9588d4e3b9c1e2367dbdf1740%7C0%7C0%7C637965363971898484%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=wGdOP6RH0jSjJyTRNx51P2KyIdzXt5Cda0IZnT%2F6XIk%3D&reserved=0>, or unsubscribe<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAFBKNZ3QZRKAI7PYS6A5B3LVZ7SMVANCNFSM57B2M54A&data=05%7C01%7Cyjzhang%40vims.edu%7C6d6b6355ff1d410f4f7608da821e5189%7C8cbcddd9588d4e3b9c1e2367dbdf1740%7C0%7C0%7C637965363971898484%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=N74Oo3e4RUWOriv31hl6noKENl6hUhvPgRHwTG5ejh8%3D&reserved=0>.
You are receiving this because you are subscribed to this thread.Message ID: ***@***.******@***.***>>
|
Interesting! Any idea how the model performs in desktop grade AMD processors with GCC? To put it different way, is the performance is comparable between an Intel i7 and Ryzen 5 processors? Thanks. |
@josephzhang8, should setting |
I see, thank you |
I still see the same issue on
environment. I get the following error in my run logs: first one of the following lines for each core:
which I think is due to how the ParallelWorks environment is set up. And then one of these for each core
|
Looks like an MPI implementation issue. Not sure.
…-Joseph
Y. Joseph Zhang
Web: schism.wiki
Office: 804 684 7466
From: Soroosh Mani ***@***.***>
Sent: Monday, August 22, 2022 12:41 PM
To: schism-dev/schism ***@***.***>
Cc: Y. Joseph Zhang ***@***.***>; Mention ***@***.***>
Subject: Re: [schism-dev/schism] Question: Is AMD processor + Intel compiler supported by SCHISM? (Issue #77)
[EXTERNAL to VIMS received message]
I still see the same issue on hpc6a platform with the
limit -s unlimited
export UCX_UNIFIED_MODE=y
environment. I get the following error in my run logs: first one of the following lines for each core:
MPI startup(): Warning: I_MPI_PMI_LIBRARY will be ignored since the hydra process manager was found
which I think is due to how the ParallelWorks environment is set up. And then one of these for each core
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
pschism_PAHM_TVD- 00000000006F71DA for__signal_handl Unknown Unknown
libpthread-2.17.s 00002AFBEEFD8630 Unknown Unknown Unknown
libshm-fi.so 00002AFCFA21A98A Unknown Unknown Unknown
libshm-fi.so 00002AFCFA2078BE Unknown Unknown Unknown
libshm-fi.so 00002AFCFA2026B9 Unknown Unknown Unknown
libshm-fi.so 00002AFCFA202F23 Unknown Unknown Unknown
libefa-fi.so 00002AFCFAA08E31 Unknown Unknown Unknown
libefa-fi.so 00002AFCFAA11945 Unknown Unknown Unknown
libefa-fi.so 00002AFCFAA077A9 Unknown Unknown Unknown
libefa-fi.so 00002AFCFAA07865 Unknown Unknown Unknown
libmpi.so.12.0.0 00002AFBEDB26E84 Unknown Unknown Unknown
libmpi.so.12.0.0 00002AFBEDE1117B Unknown Unknown Unknown
libmpi.so.12.0.0 00002AFBEDE18094 Unknown Unknown Unknown
libmpi.so.12.0.0 00002AFBEDA0746A Unknown Unknown Unknown
libmpi.so.12.0.0 00002AFBEDA7BAF0 Unknown Unknown Unknown
libmpi.so.12.0.0 00002AFBEDA6616B Unknown Unknown Unknown
libmpi.so.12.0.0 00002AFBEDA54748 MPI_Comm_dup Unknown Unknown
libmpifort.so.12. 00002AFBED4F260B pmpi_comm_dup_ Unknown Unknown
pschism_PAHM_TVD- 0000000000448D6E Unknown Unknown Unknown
pschism_PAHM_TVD- 0000000000410794 Unknown Unknown Unknown
pschism_PAHM_TVD- 00000000004106A2 Unknown Unknown Unknown
libc-2.17.so 00002AFBEF207555 __libc_start_main Unknown Unknown
pschism_PAHM_TVD- 00000000004105A9 Unknown Unknown Unknown
-
Reply to this email directly, view it on GitHub<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fschism-dev%2Fschism%2Fissues%2F77%23issuecomment-1222617660&data=05%7C01%7Cyjzhang%40vims.edu%7Caceb7b81850c47690ccf08da845d20ca%7C8cbcddd9588d4e3b9c1e2367dbdf1740%7C0%7C0%7C637967832770042489%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=I9VaZ7e7RWa0W6btA2WQQN%2BExxtIIoqSbtEQnk0lTjQ%3D&reserved=0>, or unsubscribe<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAFBKNZ2YHXDLWWAEOWEWLJ3V2OUSTANCNFSM57B2M54A&data=05%7C01%7Cyjzhang%40vims.edu%7Caceb7b81850c47690ccf08da845d20ca%7C8cbcddd9588d4e3b9c1e2367dbdf1740%7C0%7C0%7C637967832770042489%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=ex8VXPCdVdU6wB1%2Bjm96Rob7ErTP17FfCxDB%2F5wL7ag%3D&reserved=0>.
You are receiving this because you were mentioned.Message ID: ***@***.******@***.***>>
|
I'm trying this combination on ParallelWorks platform where they have AWS HPC6a instances (AMD) and I'm using the same Intel compilers (2021.3.0) that I used on Intel to run it, but the run doesn't go through, I get a segfault. So I was wondering if there are any known issues with this combination?
The text was updated successfully, but these errors were encountered: