Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Darwin 32-bit and PPC #5916

Merged
merged 28 commits into from
May 2, 2023
Merged

Conversation

barracuda156
Copy link
Contributor

@barracuda156 barracuda156 commented Feb 27, 2023

Fixes #5769

Test suite almost passes:

95% tests passed, 2 tests failed out of 39

@crtrott @PhilMiller Please take a look.

@dalg24-jenkins
Copy link
Collaborator

Can one of the admins verify this patch?

@barracuda156 barracuda156 changed the title Darwin Add support for Darwin 32-bit and PPC Feb 27, 2023
@dalg24
Copy link
Member

dalg24 commented Feb 27, 2023

Please issue your pull request against the develop branch. We will handle cherry-picking into release candidate branches as necessary. This patch likely missed the train for 4.0 but has chances to make the first patch release.

core/unit_test/Makefile Outdated Show resolved Hide resolved
@barracuda156
Copy link
Contributor Author

Please issue your pull request against the develop branch. We will handle cherry-picking into release candidate branches as necessary. This patch likely missed the train for 4.0 but has chances to make the first patch release.

@dalg24 Got it, will be done.

@barracuda156
Copy link
Contributor Author

barracuda156 commented Mar 3, 2023

UPD. False alarm, I just needed to switch base to develop. Should be good now.

@barracuda156 barracuda156 changed the base branch from release-candidate-4.0.0 to develop March 3, 2023 16:37
core/src/impl/Kokkos_ClockTic.hpp Outdated Show resolved Hide resolved
CMakeLists.txt Outdated Show resolved Hide resolved
@barracuda156
Copy link
Contributor Author

Current result with tests:

--->  Testing kokkos-devel
Executing:  cd "/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_kokkos-devel/kokkos-devel/work/build" && /usr/bin/make test 
Running tests...
/opt/local/bin/ctest --force-new-ctest-process 
Test project /opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_kokkos-devel/kokkos-devel/work/build
      Start  1: KokkosCore_UnitTest_Serial1
 1/34 Test  #1: KokkosCore_UnitTest_Serial1 ..................   Passed  233.86 sec
      Start  2: KokkosCore_UnitTest_Serial2
 2/34 Test  #2: KokkosCore_UnitTest_Serial2 ..................   Passed  705.07 sec
      Start  3: KokkosCore_UnitTest_SerialGraph
 3/34 Test  #3: KokkosCore_UnitTest_SerialGraph ..............   Passed    0.10 sec
      Start  4: KokkosCore_UnitTest_OpenMP
 4/34 Test  #4: KokkosCore_UnitTest_OpenMP ...................   Passed  1348.97 sec
      Start  5: KokkosCore_UnitTest_OpenMPInterOp
 5/34 Test  #5: KokkosCore_UnitTest_OpenMPInterOp ............   Passed    0.09 sec
      Start  6: KokkosCore_UnitTest_OpenMPGraph
 6/34 Test  #6: KokkosCore_UnitTest_OpenMPGraph ..............   Passed    0.08 sec
      Start  7: KokkosCore_UnitTest_Default
 7/34 Test  #7: KokkosCore_UnitTest_Default ..................   Passed    0.46 sec
      Start  8: KokkosCore_UnitTest_LegionInitialization
 8/34 Test  #8: KokkosCore_UnitTest_LegionInitialization .....   Passed    0.06 sec
      Start  9: KokkosCore_UnitTest_PushFinalizeHook
 9/34 Test  #9: KokkosCore_UnitTest_PushFinalizeHook .........   Passed    0.05 sec
      Start 10: KokkosCore_UnitTest_Develop
10/34 Test #10: KokkosCore_UnitTest_Develop ..................   Passed    0.06 sec
      Start 11: KokkosCore_UnitTest_LogicalSpaces
11/34 Test #11: KokkosCore_UnitTest_LogicalSpaces ............   Passed    3.98 sec
      Start 12: KokkosCore_UnitTest_KokkosP
12/34 Test #12: KokkosCore_UnitTest_KokkosP ..................   Passed    0.07 sec
      Start 13: KokkosCore_UnitTest_ToolIndependence
13/34 Test #13: KokkosCore_UnitTest_ToolIndependence .........   Passed    0.05 sec
      Start 14: KokkosCore_ProfilingTestLibraryLoadHelp
14/34 Test #14: KokkosCore_ProfilingTestLibraryLoadHelp ......   Passed    0.05 sec
      Start 15: KokkosCore_ProfilingTestLibraryCmdLineHelp
15/34 Test #15: KokkosCore_ProfilingTestLibraryCmdLineHelp ...   Passed    0.05 sec
      Start 16: KokkosCore_ProfilingTestLibraryLoad
16/34 Test #16: KokkosCore_ProfilingTestLibraryLoad ..........   Passed    0.05 sec
      Start 17: KokkosCore_ProfilingTestLibraryCmdLine
17/34 Test #17: KokkosCore_ProfilingTestLibraryCmdLine .......   Passed    0.05 sec
      Start 18: KokkosCore_UnitTest_StackTraceTest
18/34 Test #18: KokkosCore_UnitTest_StackTraceTest ...........   Passed   10.50 sec
      Start 19: KokkosCore_UnitTest_HWLOC
19/34 Test #19: KokkosCore_UnitTest_HWLOC ....................   Passed    0.06 sec
      Start 20: KokkosCore_IncrementalTest_OPENMP
20/34 Test #20: KokkosCore_IncrementalTest_OPENMP ............   Passed    2.56 sec
      Start 21: KokkosCore_IncrementalTest_SERIAL
21/34 Test #21: KokkosCore_IncrementalTest_SERIAL ............   Passed    2.72 sec
      Start 22: KokkosCore_UnitTest_CTestDevice
22/34 Test #22: KokkosCore_UnitTest_CTestDevice ..............   Passed    0.10 sec
      Start 23: KokkosCore_UnitTest_CMakePassCmdLineArgs0
23/34 Test #23: KokkosCore_UnitTest_CMakePassCmdLineArgs0 ....   Passed    0.04 sec
      Start 24: KokkosCore_UnitTest_DeviceAndThreads
24/34 Test #24: KokkosCore_UnitTest_DeviceAndThreads .........   Passed    0.90 sec
      Start 25: KokkosContainers_UnitTest_Serial
25/34 Test #25: KokkosContainers_UnitTest_Serial .............   Passed   74.56 sec
      Start 26: KokkosContainers_UnitTest_OpenMP
26/34 Test #26: KokkosContainers_UnitTest_OpenMP .............***Failed  Error regular expression found in output. Regex=[  FAILED  ] 32.17 sec
      Start 27: KokkosContainers_PerformanceTest_OpenMP
27/34 Test #27: KokkosContainers_PerformanceTest_OpenMP ......   Passed  197.71 sec
      Start 28: KokkosAlgorithms_UnitTest_RandomAndSort
28/34 Test #28: KokkosAlgorithms_UnitTest_RandomAndSort ......   Passed  329.85 sec
      Start 29: KokkosAlgorithms_UnitTest_StdSet_A
29/34 Test #29: KokkosAlgorithms_UnitTest_StdSet_A ...........   Passed    0.10 sec
      Start 30: KokkosAlgorithms_UnitTest_StdSet_B
30/34 Test #30: KokkosAlgorithms_UnitTest_StdSet_B ...........   Passed    0.15 sec
      Start 31: KokkosAlgorithms_UnitTest_StdSet_C
31/34 Test #31: KokkosAlgorithms_UnitTest_StdSet_C ...........   Passed   40.51 sec
      Start 32: KokkosAlgorithms_UnitTest_StdSet_D
32/34 Test #32: KokkosAlgorithms_UnitTest_StdSet_D ...........   Passed   38.50 sec
      Start 33: KokkosAlgorithms_UnitTest_StdSet_E
33/34 Test #33: KokkosAlgorithms_UnitTest_StdSet_E ...........   Passed    7.41 sec
      Start 34: KokkosSimd_UnitTest_SIMD
34/34 Test #34: KokkosSimd_UnitTest_SIMD .....................   Passed    0.07 sec

97% tests passed, 1 tests failed out of 34

Total Test time (real) = 3031.08 sec

The following tests FAILED:
	 26 - KokkosContainers_UnitTest_OpenMP (Failed)

Skipping works as supposed:

/opt/local/var/macports/build/_opt_PPCSnowLeopardPorts_devel_kokkos-devel/kokkos-devel/work/kokkos-fc4a9cecfa4d309881847626ebff718bbbe6af8c/core/unit_test/TestTeamBasic.hpp:215: Skipped
Fails on 32-bit
[  SKIPPED ] serial.large_team_scratch_size (0 ms)

The single failure is the following:

[ RUN      ] openmp.scatterview
unknown file: Failure
C++ exception with description "Kokkos failed to allocate memory for label "duplicated_original_view".  Allocation using MemorySpace named "Host" failed with the following error:  Allocation of size 4 G failed, likely due to insufficient memory.  (The allocation mechanism was standard malloc().)
" thrown in the test body.
[  FAILED  ] openmp.scatterview (4 ms)
[ RUN      ] openmp.scatterview_devicetype
unknown file: Failure
C++ exception with description "Kokkos failed to allocate memory for label "duplicated_original_view".  Allocation using MemorySpace named "Host" failed with the following error:  Allocation of size 4 G failed, likely due to insufficient memory.  (The allocation mechanism was standard malloc().)
" thrown in the test body.
[  FAILED  ] openmp.scatterview_devicetype (3 ms)

@barracuda156
Copy link
Contributor Author

One thing which is also necessary and not yet added into CMakeLists is linking to libatomic for 32-bit platforms. Can we add libatomic as required and pass target_link_libraries in the block where size of pointers is detected? Or should there be a separate test for atomics support?

@crtrott
Copy link
Member

crtrott commented Mar 5, 2023

Regarding the two failing tests: can we in 32bit mode simply adjust the test size to not use more than 2G memory?

@crtrott
Copy link
Member

crtrott commented Mar 5, 2023

Also does github offer any type of 32bit testing we could enable for our CI?

@barracuda156
Copy link
Contributor Author

Regarding the two failing tests: can we in 32bit mode simply adjust the test size to not use more than 2G memory?

@crtrott I do not know how to do that :)
If you or anyone can suggest a solution, it would be great.

@masterleinad
Copy link
Contributor

Also does github offer any type of 32bit testing we could enable for our CI?

I don't think there are any 32bit platforms available but we could create 32bit executables using -m32 for gcc if that's sufficient.

@barracuda156
Copy link
Contributor Author

Also does github offer any type of 32bit testing we could enable for our CI?

I don't think there are any 32bit platforms available but we could create 32bit executables using -m32 for gcc if that's sufficient.

IMO for the most purposes that will do. I do half of my testing for ppc32 on 10.6.8 x86_64. This trick won’t work on other OS, but i386 should be good, as long as OS permits running binaries (macOS from Catalina onward won’t).

@crtrott
Copy link
Member

crtrott commented Mar 15, 2023

@masterleinad can you try add a build with -m32 to this and also maybe help rebase the thing so we could get this moving?

Copy link
Member

@dalg24 dalg24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks look overall.

Please print something in Kokkos::print_configuration in 32-bit mode.

CMakeLists.txt Outdated Show resolved Hide resolved
CMakeLists.txt Outdated Show resolved Hide resolved
.github/workflows/continuous-integration-workflow.yml Outdated Show resolved Hide resolved
@@ -51,8 +51,11 @@ struct test_vector_insert {
it += 17;
it_return = a.insert(it, n + 5, scalar_type(5));

using difference_type =
typename std::iterator_traits<decltype(it)>::difference_type;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any good reason not to use std::ptrdiff_t directly?

core/src/Kokkos_Core_fwd.hpp Show resolved Hide resolved
core/unit_test/TestDeepCopyAlignment.hpp Outdated Show resolved Hide resolved
core/unit_test/TestTeamReductionScan.hpp Outdated Show resolved Hide resolved
.github/workflows/continuous-integration-workflow.yml Outdated Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this setup is becoming too complex
Did you consider having the 32-bit build in a separate workflow?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we can try that.

@masterleinad
Copy link
Contributor

Retest this please.

@dalg24
Copy link
Member

dalg24 commented May 1, 2023

Retest this please

@dalg24
Copy link
Member

dalg24 commented May 1, 2023

38: [==========] 5 tests from 1 test suite ran. (588 ms total)
38: [  PASSED  ] 4 tests.
38: [  FAILED  ] 1 test, listed below:
38: [  FAILED  ] cuda.Random_XorShift1024_0
38: 
38:  1 FAILED TEST
38/44 Test #38: KokkosAlgorithms_UnitTest_RandomAndSort ......***Failed  Error regular expression found in output. Regex=[  FAILED  ]  1.31 sec

failed twice in a row.

@barracuda156
Copy link
Contributor Author

38: [==========] 5 tests from 1 test suite ran. (588 ms total)
38: [  PASSED  ] 4 tests.
38: [  FAILED  ] 1 test, listed below:
38: [  FAILED  ] cuda.Random_XorShift1024_0
38: 
38:  1 FAILED TEST
38/44 Test #38: KokkosAlgorithms_UnitTest_RandomAndSort ......***Failed  Error regular expression found in output. Regex=[  FAILED  ]  1.31 sec

failed twice in a row.

Did we make any changes for CUDA here? Should not be relevant for 32 bit, perhaps.

@masterleinad
Copy link
Contributor

KokkosAlgorithms_UnitTest_RandomAndSort failed twice in a row.

@dalg24 I just ran the test manually 30 times in a row in Debug and Release mode without any failures.

- name: Configure Kokkos
run: |
cmake -B builddir \
-DCMAKE_INSTALL_PREFIX=/usr \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the default what is the point in specifying?
Also we don't even install. Please drop that option.

Comment on lines 38 to 39
-Ddesul_ROOT=/usr/desul-install/ \
-DKokkos_ENABLE_DESUL_ATOMICS_EXTERNAL=OFF \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why specifying desul root directory and then disable finding it and use the bundled version?

-DKokkos_ENABLE_TESTS=ON \
-DKokkos_ENABLE_BENCHMARKS=ON \
-DKokkos_ENABLE_EXAMPLES=ON \
-DKokkos_ENABLE_LIBDL=OFF \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why disable LIBDL?

Comment on lines 15 to 29
- name: Checkout desul
uses: actions/checkout@v3
with:
repository: desul/desul
ref: 477da9c8f40f8db369c28dd3f93a67e376d8511b
path: desul
- name: Install desul
working-directory: desul
run: |
git submodule init
git submodule update
mkdir build
cd build
cmake -DDESUL_ENABLE_TESTS=OFF -DCMAKE_INSTALL_PREFIX=/usr/desul-install ..
sudo cmake --build . --target install --parallel 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not convinced it was necessary to use an external desul atomics for that CI build but it is not even used below. Please remove.

-DCMAKE_BUILD_TYPE=RelWithDebInfo
- name: Build
run: |
ccache -z
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ccache is not actually used as the compiler launcher in the build.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you go!

-DKokkos_ENABLE_TESTS=ON \
-DKokkos_ENABLE_BENCHMARKS=ON \
-DKokkos_ENABLE_EXAMPLES=ON \
-DKokkos_ENABLE_LIBDL=ON \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have just omitted. Feel free to ignore.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you go!

@masterleinad
Copy link
Contributor

Only HIP-ROCm-5.2-C++20 is timing out.

@dalg24 dalg24 merged commit e5490e1 into kokkos:develop May 2, 2023
27 of 28 checks passed
@barracuda156 barracuda156 deleted the darwin branch May 2, 2023 23:13
@barracuda156
Copy link
Contributor Author

Thanks to everyone for working on this!

@dalg24 dalg24 mentioned this pull request May 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants