Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change powerPC clocktic into 64 bit version and remove 32 bit version #5577

Merged
merged 2 commits into from
Oct 21, 2022

Conversation

JBludau
Copy link
Contributor

@JBludau JBludau commented Oct 18, 2022

In #5502 we discovered that sometimes one of our tests fails on PowerPC. The failing test uses the ClockTic functionality to count the cycles that elapsed. The cycles counted as elapsed in the failing test is (in this case) 11'258'999'065'812. The test TestSharedSpace uses a difference of uint64_t returned by the ClockTic to measure timings independent of the current frequency the core is running on. The differences is then added in a parallel_reduce to compute the number of cycles needed for the memory accesses.
As the difference of unsigned should not be affected by any overflow, I suspect something being wrong with how the ClockTic is read from the registers. I ended up looking at the documentation I found for the registers here

Move to/from register functions

__mftb Purpose
Move from Time Base
In 32-bit compilation mode, returns the lower word of the time base register. In 64-bit mode, returns the entire doubleword of the time base register.
Prototype
unsigned long __mftb (void);
Usage
In 32-bit mode, this function can be used in conjunction with the__mftbu built-in function to read the entire time base register. In 64-bit mode, the entire doubleword of the time base register is returned, so separate use of __mftbu is unnecessary
It is recommended that you insert the __fence built-in function before and after the __mftb built-in function.

and

__mftbu Purpose
Move from Time Base Upper
Returns the upper word of the time base register. Prototype
unsigned int __mftbu (void);
Usage
In 32-bit mode you can use this function in conjunction with the __mftb built-in function to read the entire time base register
It is recommended that you insert the __fence built-in function before and after the __mftbu built-in function.

So I tried to split the implementation for PowerPC into a 32 and 64 bit version, as the 64 bit version apparently can read the register as one unsigned long value while the 32 bit needs to read two registers and add them like on x86.
If we have anyone who is good with asm and knows this kind of stuff: any help/insight/thing here is greatly appreciated.
Also I might be looking in a completely wrong direction here ...

@PhilMiller
Copy link
Contributor

Please revise the PR title, and I think this is good to go.

@dalg24
Copy link
Member

dalg24 commented Oct 20, 2022

@ndellingwood would you please confirm this resolves #5502

@JBludau
Copy link
Contributor Author

JBludau commented Oct 20, 2022

This might be hard for @ndellingwood to do, as the problem might not show in every run. I am currently trying with @ajpowelsnl to run it for an extended period of time to look into this.

@ndellingwood
Copy link
Contributor

@JBludau @dalg24 I'll build this PR on our Power9+Volta70 system and rerun the tests several times.
Since the failure is spurious with standalone kokkos testing we'll need to continue monitoring the test

@JBludau JBludau changed the title Split powerPC clocktic into 32 and 64 bit version Change powerPC clocktic into 64 bit version and remove 32 bit version Oct 20, 2022
@JBludau JBludau marked this pull request as ready for review October 20, 2022 20:42
@ndellingwood
Copy link
Contributor

Repeat runs of tests without failures, a good sign :)

@dalg24 dalg24 merged commit 051fd73 into kokkos:develop Oct 21, 2022
@dalg24 dalg24 mentioned this pull request Oct 21, 2022
@PhilMiller
Copy link
Contributor

Later introduction of the failing test aside, this may be worth cherry picking into 3.7.01

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants