Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MDEV-8684 ut relax cpu isnt relaxing #168

Conversation

grooverdan
Copy link
Member

Combination of fixes. Details in JIRA.

My contributions here are under the MCA.

Marko Mäkelä and others added 4 commits March 31, 2016 16:34
UT_RELAX_CPU(): Use a compiler barrier.

ut_delay(): Remove the dummy global variable ut_always_false.

RB: 11399
Reviewed-by: Jimmy Yang <jimmy.yang@oracle.com>

Backported from MySQL-5.7 - patch mysql/mysql-server@5e3efb0

Suggestion by Stewart Smith
Bug#18842925 : SET THREAD PRIORITY IN INNODB MUTEX SPINLOOP
Like "pause" instruction for hyper-threading at Intel CPUs,
POWER has special instructions only for hinting priority of hardware-threads.

Approved by Sunny in rb#6256

Backport of the 5.7 fix - mysql/mysql-server@c92102a
(excluded cache line size patch)

Suggestion by Stewart Smith
Also introduce compiler barrier properly on all architectures.
Using __ppc_get_timebase will translate to mfspr instruction
The mfspr instruction will block FXU1 until complete but the other
Pipelines are available for execution of instructions from other
SMT threads on the same core.

The latency time to read the timebase SPR is ~10 cycles.

So any impact on other threads is limited other FXU1 only instructions
(basically other mfspr/mtspr ops).

Suggested by Steven J. Munroe, Linux on Power Toolchain Architect,
Linux Technology Center
IBM Corporation
@janlindstrom
Copy link
Contributor

Hi, can you explain the change and have you done some performance testing on this ?

@@ -88,15 +88,25 @@ struct ut_when_dtor {
the YieldProcessor macro defined in WinNT.h. It is a CPU architecture-
independent way by using YieldProcessor. */
# define UT_RELAX_CPU() YieldProcessor()
# elif defined(HAVE_ATOMIC_BUILTINS)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you have replaced HAVE_ATOMIC_BUILTINS to powerpc is HAVE_ATOMIC_BUILTINS defined only on powerpc or can this be executed on other platforms before the change ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Atomic operations aren't relaxing on any CPU and have effects beyond the CPU which is counter to what ut_delay wants to achieve.

@akopytov do you have a suggestion as to what ARM should do to relax a processor?
@akamat-ibm any suggestion on what Z should do?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@grooverdan I'm still evaluating possible options for ARM. I'll submit separate PRs for MySQL and likely MariaDB.

@grooverdan
Copy link
Member Author

I put the performance testing in https://jira.mariadb.org/browse/MDEV-8684

i.e. __ppc_set_ppr_low rather than 'or 1,1,1'
@grooverdan grooverdan changed the title 10.1 mdev 8684 ut relax cpu isnt relaxing MDEV-8684 ut relax cpu isnt relaxing Apr 1, 2016
# define UT_RELAX_CPU() do { \
volatile lint volatile_var; \
os_compare_and_swap_lint(&volatile_var, 0, 1); \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But when HAVE_ATOMIC_BUILTINS was defined it did these two lines also when CPU != powerpc now not sure what happens.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of UT_COMPILER_BARRIER the loop won't be optimised away and there will just be an empty loop.

I've no objection to leaving the HAVE_ATOMIC_BUILTINS implementation for non x86/powerpc architectures until they can come up with something better. I was just following the MySQL upstream patch in this respect.

@janlindstrom janlindstrom merged commit 9794cf2 into MariaDB:10.1 Apr 6, 2016
@grooverdan
Copy link
Member Author

thanks @janlindstrom

@grooverdan grooverdan deleted the 10.1-MDEV-8684-UT_RELAX_CPU_isnt_relaxing branch April 6, 2016 05:53
@janlindstrom
Copy link
Contributor

Naturally, it does not compile on Windows...

} while (0)
# else
# define UT_RELAX_CPU() ((void)0) /* avoid warning for an empty statement */
# endif

#define UT_COMPILER_BARRIER() __asm__ __volatile__ ("":::"memory")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line does not compile on WIndows, is this portable outside gcc/clang ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks gcc-ish The correspording intrinsic for Microsoft compiler is called MemoryBarrier()

@grooverdan
Copy link
Member Author

thanks @vaintroub, sorry for missing that.

@svoj svoj self-assigned this Apr 22, 2018
@svoj svoj added this to the 10.1 milestone Apr 22, 2018
@svoj svoj assigned janlindstrom and unassigned svoj Apr 22, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
5 participants