Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crash on Intel CPUs with FSRM #807

Open
KonradMagnusson opened this issue Aug 22, 2023 · 5 comments
Open

crash on Intel CPUs with FSRM #807

KonradMagnusson opened this issue Aug 22, 2023 · 5 comments

Comments

@KonradMagnusson
Copy link

Hello!

We (Paradox Interactive, Victoria 3 team) have been receiving player reports of a crash that we've now narrowed down the source of to memory allocations using mimalloc 2.1.1 on 10th gen and newer Intel CPUs.

I am only sporadically able to reproduce the crash, and only in optimized release builds so I can sadly not provide much debug information. I'm using an Intel i9-13900k.

The top of the stack looks like this:

 mi_heap_malloc alloc.c:38
 mi_heap_alloc_new alloc.c:950
 mi_new alloc.c:956
 PdxNew(unsigned long) pdx_allocator.cpp:223
 operator new(unsigned long) pdx_allocator.cpp:288

The line numbers are likely misleading since things are inlined, and I'm not getting any values for the page pointer passed to mi_heap_malloc(...). However heap is null, and I'm not sure what's causing this.

Since our user reports and in-house testing hints at this only happening on newer Intel CPUs, I did some digging and found this CPU feature check (mi_detect_cpu_features) that sets a bool that is checked here (_mi_memcpy and _mi_memzero), resulting in hardware-dependent implementations of _mi_memcpy and _mi_memzero.
If I disable the FSRM-based code (see diff below), we are no longer able to reproduce the crash.

The crash is not OS-dependent, and our QA has confirmed it to happen on an i7-10700k too.
Victoria 3 uses mimalloc in version 1.2 and onwards. We will be applying the patch below, disabling the FSRM implementations, as a workaround for the crash in 1.4.


fsrm-diff.patch
diff --git a/common/mimalloc/mimalloc-2.1.1/include/mimalloc/internal.h b/common/mimalloc/mimalloc-2.1.1/include/mimalloc/internal.h
index a4495c1613..6a96ded17a 100644
--- a/common/mimalloc/mimalloc-2.1.1/include/mimalloc/internal.h
+++ b/common/mimalloc/mimalloc-2.1.1/include/mimalloc/internal.h
@@ -885,27 +885,6 @@ static inline size_t mi_bsr(uintptr_t x) {
 // (AMD Zen3+ (~2020) or Intel Ice Lake+ (~2017). See also issue #201 and pr #253.
 // ---------------------------------------------------------------------------------
 
-#if !MI_TRACK_ENABLED && defined(_WIN32) && (defined(_M_IX86) || defined(_M_X64))
-#include <intrin.h>
-#include <string.h>
-extern bool _mi_cpu_has_fsrm;
-static inline void _mi_memcpy(void* dst, const void* src, size_t n) {
-  if (_mi_cpu_has_fsrm) {
-    __movsb((unsigned char*)dst, (const unsigned char*)src, n);
-  }
-  else {
-    memcpy(dst, src, n);
-  }
-}
-static inline void _mi_memzero(void* dst, size_t n) {
-  if (_mi_cpu_has_fsrm) {
-    __stosb((unsigned char*)dst, 0, n);
-  }
-  else {
-    memset(dst, 0, n);
-  }
-}
-#else
 #include <string.h>
 static inline void _mi_memcpy(void* dst, const void* src, size_t n) {
   memcpy(dst, src, n);
@@ -913,7 +892,6 @@ static inline void _mi_memcpy(void* dst, const void* src, size_t n) {
 static inline void _mi_memzero(void* dst, size_t n) {
   memset(dst, 0, n);
 }
-#endif
 
 
 // -------------------------------------------------------------------------------
diff --git a/common/mimalloc/mimalloc-2.1.1/src/init.c b/common/mimalloc/mimalloc-2.1.1/src/init.c
index 51d42acd9b..ccda7b41c3 100644
--- a/common/mimalloc/mimalloc-2.1.1/src/init.c
+++ b/common/mimalloc/mimalloc-2.1.1/src/init.c
@@ -528,22 +528,6 @@ static void mi_process_load(void) {
   _mi_random_reinit_if_weak(&_mi_heap_main.random);
 }
 
-#if defined(_WIN32) && (defined(_M_IX86) || defined(_M_X64))
-#include <intrin.h>
-mi_decl_cache_align bool _mi_cpu_has_fsrm = false;
-
-static void mi_detect_cpu_features(void) {
-  // FSRM for fast rep movsb support (AMD Zen3+ (~2020) or Intel Ice Lake+ (~2017))
-  int32_t cpu_info[4];
-  __cpuid(cpu_info, 7);
-  _mi_cpu_has_fsrm = ((cpu_info[3] & (1 << 4)) != 0); // bit 4 of EDX : see <https://en.wikipedia.org/wiki/CPUID#EAX=7,_ECX=0:_Extended_Features>
-}
-#else
-static void mi_detect_cpu_features(void) {
-  // nothing
-}
-#endif
-
 // Initialize the process; called by thread_init or the process loader
 void mi_process_init(void) mi_attr_noexcept {
   // ensure we are called once
@@ -553,7 +537,6 @@ void mi_process_init(void) mi_attr_noexcept {
   _mi_verbose_message("process init: 0x%zx\n", _mi_thread_id());
   mi_process_setup_auto_thread_done();
 
-  mi_detect_cpu_features();
   _mi_os_init();
   mi_heap_main_init();
   #if MI_DEBUG
@KonradMagnusson KonradMagnusson changed the title crash on intel CPUs with FSRM crash on Intel CPUs with FSRM Aug 22, 2023
@KonradMagnusson
Copy link
Author

I've done some testing, and it seems mimalloc v2.0.0 does not suffer from the same issue in an otherwise identical setup. The same FSRM optimizations are present in v2.0.0.

@malkia
Copy link

malkia commented Nov 14, 2023

@malkia
Copy link

malkia commented Nov 16, 2023

@KonradMagnusson - Do you have the binaries (.dll) of the mimalloc, or at least how they were compiled? (Internally we are also using it, and I want to ensure we are alright - we haven't seen the issue though - but probably by the sheer luck that I've compiled mimalloc with TRACKING support, and for some reason this codepath got disabled there.

@KonradMagnusson
Copy link
Author

Hi @malkia, sorry about the late reply

We compile mimalloc as an (unmodified) external CMake library, that we then link to statically.

We set the following defines:

-DMI_BUILD_OBJECT=Off
-DMI_BUILD_SHARED=Off
-DMI_BUILD_STATIC=On
-DMI_OVERRIDE=Off
-DMI_BUILD_TESTS=Off

@KonradMagnusson
Copy link
Author

KonradMagnusson commented Nov 20, 2023

Possibly related https://news.ycombinator.com/item?id=38266773 -> https://lock.cmpxchg8b.com/reptar.html

It at the very least does not sound far-fetched to my ears.

I updated my CPU microcode to 20231114, and have not been able to reproduce the crash yet 👀

edit: I updated the µcode and locally reverted the removal of FSRM the same day it rolled out - November 14th.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants