Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

use armv7 memory barriers by default #158

Merged
merged 1 commit into from

2 participants

@dicej
Owner

armv7 and later provide weaker cache coherency models than armv6 and
earlier, so we cannot just implement memory barriers as no-ops. This
patch uses the DMB instruction (or the equivalent OS-provided barrier
function) to implement barriers. This should fix concurrency issues
on newer chips such as the Apple A6 and A7.

If you still need to support ARMv6 devices, you should pass
"armv6=true" to make when building Avian. Ideally, the VM would
detect what kind of CPU it was executing on at runtime and direct the
JIT compiler accordingly, but I don't know how to do that on ARM.
Patches are welcome, though!

@dicej
Owner

This addresses #153

@dicej
Owner

Are we happy with DMB SY as a conservative option, or is DSB necessary? My understanding is that Linux's smp_*mb macros (which use DMB) are intended for synchronization among CPUs, while the *mb macros (which use DSB) are for synchronization with other devices (e.g. DMA, etc.), which would imply that DMB is sufficient for our purposes.

@joshuawarner32
Collaborator

Ah, that makes a lot of sense. Yes, I think DMB is a fine conservative option. It would be nice to at least leave TODOs to move to using the weaker variants of DMB where possible.

@dicej dicej use armv7 memory barriers by default
armv7 and later provide weaker cache coherency models than armv6 and
earlier, so we cannot just implement memory barriers as no-ops.  This
patch uses the DMB instruction (or the equivalent OS-provided barrier
function) to implement barriers.  This should fix concurrency issues
on newer chips such as the Apple A6 and A7.

If you still need to support ARMv6 devices, you should pass
"armv6=true" to make when building Avian.  Ideally, the VM would
detect what kind of CPU it was executing on at runtime and direct the
JIT compiler accordingly, but I don't know how to do that on ARM.
Patches are welcome, though!
2b11770
@dicej
Owner

I just force-pushed an update with TODO comments.

@joshuawarner32 joshuawarner32 merged commit 2ac66cb into ReadyTalk:master
@dicej dicej deleted the dicej:armv7 branch
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Jan 11, 2014
  1. @dicej

    use armv7 memory barriers by default

    dicej authored
    armv7 and later provide weaker cache coherency models than armv6 and
    earlier, so we cannot just implement memory barriers as no-ops.  This
    patch uses the DMB instruction (or the equivalent OS-provided barrier
    function) to implement barriers.  This should fix concurrency issues
    on newer chips such as the Apple A6 and A7.
    
    If you still need to support ARMv6 devices, you should pass
    "armv6=true" to make when building Avian.  Ideally, the VM would
    detect what kind of CPU it was executing on at runtime and direct the
    JIT compiler accordingly, but I don't know how to do that on ARM.
    Patches are welcome, though!
This page is out of date. Refresh to see the latest.
View
4 README.md
@@ -127,6 +127,10 @@ devices. See [here](https://github.com/ReadyTalk/hello-ios) for an
example of an Xcode project for iOS which uses Avian.
* _default:_ false
+ * `armv6` - if true, don't use any instructions newer than armv6. By
+default, we assume the target is armv7 or later, and thus requires explicit
+memory barrier instructions to ensure cache coherency
+
* `bootimage` - if true, create a boot image containing the pre-parsed
class library and ahead-of-time compiled methods. This option is
only valid for process=compile builds. Note that you may need to
View
5 makefile
@@ -470,6 +470,10 @@ ifeq ($(arch),arm)
endif
endif
+ifeq ($(armv6),true)
+ cflags += -DAVIAN_ASSUME_ARMV6
+endif
+
ifeq ($(ios),true)
cflags += -DAVIAN_IOS
use-lto = false
@@ -1895,6 +1899,7 @@ $(bootimage-generator): $(bootimage-generator-objects) $(vm-objects)
arch=$(build-arch) \
aot-only=false \
target-arch=$(arch) \
+ armv6=$(armv6) \
platform=$(bootimage-platform) \
target-format=$(target-format) \
openjdk=$(openjdk) \
View
18 src/avian/arm.h
@@ -79,11 +79,25 @@ trap()
#endif
}
+// todo: determine the minimal operation types and domains needed to
+// implement the following barriers (see
+// http://community.arm.com/groups/processors/blog/2011/10/19/memory-access-ordering-part-3--memory-access-ordering-in-the-arm-architecture).
+// For now, we just use DMB SY as a conservative but not necessarily
+// performant choice.
+
#ifndef _MSC_VER
inline void
memoryBarrier()
{
- asm("nop");
+#ifdef __APPLE__
+ OSMemoryBarrier();
+#elif (__GNUC__ >= 4) && (__GNUC_MINOR__ >= 1)
+ return __sync_synchronize();
+#elif (! defined AVIAN_ASSUME_ARMV6)
+ __asm__ __volatile__ ("dmb" : : : "memory");
+#else
+ __asm__ __volatile__ ("" : : : "memory");
+#endif
}
#endif
@@ -148,7 +162,7 @@ inline bool
atomicCompareAndSwap32(uint32_t* p, uint32_t old, uint32_t new_)
{
#ifdef __APPLE__
- return OSAtomicCompareAndSwap32(old, new_, reinterpret_cast<int32_t*>(p));
+ return OSAtomicCompareAndSwap32Barrier(old, new_, reinterpret_cast<int32_t*>(p));
#elif (defined __QNX__)
return old == _smp_cmpxchg(p, old, new_);
#else
View
2  src/codegen/target/arm/encode.h
@@ -172,6 +172,8 @@ inline int blo(int offset) { return SETCOND(b(offset), CC); }
inline int bhs(int offset) { return SETCOND(b(offset), CS); }
inline int bpl(int offset) { return SETCOND(b(offset), PL); }
inline int fmstat() { return fmrx(15, FPSCR); }
+// todo: make this pretty:
+inline int dmb() { return 0xf57ff05f; }
} // namespace isa
View
6 src/codegen/target/arm/multimethod.cpp
@@ -58,9 +58,9 @@ void populateTables(ArchitectureContext* con) {
BranchOperationType* bro = con->branchOperations;
zo[lir::Return] = return_;
- zo[lir::LoadBarrier] = memoryBarrier;
- zo[lir::StoreStoreBarrier] = memoryBarrier;
- zo[lir::StoreLoadBarrier] = memoryBarrier;
+ zo[lir::LoadBarrier] = loadBarrier;
+ zo[lir::StoreStoreBarrier] = storeStoreBarrier;
+ zo[lir::StoreLoadBarrier] = storeLoadBarrier;
zo[lir::Trap] = trap;
uo[Multimethod::index(lir::LongCall, C)] = CAST1(longCallC);
View
28 src/codegen/target/arm/operations.cpp
@@ -1228,7 +1228,33 @@ void trap(Context* con)
emit(con, bkpt(0));
}
-void memoryBarrier(Context*) {}
+// todo: determine the minimal operation types and domains needed to
+// implement the following barriers (see
+// http://community.arm.com/groups/processors/blog/2011/10/19/memory-access-ordering-part-3--memory-access-ordering-in-the-arm-architecture).
+// For now, we just use DMB SY as a conservative but not necessarily
+// performant choice.
+
+void memoryBarrier(Context* con UNUSED)
+{
+#ifndef AVIAN_ASSUME_ARMV6
+ emit(con, dmb());
+#endif
+}
+
+void loadBarrier(Context* con)
+{
+ memoryBarrier(con);
+}
+
+void storeStoreBarrier(Context* con)
+{
+ memoryBarrier(con);
+}
+
+void storeLoadBarrier(Context* con)
+{
+ memoryBarrier(con);
+}
} // namespace arm
} // namespace codegen
View
6 src/codegen/target/arm/operations.h
@@ -230,7 +230,11 @@ void return_(Context* con);
void trap(Context* con);
-void memoryBarrier(Context*);
+void loadBarrier(Context*);
+
+void storeStoreBarrier(Context*);
+
+void storeLoadBarrier(Context*);
} // namespace arm
} // namespace codegen
Something went wrong with that request. Please try again.