Skip to content

CoreMark results

ict edited this page Jan 3, 2022 · 21 revisions

This page stores various results obtained from running the CoreMark benchmark included in this package. Bold text indicates the best result from a given compiler.

The results presented on this page are strictly unofficial, please refer to the CoreMark scores page on the EEMBC website for official results.

Jump to: Embedded Systems, Mobile Phones, Laptops and Portables, Workstations, Servers

Embedded Systems

HP t5700

Released in early 2003, the t5700 was among HP's first PC-compatible thin clients, featuring Transmeta's low-power Crusoe TM5800 x86-compatible 128-bit VLIW microprocessor with independent 64 KiB instruction and data caches as well as a 512 KiB unified secondary cache. All tests are performed under Debian 9.8 on a system not specifically configured for benchmarking.

GCC 6.3.0: 40,000 iterations

Options Iterations/second
none 925.37
-O1 1732.35
-O2 2171.08
-O3 2283.91
-Ofast 2280.42

HP t5325

The t5325 is a miniscule low-power thin client unveiled by HP in late 2009, designed around a Marvell Kirkwood 88F6281 system-on-a-chip implementing a Marvell designed ARMv5TE-compliant "Sheeva" processor core clocked at 1.2 GHz with independent 16 KiB instruction and data caches and a 256 KiB unified secondary cache. All tests are performed under the HP "ThinPro" operating system, a lightly customized variant of Debian Lenny, on a system not specifically configured for benchmarking.

GCC 4.2.4: 40,000 iterations

Options Iterations/second
none 551.95
-O1 1747.49
-O2 1873.54
-O3 2214.84

HP Media Vault Pro mv5150

HP's second-generation Media Vault network storage appliance, first released in 2008, based on a Marvell Orion 88F5182 system-on-a-chip implementing a Marvell designed ARMv5TE-compliant "Feroceon" processor core clocked at 500 MHz with independent 32 KiB instruction and data caches. All tests are performed under the Media Vault's embedded Linux 2.6.12.6 operating system, which has not been specifically configured for benchmarking.

GCC 3.4.4: 40,000 iterations

Options Iterations/second
none 222.72
-O1 533.19
-O2 665.56
-O3 775.95

I-O DATA USL-5P

No data is available for -O1 due to a compiler bug interfering with register allocation.

GCC 4.2.1: 20,000 iterations

Options Iterations/second
none 114.31
-O1 N/A
-O2 421.41
-O3 518.54

Mobile Phones

Palm Pre Plus

Announced at CES 2010 and launched on Verizon Wireless on March 2010, the Pre Plus was an updated version of Palm's innovative Pre smartphone with double the RAM (512 MiB) and storage (16 GiB), as well as a new touch-based gesture area rather than the previous home button. Like the Pre, the Pre Plus is designed around Texas Instruments' OMAP3430 multimedia processor featuring an ARM Cortex-A8 core clocked at 500 MHz with independent 16 KiB instruction and data caches as well as a unified 256 KiB second-level cache. All tests are performed under WebOS 1.4.5 with WebOS Internals' UberKernel allowing for greater range of clock frequency tweaking, but otherwise no benchmarking-specific configuration.

GCC 4.2.3: 20,000 iterations, Palm default profile (500 MHz underclock)

Options Iterations/second
none 285.63
-O1 939.41
-O2 1014.71
-O3 1175.09

GCC 4.2.3: 20,000 iterations, OMAP3430 standard clock (600 MHz)

Options Iterations/second
none 341.76
-O1 1123.28
-O2 1211.02
-O3 1405.48

GCC 4.2.3: 40,000 iterations, 1 GHz overclock

Options Iterations/second
none 565.29
-O1 1893.93
-O2 2009.04
-O3 2358.49

Laptops and Portables

Apple iBook G4 (Mid-2005/1.33)

The last and fastest of the 12'' consumer-oriented iBook G4 line, the mid-2005 model is built around the 32-bit PowerPC 7447a microprocessor fabricated by Freescale Semiconductor, then recently spun off from Motorola in the previous year. The 7447a is the final desktop iteration of the PowerPC 7400 'G4' microprocessor used by Apple in their systems, featuring two 32 KiB primary caches for instructions and data, a single 512 KiB on-die unified secondary cache, and some additional mobile-oriented features, such as dynamic frequency scaling and an on-chip thermal diode. As implied by the system model, the processor clock is 1.33 GHz. All tests are performed under Mac OS X 10.4 on a system not specifically configured for benchmarking.

Apple GCC 4.0.1: 80,000 iterations

Options Iterations/second
none 847.37
-O1 3153.33
-O2 3430.53
-O3 4012.04

Workstations

Apple Power Mac G5 (Late 2005/2.3DC)

The mid range of Apple's final generation of PowerPC-based professional systems, the 2.3DC was introduced in October 2005 and featured IBM's new dual-core 64-bit PowerPC 970MP processor, which featured two PowerPC 970 cores each with 32 KiB data cache, 64 KiB instruction cache, and a unified 1 MiB secondary cache. This system has a clock frequency of 2.3 GHz. All tests are performed under Mac OS 10.4.11, on a system not specifically configured for benchmarking.

Apple GCC 4.0.1: 60,000 iterations

Options Iterations/second
none 851.79
-O1 3742.98
-O2 4350.98
-O3 5080.44

HP VISUALIZE C3000 (9000/785/C3000)

A mid-range Unix workstation released in 1999, based on HP's indigenous PA-8500 microprocessor with 1 MiB of on-die data cache, 512 KiB of on-die instruction cache and a clock frequency of 400 MHz. All tests are performed under HP-UX 11.11 (11i v1) on a system not specifically configured for benchmarking.

HP C B.11.11.16: 20,000 iterations

Options Iterations/second
none 182.82
+O1 257.37
+O2 727.28
+O3 800.64
+O4 752.73
-fast 776.70

Note: HP C +O2 is roughly equivalent to GCC -O1

GCC 4.2.3: 20,000 iterations

Options Iterations/second
none 205.21
-O1 593.12
-O2 655.74
-O3 718.13
-O3 -ffast-math 718.13

Note: -Ofast is only available in GCC >=4.7

Servers

Sun Fire T1000

The Sun Fire T1000 is an entry-level 1U rackmounted server released in early 2006 as one of the first systems to use Sun's radically multi-threaded UltraSPARC T1 "Niagra" microprocessor, derived from a SPARC implementation originally developed by Afara Websystems that features four, six or eight relatively simple SPARC V9 cores with individual 16 KiB instruction caches and 8 KiB data caches, a shared 3 MiB secondary cache and a single floating-point unit shared among all cores. Each core also has four threads, all sharing a single pipeline and a massive register file composed of 640 64-bit registers that allows for a thread's state to be quickly saved and resumed in a single cycle in order to maximize processor utilization in heavily multi-threaded workloads. The T1 utilized in the T1000 is clocked at 1 GHz.

All tests are performed on a T1000 with an 8-core UltraSPARC T1 running Solaris 10 10/09 with no specific configuration for benchmarking purposes. Because the ANSIbench makefile is not yet set up to build CoreMark with multi-thread support, all results are for execution on one thread only. Keep this in mind when interpreting these results, as the T1's single-thread performance can be incredibly weak, even weaker than many low-power embedded processors like the ARM9 chip found in the HP t5325 above.

Sun Studio 12/Sun C 5.9: 20,000 iterations, 1 thread

Options Iterations/second
none 594.71
-xO1 218.10
-xO2 870.32
-xO3 949.67
-xO4 1087.55
-xO5 1058.20
-fast 1048.77

GCC 5.5.0: 20,000 iterations, 1 thread

Options Iterations/second
none 253.07
-O1 1086.96
-O2 1327.14
-O3 1407.46
-Ofast 1408.45