Works on Arm News 2019-W19
This issue of Works on Arm News comes to you from Boston, after a successful Red Hat Summit which saw the release of RHEL 8 for the arm64 platform.
In the news
- RHEL 8 release
- Neoverse N1 server running Fedora 30 at Red Hat Summit
- Scaling results from the first generation of Arm supercomputers
- glibc improvements for Marvell ThunderX2
- ODroid N2 review
- Building a Kubernetes-ready kernel for the Jetson Nano (Dieter Reuter)
- Mozilla fixes Firefox add-ons in "armagadd-on" bug
- Linux 5.1 release
- GCC 9
- Arm and RISC-V memory concurrency models
- dav1d speedups
- Windows 10 on Arm - sysinternals utilities
- Tegra Docker for Jetson Nano
- Measuring on-chip delays
- Sixel graphics with vt340-compatible terminals
- $199 Pinebook Pro status
- Swift for Tensorflow on Jetson Nano
- Caffe on Jetson Nano
- k3s v0.5 and k3os 0.2
- Clearfog ITX prototype boards
- Animation Compression Library
- Alpine Linux 3.9.4
RHEL 8 release
RHEL 8 was announced at Red Hat Summit in Boston. This is the first release of RHEL that is available on Arm server systems, and the last release of RHEL before Red Hat's acquisition by IBM. Red Hat has a single codebase for all of RHEL 8, that supports all of their supported operating systems on all architectures - no forks, special spins, or feature branches needed.
Neoverse N1 server running Fedora 30 at Red Hat Summit
At Red Hat Summit, the Fedora team showed off a research prototype workstation running with a 4-core Neoverse N1 chip. The system was based on a 5.1 kernel running Fedora 30. This particular board is not expected to ever be in wide production, but it does let OS vendors get hands-on experience with new IP and new silicon so that when licensees are ready for production the software will be well tested.
Scaling results from the first generation of Arm supercomputers
At the Cray User Group conference in Montreal, a team led by Simon McIntosh-Smith of the University of Bristol won "best paper" for their work on scaling results from the first generation of Arm-based supercomputers (Isambard).
glibc improvements for Marvell ThunderX2
Marvell has contributed improvements to glibc for the ThunderX2 that speed up memmove performance by up to 30% on large forward moves. The code is specific to that platform, but should be adaptable to any system where SIMD instructions perform well. Similar optimizations are already in place in that code base for Qualcomm's now-defunct Falkor cores.
ODroid N2 review
Carlos Eduardo reviews the Amlogic powered ODroid N2 and finds it a third faster than the Rockchip RK3399 for his set of real world sample tasks.
Building a Kubernetes-ready kernel for the Jetson Nano (Dieter Reuter)
Dieter Reuter goes through the detail of the process to completely rebuild your kernel on the Jetson Nano, in order to enable Kubernetes operations on that system. A full kernel rebuild is about an hour's compile time on this quad-core A57-based 4 GB memory system.
Mozilla fixes Firefox add-ons in "armagadd-on" bug
The Mozilla team has resolved a set of issues that cause user add-ons in their popular Firefox browser to stop working. The post-mortem points to an expired certificate as contributing to the problem. Firefox Beta is available as a native Window 10 on Arm build for arm64 Windows systems.
Linux 5.1 release
The Linux 5.1 changelog has release notes for this latest kernel release. The new kernel enables new arm64 hardware, and now defaults to a maximum of 256 cores per system to support high-performance hardware like the Marvell ThunderX2.
The GCC 9.1 release is out. Among the improvements in it are better link time optimizations, illustrated in detail in a weblog by Honza Hubicka.
Arm and RISC-V memory concurrency models
Christopher Pulte's doctoral thesis on ARMv8 and RISC-V multicopy atomic memory semantics is good reading if you are interested in the complexities of the memory models of modern processors and instruction set architectures.
- Will Deacon, Twitter
- Pulte, C. (2019). The Semantics of Multicopy Atomic ARMv8 and RISC-V (Doctoral thesis)
- POPL 19 paper with Alistair Reid, Arm
dav1d is a video decoder for the royalty-free AV1 streaming media format. It has been accelerated in the 0.3 release to improve performance on Arm systems with new NEON and SIMD codes.
Windows 10 on Arm - sysinternals utilities
Microsoft teased a version of their
sysinternals debugging and
process management tools for Windows 10 on Arm.
rclone is a fast command line utility for file copies,
backups, and restore operations. It supports arm64 systems,
and can be used to effectively manage file transfers across
a number of remote file system protocols.
Tegra Docker for Jetson Nano
Work is underway to document and explore the use of GPU-enabled containers on the Jetson Nano, a $99 system from Nvidia.
- Tegra-Docker, Technica Corporation
- Dieter Reuter, Twitter
- OpenDataCam issue, Github
- Deploy GPU Enabled Kubernetes Pod, Medium
Measuring on-chip delays
In multicore and manycore systems, the delay to get data
between pairs of cores in a CPU complex can vary considerably.
Michael Kuron measures these delays in an AMD Threadripper
core-latency tool from Adam Jakubek. Complex
chip architectures can introduce non-uniform memory
access (NUMA), and as a consequence schedulers need to
be architecture aware to maximize performance.
- Michael Kuron on AMD Ryzen
- Next Platform on single socket servers
- Adam Jakubek, core-latency source code, Github
Sixel graphics with vt340-compatible terminals
Just when you thought display technology had reached its apex comes SIXEL, a 1990's technology from the DEC vt340 terminal that enables simple image graphics within an otherwise text-only window.
$199 Pinebook Pro status
Pine64 provides an update to their $199 Pinebook Pro production efforts. The current boards for this device have a few minor issues that will be worked out before the first real production run due within the next couple of months.
- May 2019 news, Pine64
- CNX Software
- Ameridroid, Pinebook Pro video
- Pinebook Pro video, Youtube
Swift for Tensorflow on Jetson Nano
A team lead by Brad Larson has produced a version of the Swift language with support for Tensorflow operations targeting the Jetson Nano. Swift is a language from Apple used for both client-side and server-side programming, and Tensorflow is a Google effort for machine learning.
Caffe on Jetson Nano
Caffe is another machine learning framework, and Seeed Studio has a brief tutorial on bringing it to life on the Jetson Nano.
k3s v0.5 and k3os 0.2
The latest version of k3s and k3os from Rancher Labs brings additional development to this very lightweight Kubernetes environment.
Clearfog ITX prototype boards
Jon Nettleton from Solid Run shows the latest photos of the Clearfog ITX, a design for a developer-ready desktop Arm system built around an NXP SOC.
npcap, a packet capture and analysis utility, adds support for Windows 10 on Arm.
Animation Compression Library
The Animation Compression Library provides substantial speedups for animation on a variety of platforms, especially arm64 for IoS and newly for Windows 10 on Arm. The codebase supports Linux but has not been ported yet to the linux/arm64 combination.
ACL architect Nicholas Frechette has an interesting weblog to go with the software development, providing inside into the instruction set level optimizations, tweaks, and assembly language deep work needed to make this package perform as well as possible.
Alpine Linux 3.9.4
Alpine Linux has produced a 3.9.4 release across a number of platforms including arm64. The latest release addresses a number of security issues from contributed packages, and new to this release the CVEs for those security issues are included in the release notes.