Skip to content

Latest commit

 

History

History
72 lines (58 loc) · 5.1 KB

File metadata and controls

72 lines (58 loc) · 5.1 KB

Introduction

Fortran plays a pivotal role in science. Open source scientific tools like R and Python's numerical libraries leverage Fortran. Vast numbers of robust scientific toolsuse this language. While original developed in the 1950's, Fortran's use of arrays makes it well-suited for numerical arrays, and has features that allow high performance optimization.

Unfortunately, the open source Fortran compilers do not yet support the ARM-based architecture used by Apple's M1 CPUs. Fortunately, Iain Sandoe has been working on porting both gcc and gFortran to this architecture. François-Xavier Coudert has created an experimental release for the M1. Here we assess the performance of this nascent tool.

Here we test a couple of popular benchmarks. These old benchmarks execute very rapidly, and are probably not a good measure of compute performance. I also include a simple benchmark I wrote (f400) simulating neuroimaging data (which is probably limited more by memory bandwidth than computation). Finally, I include a 2D filtering benchmark. For this benchmark, I take report the total runtime, which means the results are dominated by the slow naive versus the fast implementation. It is worth noting that the naive implementation performed pporly on the M1 relative to the Intel and AMD CPUs, but the reverse was true for the fast method. Therefore, the total time reported suggests the M1 does poorly in this test, but the reverse would be true if the geometric means of the sub-tests was used. Due to these factors, I would take these results with a grain of salt. However, the fact that each of these runs to completion suggests that scientific tools which rely on Fortran may soon be available on this promising architecture.

Installing dependencies

Beyond gfortran and OpenMP, you will need a few libraries

For Linux:

sudo apt-get install libapr1 libapr1-dev
sudo apt-get install libgmp3-dev

For macOS x86-64:

brew install gmp
gfortran -O3 -I/usr/local/Cellar/gmp/6.2.1/include -o fpidigits pidigits.f90 -L/usr/local/Cellar/gmp/6.2.1/lib -lgmp; time ./fpidigits 10000

For macOS ARM (at the moment, libraries must be compiled from source):

export ARCHFLAGS='-arch arm64'
brew install -s gmp
-I/opt/homebrew/Cellar/gmp/6.2.1/include
gfortran -O3 -I/opt/homebrew/Cellar/gmp/6.2.1/include -o fpidigits pidigits.f90 -L/opt/homebrew/Cellar/gmp/6.2.1/lib -lgmp; time ./fpidigits 10000

Running the scripts

gfortran -O3 -o fbinarytrees binarytrees.f90 -lapr-1; time ./fbinarytrees 21
gfortran -O3 -o ffannkuch3 fannkuch3.f90; time ./ffannkuch3 12
gfortran -O3 -o ffasta fasta.f90; time ./ffasta 25000000  > /dev/null
gfortran -O3 -o fmandelbrot mandelbrot.f90 -lapr-1; time ./fmandelbrot 16000  > /dev/null
gfortran -O3 -o fnbody nbody.f90; time ./fnbody 50000000
gfortran -O3 -I/opt/homebrew/Cellar/gmp/6.2.1/include -o fpidigits pidigits.f90 -L/opt/homebrew/Cellar/gmp/6.2.1/lib -lgmp; time ./fpidigits 10000
gfortran -O3 -o fpidigits pidigits.f90 -lgmp; time ./fpidigits 10000
gfortran -O3 -o freverse reverse.f90; time ./freverse 0 < revcomp-input100000000.txt
gfortran -O3 -fopenmp -o fspectral-norm spectral-norm.f90; time ./fspectral-norm 5500
gfortran -O3 -o fpidigits pidigits.f90 -lgmp; time ./fpidigits 10000
gfortran -O3 -o freverse reverse.f90; time ./freverse 0 < revcomp-input100000000.txt
gfortran -O3 -fopenmp -o fspectral-norm spectral-norm.f90; time ./fspectral-norm 5500
gfortran -O3 -o f400 fortran400.f95; time ./f400
gfortran -O3 -o f400 fortran400.f95; time ./f400
gfortran -O3 -c -o fortfilt.o fortfilt.f90
gfortran -O3 -o performancetest fortfilt.o performancetest.f90
time ./benchmark.sh

Testing

All of these times are very brief. Since macOS applications often phone home at launch, such short benchmarks are unlikely to be a reliable measure of performance.

Test i5-8259u 3900X M1
binarytrees 3.226 2.903 2.679
fannkuch3 29.577 24.222 26.867
fasta 1.301 0.902 2.635
mandelbrot 4.539 3.405 3.029
nbody 3.265 2.437 2.642
pidigits 0.861 0.511 0.574
reverse 0.006 0.002 0.159
spectral-norm 0.638 0.165 0.454
f400_fp32 296.547 97.050 42.773
f400_fp64 334.015 276.307 85.466
fasticonv 26.746 27.429 60.570