*Copyright 2024 The Specials Authors. Licensed under the Apache License, Version 2.0 (the "License").*

_Some of the code in this file is adapted from:_

_modularml/mojo_<br>
_Copyright (c) 2023, Modular Inc._<br>
_Licensed under the Apache License v2.0 with LLVM Exceptions._

# Elementary Functions in Specials

In this Jupyter notebook, we compare the implementations of elementary functions from the Specials package with those found in well-known Mojo and Python packages.

Although the Mojo standard library implements all or most of the elementary functions found in Specials, we have decided to implement them in the package as a matter of priority. For us, **Accuracy `>` Performance**: when forced to choose between FLOPS and numerical accuracy, we prefer numerical accuracy.

This prioritization is justified by the following fact: when these elementary functions do not exhibit near-perfect accuracy, minor errors can propagate and have a significant impact on the special functions that depend on them.

<div class="alert alert-block alert-info">Whenever possible, we parallelize the application of elementary functions to tensors in Mojo. To force sequential execution, set the <code>force_sequential</code> variable to <code>True</code>.</div>

In [1]:
# Option to force sequential execution in Mojo for elementwise operations
alias force_sequential = False


## Table of Contents

- Experimental Settings
  * Domains
  * Evaluation Metrics
    - Accuracy
    - Runtime Performance
  * Packages and Auxiliary Functions
- Experiment Results
  * Exp
  * Exp2
  * Expm1
  * Log
  * Log1p
- Appendix A: System information

## Experimental Settings

In this section, we outline the experimental settings. From the definition of domains to the metrics used for accuracy and runtime performance evaluation, these settings lay the foundation for an objective assessment.

### Domains

For each elementary function, we uniformly sample 100,000 argument values from intervals of the form $[a_i, b_i]$, referred to as _domains_, where $a_i$ and $b_i$ are the minimum and maximum values of each domain, respectively.

In [2]:
# Sample size
var num_samples = 100_000


We repeat each experiment for single- and double-precision floating-point inputs (`float32` and `float64`).

### Evaluation Metrics

In this section, we present the metrics used for accuracy and runtime performance evaluation.

#### Accuracy

To evaluate the accuracy of function implementations, we measure _error in ulps_. Ulp stands for _unit in the last place_. In this Jupyter notebook, we use the Kahan definition of $\text{ulp}(y)$, where $y$ is an arbitrary real number:

> $\text{ulp}(y)$ is the gap between the two finite floating-point numbers nearest $y$, even if $y$ is not contained in that interval.

Let $\hat{f}$ be an implementation of the mathematical function $f$. Given a representation $\mathbb{T}$ with finite precision (e.g., `float32`) and an input $x \in \mathbb{T}$, the floating-point number $\hat{y} \equiv \hat{f}(x)$ is an approximation of the real number $y \equiv f(x)$. The error in ulps of $\hat{y}$ relative to $y$, $E_{\text{ulps}}(y, \hat{y})$, is given by:

$$E_{\text{ulps}}(y, \hat{y}) = \frac{|y - \hat{y}|}{\text{ulp}(y)}.$$

Ideally, this error is always less than 0.5 in round-to-nearest mode for any pair $(y, \hat{y})$. In fact, this metric has the following interesting property (here we assume that $\mathbb{T}$ is a binary floating-point representation):

> $E_{\text{ulps}}(\hat{y}) < 0.5$ if, and only if, $\hat{y} = \text{RN}(y)$, where $\text{RN}(\cdot)$ is the round-to-nearest function.

The exact but unknown value $y$ is computed with high precision using the Python library [`mpmath`](https://mpmath.org/).

To compare different implementations in terms of accuracy, we calculate the maximum and the root mean square (RMS) of the observed errors for each combination of implementation and domain, with lower values indicating higher accuracy.

#### Runtime Performance

For runtime performance, we measure the execution time using the `benchmark` module in Mojo and a custom function based on the `timeit` module in Python.

We assess runtime performance by calculating the average execution time for each implementation-domain combination, with lower values denoting better performance.

### Packages and Auxiliary Functions

In this section, we import packages and auxiliary functions essential for conducting our experiments and measuring results.

In [3]:
%%python

import mpmath as mp
import numpy as np


In [4]:
import math
import specials

from test_utils.benchmark import run_experiment


## Experiment Results

In this section, we delve into the results of our experiments for each elementary function evaluated.


### Exp

This section shows the experiment results for `exp`, which computes the exponential of `x`.

In [5]:
%%python

def numpy_exp(x):
    """Computes the exponential of a given array using NumPy."""
    return np.exp(x)


def mpmath_exp(x):
    """Computes the exponential of a given array using mpmath."""
    def _mp_exp_impl(a):
        with mp.workdps(50):
            res = mp.exp(mp.mpf(a))
        return mp.nstr(res, 40)

    return np.frompyfunc(_mp_exp_impl, 1, 1)(x)


In [6]:
run_experiment[
    num_domains=5,
    specials_func = specials.exp,
    mojo_func = math.exp,
    type = DType.float32,
    force_sequential=force_sequential,
](
    experiment_name="Exp",
    num_samples=num_samples,
    min_values=StaticTuple[Float32, 5](-1e-5, -1.0, -4.0, -20.0, -87.0),
    max_values=StaticTuple[Float32, 5](1e-5, 1.0, 4.0, 20.0, 87.0),
    truth_func=mpmath_exp,
    python_func=numpy_exp,
    python_func_name="NumPy",
)



Experiment: Exp (float32)

                               Maximum Error          RMS Error    Average Execution
Domain        Solution       Observed (ulps)    Observed (ulps)         Time (msecs)
------------  -----------  -----------------  -----------------  -------------------
-1e-05,1e-05  Specials                 0.500              0.288                0.205
              Mojo Stdlib              0.500              0.288                0.068
              NumPy                    1.492              0.568                0.169
------------  -----------  -----------------  -----------------  -------------------
-1,1          Specials                 0.513              0.289                0.203
              Mojo Stdlib              0.774              0.295                0.068
              NumPy                    2.156              0.572                0.167
------------  -----------  -----------------  -----------------  -------------------
-4,4          Specials               

In [7]:
run_experiment[
    num_domains=5,
    specials_func = specials.exp,
    mojo_func = math.exp,
    type = DType.float64,
    force_sequential=force_sequential,
](
    experiment_name="Exp",
    num_samples=num_samples,
    min_values=StaticTuple[Float64, 5](-1e-14, -1.0, -8.0, -80.0, -708.0),
    max_values=StaticTuple[Float64, 5](1e-14, 1.0, 8.0, 80.0, 708.0),
    truth_func=mpmath_exp,
    python_func=numpy_exp,
    python_func_name="NumPy",
)



Experiment: Exp (float64)

                               Maximum Error          RMS Error    Average Execution
Domain        Solution       Observed (ulps)    Observed (ulps)         Time (msecs)
------------  -----------  -----------------  -----------------  -------------------
-1e-14,1e-14  Specials                 0.500              0.289                0.373
              Mojo Stdlib              0.500              0.289                0.198
              NumPy                    0.500              0.289                0.709
------------  -----------  -----------------  -----------------  -------------------
-1,1          Specials                 0.516              0.289                0.374
              Mojo Stdlib          2,543.586          1,512.896                0.191
              NumPy                    0.504              0.289                0.695
------------  -----------  -----------------  -----------------  -------------------
-8,8          Specials               

### Exp2

This section shows the experiment results for `exp2`, which computes the base-2 exponential of `x`.

In [8]:
%%python

def numpy_exp2(x):
    """Computes the base-2 exponential of a given array using NumPy."""
    return np.exp2(x)


def mpmath_exp2(x):
    """Computes the base-2 exponential of a given array using mpmath."""
    def _mp_exp2_impl(a):
        with mp.workdps(50):
            res = mp.power(mp.mpf(2), mp.mpf(a))
        return mp.nstr(res, 40)

    return np.frompyfunc(_mp_exp2_impl, 1, 1)(x)


In [9]:
run_experiment[
    num_domains=5,
    specials_func = specials.exp2,
    mojo_func = math.exp2,
    type = DType.float32,
    force_sequential=force_sequential,
](
    experiment_name="Exp2",
    num_samples=num_samples,
    min_values=StaticTuple[Float32, 5](-1e-5, -1.0, -5.0, -25.0, -126.0),
    max_values=StaticTuple[Float32, 5](1e-5, 1.0, 5.0, 25.0, 127.0),
    truth_func=mpmath_exp2,
    python_func=numpy_exp2,
    python_func_name="NumPy",
)



Experiment: Exp2 (float32)

                               Maximum Error          RMS Error    Average Execution
Domain        Solution       Observed (ulps)    Observed (ulps)         Time (msecs)
------------  -----------  -----------------  -----------------  -------------------
-1e-05,1e-05  Specials                 0.500              0.289                0.206
              Mojo Stdlib              0.500              0.289                0.056
              NumPy                    0.500              0.289                0.596
------------  -----------  -----------------  -----------------  -------------------
-1,1          Specials                 0.519              0.289                0.203
              Mojo Stdlib            115.593             46.483                0.055
              NumPy                    0.501              0.289                0.601
------------  -----------  -----------------  -----------------  -------------------
-5,5          Specials              

In [10]:
run_experiment[
    num_domains=5,
    specials_func = specials.exp2,
    mojo_func = math.exp2,
    type = DType.float64,
    force_sequential=force_sequential,
](
    experiment_name="Exp2",
    num_samples=num_samples,
    min_values=StaticTuple[Float64, 5](-1e-14, -1.0, -10.0, -100.0, -1022.0),
    max_values=StaticTuple[Float64, 5](1e-14, 1.0, 10.0, 100.0, 1023.0),
    truth_func=mpmath_exp2,
    python_func=numpy_exp2,
    python_func_name="NumPy",
)



Experiment: Exp2 (float64)

                               Maximum Error          RMS Error    Average Execution
Domain        Solution       Observed (ulps)    Observed (ulps)         Time (msecs)
------------  -----------  -----------------  -----------------  -------------------
-1e-14,1e-14  Specials                 0.500              0.289                0.398
              Mojo Stdlib              0.500              0.289                0.139
              NumPy                    0.500              0.289                0.646
------------  -----------  -----------------  -----------------  -------------------
-1,1          Specials                 0.518              0.289                0.390
              Mojo Stdlib           6.19e+10           2.50e+10                0.144
              NumPy                    0.507              0.289                0.670
------------  -----------  -----------------  -----------------  -------------------
-10,10        Specials              

### Expm1

This section shows the experiment results for `expm1`, which computes `exp(x) - 1` in a numerically stable way.

In [11]:
%%python

def numpy_expm1(x):
    """Computes `exp(x) - 1` for all elements in the given array using NumPy."""
    return np.expm1(x)


def mpmath_expm1(x):
    """Computes `exp(x) - 1` for all elements in the given array using mpmath."""
    def _mp_expm1_impl(a):
        with mp.workdps(50):
            res = mp.expm1(mp.mpf(a))
        return mp.nstr(res, 40)

    return np.frompyfunc(_mp_expm1_impl, 1, 1)(x)


In [12]:
run_experiment[
    num_domains=5,
    specials_func = specials.expm1,
    mojo_func = math.expm1,
    type = DType.float32,
    force_sequential=force_sequential,
](
    experiment_name="Expm1",
    num_samples=num_samples,
    min_values=StaticTuple[Float32, 5](-1e-5, -1.0, -4.0, -20.0, -87.0),
    max_values=StaticTuple[Float32, 5](1e-5, 1.0, 4.0, 20.0, 87.0),
    truth_func=mpmath_expm1,
    python_func=numpy_expm1,
    python_func_name="NumPy",
)



Experiment: Expm1 (float32)

                               Maximum Error          RMS Error    Average Execution
Domain        Solution       Observed (ulps)    Observed (ulps)         Time (msecs)
------------  -----------  -----------------  -----------------  -------------------
-1e-05,1e-05  Specials                 0.500              0.288                0.123
              Mojo Stdlib              0.500              0.288                0.419
              NumPy                    0.500              0.288                0.793
------------  -----------  -----------------  -----------------  -------------------
-1,1          Specials                 0.537              0.289                0.329
              Mojo Stdlib              0.778              0.311                0.807
              NumPy                    0.778              0.311                1.203
------------  -----------  -----------------  -----------------  -------------------
-4,4          Specials             

In [13]:
run_experiment[
    num_domains=5,
    specials_func = specials.expm1,
    mojo_func = math.expm1,
    type = DType.float64,
    force_sequential=force_sequential,
](
    experiment_name="Expm1",
    num_samples=num_samples,
    min_values=StaticTuple[Float64, 5](-1e-14, -1.0, -8.0, -80.0, -708.0),
    max_values=StaticTuple[Float64, 5](1e-14, 1.0, 8.0, 80.0, 708.0),
    truth_func=mpmath_expm1,
    python_func=numpy_expm1,
    python_func_name="NumPy",
)



Experiment: Expm1 (float64)

                               Maximum Error          RMS Error    Average Execution
Domain        Solution       Observed (ulps)    Observed (ulps)         Time (msecs)
------------  -----------  -----------------  -----------------  -------------------
-1e-14,1e-14  Specials                 0.500              0.288                0.261
              Mojo Stdlib              0.500              0.288                0.419
              NumPy                    0.500              0.288                0.760
------------  -----------  -----------------  -----------------  -------------------
-1,1          Specials                 0.540              0.289                0.556
              Mojo Stdlib              0.790              0.311                0.605
              NumPy                    0.790              0.311                1.243
------------  -----------  -----------------  -----------------  -------------------
-8,8          Specials             

### Log

This section shows the experiment results for `log`, which computes the natural logarithm of `x`.

In [14]:
%%python

def numpy_log(x):
    """Computes the logarithm for all elements in the given array using NumPy."""
    return np.log(x)


def mpmath_log(x):
    """Computes the logarithm for all elements in the given array using mpmath."""
    def _mp_log_impl(a):
        with mp.workdps(50):
            res = mp.log(mp.mpf(a))
        return mp.nstr(res, 40)

    return np.frompyfunc(_mp_log_impl, 1, 1)(x)


In [15]:
var xone_inf: Float32 = 0.9394130628134757861197108246223050845246808905494418220094926620
var xone_sup: Float32 = 1.0644944589178594295633905946428896731007254436493533015193075106


In [16]:
run_experiment[
    num_domains=6,
    specials_func = specials.log,
    mojo_func = math.log,
    type = DType.float32,
    force_sequential=force_sequential,
](
    experiment_name="Log",
    num_samples=num_samples,
    min_values=StaticTuple[Float32, 6](
        2e-38, xone_inf, xone_sup, 5.0, 100.0, 1e20
    ),
    max_values=StaticTuple[Float32, 6](
        xone_inf, xone_sup, 5.0, 100.0, 1e20, 3e38
    ),
    truth_func=mpmath_log,
    python_func=numpy_log,
    python_func_name="NumPy",
)



Experiment: Log (float32)

                              Maximum Error          RMS Error    Average Execution
Domain       Solution       Observed (ulps)    Observed (ulps)         Time (msecs)
-----------  -----------  -----------------  -----------------  -------------------
2e-38,0.9    Specials                 0.527              0.289                0.376
             Mojo Stdlib              0.832              0.307                0.099
             NumPy                    2.964              0.465                0.245
-----------  -----------  -----------------  -----------------  -------------------
0.9,1.1      Specials                 0.501              0.289                0.190
             Mojo Stdlib              0.546              0.289                0.095
             NumPy                    2.810              0.643                0.245
-----------  -----------  -----------------  -----------------  -------------------
1.1,5        Specials                 0.535     

In [17]:
var xone_inf: Float64 = 0.9394130628134757861197108246223050845246808905494418220094926620
var xone_sup: Float64 = 1.0644944589178594295633905946428896731007254436493533015193075106


In [18]:
run_experiment[
    num_domains=6,
    specials_func = specials.log,
    mojo_func = math.log,
    type = DType.float64,
    force_sequential=force_sequential,
](
    experiment_name="Log",
    num_samples=num_samples,
    min_values=StaticTuple[Float64, 6](
        3e-308, xone_inf, xone_sup, 5.0, 100.0, 1e155
    ),
    max_values=StaticTuple[Float64, 6](
        xone_inf, xone_sup, 5.0, 100.0, 1e155, 1e308
    ),
    truth_func=mpmath_log,
    python_func=numpy_log,
    python_func_name="NumPy",
)



Experiment: Log (float64)

                                Maximum Error          RMS Error    Average Execution
Domain         Solution       Observed (ulps)    Observed (ulps)         Time (msecs)
-------------  -----------  -----------------  -----------------  -------------------
3e-308,0.9     Specials                 0.537              0.289                0.991
               Mojo Stdlib     21,116,084.236      3,592,310.488                0.183
               NumPy                    0.514              0.289                0.577
-------------  -----------  -----------------  -----------------  -------------------
0.9,1.1        Specials                 0.504              0.289                0.367
               Mojo Stdlib        139,420.868         76,557.502                0.185
               NumPy                    0.503              0.289                0.662
-------------  -----------  -----------------  -----------------  -------------------
1.1,5          Specials   

### Log1p

This section shows the experiment results for `log1p`, which computes `log(1 + x)` in a numerically stable way.

In [19]:
%%python

def numpy_log1p(x):
    """Computes `log(1 + x)` for all elements in the given array using NumPy."""
    return np.log1p(x)


def mpmath_log1p(x):
    """Computes `log(1 + x)` for all elements in the given array using mpmath."""
    def _mp_log1p_impl(a):
        with mp.workdps(50):
            res = mp.log1p(mp.mpf(a))
        return mp.nstr(res, 40)

    return np.frompyfunc(_mp_log1p_impl, 1, 1)(x)


In [20]:
var xsml_inf: Float32 = -0.060586937186524213880289175377694915475319109450558177990507337
var xsml_sup: Float32 = 0.0644944589178594295633905946428896731007254436493533015193075106


In [21]:
run_experiment[
    num_domains=6,
    specials_func = specials.log1p,
    mojo_func = math.log1p,
    type = DType.float32,
    force_sequential=force_sequential,
](
    experiment_name="Log1p",
    num_samples=num_samples,
    min_values=StaticTuple[Float32, 6](-1e-5, xsml_inf, -1.0, 1.0, 5.0, 1e7),
    max_values=StaticTuple[Float32, 6](1e-5, xsml_sup, 1.0, 5.0, 1e7, 3e38),
    truth_func=mpmath_log1p,
    python_func=numpy_log1p,
    python_func_name="NumPy",
)



Experiment: Log1p (float32)

                               Maximum Error          RMS Error    Average Execution
Domain        Solution       Observed (ulps)    Observed (ulps)         Time (msecs)
------------  -----------  -----------------  -----------------  -------------------
-1e-05,1e-05  Specials                 0.500              0.288                0.131
              Mojo Stdlib              0.500              0.288                0.476
              NumPy                    0.500              0.288                0.843
------------  -----------  -----------------  -----------------  -------------------
-0.1,0.1      Specials                 0.501              0.288                0.127
              Mojo Stdlib              0.549              0.289                0.441
              NumPy                    0.549              0.289                0.827
------------  -----------  -----------------  -----------------  -------------------
-1,1          Specials             

In [22]:
var xsml_inf: Float64 = -0.060586937186524213880289175377694915475319109450558177990507337
var xsml_sup: Float64 = 0.0644944589178594295633905946428896731007254436493533015193075106


In [23]:
run_experiment[
    num_domains=6,
    specials_func = specials.log1p,
    mojo_func = math.log1p,
    type = DType.float64,
    force_sequential=force_sequential,
](
    experiment_name="Log1p",
    num_samples=num_samples,
    min_values=StaticTuple[Float64, 6](-1e-14, xsml_inf, -1.0, 1.0, 5.0, 1e15),
    max_values=StaticTuple[Float64, 6](1e-14, xsml_sup, 1.0, 5.0, 1e15, 1e308),
    truth_func=mpmath_log1p,
    python_func=numpy_log1p,
    python_func_name="NumPy",
)



Experiment: Log1p (float64)

                               Maximum Error          RMS Error    Average Execution
Domain        Solution       Observed (ulps)    Observed (ulps)         Time (msecs)
------------  -----------  -----------------  -----------------  -------------------
-1e-14,1e-14  Specials                 0.500              0.288                0.275
              Mojo Stdlib              0.500              0.288                0.260
              NumPy                    0.500              0.288                0.635
------------  -----------  -----------------  -----------------  -------------------
-0.1,0.1      Specials                 0.506              0.289                0.284
              Mojo Stdlib              0.549              0.289                0.476
              NumPy                    0.549              0.289                0.894
------------  -----------  -----------------  -----------------  -------------------
-1,1          Specials             

## Appendix: System information

Below, information about the system used to run the experiment.

In [24]:
%%python

import subprocess

subprocess.run(["modular", "-v"])
subprocess.run(["mojo", "-v"])


modular 0.7.2 (d0adc668)
mojo 24.2.1 (2f0dcf11)


In [25]:
from sys.info import (
    os_is_linux,
    os_is_windows,
    os_is_macos,
    has_sse4,
    has_avx,
    has_avx2,
    has_avx512f,
    has_vnni,
    has_neon,
    is_apple_m1,
    has_intel_amx,
    num_physical_cores,
    _current_target,
    _current_cpu,
    _triple_attr,
)

var os: StringLiteral
if os_is_linux():
    os = "linux"
elif os_is_macos():
    os = "macOS"
else:
    os = "windows"

var cpu = String(_current_cpu())
var arch = String(_triple_attr())

var cpu_features = String("")
if has_sse4():
    cpu_features += " sse4"
if has_avx():
    cpu_features += " avx"
if has_avx2():
    cpu_features += " avx2"
if has_avx512f():
    cpu_features += " avx512f"
if has_vnni():
    if has_avx512f():
        cpu_features += " avx512_vnni"
    else:
        cpu_features += " avx_vnni"
if has_intel_amx():
    cpu_features += " intel_amx"
if has_neon():
    cpu_features += " neon"
if is_apple_m1():
    cpu_features += " apple_m1"

if len(cpu_features) > 0:
    cpu_features = cpu_features[1:]

print("System Information")
print("  OS                   :", os)
print("  CPU                  :", cpu)
print("  Arch                 :", arch)
print("  Num Physical Cores   :", num_physical_cores())
print("  CPU Features         :", cpu_features)


System Information
  OS                   : linux
  CPU                  : haswell
  Arch                 : x86_64-unknown-linux-gnu
  Num Physical Cores   : 4
  CPU Features         : sse4 avx avx2


In [26]:
%%python
import pkg_resources
import sys

def get_version(package):
    """Returns the version of a Python package."""
    return pkg_resources.get_distribution(package).version

print("mpmath version:", mp.__version__)
print("NumPy version:", np.__version__)
print("Python version:", sys.version)
print("Tabulate version:", get_version("tabulate"))


mpmath version: 1.3.0
NumPy version: 1.26.4
Python version: 3.11.8 | packaged by conda-forge | (main, Feb 16 2024, 21:14:50) [GCC 12.3.0]
Tabulate version: 0.9.0
