In [4]:
import numpy as np

## Overflow

In [7]:
2.0**1023

8.98846567431158e+307

In [8]:
2.0**1024

OverflowError: (34, 'Result too large')

**Overflow Example 1**

We want to compute

$$
\sqrt{x^2+y^2} \qquad \mbox{with }x=10^{300} \mbox{ and } y=3\cdot 10^{300}
$$

In [2]:
x = 1e300
x

1e+300

In [3]:
y = 3*x
y

3e+300

In [4]:
np.sqrt(x**2+y**2)

OverflowError: (34, 'Result too large')

We got an overflow error because $x^2$ is larger than the largest floating point number (in double format)

We can avoid the overflow error by scaling the variables x and y by their max value

$$
\sqrt{x^2+y^2} = \max\{|x|,|y|\}\cdot\sqrt{\left(\frac{|x|}{\max\{|x|,|y|\}}\right)^2 + \left(\frac{|y|}{\max\{|x|,|y|\}}\right)^2}
$$

In [5]:
# scale x and y
max_value = np.max([x,y])
max_value

3e+300

In [6]:
max_value*np.sqrt((x/max_value)**2+(y/max_value)**2)

3.1622776601683795e+300

**Overflow Example 2: the Softmax Function**

The softmax function takes as input an $n$-dimensional vector $x=[x_1,\ldots,x_n]$, and returns a vector $g(x)=[g_1(x),\ldots,g_n(x)]$ with entries

$$
g_j(x) = 
\frac{e^{x_j}}{\sum_{i=1}^n e^{x_i}}, \quad j=1,2,\ldots, n.
$$

The elements of $g$ are all between 0 and 1 and they sum to 1.
Softmax is a key function in machine learning algorithms.

In [14]:
x = np.array([10,2,30,-1])
x

array([10,  2, 30, -1])

In [15]:
np.exp(x)

array([2.20264658e+04, 7.38905610e+00, 1.06864746e+13, 3.67879441e-01])

In [16]:
np.sum(np.exp(x))

10686474603558.686

In [18]:
# softmax function at x
np.exp(x)/np.sum(np.exp(x))

array([2.06115362e-09, 6.91440009e-13, 9.99999998e-01, 3.44247710e-14])

A problem with numerical evaluation of  softmax is that overflow is likely even for quite modest values of $x_i$ because of the exponentials, even though g(x) cannot overflow.

In [28]:
x = np.array([790,650,750,700,780])
x

array([790, 650, 750, 700, 780])

In [29]:
np.exp(x)/np.sum(np.exp(x))

  np.exp(x)/np.sum(np.exp(x))
  np.exp(x)/np.sum(np.exp(x))


array([nan,  0., nan,  0., nan])

A standard solution it to incorporate a shift, $a$, and use the formula

$$
g_j(x) = 
\frac{e^{-a}e^{x_j}}{e^{-a}\sum_{i=1}^n e^{x_i}}
=\frac{e^{x_j-a}}{\sum_{i=1}^n e^{x_i-a}},
\quad j=1,2,\ldots, n.
$$

where $a$ is usually set to $a=\max\{x_1,x_2,\ldots,x_n\}$.

In [30]:
a = np.max(x)
a

790

In [31]:
np.exp(x-a)/np.sum(np.exp(x-a))

array([9.99954602e-01, 1.58034831e-61, 4.24816139e-18, 8.19364063e-40,
       4.53978687e-05])

## checking the condition number rule of thumb

In [10]:
def f(x):
    return x-1

In [24]:
def condf(x):
    return np.abs(x)/np.abs(x-1)

In [25]:
xlist = [0.9,0.99,0.999,0.9999,0.99999,0.999999,0.9999999,0.99999999]

In [26]:
# Delta x
error = 1e-16

for x in xlist:

    # x + error (x + Delta x)
    xdx = x + error

    # y value
    y = f(x)

    # y + Delta y
    ydy = f(xdx)

    print('x value')
    print(x)
    print('relative error in x')
    print(np.abs(error)/np.abs(x))
    print('relative error in y')
    print(np.abs(ydy-y)/np.abs(y))
    print('condition number at x')
    print(condf(x))
    print('-------------------------')
    

x value
0.9
relative error in x
1.111111111111111e-16
relative error in y
1.1102230246251567e-15
condition number at x
9.000000000000002
-------------------------
x value
0.99
relative error in x
1.0101010101010101e-16
relative error in y
1.1102230246251556e-14
condition number at x
98.99999999999991
-------------------------
x value
0.999
relative error in x
1.001001001001001e-16
relative error in y
1.1102230246251555e-13
condition number at x
998.9999999999991
-------------------------
x value
0.9999
relative error in x
1.000100010001e-16
relative error in y
1.1102230246252787e-12
condition number at x
9999.0000000011
-------------------------
x value
0.99999
relative error in x
1.000010000100001e-16
relative error in y
1.1102230246302091e-11
condition number at x
99999.0000004551
-------------------------
x value
0.999999
relative error in x
1.000001000001e-16
relative error in y
1.1102230245932314e-10
condition number at x
999998.9999712444
-------------------------
x value
0.99999