## Elizabeth Daly 
## HDip Data Analytics 2020

### Machine Learning & Statistics Tasks
***

## 1: Function to calculate the square root of 2 to 100 decimal places 

For this task, we have been asked to compute the square root of 2, to 100 decimal places, without using any imported module, or any module from the standard library in Python. There is indeed a math module that could be used to complete this task easily, but we will only use it to provide a comparison with our solution later on [1].
Common methods to calculate square roots start by providing an initial guess for the square root, and then iterating through an algorithm to produce a better guess each time through the algorithm [2]. Iteration stops when a required accuracy or tolerance is reached, or when some maximum number of iterations has taken place for slowly-converging algorithms. 

Newton's method is a very common approach and the one we will use for this task [3, 4, 5]. It can be used to calculate the square root $s$ of a number $x$, where $x \gt 0$ and $s = \sqrt x $. Starting with an initial guess for the square root, $s_0$, the algorithm computes a better guess using the formula

$$ s_{n+1} = s_n - \left ( \frac{s_n^2 - x}{2 s_n} \right ). $$

Here, $s_{n}$ is the previous estimate for the square root and $s_{n+1}$ is the updated/better estimate.

With Newton's method, setting the initial guess to be slightly larger than the root will result in slightly faster convergence than a guess that is slightly smaller than the root [2]. I have hard-coded the initial guess ($s_0 = 1$) into my algorithm because we have been asked to calculate the square root of a fixed number, $ x = 2$. In general, the initial guess should be between 1 and the number, and initial guesses closer to the root will converge faster. We already roughly know the answer to this task, $\sqrt 2 \approx 1.4$, and so an initial guess anywhere between 1 and 1.4 will work fine. Much more consideration of the initial guess would be required if we had been asked to calculate the square root of *any* number. The further the initial guess is from the actual answer, the more iterations are "wasted" just getting to the vicinity of the answer [].

In [1]:
def sqrt2(x = 2, a = 0.0000000001):
    """
    A function to calculate the square root of 2 to a desired accuracy a
    using Newton's method
    """
    # We are looking for the square root of 2 so make that the default value for x.
    # x = 2
    # Provide an initial guess, s, for the square root of 2.
    s = 1
    # set iteration count = 0
    iter = 0
    # Loop until desired accuracy reached.
    # i.e. Loop if |x-s*s| > required accuracy. Default value set if not provided.
    while abs(x - (s * s)) > a:
        # Now calculate a better guess for the square root of 2.
        s -= (s*s - x) / (2 * s)
        # Keep track of number of iterations required to meet desired accuracy.
        iter += 1
        # print(iter)
    # Return the (approximate) square root of 2 and number of iterations.
    return s, iter

Call the function sqrt2

In [2]:
# sqrt2 returns a tuple. 
# First element = sqrt(2); second = # iterations.
# Extract the answer, which is a floating point number.
ans = sqrt2()[0]
#type(ans)

# Extract the # iterations, which is an integer.
n = sqrt2()[1]
# type(n)

Now display the square root of 2 to 100 decimal places. For this we make use of string formatting.

In [3]:
# Just printing the answer does not provide enough decimal places for this task.  
# print(ans)

print("Iteration count:", n)
# Use string formatting to display the correct number of decimal places.
format(ans, '.100f')   # give 100 digits after the decimal point


Iteration count: 4


'1.4142135623746898698271934335934929549694061279296875000000000000000000000000000000000000000000000000'

### A little note about accuracy
I'm a tiny bit suspicious of all those zeros at the end of the number. We discussed in week 2 of this module how $\sqrt 2$ is an irrational number: it cannot be expressed as a ratio of two integers so its decimal expansion never becomes periodic. My algorithm is producing a periodic decimal expansion (repeated zeros), so although it's a very good approximation of $\sqrt 2$, it is not exactly $\sqrt 2$. The algorithm takes 4 iterations to converge with tolerance set to 0.0000000001. Newton's method has a quadratic rate of convergence, meaning that the number of correct bits, or decimal places, (n) doubles with each iteration; the required number of iterations is of order $O(\log_2 n)$ [6]. We have been asked for 100 decimal places, so we would need about $\log_2 100 \approx 4.6$ iterations However, the running time of this algorithm scales as $O( n (\log_2 n )^2)$ [6]. This implies that requiring 100 bits of precision will cause the algorithm to take about 2000 times longer to run than for 10 bits. In the code as written above, some improvement in accuracy can probably be achieved by decreasing tolerance, but it's not the only contributor to the precision of the answer. I might come back to this as I don't want to get bogged down at this stage. The main reason for the rational appearance of $\sqrt 2$ comes from the fact that computers store numbers and do arithmetic in floating point format [7]. Numbers in the computer are stored as binary fractions, and most decimal fractions (and especially irrational numbers like $\sqrt 2$) cannot be expressed as binary fractions. The actual stored value on the computer is the nearest representable binary fraction. It's not a big problem most of the time, as the errors are typically no more than 1 part in $2^{53}$ per operation. This document recommends using the Python decimal module for high-precision applications. Calculating the $\sqrt 2$ is not a high-precision application.

First, compare this answer to what the Python math module produces [math module].

In [4]:
# import the math module  
import math  
  
# print the square root of  0  
print(format(math.sqrt(2), '.100f'))

# What's the difference?
# print(format(math.sqrt(2) - sqrt2()[0], '.100f'))

1.4142135623730951454746218587388284504413604736328125000000000000000000000000000000000000000000000000


The answer is almost identical to one produced by our algorithm, but not quite. Why? It also has a lot of zeros at the end of the number. Clearly this function is also producing only an approximation to the answer because of the way that floating point numbers are stored on the computer. 

Lastly, look at the Python decimal module for another comparison [8, 9]. expl.

In [5]:
from decimal import *
getcontext().prec = 100
Decimal(2).sqrt()

Decimal('1.414213562373095048801688724209698078569671875376948073176679737990732478462107038850387534327641573')

### Conclusion for Task 1
blah

### References for Task 1
[1] The Python Standard Library; Numeric and Mathematical Modules; https://docs.python.org/3/library/math.html

[2] Methods of computing square roots; Wikipedia; https://en.wikipedia.org/wiki/Methods_of_computing_square_roots

[3] Exercise: Loops and Functions: A Tour of Go; https://tour.golang.org/flowcontrol/8

[4] Newton's Method; Mathematical Python; https://www.math.ubc.ca/~pwalls/math-python/roots-optimization/newton/

[5] S. G. Johnson, MIT Course 18.335, February 4, 2015; Square Roots via Newton’s Method; https://math.mit.edu/~stevenj/18.335/newton-sqrt.pdf

[6] Arbitrary precision integer square root algorithm?; StackExchange Computer Science; https://cs.stackexchange.com/questions/37596/arbitrary-precision-integer-square-root-algorithm

[7] Floating Point Arithmetic: Issues and Limitations; The Python Tutorial; https://docs.python.org/3/tutorial/floatingpoint.html

[] 

[8] how can i show an irrational number to 100 decimal places in python?; Stack Overflow;
https://stackoverflow.com/questions/4733173/how-can-i-show-an-irrational-number-to-100-decimal-places-in-python

[9] decimal — Decimal fixed point and floating point arithmetic; The Python Standard Library; https://docs.python.org/3/library/decimal.html

To do:
- write explanation
- sync refs