***


# Emerging Technology Tasks


***
This Juypter Notebook is a collection of solutions to the tasks given to me for the Emerging Technology module delivered by Dr. Ian Mcloughlin. Each of the four tasks are marked below. You will also find a list of references.

Author: Darragh Lally   
GMIT ID: G00220290   
Mail: g00220290@gmit.ie
***

## Task 1: SQRT2

***

In the first of our tasks we are to create a python function that will compute and display the square root of 2, to 100 decimal places **without** using any extra imports, including but not limited to math.   


Using Newton's Method we can calculate the square root of a number [1.1]. To find the square root $z$ of number $x$, we can loop using the following;   
    
$$ z_{next} = z - \frac{z^2 - x}{2z} $$    


***
#### Step 1
***
Using Newtons method to calculate the square root of 2. Testing our algorithm with number '2' against the import 'math', we can confirm the algorithm is returning an acceptable value. But we are seeing a limited number of decimal places. The problem calls for us to print the value to one hundred decimal places. Which leads us to the next step.

#### Newtons Method in Python

In [1]:
def sqrt2(x):
    """
    A function to calculate the square root of a number. 
    """
    # Inital guess for sqrt of z
    z = x / 2     
    # Loop until happy with accuracy
    while abs(x - (z * z)) > 0.0001:
        # Calculate a better guess
        z -= (z*z - x) / (2*z)
    # Return the approx square root of x
    return z

#### Testing

In [2]:
# Test our function
sqrt2(2)

1.4142156862745099

In [3]:
# Now with the 'math' equivalent
import math
math.sqrt(2)

1.4142135623730951

***
#### Step 2
***
**2.1** We need to now figure out how to print out more decimal places, currently we have 15. One post on stackoverflow[1.2] suggests using the function repr() [1.3].
- repr(); This function returns a string representation of an object been passed as the parameter.
- Below test shows that repr() is returning the same number of decimal places as before.
    - This result suggests that under the hood the number is been stored to 15 decimal places and not just printing that number for readability.
    
#### Testing

In [4]:
# Testing repr()
ans = sqrt2(2)
print('With repr(): ')
print( repr(ans))
print('Without repr(): ')
print(ans)

With repr(): 
1.4142156862745099
Without repr(): 
1.4142156862745099


**2.2** The same stackoverflow post [1.2], also suggests trying to format the string output using the following;
- "{0:.100f}".format(a), where the float value is the number of decimal places.
    - We are getting closer to our solution as seen by the extra accuracy of our output, yet the trailing zeros should be addressed.
    
#### Testing

In [5]:
# Testing string formatting
ans = sqrt2(2)
"{0:.100f}".format(ans)

'1.4142156862745098866440685014822520315647125244140625000000000000000000000000000000000000000000000000'

**2.3** I found a response to the question 'Generating digits of square root 2' on Stackoverflow [1.4]. 
Function returns the square root of 'a' with 'digits' number of places.It does not however place the decimal point in its output. It takes two inputs 
1. Number to perform task on - 'a'
2. The number of places to print to - 'digits'
3. Performs the following algorithm on 'a' to increase its size dramatically.
$$ a * 10^{(2*digits)} $$
4. Finds the square root of new number

In [6]:
# https://stackoverflow.com/questions/5187664/generating-digits-of-square-root-of-2
def sqroot(a, digits):
    a = a * (10**(2*digits))
    x_prev = 0
    x_next = 1 * (10**digits)
    while x_prev != x_next:
        x_prev = x_next
        x_next = (x_prev + (a // x_prev)) >> 1
    return x_next
print(sqroot(2,100))

14142135623730950488016887242096980785696718753769480731766797379907324784621070388503875343276415727


**2.4** Refactoring the above function to meet my needs. This will prove my understanding of the function but also allow me to add the decimal point into the output. Although the problem calls for us to deal with 2 to 100 places, I am leaving the function parameters, so we can perform this square root on any number passed in and, be able to decide the number of digits returned in each case.

In [7]:
# 2.0 Find square root of n
def sqrt2(n,d):
    # n = number to perform square root on
    # d = number of digits to return
    # 1. Multiply n by 10^(2*d)
    # 2. Get integer square root of new n
    # 3. Convert to string and store in list
    # 4. Add decimal point
    n = n*(10**(2*d))
    x_previous = 0
    x_next = 1*(10**d)
    while x_previous != x_next:
        x_previous = x_next
        x_next = (x_previous + (n//x_previous)) >> 1
    # Convert 'x_next' to a string, add to list
    l_chars = list(str(x_next))
    # Add decmial point in index 1 of list
    l_chars.insert(1,'.')
    # Concat l_chars
    ans = ''.join(l_chars)
    # return our answer
    return ans

#### Testing

In [8]:
print('Square Root of 2 to 100 places:')
print(sqrt2(2,100))


Square Root of 2 to 100 places:
1.4142135623730950488016887242096980785696718753769480731766797379907324784621070388503875343276415727


### References; sqrt2
[1.1] A tour of Go; Exercise: Loops and Functions; https://tour.golang.org/flowcontrol/8

[1.2] stackeoverflow, more decimal places needed in python; https://stackoverflow.com/questions/14057835/more-decimal-places-needed-in-python

[1.3] repr(), Programiz, Python Tutorials; https://www.programiz.com/python-programming/methods/built-in/repr

[1.4] casevh, stackeoverflow contributor, Generating digits of square root of 2; https://stackoverflow.com/questions/5187664/generating-digits-of-square-root-of-2

***
## Task 2: Chi-squared  $\chi^2$
***
Our second task is to verify the Chi-squared value from a given table of data. Published by Karl Pearson in 1900, it is considerd to be a 'founding stone' of moders statistics[2.1].

#### Given Table - [2.1]
| | A | B | C | D | Total     
| :------------- | :----------: | :-----------: | :-----------: | :-----------: | -----------: |
|  White Collar | 90 | 60 | 104 | 95 | 349 |
| Blue Collar | 30 | 50 | 51 | 20 | 151 |
| No Collar | 30 | 40 | 45 | 35 | 150 |
| Total | 150 | 150 | 200 | 150 | 650 |


#### Value to be verified
Approximately **24.6**

A search for 'scipy.stats' and 'chi squared' gave me links to the SciPy.org website. It holds the documentation for chisquared [2.2]. Below I implement the function, using data from given table. By testing this function I have found that it is not returning the relevent information to me. I have tried with multiple parameter differences.
1. Table Data only
2. Table Data and expected frequencies
3. Table Data and delta degree of freedom

In [9]:
from scipy.stats import chisquare
import numpy as np

# 1. params: table data.
#chisquare([[90,60,104,95],[30,50,51,20],[30,40,45,35]])

# 2. params: table data, f_exp (expected frequencies in each cat)
#chisquare([[90,60,104,95],[30,50,51,20],[30,40,45,35]], f_exp=[[90,60,104,95],[30,50,51,20],[30,40,45,35]])

# 3. params: table data, ddof (delta degrees of freedom)
#chisquare([[90,60,104,95],[30,50,51,20],[30,40,45,35]], ddof=0.5)



Further research led me to chi2_contingency on scipy docs [2.3]. If we pass in our table data it will return;
1. Chi test statistic 
2. 'p' value
3. Degrees of freedom
4. Expected frequencies array.

By using the below code we can confirm;
1. Approximated value 24.6 is only 0.03 off actual value.
2. 'p' value = 0.0004098425861096696 

In [10]:
from scipy.stats import chi2_contingency
import numpy as np

# Create numpy array from table data
obs = np.array([[90,60,104,95],[30,50,51,20],[30,40,45,35]])
# Pass into function
chi2_contingency(obs)

(24.5712028585826,
 0.0004098425861096696,
 6,
 array([[ 80.53846154,  80.53846154, 107.38461538,  80.53846154],
        [ 34.84615385,  34.84615385,  46.46153846,  34.84615385],
        [ 34.61538462,  34.61538462,  46.15384615,  34.61538462]]))

### References; chi squared
[2.1] Chi-squared, wikipedia, https://en.wikipedia.org/wiki/Chi-squared_test

[2.2] SciPy.org, scipy.stats.chi2_contingency; https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.chisquare.html

[2.3] SciPy.org, scipy.stats.chisquare; https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.chi2_contingency.html

Abhik Mukherjee, Medium, Difference Between 'chi2_contingency' and 'chisquare' in scipy; https://mabhik93.medium.com/difference-between-chi2-contingency-and-chisquare-in-python-132dacf84678

Stack Exchange, Chi-squared test with scipy: whats the difference between chi2_contingency and chisquare; https://stats.stackexchange.com/questions/110718/chi-squared-test-with-scipy-whats-the-difference-between-chi2-contingency-and


***

## Task 3: MS Excel Standard Deviation, Population & Sample

***

Task 3 envolves standard deviation and in particular, the differences between Microsoft Excel's two implementations of the equation and demo why STDEV.S calculation is a better estimate for the standard deviation of a population when performed on a sample 

1. **STDEV.P**
    * Standard deviation of an **entire population** 

    * Biased Analysis
    
    $$\sqrt{\frac{\epsilon(x - \bar{x})^2}{n}}$$
    
2. **STDEV.S**
    * Standard deviation of a **sample of an entire Population** 

    * Non-biased Analysis
    
    $$\sqrt{\frac{\epsilon(x - \bar{x})^2}{n-1}}$$



In [11]:
import numpy as np

# Sample numbers from Microsofts support documentation
x = [1335,1301,1368,1322,1310,1370,1318,1350,1303,1299]

# STDEV.P
def stdev_p(x):
    return np.sqrt(np.sum((x - np.mean(x))**2)/len(x))

# STDEV.S
def stdev_s(x):
    return np.sqrt(np.sum((x - np.mean(x))**2)/len(x)-1)

In [12]:
print(stdev_p(x))
print(stdev_s(x))

25.593749236874224
25.57420575501808


In [13]:
import numpy as np
import statistics

# Same data as above
x = [1335,1301,1368,1322,1310,1370,1318,1350,1303,1299]

# Using numpy standard deviation
stdev_p = np.std(x)
print(stdev_p)

# Using statistics standard deviation
stdev_s = statistics.stdev(x)
print(stdev_s)

25.593749236874224
26.97818048390629


### References; STDEV.S & STDEV.P

stdev.s https://support.microsoft.com/en-us/office/stdev-s-function-7d69cf97-0c1f-4acf-be27-f3e83904cc23

stdev.p https://support.microsoft.com/en-us/office/stdev-p-function-6e917c05-31a0-496f-ade7-4f4e7462f285

https://blog.finxter.com/how-to-get-the-standard-deviation-of-a-python-list/

# References

### General
Markdown mathematical symbols, LKS90 Github; https://gist.github.com/LKS90/252ac41bd4a173be35b0 

Markdown Guide; https://www.markdownguide.org/

Numpy Documentation; https://numpy.org/doc/stable/user/quickstart.html

