# Machine Learning and Statistics Tasks

**Tatjana Staunton**

***

### Task 1







>Square roots are difficult to calculate. In Python, you typically
use the power operator (a double asterisk) or a package such
as `math`. In this task, you should write a function `sqrt(x)` to 
approximate the square root of a floating point number x without
using the power operator or a package.

>Rather, you should use the Newton’s method. Start with an 
initial guess for the square root called $z_0$. You then repeatedly
improve it using the following formula, until the difference between some previous guess $z_i$ and the next $z_{i+1}$
is less than some
threshold, say 0.01.

$$ z_{i+1} = z_i - \frac{z_i * z_i - x}{2z_i}$$



In [1]:
def sqrt(x):
# Initial guess for the square root.
    z = x / 4.0
# Loop until it accurate enough.
    for i in range(100):
# Newton's method for a better 
        z = z - (((z * z) -x) / (2 * z))
# Now z should be a good aproximation for the square root.
    return z

In [2]:
# Test the function on 5.
sqrt(5)

2.23606797749979

In [3]:
# Check Python's square root of 3.
5 ** 0.5

2.23606797749979

##### Notes


>1. The calculation $z^2 - x$ is exactly zero when $z$ is the square root of $x$. it is greater than zero when $z$ is too big. It is less than zero when $z$ is too small. Thus $(z^2 - x)^2$ ia a good candidate for a cost function.
>2. The derivative of the numerator $z^2 - x$ with respect to $z$ is $2z$.That is denominator.



#### References

https://atlantictu-my.sharepoint.com/personal/ian_mcloughlin_atu_ie/_layouts/15/stream.aspx?id=%2Fpersonal%2Fian%5Fmcloughlin%5Fatu%5Fie%2FDocuments%2Fstudent%5Fshares%2Fmachine%5Flearnning%5Fand%5Fstatistics%2F1%5Fgeneral%2Ft01v11%5Ftask%5Fone%5Fand%5Frepo%2Emkv&referrer=StreamWebApp%2EWeb&referrerScenario=AddressBarCopied%2Eview

***

### Task 2

>Consider the below contingency table based on a survey asking
respondents whether they prefer coffee or tea and whether they
prefer plain or chocolate biscuits. Use `scipy.stats` to perform
a `chi-squared` test to see whether there is any evidence of an association between drink preference and biscuit preference in this
instance.



\begin{array}{|c|c|c|}
\hline
\text{} & \text{Chocolat Biscuit} & \text{PlaneBiscuit}\\
\hline
\text{Coffee}  & \text{43}& \text{57}\\
\hline
\text{Tea} & \text{56} & \text{45}\\
\hline
\end{array}


In [4]:
# Importing necessary libraries.
import numpy as np
import scipy.stats as ss
ss.chi2_contingency

# Creating the contingency table.
contingency_table = np.array([[43, 57], [56, 45]])
result = ss.chi2_contingency(contingency_table)

result

Chi2ContingencyResult(statistic=2.6359100836554257, pvalue=0.10447218120907394, dof=1, expected_freq=array([[49.25373134, 50.74626866],
       [49.74626866, 51.25373134]]))

#### Notes
Based on the provided data and the results of the chi-squared test, there is not sufficient statistical evidence to conclude that there is an association between respondents' preferences for coffee or tea and their preferences for plain or chocolate biscuits. At the given significance level, we cannot confidently say that there is a significant relationship between drink preference and biscuit preference based on the survey data you have.

#### References

https://www.overleaf.com/learn/latex/Tables#Creating_a_simple_table_in_LaTeX

https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.chi2_contingency.html

https://atlantictu-my.sharepoint.com/personal/ian_mcloughlin_atu_ie/_layouts/15/onedrive.aspx?id=%2Fpersonal%2Fian%5Fmcloughlin%5Fatu%5Fie%2FDocuments%2Fstudent%5Fshares%2Fmachine%5Flearnning%5Fand%5Fstatistics%2F2%5Fchi%5Fsquare&ga=1