# I. Least Square Regression Line

### Task

A group of five students enrolls in Statistics immediately after taking a Math aptitude test. Each student's 
Math aptitude test score, x, and Statistics course grade, y, can be expressed as the following list of 
points: 
1. (95, 85)
2. (85, 95)
3. (80, 70)
4. (70, 65)
5. (60, 70)<br><br>
If a student scored an 80 on the Math aptitude test, what grade would we expect them to achieve in 
Statistics? Determine the equation of the best-fit line using the least squares method, then compute and 
print the value y of when x=80.

### Input format

There are five lines of input; each line contains two space-separated integers describing a student's 
respective and grades: <br>
95 85 <br>
85 95 <br>
80 70 <br>
70 65 <br>
60 70 <br><br>
If you do not wish to read this information from stdin, you can hard-code it into your program.

### Output Format

Print a single line denoting the answer, rounded to a scale of 3 decimal places (i.e., 1.234 format).

### Solution

When using the Least Square Regression Line method, coefficients of the line $y=a+bx$ can be found by following formulas:
$$b=\frac{n \sum(x_i y_i) - (\sum x_i) (\sum y_i)} {n \sum (x_i^2)-(\sum x_i) ^2} \\
a=\bar{y}-b \cdot \bar{x} \\ \text{where } \bar{y} \text{ is the mean of Y and } \bar{x} \text{ is the mean of X. } $$

### Code

In [2]:
x = []
y = []
n = 5
for i in range(n):
    xi, yi = input().split(' ')
    x.append(float(xi))
    y.append(float(yi))
    
sum_xy = sum([xi*yi for (xi,yi) in zip(x,y)])
sum_x = sum(x)
sum_y = sum(y)
sum_sq_x = sum([xi**2 for xi in x])

b = (n*sum_xy - sum_x*sum_y) / (n*sum_sq_x - sum_x**2)

mean_x = sum_x / n
mean_y = sum_y / n

a = mean_y - b * mean_x

x_target = 80
y_target = a + b*x_target

print('{:.3f}'.format(y_target))

"95 85"
"85 95"
"80 70"
"70 65"
"60 70"
78.288


# II. Pearson Correlation Coefficient II

### Task

The regression line of y on x is $3x+4y+8=0$, and the regression line of x on y is $4x+3y+7=0$. What is the value of the Pearson correlation coefficient?
<ul>
<li>1</li>
<li>-1</li>
<li>3/4</li>
<li>4/3</li>
<li>-4/3</li>
<li><b>-3/4</b></li>
</ul>

### Solution

Another formula that allows to calculate b coefficient of regression line using Pearson Correlation Coefficient is:
$$b=\rho \frac{\sigma_Y} {\sigma_X}$$
Let's multiply b coefficients for given two lines:
$$b_Y \cdot b_X = \rho \cdot \frac{\sigma_Y} {\sigma_X} \cdot \rho \cdot \frac{\sigma_X} {\sigma_Y} = \rho ^2$$
The b coefficients of regression lines y on x and x on y are both -3/4. Thus, 
$$\rho=\pm \sqrt{-\frac{3}{4} \cdot (-\frac{3}{4})} =\pm \frac{3}{4}$$
As the $\sigma$ value cannot be negative, the $\rho$ coefficient has the same sign as b. So, the answer is -3/4.