# Equivalence, Invertible Functions, z-Scores, and Simple Linear Regression

## 1. Equivalence through invertible functions

In mathematics, we sometimes consider two entities as *equivalent* to each other.

What do we mean by equivalence?  There comes many definitions from different perspectives.  As one definition, we can think in terms of (mathematical) *functions*.

For example, **consider $y = f(x) = 2 x$.**

**NOTE:** Here $f$ is a symbol for a functional entity that we define and discuss.  The symbol itself is less important than the meaning it conveys.  We can call it $f$, or we can rename it to $g(x)$.  The naming matters less than the fact that this function maps a quantity to double the amount of the quantity.

As $y$ is connected to $x$ through $f$, we know that $y$ is simply twice as much as $x$ is.

In a concrete example, we invite couples to our party.  $x$ may stand for the number of couples attending our party, and $y$ then stands for the number of guests in attendance.

Furthermore, the naming of the variables inside the parenthese and returned by $f$ are just convention.  We do not have to stick to $x$ and $y$, as we may use any name to our liking, or deemed appropriate in the context.  For example, we may as well have 

number_of_guests = $f($ number_of_couples $)$

We say that the above $f$ establishes an equivalent relation between $x$ and $y$ for reasons as follows.

**$f$, as a function, has an inverse function, which we usually denote as $f^{-1}$.**

$x = f^{-1}(y) = \frac{1}{2} y$

**NOTE:**  We may as well define $u = f^{-1}(t) = \frac{1}{2} t$, which is the same inverse function.  The definition is not altered by renaming varialbes.

Note that the inverse of an inverse is the original.  Therefore, we say $f$ and $f^{-1}$ are mutually inverse functions.  In general we have:

$f^{-1}(f(x)) = x$

$f(f^{-1}(y)) = y$

Successive application of a function followed by its inverse gives us identity ($I(x) = x$).

Then why we can say that the above $f$ establishes an equivalence between $x$ and $y$?

Suppose you know $x$, can you deduce the value of $y$?  Yes, simply multiply it by 2.

On the other hand, if you are given the value of $y$, you may divide it by 2 to obtain $x$.

It does not matter which piece of information between $x$ and $y$ is provided to you, as you can always deduce the missing piece.  Therefore, knowing $x$ is exactly the same as knowing $y$.


**NOTE**: Not all functions have inverse.  Trivally, suppose you have $f(x) = 0$.  This function has no inverse.  Given the function result of $0$, you cannot deduce what the original value of $x$ is that has resulted in this $0$ result.

Another example is $y = f(x) = x^2$.  Because $(-2)^2 = 2^2 = 4$.  Given that $f(x) = 4$, you cannot say which $y$ is mapped to by $f^{-1}$, so non-existence. 

**Takeaway**: if a function has its inverse, then that function (or its inverse) has established an equivalence between the input (independent varialbe) and the result (dependent variable).

## 2. Z-score is an invertible map

Think about the below function, suppose we know $\bar{x}$ and $S_x$ as constants.

$z_x = z_x(x) = \frac{x - \bar{x}}{S_x}$

This is how z-score is computed for an observed sample point.  One claim is that this computation, if you think of it as a function, is invertible.  This means that, while if I give you $x$ you can compute $z_x$, if on the other hand I give you $z_x$, you can easily compute the value of $x$ by

$x = \bar{x} + z_x S_x$

Therefore, knowing $x$ is equivalent to knowing the z-score of $x$, i.e., $z_x$.

**Takeaway**: In statistics, when talking about a dataset, a data point, a measurement of certain physical quantity, etc., it does not matter if we refer to the original value, or to the z-score that values corresponds to. 

## 3. Simple Linear Regression in z-scores

In class we see that, for two variables $x$ and $y$, we can find a best fitting line that models a linear association between the variables, which is given as

$\hat{y} = b_0 + b_1 x$  (where $y = \hat{y} + e$, and $e$ is residue)

We also know that 

$b_1 = r \frac{S_y}{S_x}$

and $b_0$ is given in the equation

$\bar{y} = b_0 + b_1 \bar{x}$

Now suppose we express the data by their z-scores.  In other words, instead of representing our quantities by the original measures, we talk about each data point as a number (z-score in absolute value) of standard deviations above/below the mean.  This means that we are dealing with $z_x$ and $z_y$ in equivalence.

The z-scores can be modeled in a linear relation.

$\hat{z}_y = \beta_0 + \beta_1 z_x$  (where $z_y = \hat{z}_y + \epsilon$, and $\epsilon$ is residue)

By definition, we know that the z-scores have mean 0 and standard deviation of 1, and the linear correlation coefficent is unchanged.

$\beta_1 = r \frac{1}{1} = r$

$0 = \beta_0 + r \times 0$, so $\beta_0 = 0$

We have thus reached a very simple conclusion that

$\hat{z}_y = r z_x$  (*)

The correlation coefficient is the slope in this linear regression on the z-scores.