![xkcd](https://imgs.xkcd.com/comics/machine_learning_2x.png)

_____

# Linear Algebra Homework

_____


### Problem 1: Interpolation versus Regression (5 points)

In your own words, describe the difference between regression and interpolation using the language of linear algebra. Does linear algebra provide an approach to extrapolation? Why or why not? 

____

### Problem 2: Polynomial Regression (10 points)

We have learned that linear regression refers to fitting data to a model in which the weights appear linearly. But, the model itself need not use linear functions. The Gaussian RBFs we used were an example of that. Another very common choice is that of a polynomial. Let's code two cases in which the number of weights is and is not the same as the number of data points.

We wish to model our data with a polynomial. The data you are given is:
$$x = [-2, -0.5, 0, 1] ,$$
$$y = [0, 0.9375, 1, 3] .$$
Because you have four data points, you are tempted to use a model with four parameters, such as:
$$y = w_0 + w_1x + w_2x^2 + w_3x^3 .$$
But, you are also worried that the data may be noisy so you **also** want to fit it to only three weights. You decide that the $x^3$ term could cause large excursions that might follow the noise, so your second model is:
$$y = w_0 + w_1x + w_2x^2.$$

Using only libraries from `linalg`, fit the data to both models. Plot the data and the two resulting models.

As we have seen, the coding for this is trivial - the hard part is setting up the vectors and matrices. Slow down and be sure you understand what you are doing: this will help you set up the problem so that it is very easy. I'll give you a hint:
$$\underbrace{\begin{bmatrix} y_1\\y_2\\y_3\\y_4\end{bmatrix}}_{4\times 1} = \underbrace{\begin{bmatrix} 0\\0.09375\\1\\3\end{bmatrix}}_{4\times 1} =\underbrace{\begin{bmatrix} 1 & -2 & 4 & 8 \\ 1 & -0.5 & 0.25 & 0.125 \\\vdots  \end{bmatrix}}_{4\times 4}\underbrace{\begin{bmatrix} w_0\\w_1\\w_2\\w_3\end{bmatrix}}_{4\times 1}.$$
It is crucial that you know where this came from!! Explain in a markdown cell where I got these numbers from.

You then use Python to get the weight vector ${\bf w}$, which allows you to plot the resulting polynomial. In one case you will need to use the pseudoinverse, [the `pinv` function](https://numpy.org/doc/stable/reference/generated/numpy.linalg.pinv.html), or in [SciPy](https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.pinv.html#scipy.linalg.pinv), because you have more data points than weights.

If you write your code in a general way, it is easy to implement the third model:
$$y = w_0 + w_1x.$$
Do that next.

In the world of machine learning, we would need to figure out which of these three models is the "best", a process called "_model selection_". We won't worry about it now, but I wanted you to at least be aware of the idea. A second idea I'll introduce here is "_regularization_", since it is connected. Regularization is penalizing weights that cause large excursions; here, we are doing this by hand by dropping the higher-order terms (e.g., setting $w_3=0$), thereby prohibiting predictions with large excursions.  There are very powerful techniques for automating this. There is a lot to learn from this problem other than just setting up regression problems and inverting a matrix! 🤓 In a markdown cell, comment on which model you think is best based on your plot(s) and explain your reasons. (If you take a machine learning course, you will learn powerful mathematical methods to make this judgement.)


____

### Problem 3: Inverting a Matrix By Hand (10 points)

I mentioned in class that we will mostly rely on Python libraries for performing linear algebra operations. Why not, isn't that why they were developed?!

However, it is also good practice to know how to do these by yourself for the simplest cases. This allows you to explore ideas without a computer and build your intuition for what the libraries are doing. We'll learn here how to invert a $2\times 2$ matrix by hand. Being able to do this by hand also gives you a test case to ensure you are using the Python libraries correctly.

Follow these steps:
1. Make a $2\times 2$ matrix $A = \begin{bmatrix} a & b \\ c & d\end{bmatrix}$ using a NumPy array (you might want to try several choices); if you don't know about these already, NumPy has [some nice functionality for creating arrays](https://numpy.org/doc/stable/user/basics.creation.html), which can be matrices, of various types.
2. Find the [determinant](https://en.wikipedia.org/wiki/Determinant) of your matrix, using:
$$ \mathrm{det}(A) = ad - cb.$$
Do this by hand, not with a library.
3. Form the inverse $A^{-1}$ with
$$A^{-1} = \frac{1}{\mathrm{det}(A)}\begin{bmatrix} d & -b \\ -c & a\end{bmatrix}.$$
Show all of your steps using $\LaTeX$ in a markdown cell.
4. Now that you have $A^{-1}$, use the rules of matrix multiplication to find the product $A^{-1}A$. Show your steps.
5. Vary the matrix $A$ and comment on anything interesting you see. For example, what would $A$ look like if its determinant were $0$? For example, make a matrix for which $b=2a$ and $d = 2c$. What does this case correspond to?

____

### Problem 4: Inner and Outer Product (5 points)

Given the two vectors:
$$v_1 = \begin{bmatrix} 1 \\ 2 \\ 3\end{bmatrix} \: v_2 = \begin{bmatrix} 1 \\ 1 \\ 1\end{bmatrix},$$
compute the two outer products
$$v_1 v_2^T,$$
and
$$v_2 v_1^T.$$
Do this _both_ by hand, and show your work using $\LaTeX$, and using a NumPy or SciPy library. Next, do the same for the two possible **inner** products. 


Does the order of the vectors matter for the outer product? That is, what is the commutivity relation for these operations? Show and explain all of the details.




____

### Problem 5: Randomized Experiments (10 points)

In data science, we often work with existing datasets. However, some of the most crucial decisions in both personal and professional contexts depend on data that must first be collected through carefully designed experiments. From medical trials testing new treatments to A/B tests in tech companies, the quality of our conclusions depends fundamentally on the quality of our experimental design.

Read Chapter 11 ("Randomized Experiments") from "Thinking Clearly with Data." As you read, consider:
- How does randomization address the problem of bias? What is the connection to missingness? 
- What makes an experiment truly "controlled"?
- Why might seemingly well-designed experiments still fail?

1. Provide a concise summary of the chapter's main points.
2. Identify and explain what you consider to be the single most important conclusion from this chapter. Why did this particular point resonate with you?
3. Give a real-world example where proper experimental design is crucial (this could be from medicine, technology, social science, etc.).
4. Reflect on how this knowledge might influence your:
   - Future work as a data scientist
   - Personal decision-making (especially regarding medical or scientific claims)
   - Understanding of scientific literature and news

Use an AI assistant (like Claude 3.5 Sonnet (New), ChatGPT 4o or o1, etc.) to help you organize and synthesize the chapter's key concepts. Give your own summary and then some suggested prompts:
- "Create a comparison table showing the differences between randomized experiments and observational studies"
- "Generate a decision tree using Mermaid for determining whether an experiment is properly randomized"
- "Organize my summary of the key threats to validity in experimental design and how to address them"

Include both your prompts and the AI's responses in your submission, along with your own analysis of how helpful (or not) the AI was in deepening your understanding.

Remember: The best data analysis cannot overcome poor experimental design. As future data scientists, understanding how to collect good data is just as important as knowing how to analyze it.

____

### Problem 6: Computational Linear Algebra (10 points)

First, run this code that generates some data for you.

In [1]:
import numpy as np

# Set random seed for reproducibility
np.random.seed(42)

# Create these matrices:
A = np.random.randn(4, 3)  # 4x3 matrix from standard normal distribution
B = np.random.randn(3, 4)  # 3x4 matrix from standard normal distribution
C = np.random.randn(4, 4)  # 4x4 matrix from standard normal distribution
v = np.random.randn(3)     # vector of length 3

Before computing anything, write down the dimensions of $A$, $B$, $v$, $AB$, $BA$, $Av$, $vA$. Do all of these make sense? If any of these don't make sense, how can you "fix" them? 

Now, only use [Python's Numpy](https://numpy.org/doc/stable/reference/routines.linalg.html). 

Using Numpy perform these operations:
1. use the @ operator to perform matrix multiplication in Python, comparing $A @ B$ and $B @ A$ and $A @ v$ and $v @ A$; explain your findings - did you get what you expected? any errors? 
2. use Numpy to compute the transpose of $C$; do this twice - do you get $C$ back again? 
3. compute the trace of $C$
4. copmute the inverse of $C$ if it has one; if it does, compute and print $CC^{-1}$
5. form the symmetric matrix $S = \frac{1}{2}\left( C + C^T\right)$; what is the transpose of $S$ and how is it related to $S$ itself? 
6. compute the eigenvalues of $S$ and $C$ - how do they differ, if they do? 