# Discussion Week 4

In this discussion we review how to solve a least squares problem and how to solve linear systems of equations. 

You can use the Shared Computing Cluster (SCC) or Google Colab to run this notebook.

The general instructions for running on the SCC are available under General Resources on [Piazza](https://piazza.com/bu/fall2025/ds722/resources).

## Problem: Least squares

In this exercise, you'll generate a synthetic dataset with one independent variable $x$ and one dependent variable $y$, and fit a linear least squares model using `scikit-learn`. You could alternatively use `numpy` or `scipy` to solve the least squares problem directly. In class, we used `numpy.linalg.lstsq` to solve this problem.

### Step 1: Generate Synthetic Data

- Use `sklearn.datasets.make_regression` to create a dataset with:
  - One feature (independent variable)
  - One target (dependent variable)
  - Add Gaussian noise to simulate measurement error

The documentation for `make_regression` can be found [here](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.make_regression.html).

### Step 2: Fit a Linear Regression Model

- Use `sklearn.linear_model.LinearRegression` to fit a least squares model.
- Extract the slope (coefficient) and intercept.
- Predict the outputs using the fitted model.

The documentation for `LinearRegression` can be found [here](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html).

### Step 3: Visualize the Fit

- Plot the original data points \((x, y)\)
- Overlay the fitted regression line

### Step 4: Verify $b-Ax\perp \operatorname{Range}(A)$

- Compute $r=b-Ax$ (be careful how you include the y-intercept)
- Compute $A^{T}r$ and verify this is a vector of zeros (or near-zero numbers)
- Compute the quantities $\Vert Ax\Vert_{2}^{2}$, $\Vert r\Vert_{2}^{2}$, and $\Vert b\Vert_{2}^{2}$ and numerically confirm that $\Vert Ax\Vert_{2}^{2} + \Vert r\Vert_{2}^{2} = \Vert b\Vert_{2}^{2}$
- You will want to use the `numpy.linalg.norm` function to compute the 2-norm of a vector.

In [None]:
#TODO Steps 1-3

In [None]:
#TODO Step 4

# Problem: LU factorization for $4\times4$ matrices

Using the `scipy.linalg.lu` function, compute the LU factorization of the following two matrices:

$$
A_{1}
=
\begin{bmatrix}
4 & 2 & 3 & 1 \\
1 & 3 & 2 & 5 \\
3 & 1 & 4 & 2 \\
2 & 4 & 1 & 3 \\
\end{bmatrix},
\quad
A_{2}
\begin{bmatrix}
2 & 1 & 3 & 4 \\
4 & 2 & 6 & 8 \\
6 & 0 & 9 & 12 \\
1 & 1 & 1 & 1 \\
\end{bmatrix}.
$$

The documentation for the `scipy.linalg.lu` function can be found [here](https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.lu.html). Observe that this function returns a permutation matrix $P$ as well as the matrices $L$ and $U$, such that $A = PLU$.

Be sure to print out the matrices $P$, $L$ and $U$ for each case, and verify that $A=PLU$.

For $A_{2}$, what do you notice about the rows of $U$? What does this imply about the invertibility of $A_{2}$?


In [None]:
#TODO A1 LU factorization


In [None]:
# TODO A2 LU factorization
