**Instructions for Assignment 2: TLS Problem on Textile Data**

1. **Objective**:  
   In this assignment, you will write modular Python code in the provided Colab notebook to solve the TLS problem for textile data.

2. **Notebook Organization**:  
   - Ensure the notebook is well-structured and modular.  
   - Include all necessary library imports and helper functions in the earlier cells.  
   - Write the final function, `test_TLS(data_set)`, in the **last cell** of the notebook.

3. **Function Requirements**:  
   - Define the function `test_TLS(data_set)` to perform the following:
     - Compute the **regression coefficients**: `b2`, `b1`, `b0`.
     - Calculate the **Frobenius norm**: `norm`.  
   - The input `data_set` will be a pandas DataFrame created from a CSV file.  
   - The function must return a tuple in the format:  
     ```python
     (b2, b1, b0, norm)
     ```
   - Ensure all returned values are rounded to **2 decimal places**.

4. **Regression Coefficient Format**:  
   - The coefficients should be presented in the format:  
     `b2 × Color + b1 × Quality + b0`

5. **Testing the Function**:  
   - Your function will be tested using a call like this:  
     ```python
     (b2, b1, b0, norm) = test_TLS(public_data_set)
     ```
   - The `public_data_set` is a pandas DataFrame created from the file `textile_data.csv`.

6. **Expected Output Ranges**:  
   For the provided **public dataset**, your function should produce results in the following ranges:  
   - `b2`: Between 2.0 and 2.1  
   - `b1`: Between 1.5 and 1.6  
   - `b0`: Between 0.19 and 0.21  
   - `norm`: Between 62 and 63  

7. **Evaluation Process**:  
   - The evaluation of your notebook will involve running **all the cells sequentially** and testing the function with **private test cases**.  
   - Ensure the following:  
     1. **Order of Functions and Declarations**: Maintain the proper sequence of functions and declarations in the notebook.  
     2. **No Additional or Unwanted Code**: Remove any extra cells, variables, or function declarations that are not relevant to the solution. Your final submission should only contain the necessary code to solve the problem.

8. **Submission Guidelines**:  
   - Ensure your notebook is clean, with appropriate comments and well-documented code.  
   - Verify your function’s output matches the expected ranges before submission.

In [56]:
import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler
from scipy.linalg import svd

In [57]:
def frobenius_norm(matrix):
    return np.linalg.norm(matrix, 'fro')

def get_corrected_estimates(data, scaler_X, scaler_Y):
    # Separate the features (X) and the target variable (Y)
    X = data[['Color', 'Quality']].values
    Y = data['Price'].values

    # Scale the data
    X_scaled = scaler_X.fit_transform(X)
    Y_scaled = scaler_Y.fit_transform(Y.reshape(-1, 1))

    # Construct the augmented data matrix with an intercept
    ones = np.ones((X_scaled.shape[0], 1))
    X_aug = np.hstack((ones, X_scaled, Y_scaled))

    # Perform SVD on the augmented data matrix
    U, S, Vt = svd(X_aug)
    V = Vt.T

    # Set the smallest singular value to zero
    S[-1] = 0

    # Reconstruct the augmented data matrix
    X_aug_corrected = np.dot(U[:, :len(S)], np.dot(np.diag(S), Vt[:len(S), :]))
    # Extract the corrected estimates for X and y
    X_corrected = X_aug_corrected[:, 1:-1]
    Y_corrected = X_aug_corrected[:, -1]

    # Transform corrected estimates back to original scale
    X_corrected_original = scaler_X.inverse_transform(X_corrected)
    Y_corrected_original = scaler_Y.inverse_transform(Y_corrected.reshape(-1, 1))

    return X_corrected_original, Y_corrected_original


In [61]:
def test_TLS(data_set):

    # Compute the regression coefficients
    X = data_set[['Color', 'Quality']].values
    Y = data_set['Price'].values

    # Scale the data
    scaler_X = StandardScaler()
    scaler_Y = StandardScaler()
    X_scaled = scaler_X.fit_transform(X)
    Y_scaled = scaler_Y.fit_transform(Y.reshape(-1, 1))

    # Construct the augmented data matrix with an intercept
    ones = np.ones((X_scaled.shape[0], 1))
    X_aug = np.hstack((ones, X_scaled, Y_scaled))

    # Perform SVD on the augmented data matrix
    U, S, Vt = svd(X_aug)
    V = Vt.T

    # Identify the smallest singular value and its corresponding right singular vector
    v_n_plus_1 = V[:, -1]

    # Partition the singular vector
    v_ones = v_n_plus_1[0]
    v_X = v_n_plus_1[1:-1]
    v_Y = v_n_plus_1[-1]



    # Compute the regression coefficients
    beta = -v_X / v_Y



    # Get corrected estimates for the data matrices
    X_corrected, Y_corrected = get_corrected_estimates(data_set, scaler_X, scaler_Y)

    # Calculate the total Frobenius Norm
    norm = frobenius_norm(np.hstack((X_corrected, Y_corrected)))

    # Inverse transform the coefficients
    beta_original = scaler_Y.scale_ * beta / scaler_X.scale_
    beta_0_original = scaler_Y.mean_ - np.dot(scaler_X.mean_, beta_original)

    # Round the results to 2 decimal places
    beta_0_original = np.round(beta_0_original, 2)
    beta_original = np.round(beta_original, 2)
    norm = np.round(norm, 2)

    return (beta_original[0], beta_original[1], beta_0_original[0], norm)


In [1]:
public_data_set = pd.read_csv('textile_data.csv')
test_TLS(public_data_set)

NameError: name 'pd' is not defined