# ENGR 240: General Linear Least Squares (GLLS)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/WCC-Engineering/ENGR240/blob/main/Class%20Demos%20and%20Activities/Week%205/Worksheet%205-2%20General%20Linear%20Least%20Squares.ipynb)

## Introduction

General Linear Least Squares (GLLS) is a powerful technique for fitting nonlinear models that can be transformed into linear form. In this worksheet, we'll implement this method in Python using NumPy's `np.linalg.lstsq()` function, applying it to a thermistor calibration problem.

### Learning Objectives
- Understand the principles of General Linear Least Squares
- Transform nonlinear models into linear form
- Implement GLLS using NumPy's `np.linalg.lstsq()` function
- Apply GLLS to a thermistor calibration problem
- Evaluate fit quality and visualize results

### The Thermistor Calibration Problem

Thermistors are temperature-sensing resistors that change resistance with temperature. The relationship between temperature and resistance is nonlinear, but can be modeled using the Steinhart-Hart equation:

$$\frac{1}{T} = A + B\ln(R) + C[\ln(R)]^3$$

Where:
- $T$ is the temperature in Kelvin (or Celsius for our purposes)
- $R$ is the resistance in ohms
- $A$, $B$, and $C$ are coefficients specific to the thermistor

Our goal is to determine the coefficients $A$, $B$, and $C$ that provide the best fit to experimental data.

## Understanding General Linear Least Squares (GLLS)

GLLS is used to fit models of the form:

$$y = \sum_{j=1}^{m} c_j Z_j(x)$$

Where:
- $y$ is the dependent variable (possibly transformed)
- $x$ is the independent variable
- $Z_j(x)$ are basis functions
- $c_j$ are the coefficients we want to determine

The key insight of GLLS is that while the relationship between $x$ and $y$ might be nonlinear, the relationship between the transformed variables $Z_j(x)$ and a transformation of $y$ can be linear. This allows us to use standard linear least squares techniques.

The steps for implementing GLLS are:
1. Transform the model into a linear form
2. Define the basis functions $Z_j(x)$
3. Set up the Z matrix (each column corresponds to a basis function evaluated at all data points)
4. Solve the system using linear least squares
5. Evaluate the quality of the fit

## Setup and Imports

Let's import the necessary libraries and set up our experimental data:

In [None]:
import numpy as np
import matplotlib.pyplot as plt

# Sample thermistor data: resistance (ohms) and temperature (Celsius)
thermistor_data = np.array([
    [4212, 10],
    [2815, 20],
    [1921, 30],
    [1340, 40],
    [944, 50],
    [682, 60],
    [496, 70]
])

# Extract resistance and temperature data
R_data = thermistor_data[:, 0]  # Resistance in ohms
T_data = thermistor_data[:, 1]  # Temperature in Celsius

print("Resistance data (ohms):", R_data)
print("Temperature data (°C):", T_data)

## Task 1: Linearize the Model and Define Basis Functions

For the thermistor model $T = \frac{1}{A + B\ln(R) + C[\ln(R)]^3}$, we need to linearize it to fit into the GLLS framework.

Your tasks:
1. Transform the model to express it in a linear form.
2. Identify the appropriate dependent variable after transformation.
3. Define the basis functions $Z_j(x)$.

<details>
<summary>Hint</summary>
Consider taking the reciprocal of both sides of the equation and see what you get.
</details>

In [None]:
# Step 1: Transform the model
# Hint: Take the reciprocal of both sides

# Step 2: Define the transformed dependent variable
# Transformed_y = ?

# Step 3: Define the basis functions
# Z1(x) = ?
# Z2(x) = ?
# Z3(x) = ?

# Your code here:

## Task 2: Set Up the Z Matrix

Now, we need to set up the Z matrix where each column corresponds to a basis function evaluated at all data points.

Your tasks:
1. Create the Z matrix with dimensions (n × m), where n is the number of data points and m is the number of basis functions.
2. Fill each column with the values of the corresponding basis function evaluated at each data point.

In [None]:
# Create Z matrix
# Z = ?

# Print Z matrix to verify
# print("Z matrix shape:", Z.shape)
# print("Z matrix:")
# print(Z)

# Your code here:

## Task 3: Solve Using np.linalg.lstsq()

Now, we'll use the `np.linalg.lstsq()` function to solve the linear system and find the coefficients.

Your tasks:
1. Set up the transformed dependent variable vector.
2. Use `np.linalg.lstsq()` to find the coefficients.
3. Extract the coefficients from the result.

In [None]:
# Create the transformed dependent variable vector
# y_transformed = ?

# Solve the system using np.linalg.lstsq()
# result = np.linalg.lstsq(?, ?, rcond=None)

# Extract the coefficients
# coefficients = ?

# Print the coefficients
# print("Coefficients (A, B, C):", coefficients)

# Your code here:

## Task 4: Calculate Fit Quality Statistics

Now, let's evaluate how well our model fits the data. We'll calculate the standard error (Syx) and coefficient of determination (R²).

Copy your fit quality statistics function from Worksheet 5-1. If you don't have it, create a function that calculates R² and standard error.

In [None]:
# Copy your fit_quality_stats function from Worksheet 5-1
def fit_quality_stats(y_actual, y_predicted):
    """
    Calculate fit quality statistics: R² and standard error.

    Parameters:
    -----------
    y_actual : array-like
        Actual values
    y_predicted : array-like
        Predicted values

    Returns:
    --------
    r_squared : float
        Coefficient of determination
    std_error : float
        Standard error of the estimate
    """
    # Your code from Worksheet 5-1 here
    pass

# Create a function to predict temperatures using the fitted model
def predict_temperature(R, coefficients):
    """
    Predict temperature from resistance using the fitted model.

    Parameters:
    -----------
    R : array-like
        Resistance values in ohms
    coefficients : array-like
        Model coefficients [A, B, C]

    Returns:
    --------
    T : array-like
        Predicted temperature values in Celsius
    """
    # Your code here
    pass

# Calculate predicted temperatures
# T_predicted = predict_temperature(R_data, coefficients)

# Calculate fit quality statistics
# r_squared, std_error = fit_quality_stats(T_data, T_predicted)

# Print results
# print(f"R-squared: {r_squared:.4f}")
# print(f"Standard Error: {std_error:.4f} °C")

# Your code here:

## Task 5: Visualize the Fit

Finally, let's create a visualization to see how well our model fits the data.

Your tasks:
1. Generate a range of resistance values for smooth plotting.
2. Calculate the predicted temperatures for these resistance values.
3. Create a scatter plot of the original data.
4. Add a line plot of the fitted model.
5. Include labels, a legend, and appropriate formatting.

In [None]:
# Generate resistance values for smooth plotting
# R_smooth = np.logspace(?, ?, 100)  # Logarithmic spacing works well for thermistor data

# Calculate predicted temperatures
# T_smooth = predict_temperature(R_smooth, coefficients)

# Create a figure
# plt.figure(figsize=(10, 6))

# Plot original data points
# plt.scatter(?, ?, label=?)

# Plot fitted curve
# plt.plot(?, ?, label=?, color='red')

# Add labels and title
# plt.xlabel(?)
# plt.ylabel(?)
# plt.title(?)

# Add legend
# plt.legend()

# Set x-axis to log scale (resistance varies over orders of magnitude)
# plt.xscale('log')

# Add grid
# plt.grid(True, which='both', linestyle='--', alpha=0.7)

# Display the plot
# plt.show()

# Your code here:

## Conclusion

In this worksheet, you've implemented the General Linear Least Squares method to fit a nonlinear thermistor model. This approach can be applied to many other problems where the model is nonlinear but can be transformed into a linear form.

Key takeaways:
1. GLLS allows us to use linear least squares techniques for certain nonlinear models.
2. The key is transforming the model into a linear form and defining appropriate basis functions.
3. NumPy's `np.linalg.lstsq()` function is a powerful tool for solving linear least squares problems.
4. Evaluating the fit with statistics and visualization is crucial for assessing model quality.

## Discussion Questions

1. How would you modify this approach for a different mathematical model?
2. What are the advantages and limitations of linearizing a model versus using nonlinear optimization techniques directly as you did in Worksheet 5-1?
