# Linear Regression – Functional Test Notebook

This notebook is used to **numerically test** the Rust-based `MyRustLinearRegression`
model exposed through the `rust_core` Python module.

The goal here is *correctness*, not speed:
- Verify that training runs without errors
- Check that predictions have the expected shape
- Compare predictions against scikit-learn's `LinearRegression` on simple synthetic data


## Imports


In [None]:
import importlib
import numpy as np

import rust_core
from sklearn.linear_model import LinearRegression

# Reload in case the Rust module was rebuilt
importlib.reload(rust_core)

print("rust_core loaded:", rust_core)


## Simple numerical test data

In [None]:
# Simple 2D linear regression toy example

X_train = np.array([
    [1.0, 2.0],
    [3.0, 4.0],
    [5.0, 6.0],
    [7.0, 8.0],
], dtype=np.float64)

y_train = np.array([10.0, 20.0, 30.0, 40.0], dtype=np.float64)

X_test = np.array([
    [2.0, 3.0],
    [6.0, 7.0],
], dtype=np.float64)

print("X_train shape:", X_train.shape)
print("y_train shape:", y_train.shape)
print("X_test shape:", X_test.shape)


### Optional: feature scaling

In [None]:
# Column-wise L2 normalization (same as you used before)
norms = np.linalg.norm(X_train, axis=0)
norms[norms == 0.0] = 1.0

X_train_scaled = X_train / norms
X_test_scaled = X_test / norms

print("X_train_scaled:\n", X_train_scaled)
print("X_test_scaled:\n", X_test_scaled)


## Test: Rust `MyRustLinearRegression` with small number of iterations

In [None]:
# Short training run – just to check that it runs and returns predictions
lr_rust = rust_core.MyRustLinearRegression(
    learning_rate=0.05,
    iterations=10,
    mode=rust_core.Mode.Regression,
)

lr_rust.fit(X_train_scaled, y_train)
print("Rust model (10 iters) fitted successfully.")

pred_rust_10 = lr_rust.predict(X_test_scaled)
print("Predictions (10 iters):", pred_rust_10)
print("Shape:", pred_rust_10.shape)


## Test: Rust model with more iterations

In [None]:
# More iterations – should converge closer to the optimal solution
lr_rust_long = rust_core.MyRustLinearRegression(
    learning_rate=0.05,
    iterations=100_000,
    mode=rust_core.Mode.Regression,
)

lr_rust_long.fit(X_train_scaled, y_train)
print("Rust model (100k iters) fitted successfully.")

pred_rust_100k = lr_rust_long.predict(X_test_scaled)
print("Predictions (100k iters):", pred_rust_100k)
print("Shape:", pred_rust_100k.shape)


## Reference: scikit-learn `LinearRegression` on the same data

In [None]:
lr_sklearn = LinearRegression()
lr_sklearn.fit(X_train_scaled, y_train)

pred_sklearn = lr_sklearn.predict(X_test_scaled)
print("sklearn predictions:", pred_sklearn)
print("Shape:", pred_sklearn.shape)


## Compare Rust vs scikit-learn numerically

In [None]:
def print_diff(name, a, b):
    diff = a - b
    print(f"{name} diff:")
    print("  values:", diff)
    print("  L2 norm:", np.linalg.norm(diff))
    print("  max abs:", np.max(np.abs(diff)))
    print()

print("Rust (10 iters) vs sklearn:")
print_diff("10 iters", pred_rust_10, pred_sklearn)

print("Rust (100k iters) vs sklearn:")
print_diff("100k iters", pred_rust_100k, pred_sklearn)


## Notes

- If `100k` iterations is overkill or too slow, reduce it to something like `10_000`.
- You can turn this into an automated test by asserting that the difference between
  Rust and scikit-learn is below some tolerance, for example:

```python
assert np.allclose(pred_rust_100k, pred_sklearn, atol=1e-2)
```

- This notebook is meant for **functional testing**, while a separate notebook
  can be used for **performance benchmarking**.
