# Supervised linear regression

This notebook shows how to use the agent to perform supervised linear regression. The output is compared to results from the `scikit-learn` package.

==========================================================================

* **Notebook dependencies**:
    * ...

* **Content**: Jupyter notebook accompanying Chapter 3 of the textbook "Fundamentals of Active Inference"

* **Author**: Sanjeev Namjoshi (sanjeev.namjoshi@gmail.com)

* **Version**: 0.1

In [20]:
import matplotlib as mpl
import numpy as np

from types import SimpleNamespace

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression

mpl.style.use("seaborn-deep")

We now build a linear regression agent that can learn from test data and $y$ given some new $X$. This is a classic use case of linear regression in a supervised learning setting.

We use the same environment as before:

In [3]:
class StaticEnvironment:
    def __init__(self, params: dict) -> None:
        self.params = SimpleNamespace(**params)
        
    def _noise(self):
        return np.random.normal(loc=0, scale=self.params.y_star_std)
    
    def _generating_function(self, x_star: np.array) -> float:
        return x_star.T @ self.params.theta_star
    
    def generate(self, x_star: float) -> float:
        x_star = np.insert(x_star, 0, 1)
        return self._generating_function(x_star) + self._noise()

The agent now has a `predict()` function that attempts to predict $y$ given some new $X$ input. All this function does is run X_star through the generating function to obtain $y$.

In [32]:
class MultipleLinearRegressionAgent:
    def __init__(self) -> None:
        ...
        
    def mle_theta(self, X: np.ndarray, y: np.ndarray) -> np.ndarray:
        return np.linalg.pinv(X) @ y
    
    def build_data_matrix(self, X_star: np.ndarray) -> np.ndarray:
        return np.insert(X_star, 0, 1, axis=1) 
    
    def learn_parameters(self, X_star: np.ndarray, y: np.ndarray) -> None:
        X = self.build_data_matrix(X_star)
        self.theta = self.mle_theta(X, y)
        
    def _generating_function(self, X: np.array) -> float:
        return X @ self.theta
        
    def predict(self, X_star_new: np.ndarray):
        X = self.build_data_matrix(X_star_new)
        self.y_pred = self._generating_function(X)

We generate data from the environment first.

In [33]:
# Environment parameters
env_params = {
    "theta_star"  : np.array([3., 2., 4., 5., 6.]), 
    "y_star_std"  : 1.,                   # Standard deviation of sensory data
    "C"           : 5                     # Number of parameters
}

# Initialize environment with parameters
env = StaticEnvironment(params=env_params)

# Generate data 
N       = 1000                                        # Number of samples
C       = env_params["theta_star"].shape[0]          # Number of parameters
x_range = np.linspace(start=0.01, stop=5, num=500)   # Support of x
X_star  = np.random.choice(x_range, size=(N, C-1))   # N random external states
y       = np.zeros(N)                                # Empty array for N data samples

# Generate N samples
for idx, x in enumerate(X_star):
    y[idx] = env.generate(x)

Now we create the agent, learn parameters, generate more data, and then predict $y$.

In [34]:
# Initialize agent and learn parameters
agent = MultipleLinearRegressionAgent()
agent.learn_parameters(X_star, y)

# Generate new data
X_star_new = np.random.choice(x_range, size=(N, C-1))   # N random external states
y_new      = np.zeros(N)                                # Empty array for N data samples

# Generate N samples
for idx, x_new in enumerate(X_star_new):
    y_new[idx] = env.generate(x_new)

# Predict new y
agent.predict(X_star_new)

To evaluate the performance we use the root mean-squared error. 

In [35]:
def rmse(y_true, y_pred):
    return np.sqrt(((y_pred - y_true) ** 2).mean())

In [36]:
rmse(y_new, agent.y_pred)

0.9995016143212305

These results indicate that the agent is, on average, off by 1 unit of light intensity from the true value.

## Supervised learning comparison

Now we compare these results with `scikit-learn`'s build in linear regression fitter. One can think of the `learn_parameters()` function in our agent as equivalent to the `fit()` function in `scikit-learn`. First we generate the $X$ and $y$ from the environment.

In [37]:
# Environment parameters
env_params = {
    "theta_star"  : np.array([3., 2., 4., 5., 6.]), 
    "y_star_std"  : 1.,                   # Standard deviation of sensory data
    "C"           : 5                     # Number of parameters
}

# Initialize environment with parameters
env = StaticEnvironment(params=env_params)

# Generate data 
N       = 1000                                        # Number of samples
C       = env_params["theta_star"].shape[0]          # Number of parameters
x_range = np.linspace(start=0.01, stop=5, num=500)   # Support of x
X_star  = np.random.choice(x_range, size=(N, C-1))   # N random external states
y       = np.zeros(N)                                # Empty array for N data samples

# Generate N samples
for idx, x in enumerate(X_star):
    y[idx] = env.generate(x)

Next we split into training and testing sets.

In [38]:
X_train, X_test, y_train, y_test = train_test_split(X_star, y, test_size=0.3, random_state=4885)

Now we run linear regression with `sklearn` and compare the results to our model.

In [39]:
""" scikit-learn """
lr = LinearRegression()
lr.fit(X_train, y_train)
skl_y_pred = lr.predict(X_test)
skl_rmse = rmse(y_test, skl_y_pred)

""" Our agent """
agent = MultipleLinearRegressionAgent()
agent.learn_parameters(X_train, y_train)
agent.predict(X_test)
agent_y_pred = agent.y_pred
agent_rmse = rmse(y_test, agent_y_pred)

# Results
print(f"sklearn RMSE: {skl_rmse}.")
print(f"Agent RMSE: {agent_rmse}.")

sklearn RMSE: 1.0061497781750777.
Agent RMSE: 1.0061497781750837.


And just to see how close the agent's prediction and sklearn's prediction deviated from one another, we can use the RMSE again.

In [41]:
rmse(skl_y_pred, agent_y_pred)

3.305980954218592e-14

As we can see, there is almost no error between the output of the agent and the output of sklearn. Let's also examine the parameter estimates.

**Note**: Scikit-learn splits apart the intercept from the rest of the parameters in its class variables so we need to gather them together into a vector.

In [51]:
print(f"sklearn parameter estimate: {np.round(np.insert(lr.coef_, 0, lr.intercept_),3)}.")
print(f"Agent parameter estimate  : {np.round(agent.theta, 3)}.")      

sklearn parameter estimate: [3.088 1.972 3.99  4.998 5.986].
Agent parameter estimate  : [3.088 1.972 3.99  4.998 5.986].
