#### Welcome to the coding portion of the SimOpt workshop!

In [None]:
# Ignore some warnings that pop up since this is running in a Jupyter notebook
# ruff: noqa: E402, F811

# Some setup...

import os

os.chdir("../")  # Move one level up to import simopt

import sys

sys.path.append("venv\\lib\\site-packages")
sys.path.append("..")

In [None]:
# CODE CELL [0]

# Import experiment_base module, which contains functions for experimentation.
import simopt.experiment_base as expbase
from simopt.experiment_base import PlotType

# Import ERM-Example problem and Random Search and ADAM solvers.
from simopt.models.ermexample import ERMExampleProblem
from simopt.solvers.adam import ADAM
from simopt.solvers.randomsearch import RandomSearch

In this portion of the workshop, we'll be working with a problem and two solvers.

**Problem:** Find the slope ($\beta_1$) and intercept ($\beta_0$) coefficients that minimize the training MSE of a simple linear regression model, i.e., least squares regression:

$$ \min_{\beta_0, \beta_1} \frac{1}{n}\sum_{i=1}^n (y_i - (\beta_0 + \beta_1 x_i))^2, $$

where $Y(x) = 1 + x + \epsilon$ and $\epsilon \sim N(0, 0.1)$.

**Solver:** Random Search
* Randomly samples ($\beta_0$, $\beta_1$) from a bivariate normal distribution with mean vector (1, 1) and variance-covariance matrix (1, 0; 0, 1).
* Draws a fixed number of observations from the training dataset (i.e., a minibatch) and computes the squared error loss.
* [Full documentation](https://simopt.readthedocs.io/en/latest/randomsearch.html)

**Solver:** ADAM
* A gradient-based search. Direct (IPA) gradient estimators are used, if available. Otherwise a finite differences estimator is used.
* Takes a fixed number of observations (replications) at each solution. This parameter is called `r`.
* [Full documentation](https://simopt.readthedocs.io/en/latest/adam.html).

In [None]:
# CODE CELL [1]

# Instantiate the problem and the Random Search solver, with specifications.
my_problem = ERMExampleProblem(
    fixed_factors={"initial_solution": (0.0, 0.0), "budget": 2000}
)
my_rand_search_solver = RandomSearch(
    fixed_factors={"crn_across_solns": True, "sample_size": 100}
)

# Pair the problem and solver for experimentation.
myexperiment = expbase.ProblemSolver(problem=my_problem, solver=my_rand_search_solver)

Let's see how Random Search does on this problem.

In [None]:
# CODE CELL [2]

# Run 10 macroreplications of Random Search on the ERM-Example Problem.
myexperiment.run(n_macroreps=10)

# Post-process the results.
myexperiment.post_replicate(n_postreps=200)
expbase.post_normalize(experiments=[myexperiment], n_postreps_init_opt=200)

# [Results are saved in a file called experiments/<DATE-TIME>/outputs/RNDSRCH_on_ERM-EXAMPLE-1.pickle.]
# [The file is not human-readable, so we'll skip looking at it.]

# Plot the (unnormalized) progress curves from the 10 macroreplications.
expbase.plot_progress_curves(
    experiments=[myexperiment], plot_type=PlotType.ALL, normalize=False
)
# Plot the (unnormalized) mean progress curve with bootstrapped CIs.
expbase.plot_progress_curves(
    experiments=[myexperiment], plot_type=PlotType.MEAN, normalize=False
)
# [The plots should be displayed in the output produced below.]

#### Your turn.

### Exercise \#1

In CODE CELL [1], play around with the arguments when initializing `myproblem` and `mysolver`.

Vary factors of the ERM-Example problem:
- Change the initial solution.
- Change the budget, i.e., the max number of replications. 

Vary factors of the Random Search solver:
- Change whether it uses CRN across solutions.
- Change the number of replications it takes at each solution.

Rerun CODE CELLS [1] and [2]. *What do you observe?*

#### Now let's work with the source code.

### Exercise \#2

1. Open the file simopt/model/ermexample.py in the VS Code editor.
2. Let's change how random search randomly samples solutions in $\mathbb{R}^2$. For starters, uncomment Line 193

    `beta = tuple([rand_sol_rng.uniform(-2, 2) for _ in range(self.dim)])`

    and comment out Lines 194-200
    
    `beta = tuple(rand_sol_rng.mvnormalvariate(mean_vec=[1.0] * self.dim, cov=np.eye(self.dim), factorized=False))`

3. Restart the kernel using the Restart Button at the top of this notebook. This will ensure the new version of the source code is being imported.
4. Run COMBO CODE CELL [0 + 1 + 2] below (this effectively reruns CODE CELLS [0], [1], and [2]). *How have the plots changed?*

**Extra for Experts:** Come up with your own sampling distribution. Documentation on the types of distributions available can be found [here](https://mrg32k3a.readthedocs.io/en/latest/mrg32k3a.html).

In [None]:
# COMBO CODE CELL [0 + 1 + 2]
import os

os.chdir("../")
import sys

sys.path.append("venv\\lib\\site-packages")
sys.path.append("..")
import simopt.experiment_base as expbase
from simopt.experiment_base import PlotType
from simopt.models.ermexample import ERMExampleProblem
from simopt.solvers.adam import ADAM
from simopt.solvers.randomsearch import RandomSearch

my_problem = ERMExampleProblem(
    fixed_factors={"initial_solution": (0.0, 0.0), "budget": 2000}
)
my_rand_search_solver = RandomSearch(
    fixed_factors={"crn_across_solns": True, "sample_size": 100}
)
myexperiment = expbase.ProblemSolver(problem=my_problem, solver=my_rand_search_solver)

myexperiment.run(n_macroreps=10)
myexperiment.post_replicate(n_postreps=200)
expbase.post_normalize(experiments=[myexperiment], n_postreps_init_opt=200)
myexperiment.log_experiment_results()
expbase.plot_progress_curves(
    experiments=[myexperiment], plot_type=PlotType.ALL, normalize=False
)
expbase.plot_progress_curves(
    experiments=[myexperiment], plot_type=PlotType.MEAN, normalize=False
)

#### Now let's bring the ADAM solver into the mix.

In [None]:
# CODE CELL [3]

my_adam_solver = ADAM(fixed_factors={"crn_across_solns": True, "r": 100})
# Create a grouping of ERM-Example-RandomSearch and ERM-Example-ADAM pairs.
mygroupexperiment = expbase.ProblemsSolvers(
    problems=[my_problem], solvers=[my_rand_search_solver, my_adam_solver]
)

# Run 10 macroreplications of each pair and post-process.
mygroupexperiment.run(n_macroreps=10)
mygroupexperiment.post_replicate(n_postreps=200)
mygroupexperiment.post_normalize(n_postreps_init_opt=200)

# Record a summary of the results in a human-readable way.
mygroupexperiment.log_group_experiment_results()
# [Go check out the file called
# experiments/<DATE-TIME>/logs/group_RNDSRCH_ADAM_on_ERM-EXAMPLE-1_group_experiment_results.txt]

# Plot the mean progress curve for each solver from the 10 macroreplications.
expbase.plot_progress_curves(
    experiments=[
        mygroupexperiment.experiments[0][0],
        mygroupexperiment.experiments[1][0],
    ],
    plot_type=PlotType.MEAN,
    normalize=False,
)
# [The plot should be displayed in the output produced below.]