# Setup

In [None]:
from google.colab import drive
drive.mount('/content/gdrive/')

: 

In [None]:
%cd "/content/gdrive/My Drive/PerfPred/Experiment 1"

: 

In [None]:
import sys
sys.path.append('/content/gdrive/My Drive/PerfPred/Experiment 1/src')

: 

In [4]:
from trial import *
from expr.func import *
from expr.train_size import *
from expr.domain_div import *
from expr.language import *

# Documentations (for Reference)

## Trial Functions

### Single Variable

#### Linear

$$f(X) = \alpha X + C$$

```
def linear(c, x):
  """
  c: Array with dim 2, corresponding to alpha, C.
  x: Array with dim (n, 1).
  y: Array with dim n.
  """
```

#### Log

$$f(X) = C\log(\alpha X) + \beta$$
where $\alpha > 0$. \\
Idea from [Sriivasan's paper, p.4](https://arxiv.org/pdf/2110.08875.pdf).

```
def log_single(c, x):
  """
  c: Array with dim 3, corresponding to C, alpha, beta.
  x: Array with dim (n, 1).
  y: Array with dim n.
  """
```

#### Google Paper Law

$$f(X) = \alpha\left(\frac1X + C\right)^p$$
where $C > 0$. \\
Idea from [Bansal's paper](https://arxiv.org/pdf/2202.01994.pdf), p.3.

```
def recip_single(c, x):
  """
  c: Array with dim 3, corresponding to alpha, C, p.
  x: Array with dim (n, 1).
  y: Array with dim n.
  """
```

### Double Variable

#### Linear

$$f(X) = \beta_1 X_1 + \beta_2 X_2 + C$$
Idea: Simple linear.

```
def linear(c, x):
  """
  c: Array with dim 3, corresponding to beta1, beta2, C.
  x: Array with dim (n, 2).
  y: Array with dim n.
  """
```

#### Product

$$f(X) = \alpha (X_1)^{-p_1} \cdot (X_2)^{-p_2} + C$$
where $\alpha < 0, p_1, p_2, C > 0$. \\
See curve-fitting > equation in [Anthony's work](https://colab.research.google.com/drive/1Rx6sExWQ9RsNQeoHwBSmzIP2D-XvtMRy#scrollTo=aC47KqM31nLO).

```
def product_double(c, x):
  """
  c: Array with dim 4, corresponding to alpha, p1, p2, C.
  x: Array with dim (n, 2).
  y: Array with dim n.
  """
```

#### Anthony's Paper Law

$$f(X) = \alpha_1 (X_1X_2)^{-p_1} + \alpha_2 X_2 ^ {-p_2} + C$$
where $\alpha_1, \alpha_2 < 0, p_1, p_2, C > 0$. \\
See curve-fitting -> equation in [Anthony's work](https://colab.research.google.com/drive/1Rx6sExWQ9RsNQeoHwBSmzIP2D-XvtMRy#scrollTo=aC47KqM31nLO).

```
def depend_double(c, x):
  """
  c: Array with dim 5, corresponding to alpha1, alpha2, p1, p2, C.
  x: Array with dim (n, 2).
  y: Array with dim n.
  """
```

## Trial Classes

```SingleSizeTrial```: Train set 1 size (D1) *or* train set 2 size (D2). \\
```DoubleSizeTrial```: Both train set sizes (D1+D2). \\
```SingleDomainTrial```: Train set 1 jsd (j1) *or* train set 2 jsd (j2). \\
```DoubleDomainTrial```: Both train set jsds (j1+j2). \\
```SingleLanguageTrial```: Uses one of the 6 l2v distances. \\
```DoubleLanguageTrial```: Uses two of the 6 l2v distances. \\

### Usage

1. Create an instance of the class. \\
Ex: ```expr = SingleSizeTrial(1, Model(linear, np.zeros(2), pars=["alpha", "C"]), trial="trial1")```
2. Get fits & costs, in one of two ways: \\
  2a) Run fitting function: ```costs, fits = expr.fit_all()``` \\
  2b) *(Needs to be Debugged)* Read fits & costs from saved file: ```costs, fits = expr.read_all_fits()``` \\
  *Note:* You can only use ```read_all_fits``` if you have ran ```fit_all``` at least once. \\
3. *(Optional)* Plot the fitted function for each slice: ```expr.plot_all()```
4. *(Optional, Preliminary)* Analyze rmse costs: ```expr.analyze_all()```

### Arguments for ```Trial.__init__```






All trial subclasses:
```
model: Model for the trial. (More info below.)
trial: Name of trial. Used as subdirectory name.
```
SingleSizeTrial and SingleDomainTrial:
```
n: Which variable to use. Must be 1 or 2.
   (If 1, uses size1/jsd1. If 2, uses size2/jsd2).
```
SingleLanguageTrial and DoubleLanguageTrial:
```
dist/dists: l2v distance(s) to use.
```

### Arguments for ```Model.__init__```

```
f: Trial func, i.e. function used for fitting.
         f takes coefficients c & data points x as input, and returns prediction.
         (More info in pre-conditions.)
init: Initial values for coefficients (parameters) of f per slice.
      (More info in pre-conditions.)
bounds: Bounds for each coefficient.
        (More info in pre-conditions.)
loss: Loss function for the regression.
      Allowed Values: 'linear', 'soft_l1', 'huber', 'cauchy', 'arctan'
      (More info: https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.least_squares.html)
pars: Names of coefficients of f.

== Pre-Conditions ==
    * Let N be the number of slices in the slice group.
    * Let K be the number of xvars.
    * Let C be the number of coefficients of the model specified by f. (=par_num)
    - f must have two inputs c, x and an output y (with any name).
      - f must take input array x with shape (n, K) for any n.
        Each row of x corresponds to a data entry.
        Each column of x corresponds to an xvar.
      - f must return array y of len n (same n as input), with entry i of y
        being the prediction for row i of x.
    - If fixed_init is True:
      - init must be an array with the same shape as c (i.e. len C).
      - For every slice, init will be used as the initial value for fitting f.
      If fixed_init is False:
      - init must be a list of N arrays, each satisfying the property above.
      - Each element (array) in init will be the initial value for its
        corresponding slice.
    - If bounds is None, every coefficient will be unbounded.
      Otherwise, bounds must be a tuple (mins, maxes), where:
      - mins and maxes are each an array of len C.
      - The model will obey mins[i] <= c[i] <= maxes[i] for each i, i.e. mins[i]
        and maxes[i] define the bounds for the i-th coefficient.
```


*Note:* Linear models can be created by a helper static function ```linear(n)``` where n is the number of variables of the model.

# Expr 1A: Size

## Var = $D_1$
Keep train set 1, train set 2, language, and $D_2$ constant \\
Big idea: $\text{sp-BLEU}$ should increase with $D_1$


### Trial 1: Linear D1
$$ \text{sp-BLEU} (D_1) = \alpha D_1 + C$$
Idea: Simple linear

In [7]:
expr = SingleSizeTrial(1, Model.linear(1), trial="trial1")
fits, costs = expr.fit_all() # Fit.
# fits, costs = expr.read_all_fits()
expr.plot_all() # Plot "slice plots".
expr.analyze_all() # Plot "analysis plots".

Done plotting.
Done analyzing.


<Figure size 640x480 with 0 Axes>

### Trial 2: Log D1
$$ \text{sp-BLEU} (D_1) = C \log (\alpha D_1) + β$$
where $\alpha > 0$. \\
Idea from [Sriivasan's paper, p.4](https://arxiv.org/pdf/2110.08875.pdf).


In [8]:
expr =  SingleSizeTrial(1, Model(func.log_single, np.array([0.1, 0.1, 0.1]),
                           bounds=([-np.inf, 0, -np.inf], [np.inf, np.inf, np.inf]),
                           pars=["C", "alpha", "beta"]), trial="trial2")
fits, costs = expr.fit_all()
# fits, costs = expr.read_all_fits()
expr.plot_all()
expr.analyze_all()

Done plotting.
Done analyzing.


<Figure size 640x480 with 0 Axes>

### Trial 3: Google paper law D1
$$\text{sp-BLEU} (D_1) = \alpha \left(\frac{1}{D_1} + C \right) ^{p}$$
where $C > 0$. \\
Idea from [Bansal's paper](https://arxiv.org/pdf/2202.01994.pdf), p.3.

In [9]:
expr = SingleSizeTrial(1, Model(func.recip_single, np.array([0, 0, -1]),
                           bounds=([-np.inf, 0, -np.inf], [np.inf, np.inf, np.inf]),
                           pars=["alpha", "C", "p"]), trial="trial3")
fits, costs = expr.fit_all()
# fits, costs = expr.read_all_fits()
expr.plot_all()
expr.analyze_all()

Done plotting.
Done analyzing.


<Figure size 640x480 with 0 Axes>

## Var = $D_2$
Keep train set 1, train set 2, language, and $D_1$ constant \\
Big idea: $\text{sp-BLEU}$ should increase with $D_2$

### Trial 1: Linear D2
$$ \text{sp-BLEU} (D_2) = \alpha D_2 + C$$
Idea: Simple linear

In [10]:
expr = SingleSizeTrial(2, Model.linear(1), trial="trial1")
fits, costs = expr.fit_all()
# fits, costs = expr.read_all_fits()
expr.plot_all()
expr.analyze_all()

Done plotting.
Done analyzing.


<Figure size 640x480 with 0 Axes>

### Trial 2: Log D2
$$ \text{sp-BLEU} (D_2) = C \log (\alpha D_2) + \beta$$
where $\alpha > 0$. \\
Idea from [Sriivasan's paper, p.4](https://arxiv.org/pdf/2110.08875.pdf).


In [11]:
expr = SingleSizeTrial(2, Model(func.log_single, np.array([0.1, 0.1, 0.1]),
                           bounds=([-np.inf, 0, -np.inf], [np.inf, np.inf, np.inf]),
                           pars=["C", "alpha", "beta"]), trial="trial2")
fits, costs = expr.fit_all()
# fits, costs = expr.read_all_fits()
expr.plot_all()
expr.analyze_all()

Done plotting.
Done analyzing.


<Figure size 640x480 with 0 Axes>

### Trial 3: Google paper law D2
$$\text{sp-BLEU} (D_2) = \alpha \left(\frac{1}{D_2} + C \right) ^{p}$$
where $C > 0$. \\
Idea from [Bansal's paper](https://arxiv.org/pdf/2202.01994.pdf), p.3.

In [12]:
expr = SingleSizeTrial(2, Model(func.recip_single, np.array([0, 0, -1]),
                           bounds=([-np.inf, 0, -np.inf], [np.inf, np.inf, np.inf]),
                           pars=["alpha", "C", "p"]), trial="trial3")
fits, costs = expr.fit_all()
# fits, costs = expr.read_all_fits()
expr.plot_all()
expr.analyze_all()

Done plotting.
Done analyzing.


<Figure size 640x480 with 0 Axes>

## Var = $D_1, D_2$

### Trial 1: Simple Linear Regression
$$\text{sp-BLEU} (D_1, D_2) = \beta_1 D_1 + \beta_2 D_2 + C $$


In [13]:
expr = DoubleSizeTrial(Model.linear(2), trial="trial1")
fits, costs = expr.fit_all()
# fits, costs = expr.read_all_fits()
expr.plot_all()
expr.analyze_all()

Done plotting.
Done analyzing.


<Figure size 640x480 with 0 Axes>

### Trial 2: Product
$$\text{sp-BLEU} (D_1, D_2) = \alpha (D_1)^{-p_1} \cdot (D_2)^{-p_2} + C $$
where $\alpha < 0, p_1, p_2, C > 0$  \\
See curve-fitting > equation in [Anthony's work](https://colab.research.google.com/drive/1Rx6sExWQ9RsNQeoHwBSmzIP2D-XvtMRy#scrollTo=aC47KqM31nLO).

In [14]:
expr = DoubleSizeTrial(Model(func.product_double, np.zeros(4),
                        bounds=([-np.inf, 0, 0, 0], [0, np.inf, np.inf, np.inf]),
                        pars=["alpha", "p1", "p2", "C"]), trial="trial2")
fits, costs = expr.fit_all()
# fits, costs = expr.read_all_fits()
expr.plot_all()
expr.analyze_all()

Done plotting.
Done analyzing.


<Figure size 640x480 with 0 Axes>

### Trial 3: Anthony's Paper Law D1 D2
$$\text{sp-BLEU}(D_1,D_2) = \alpha_1 (D_1D_2)^{-p_1} + \alpha_2 D_2 ^ {-p_2} + C$$
where $\alpha_1, \alpha_2 < 0, p_1, p_2, C > 0$.
See curve-fitting -> equation in [Anthony's work](https://colab.research.google.com/drive/1Rx6sExWQ9RsNQeoHwBSmzIP2D-XvtMRy#scrollTo=aC47KqM31nLO).


In [15]:
expr = DoubleSizeTrial(Model(func.depend_double, np.zeros(5),
                        bounds=([-np.inf, -np.inf, 0, 0, 0], [0, 0, np.inf, np.inf, np.inf]),
                        pars=["alpha1", "alpha2", "p1", "p2", "C"]), trial="trial3")
fits, costs = expr.fit_all()
# fits, costs = expr.read_all_fits()
expr.plot_all()
expr.analyze_all()

Done plotting.
Done analyzing.


<Figure size 640x480 with 0 Axes>

### Trial 4: Simple Decision D1D2
$$
\text{sp-BLEU}(D_1, D_2) = \begin{cases}
  c_1 D_1 + c_2 D_2 + C &, D_1 > 10k \\
  c_2 D_2 + C &, \text{otherwise}
\end{cases}
$$


In [16]:
def simple_decision_size(c,x):
  """ See above
  c: Array with dim 3, corresponding to c1, c2, and C
  x: Array of dim (n,2)
  y: Array with dim n
  """
  if np.all(x[:, 0] > 10):
    return c[0] * x[:, 0] + c[1] * x[:, 1] + c[2];
  return c[1] * x[:, 1] + c[2]

In [17]:
expr = DoubleSizeTrial(Model(simple_decision_size, np.zeros(3),
                             pars=["c1", "c2", "C"]), trial = "trial4")
fits, costs = expr.fit_all()
# fits, costs = expr.read_all_fits()
expr.plot_all()
expr.analyze_all()

Done plotting.
Done analyzing.


<Figure size 640x480 with 0 Axes>

# Expr 1B: Domain Relatedness

## Var = $j_1$

### Trial 1: Linear j1
$$\text{sp-BLEU}(j_1) = \alpha j_1 + C$$



In [18]:
expr = SingleDomainTrial(1, Model.linear(1), trial="trial1")
expr.fit_all()
# expr.read_all_fits()
expr.plot_all()
expr.analyze_all()

Done plotting.
Done analyzing.


<Figure size 640x480 with 0 Axes>

## Var = $j_2$

### Trial 1: Linear j2
$$\text{sp-BLEU}( j_2) = \alpha j_2 + C$$

In [19]:
expr = SingleDomainTrial(2, Model.linear(1), trial="trial1")
expr.fit_all()
# expr.read_all_fits()
expr.plot_all()
expr.analyze_all()

Done plotting.
Done analyzing.


<Figure size 640x480 with 0 Axes>

## Var = j_1, j_2

### Trial 1: Simple Linear Regression
$$\text{sp-BLEU}(j_1, j_2) = \beta_1 j_1 + \beta_2 j_2 + C$$

In [20]:
expr = DoubleDomainTrial(Model.linear(2), trial="trial1")
expr.fit_all()
# expr.read_all_fits()
expr.plot_all()
expr.analyze_all()

Done plotting.
Done analyzing.


<Figure size 640x480 with 0 Axes>

# Expr 1C: Dataset Independent Language Relatedness

## Var = $d_{fea}$

### Trial 1: Linear dgen
$$\text{sp-BLEU}(d_{fea}) = \alpha d_{fea} + C$$

In [7]:
expr = LanguageTrial([Var.FEA_DIST], Model.linear(1), trial="trial1")
expr.fit_all()
# expr.read_all_fits()
expr.plot_all()
expr.analyze_all()

Done plotting.
Done analyzing.


<Figure size 640x480 with 0 Axes>

## Var = $d_{inv}$


### Trial 1: Linear dinv
$$\text{sp-BLEU}(d_{inv}) = \alpha d_{inv} + C$$

In [8]:
expr = LanguageTrial([Var.INV_DIST], Model.linear(1), trial="trial1")
expr.fit_all()
# expr.read_all_fits()
expr.plot_all()
expr.analyze_all()

Done plotting.
Done analyzing.


<Figure size 640x480 with 0 Axes>

## Var = $d_{pho}$


### Trial 1: Linear dpho
$$\text{sp-BLEU}(d_{pho}) = \alpha d_{pho} + C$$

In [9]:
expr = LanguageTrial([Var.PHO_DIST], Model.linear(1), trial="trial1")
expr.fit_all()
# expr.read_all_fits()
expr.plot_all()
expr.analyze_all()

Done plotting.
Done analyzing.


<Figure size 640x480 with 0 Axes>

## Var = $d_{syn}$

### Trial 1: Linear dsyn
$$\text{sp-BLEU}(d_{syn}) = \alpha d_{syn} + C$$

In [10]:
expr = LanguageTrial([Var.SYN_DIST], Model.linear(1), trial="trial1")
expr.fit_all()
# expr.read_all_fits()
expr.plot_all()
expr.analyze_all()

Done plotting.
Done analyzing.


<Figure size 640x480 with 0 Axes>

## Var = $d_{gen}$

### Trial 1: Linear dgen
$$\text{sp-BLEU}(d_{gen}) = \alpha d_{gen} + C$$

In [11]:
expr = LanguageTrial([Var.GEN_DIST], Model.linear(1), trial="trial1")
expr.fit_all()
# expr.read_all_fits()
expr.plot_all()
expr.analyze_all()

Done plotting.
Done analyzing.


<Figure size 640x480 with 0 Axes>

## Var = $d_{geo}$

### Trial 1: Linear dgeo
$$\text{sp-BLEU}(d_{geo}) = \alpha d_{geo} + C$$

In [12]:
expr = LanguageTrial([Var.GEO_DIST], Model.linear(1), trial="trial1")
expr.fit_all()
# expr.read_all_fits()
expr.plot_all()
expr.analyze_all()

Done plotting.
Done analyzing.


<Figure size 640x480 with 0 Axes>

## Var = $d_{inv}, d_{pho}$

In [13]:
expr = LanguageTrial([Var.INV_DIST, Var.PHO_DIST], Model.linear(2), trial="trial1")
expr.fit_all()
# expr.read_all_fits()
expr.plot_all()
expr.analyze_all()

Done plotting.
Done analyzing.


<Figure size 640x480 with 0 Axes>

## Var = $d_{inv}, d_{syn}$

In [14]:
expr = LanguageTrial([Var.INV_DIST, Var.SYN_DIST], Model.linear(2), trial="trial1")
expr.fit_all()
# expr.read_all_fits()
expr.plot_all()
expr.analyze_all()

Done plotting.
Done analyzing.


<Figure size 640x480 with 0 Axes>

## Var = $d_{pho}, d_{syn}$

In [15]:
expr = LanguageTrial([Var.PHO_DIST, Var.SYN_DIST], Model.linear(2), trial="trial1")
expr.fit_all()
# expr.read_all_fits()
expr.plot_all()
expr.analyze_all()

Done plotting.
Done analyzing.


<Figure size 640x480 with 0 Axes>

## Var = $d_{gen}, d_{geo}$

In [19]:
expr = LanguageTrial([Var.GEN_DIST, Var.GEO_DIST], Model.linear(2), trial="trial1")
expr.fit_all()
# expr.read_all_fits()
expr.plot_all()
expr.analyze_all()

Done plotting.
Done analyzing.


<Figure size 640x480 with 0 Axes>

## Var = $d_{fea}, d_{inv}$

In [20]:
expr = LanguageTrial([Var.FEA_DIST, Var.INV_DIST], Model.linear(2), trial="trial1")
expr.fit_all()
# expr.read_all_fits()
expr.plot_all()
expr.analyze_all()

Done plotting.
Done analyzing.


<Figure size 640x480 with 0 Axes>

## Var = $d_{fea}, d_{pho}$

In [21]:
expr = LanguageTrial([Var.FEA_DIST, Var.PHO_DIST], Model.linear(2), trial="trial1")
expr.fit_all()
# expr.read_all_fits()
expr.plot_all()
expr.analyze_all()

Done plotting.
Done analyzing.


<Figure size 640x480 with 0 Axes>

## Var = $d_{fea}, d_{syn}$

In [23]:
expr = LanguageTrial([Var.FEA_DIST, Var.SYN_DIST], Model.linear(2), trial="trial1")
expr.fit_all()
# expr.read_all_fits()
expr.plot_all()
expr.analyze_all()

Done plotting.
Done analyzing.


<Figure size 640x480 with 0 Axes>

## Var = $d_{fea}, d_{gen}$

In [26]:
expr = LanguageTrial([Var.FEA_DIST, Var.GEN_DIST], Model.linear(2), trial="trial1")
expr.fit_all()
# expr.read_all_fits()
expr.plot_all()
expr.analyze_all()

Done plotting.
Done analyzing.


<Figure size 640x480 with 0 Axes>

## Var = $d_{fea}, d_{geo}$

In [27]:
expr = LanguageTrial([Var.FEA_DIST, Var.GEO_DIST], Model.linear(2), trial="trial1")
expr.fit_all()
# expr.read_all_fits()
expr.plot_all()
expr.analyze_all()

Done plotting.
Done analyzing.


<Figure size 640x480 with 0 Axes>

## Var = $d_{inv}, d_{pho}, d_{syn}$

In [None]:
expr = LanguageTrial([Var.INV_DIST, Var.PHO_DIST, Var.SYN_DIST], Model.linear(3), trial="trial1")
expr.fit_all()
# expr.read_all_fits()
expr.plot_all()
expr.analyze_all()

## Var = $d_{fea}, d_{inv}, d_{pho}$

In [None]:
expr = LanguageTrial([Var.FEA_DIST, Var.INV_DIST, Var.PHO_DIST], Model.linear(3), trial="trial1")
expr.fit_all()
# expr.read_all_fits()
expr.plot_all()
expr.analyze_all()

## Var = $d_{fea}, d_{inv}, d_{syn}$

In [None]:
expr = LanguageTrial([Var.FEA_DIST, Var.INV_DIST, Var.SYN_DIST], Model.linear(3), trial="trial1")
expr.fit_all()
# expr.read_all_fits()
expr.plot_all()
expr.analyze_all()

## Var = $d_{fea}, d_{pho}, d_{syn}$

In [None]:
expr = LanguageTrial([Var.FEA_DIST, Var.PHO_DIST, Var.SYN_DIST], Model.linear(3), trial="trial1")
expr.fit_all()
# expr.read_all_fits()
expr.plot_all()
expr.analyze_all()

## Var = $d_{fea}, d_{gen}, d_{geo}$

In [None]:
expr = LanguageTrial([Var.FEA_DIST, Var.GEN_DIST, Var.GEO_DIST], Model.linear(3), trial="trial1")
expr.fit_all()
# expr.read_all_fits()
expr.plot_all()
expr.analyze_all()

## Var = $d_{fea}, d_{inv}, d_{pho}, d_{syn}$

In [None]:
expr = LanguageTrial([Var.FEA_DIST, Var.INV_DIST, Var.PHO_DIST, Var.SYN_DIST], Model.linear(4), trial="trial1")
expr.fit_all()
# expr.read_all_fits()
expr.plot_all()
expr.analyze_all()

## Var = $d_{inv}, d_{pho}, d_{syn}, d_{gen}, d_{geo}$

In [None]:
expr = LanguageTrial([Var.INV_DIST, Var.PHO_DIST, Var.SYN_DIST], Model.linear(5), trial="trial1")
expr.fit_all()
# expr.read_all_fits()
expr.plot_all()
expr.analyze_all()

## Var = $d_{fea}, d_{inv}, d_{pho}, d_{syn}, d_{gen}, d_{geo}$

In [None]:
expr = LanguageTrial([Var.FEA_DIST, Var.INV_DIST, Var.PHO_DIST, Var.SYN_DIST], Model.linear(5), trial="trial1")
expr.fit_all()
# expr.read_all_fits()
expr.plot_all()
expr.analyze_all()

## Var = All Dataset Independent Language Features

### Trial 1: Simple Linear Regression
$$\text{sp-BLEU}(d_{geo}, d_{gen}, d_{inv}, d_{syn}, d_{pho}) = \beta_1 d_{geo} + \beta_2 d_{gen} + \beta_3 d_{inv} + \beta_4 d_{syn} + \beta_5 d_{pho} + C$$

### Trial 2: Stepwise Regression from Linear Single


```
candidate_factors = [geo, gen, inv, syn, pho]
candidate_factors.sort() # Ascending based on average RMSE of single var linear

selected_factors = []
MAX_FACTORS = 5

# Start with linear single with lowest RMSE
current_model = linear_reg(candidate_factors[0])
best_rmse = rmse(linear_reg(candidate_factors[0]))
candidate_factors.pop_front()

# Perform stepwise regression
while len(selected_factors) < MAX_FACTORS:

  best_factor = None

  # Iterate over all remaining factors
  for factor in candidate_factors:

    # Add the candidate factor to current
    subset_factors = selected_factors
    subset_factors.append(factor)
    updated_model = linear_reg(subset_factors)
    rmse = rmse(updated_model)

    if rmse < best_rmse:
      best_factor = factor
      best_rmse = rmse

  if best_factor = None:
    break

  selected_factors.append(best_factor)
  candidate_factors.remove(best_factor)

  current_model = add_factor(current_model, best_factor)

final_model = current_model # Do whatever analysis with this
final_rmse = best_rmse

```




### Trial 3: Reverse Stepwise Regression from Linear Single


```
candidate_factors = [geo, gen, inv, syn, pho]
selected_factors = candidate_factors.copy()
current_model = linear_reg(candidate_factors)
best_rmse = rmse(simple_linear(candidate_factors))

for factor in candidate_factors:

  # Temporarily remove a factor
  subset_factors = selected_factors.copy()
  subset_factors.remove(factor)
  updated_model = linear_reg(subset_factors)
  current_rmse = rmse(updated_model)

  if cur_rmse < best_rmse:
    best_remse = current_rmse
    selected_factors = subset_factors


final_model = current_model # Do whatever analysis with this
final_rmse = best_rmse


```

