# Working with Programs

In Brush, a *Program* is an executable data structure. 
You may think of it as a *model* or a *function* mapping feature inputs to data labels. 
We call them programs because that's what they are: executable data structures,  
and that is what they are called in the genetic algorithm literature, to distinguish them from optimizing bits or strings. 

The Brush Program class operates similarly to a [sklearn](scikit-learn.org) estimator: it has `fit` and `predict` methods that are called in during training or inference, respectively. 


## Types of Programs 

There are four fundamental "types" of Brush programs:

- **Regressors**: map inputs to a continous endpoint 
- **Binary Classifiers**: map inputs to a binary endpoint, as well as a continuous value in $[0, 1]$  
- **Multi-class Classifiers**: map inputs to a category
    - Under development
- **Representors**: map inputs to a lower dimensional space. 
    - Under development

## Representation 

Internally, the programs are represented as syntax trees. 
We use the [tree.hh tree class](https://github.com/kpeeters/tree.hh) which gives trees an STL-like feel. 



## Generation

We generate random programs using Sean Luke's PTC2 algorithm.  


## Evaluation

TODO




## Visualizing Programs

Programs in Brush are symbolic tree structures, and can be viewed in a few ways: 


1. As a string using `get_model()`
2. As a string-like tree using `get_model("tree")`
2. As a graph using `graphviz` and `get_model("dot")`. 

Let's look at a regresion example.

In [7]:
import pandas as pd
from brush import BrushRegressor
from pmlb import fetch_data

# load data
df = pd.read_csv('../examples/datasets/d_enc.csv')
X = df.drop(columns='label')
y = df['label']
# X, y = fetch_data('1027_ESL', return_X_y = True)
# X, y = fetch_data('192_vineyard', return_X_y = True)
# X, y = fetch_data('594_fri_c2_100_5', return_X_y = True)
# X, y = fetch_data('607_fri_c4_1000_50', return_X_y = True)
# X, y = fetch_data('503_wind', return_X_y = True)



In [8]:
# import and make a regressor
est = BrushRegressor(
    max_depth=3, 
    max_size=20,
    functions=['SplitBest','Add','Sub','Mul','Div']
)

# use like you would a sklearn regressor
est.fit(X,y)
y_pred = est.predict(X)

  distances[cur[1]] += (next[0][i] - prev[0][i]) / norm
  distances[cur[1]] += (next[0][i] - prev[0][i]) / norm
  x = asanyarray(arr - arrmean)


best model: Add(Sub(Sub(3.92*x4,12.66),-0.04*x2),Mul(4.11*x4,0.68*x6))


In [9]:
print('score:', est.score(X,y))

score: 0.8823622777154134


Now that we have trained a model, `est.best_estimator_` contains our symbolic model. 
We can view it as a string:

In [10]:
print(est.best_estimator_.get_model())

Add(Sub(Sub(3.92*x4,12.66),-0.04*x2),Mul(4.11*x4,0.68*x6))


We can view it as a tree:

In [11]:
print(est.best_estimator_.get_model("tree"))

Add
|-Sub
  |-Sub
    |-3.92*x4
    |-12.66
  |--0.04*x2
|-Mul
|  |-4.11*x4
|  |-0.68*x6


We can also view it as a graph in dot format. 
Let's import graphviz and make a nicer plot.

In [None]:
import graphviz

model = est.best_estimator_.get_model("dot")
print(model)
g = graphviz.Source(model)
g