![QuantConnect Logo](https://cdn.quantconnect.com/web/i/icon.png)
<hr>

### Random Forest Regression

For another installment of our "mini-series" of examples on how to move your work from the research environment and into production, we've shown how you can implement a basic random forest regression model using the sklearn RandomForestRegressor. Briefly, random forests is a supervised learning algorithm that we here use specifically for regression in order to identify important features of our dataset and create weights to build a tradeable portfolio.

To start, we continue to use the US Treasuries ETF basket and get the historical data we want. We'll use the most recent 1000 hours of historical data to create our train / test data sets.

In [None]:
# QuantBook Analysis Tool 
# For more information see [https://www.quantconnect.com/docs/research/overview]
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split
qb = QuantBook()
qb

symbols = {}
assets = ["SHY", "TLT", "SHV", "TLH", "EDV", "BIL",
                  "SPTL", "TBT", "TMF", "TMV", "TBF", "VGSH", "VGIT",
                  "VGLT", "SCHO", "SCHR", "SPTS", "GOVT"]

for i in range(len(assets)):
    symbols[assets[i]] = qb.AddEquity(assets[i],Resolution.Minute).Symbol

#Copy Paste Region For Backtesting.
#==========================================
# Set up classifier
# Initialize instance of Random Forest Regressor
regressor = RandomForestRegressor(n_estimators=100, min_samples_split=5, random_state = 1990)

# Fetch history on our universe
df = qb.History(qb.Securities.Keys, 500, Resolution.Hour)

# Get train/test data
returns = df.unstack(level=1).close.transpose().pct_change().dropna()
X = returns
# use real portfolio value in algo: y = [x for x in qb.portfolioValue][-X.shape[0]:]
y = np.random.normal(100000, 5, X.shape[0])
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 1990)

In [None]:
# Fit regressor
regressor.fit(X_train, y_train)

# Get long-only predictions
weights = regressor.feature_importances_
symbols = returns.columns[np.where(weights)]
selected = zip(symbols, weights)
for x, y in selected:
    print(f'Symbol: {x}, Weight: {y}')