# Using Sklearn Evaluation Functions with Statsmodels

## Lesson objectives:
By the end of this lesson, students will be able to:

- Get predictions from a statsmodels OLS
- Use sklearn evaluation functions with statsmodels OLS
- Format f-strings for display (without changing the values)

## The OLS Summary doesn't do everything
In the prior lessons, we demonstrated the power of a statsmodels OLS model and the informative built-in summary. 

You may have noticed that we have not evaluated our model on the test data yet. While the statsmodels OLS summary is very powerful, it can only report results for the training data. 

- In order for us to get the metrics for our test data, we will need to use the OLS result variable to get predictions for our X_test data. 

In [None]:
## Fit an OLS model
model = sm.OLS(y_train,X_train_df)
result = model.fit()
## Use the result (not the model) to .predict
test_preds = result.predict(X_test_df)

In [None]:
#We can then use any of the regression metrics from sklearn.metric's module!
# e.g. r2_score, mean_squared_error
test_r2 = r2_score(y_test, test_preds)
test_mse = mean_squared_error(y_test, test_preds)copy

# Now that we have saved our metrics, we will want to report the scores in a clean, easy-to-read print statement. We can use f-strings to do this!

print(f'The testing r-square value is {test_r2} and the testing mean squared error is {test_mse}.')copy


## F-strings reFresher

Anytime you print a string, you have the option to use an f-string.  We want to make sure everyone is comfortable with f-strings as they are our preference for making easy to read and more easily reproducible code!

You can use an f-statement to print a string that includes the value of any defined variable.

- prior to the opening quotation mark, put an "f" to indicate this is an f-string
- place the desired variable into curly brackets {desired variable here}
- Continue any more of your string (In this case we show just a final period).
- Close up that statement with the closing quotation mark,

We can easily add multiple variables to a single statement:

In [None]:
print(f'The testing r-square value is {test_r2} and the testing mean squared error is {test_mse}.')

## Fancier F-Strings (Displaying fewer digits)

We could actually round our value by using something like {round(test_r2, 2)} within our f-string, but a better approach is to just control how the value is displayed.  

To control how many digits are displayed when a variable within an f-string is numeric, we can add :xf at the end of the variable where x is how many decimal places to display.  To display 2 decimal places:

In [None]:
print(f'Our testing r-squared value is only displaying 2 decimal points and is {test_r2:.2f}.')

In [None]:
# To display 5 decimal places:

print(f'Our testing r-squared value is only displaying 5 decimal points and is {test_r2:.5f}.')

# More to explore

See this resource for an excellent summary table of string formats: https://mkaz.blog/code/python-string-format-cookbook/  and this article for more information about string formatting:  https://thepythonguru.com/python-string-formatting/



## Summary
This lesson showed you how to evaluate your statsmodels OLS for the test data using scikit-learn metrics. We also reviewed f-strings and showed you the recommended way to adjust how many decimal places are shown in your output. 

Using f-strings with print statements will make your code more efficient and will allow you to provide professional-looking code!  Its certainly not as professional-looking as an OLS summary, but its a start!



For more information on reproducing more of the statsmodels OLS summary with scikit-learn, checking out the Optional lesson: "Optional - Reproducing (parts of) statsmodels Summary in Sklearn"