<h1 style="color:blue;">Scenario 12 - Final Model</h1> 


- C1.S12.Py01 – Label encoding Grade data (Scenario12_Presentation.ipynb)
- **C1.S12.Py02 - Building a regression model with backwards elimination (Scenario12_DataModeling.ipynb)**
- **C1.S12.Py03 - Reviewing the final regression model (Scenario12_DataModeling.ipynb)**
- C1.S12.Py04 - The CRISP approach in brief
- C1.S12.Py05 - Using Markdown to emphasize CRISP approach in code (CRISP_DataModeling_Template.ipynb)




In [None]:
#Code Block 1

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns



#style options 

%matplotlib inline  
#if you want graphs to automatically without plt.show

pd.set_option('display.max_columns',500) #allows for up to 500 columns to be displayed when viewing a dataframe

plt.style.use('seaborn') #a style that can be used for plots - see style reference above



In [None]:
#Code Block 2
df = pd.read_csv('data/Scenario12.csv', index_col = 0, header=0) 
    #DOES set the first column to the index
    # and the top row as the headers

<h2 style="color:blue;">Building a regression model with backwards elimination </h2>

## A Parsimonuous Model
**Definition:** 
- Parsimonious models are simple models with great explanatory predictive power. They explain data with a minimum number of parameters, or predictor variables.
- https://www.statisticshowto.com/parsimonious-model/


### Convert or Exclude all features that should not be included:
- Convert to dummy variables ***(ex. Home Ownership  - Already Complete)
- Convert to qualitative features to linear numerical values ***(ex. Grade - Already Complete)***
- Drop features that do not add value or are not feasible for the model
     - Do not include features that are not available at the time of decision ***(Grade - Still needed)***
     - Do not include features that do not have a logical linear relationship to the target variable ***(ex. MemberID or Date - Already Complete)***

### Create the model 
1. Create an initial model with ALL **independent features (X)** and your **target variable (y)**
2. Review all p-values for each individual coefficient to see which ones are not significant ***(p-value > 0.05)***
3. Eliminate the feature with the highest p-value (only if it is above 0.05) from the model (only the highest, not all features with a p-value above 0.05) and re-run regression each time.
4. After eliminating one feature, reassess the new model non-significant p-values (above 0.05).  If any p-values are above 0.05 and go back to step 3.  If no p-values are above 0.05, then proceed to step 5. 
5. For the final model, check for multi-collinearity with correlation matrix and VIF. You may want to eliminate one or two more features based on multi-collinearity.
6. Once there are no more non-significant variables and you are satisfied with the model interaction between features (multi-collinearity), you have your ***final parsimonuous model***.

In [None]:
#Code Block 3
display(df.info())
df.head()

### Drop
- MemberID              ***(DROP - not linear to target variable)***           
- LoanID                ***(DROP - not linear to target variable)***
- Grade                 ***(DROP - not known when interest rate is determined)*** 
- Day                   ***(DROP - not linear to target variable)***           
- Month                 ***(DROP - not linear to target variable)*** 
- Year                  ***(DROP - not linear to target variable)*** 

In [None]:
#Code Block 4
df_reg = df.copy()

In [None]:
#Code Block 5
df_reg.columns

### Create X and y by DROPPING features

In [None]:
#Code Block 6
df_reg = df_reg.drop(['Origination Date','Member ID', 'Loan ID','Employee Title', 'Grade', 'TermString','Zip Code of Residence', 'State of Residence',
                      'Day', 'Month', 'Year', 'MORTGAGE', 'debt_consolidation'], axis = 1)
df_reg.info()

In [None]:
#Code Block 7
X = df_reg.drop('Interest Rate', axis = 1)
y = df_reg[['Interest Rate']]

In [None]:
#Code Block 8
df_reg = df.copy()

### Create X and y by INCLUDING features

- By including instead of dropping, you can document the features used in the model in a more efficient way.

In [None]:
#Code Block 9
X = df_reg[['Amount Funded', 'Total Debt', 'Annual Income', 'Revolving Balance',
      'Total Revolving Credit Line', 'Term', 'Length of Employment', 
      'Delinquencies Past 24 Months', 'Credit Inquires Last 6 Months',
      'Open Accounts', 'Loan_Income','Debt_Income', 'RevBal_Line', 'RevBal_Income', 
      'Income Verification','IncVer_Income_act',  'OTHER', 'OWN', 'RENT', 'Own_RevLine_act', 
      'car', 'credit_card', 'home_improvement', 'house', 'major_purchase', 'medical', 'moving',
      'other', 'renewable_energy', 'small_business', 'vacation', 'wedding']]

y = df_reg[['Interest Rate']]
X.info()

<h3 style="color:blue;">Initial Regression Model </h3>


In [None]:
#Code Block 10
import statsmodels
import statsmodels.api as sm

In [None]:
#Code Block 11
X = sm.add_constant(X) # adding a constant

reg = sm.OLS(y, X).fit()

predictions = reg.predict(X) 
resid = reg.resid
reg.summary()

<h3 style="color:blue;">Step 1: Remove ['major_purchase'] </h3>

In [None]:
#Code Block 12

#Step 1: exclude 'major_purchase'


X = df_reg[['Amount Funded', 'Total Debt', 'Annual Income', 'Revolving Balance',
      'Total Revolving Credit Line', 'Term', 'Length of Employment', 
      'Delinquencies Past 24 Months', 'Credit Inquires Last 6 Months',
      'Open Accounts', 'Loan_Income','Debt_Income', 'RevBal_Line', 'RevBal_Income', 
      'Income Verification','IncVer_Income_act',  'OTHER', 'OWN', 'RENT', 'Own_RevLine_act', 
      'car', 'credit_card', 'home_improvement', 'house', 'medical', 'moving',
      'other', 'renewable_energy', 'small_business', 'vacation', 'wedding']]

X_excluded = df_reg['major_purchase']

In [None]:
#Code Block 13
X = sm.add_constant(X) # adding a constant

reg1 = sm.OLS(y, X).fit()

predictions1 = reg1.predict(X) 
resid1 = reg1.resid
reg1.summary()

<h3 style="color:blue;">Step 2: Remove ['Length of Employment'] </h3>

In [None]:
#Code Block 14

#Step 1: exclude 'major_purchase'
#Step 2: exclude 'Length of Employment'


X = df_reg[['Amount Funded', 'Total Debt', 'Annual Income', 'Revolving Balance',
      'Total Revolving Credit Line', 'Term', 'Length of Employment', 
      'Delinquencies Past 24 Months', 'Credit Inquires Last 6 Months',
      'Open Accounts', 'Loan_Income','Debt_Income', 'RevBal_Line', 'RevBal_Income', 
      'Income Verification','IncVer_Income_act',  'OTHER', 'OWN', 'RENT', 'Own_RevLine_act', 
      'car', 'credit_card', 'home_improvement', 'house', 'medical', 'moving',
      'other', 'renewable_energy', 'small_business', 'vacation', 'wedding']]

X_excluded = df_reg[['major_purchase', 'Length of Employment']]

In [None]:
#Code Block 15
X = sm.add_constant(X) # adding a constant

reg2 = sm.OLS(y, X).fit()

predictions2 = reg2.predict(X) 
resid2 = reg2.resid
reg2.summary()

<h3 style="color:blue;">Step 3: Remove ['home_improvement'] </h3>

In [None]:
#Code Block 16

#Step 1: exclude 'major_purchase'
#Step 2: exclude 'Length of Employment'
#Step 3: exclude 'home_improvement'


X = df_reg[['Amount Funded', 'Total Debt', 'Annual Income', 'Revolving Balance',
      'Total Revolving Credit Line', 'Term', 
      'Delinquencies Past 24 Months', 'Credit Inquires Last 6 Months',
      'Open Accounts', 'Loan_Income','Debt_Income', 'RevBal_Line', 'RevBal_Income', 
      'Income Verification','IncVer_Income_act',  'OTHER', 'OWN', 'RENT', 'Own_RevLine_act', 
      'car', 'credit_card', 'house', 'medical', 'moving',
      'other', 'renewable_energy', 'small_business', 'vacation', 'wedding']]

X_excluded = df_reg[['major_purchase', 'Length of Employment', 'home_improvement']]


In [None]:
#Code Block 17
X = sm.add_constant(X) # adding a constant

reg3 = sm.OLS(y, X).fit()

predictions3 = reg3.predict(X) 
resid3 = reg3.resid
reg3.summary()

<h3 style="color:blue;">Step 4: Remove ['house'] </h3>

In [None]:
#Code Block 18

#Step 1: exclude 'major_purchase'
#Step 2: exclude 'Length of Employment'
#Step 3: exclude 'home_improvement'
#Step 4: exclude 'house'


X = df_reg[['Amount Funded', 'Total Debt', 'Annual Income', 'Revolving Balance',
      'Total Revolving Credit Line', 'Term', 
      'Delinquencies Past 24 Months', 'Credit Inquires Last 6 Months',
      'Open Accounts', 'Loan_Income','Debt_Income', 'RevBal_Line', 'RevBal_Income', 
      'Income Verification','IncVer_Income_act',  'OTHER', 'OWN', 'RENT', 'Own_RevLine_act', 
      'car', 'credit_card', 'medical', 'moving',
      'other', 'renewable_energy', 'small_business', 'vacation', 'wedding']]

X_excluded = df_reg[['major_purchase', 'Length of Employment', 'home_improvement', 'house' ]]

In [None]:
#Code Block 19

X = sm.add_constant(X) # adding a constant

reg4 = sm.OLS(y, X).fit()

predictions4 = reg4.predict(X) 
resid4 = reg4.resid
reg4.summary()

<h3 style="color:blue;">Step 5: Remove ['Own_RevLine_act'] </h3>

In [None]:
#Code Block 20

#Step 1: exclude 'major_purchase'
#Step 2: exclude 'Length of Employment'
#Step 3: exclude 'home_improvement'
#Step 4: exclude 'house'
#Step 5: exclude 'Own_RevLine_act'

X = df_reg[['Amount Funded', 'Total Debt', 'Annual Income', 'Revolving Balance',
      'Total Revolving Credit Line', 'Term', 
      'Delinquencies Past 24 Months', 'Credit Inquires Last 6 Months',
      'Open Accounts', 'Loan_Income','Debt_Income', 'RevBal_Line', 'RevBal_Income', 
      'Income Verification','IncVer_Income_act',  'OTHER', 'OWN', 'RENT', 
      'car', 'credit_card', 'medical', 'moving',
      'other', 'renewable_energy', 'small_business', 'vacation', 'wedding']]

X_excluded = df_reg[['major_purchase', 'Length of Employment', 'home_improvement', 'house' ,'Own_RevLine_act']]

In [None]:
#Code Block 21

X = sm.add_constant(X) # adding a constant

reg5 = sm.OLS(y, X).fit()

predictions5 = reg5.predict(X) 
resid5 = reg5.resid
reg5.summary()

<h3 style="color:blue;">Step 6: Remove ['car'] </h3>

In [None]:
#Code Block 22

#Step 1: exclude 'major_purchase'
#Step 2: exclude 'Length of Employment'
#Step 3: exclude 'home_improvement'
#Step 4: exclude 'house'
#Step 5: exclude 'Own_RevLine_act'
#Step 6: exclude 'car'

X = df_reg[['Amount Funded', 'Total Debt', 'Annual Income', 'Revolving Balance',
            'Total Revolving Credit Line', 'Term', 
            'Delinquencies Past 24 Months', 'Credit Inquires Last 6 Months',
            'Open Accounts', 'Loan_Income','Debt_Income', 'RevBal_Line', 'RevBal_Income', 
            'Income Verification','IncVer_Income_act',  'OTHER', 'OWN', 'RENT', 
            'credit_card', 'medical', 'moving',
            'other', 'renewable_energy', 'small_business', 'vacation', 'wedding']]

X_excluded = df_reg[['major_purchase', 'Length of Employment', 'home_improvement', 'house' ,'Own_RevLine_act', 'car']]

In [None]:
#Code Block 23
X = sm.add_constant(X) # adding a constant

reg6 = sm.OLS(y, X).fit()

predictions6 = reg6.predict(X) 
resid6 = reg6.resid
reg6.summary()

<h3 style="color:red;">Should we remove ['IncVer_Income_act']? </h3>

<h3 style="color:blue;">Step 7: Remove ['IncVer_Income_act'] </h3>

In [None]:
#Code Block 24

#Step 1: exclude 'major_purchase'
#Step 2: exclude 'Length of Employment'
#Step 3: exclude 'home_improvement'
#Step 4: exclude 'house'
#Step 5: exclude 'Own_RevLine_act'
#Step 6: exclude 'car'

#Additional Regression Check
##Step 7: exclude 'IncVer_Income_act'

X = df_reg[['Amount Funded', 'Total Debt', 'Annual Income', 'Revolving Balance',
            'Total Revolving Credit Line', 'Term', 
            'Delinquencies Past 24 Months', 'Credit Inquires Last 6 Months',
            'Open Accounts', 'Loan_Income','Debt_Income', 'RevBal_Line', 'RevBal_Income', 
            'Income Verification',  'OTHER', 'OWN', 'RENT', 
            'credit_card', 'medical', 'moving',
            'other', 'renewable_energy', 'small_business', 'vacation', 'wedding']]

X_excluded = df_reg[['major_purchase', 'Length of Employment', 'home_improvement', 'house' ,
                     'Own_RevLine_act', 'car', 'IncVer_Income_act']]

In [None]:
#Code Block 25
X = sm.add_constant(X) # adding a constant

reg7 = sm.OLS(y, X).fit()

predictions7 = reg7.predict(X) 
resid7 = reg7.resid
reg7.summary()

### Look at AIC and BIC to choose between the models

- AIC is an estimate of a constant plus the relative distance between the unknown true likelihood function of the data and the fitted likelihood function of the model, so that a lower AIC means a model is considered to be closer to the truth. 
- BIC is an estimate of a function of the posterior probability of a model being true, under a certain Bayesian setup, so that a lower BIC means that a model is considered to be more likely to be the true model. 
- https://www.methodology.psu.edu/resources/AIC-vs-BIC/

### Compare Step 6 to Step 7

#### Initial Model 
- R-squared:	0.521
- Adj. R-squared:	0.520
- AIC:	1.486e+05 or 148,600
- BIC:	1.488e+05 or 148,800

#### Step 6
- R-Squared: 0.521 **(stayed the same)**
- Adj. R-squared:	0.520 **(stayed the same)**
- AIC:	1.486e+05 or 148,600 **(stayed the same)**
- BIC:	1.488e+05 or 148,800 **(stayed the same)**

#### Step 7
- R-Squared: 0.518 **(decreased)**
- Adj. R-squared:	0.517 **(decreased)**
- AIC:	1.488e+05 or 148,800 **(increased)**
- BIC:	1.490e+05 or 149,000 **(increased)**



<h3 style="color:blue;">FINAL MODEL - Step 6</h3>

In [None]:
#Code Block 26

#Step 1: exclude 'major_purchase'
#Step 2: exclude 'Length of Employment'
#Step 3: exclude 'home_improvement'
#Step 4: exclude 'house'
#Step 5: exclude 'Own_RevLine_act'
#Step 6: exclude 'car'

X = df_reg[['Amount Funded', 'Total Debt', 'Annual Income', 'Revolving Balance',
            'Total Revolving Credit Line', 'Term', 
            'Delinquencies Past 24 Months', 'Credit Inquires Last 6 Months',
            'Open Accounts', 'Loan_Income','Debt_Income', 'RevBal_Line', 'RevBal_Income', 
            'Income Verification','IncVer_Income_act',  'OTHER', 'OWN', 'RENT', 
            'credit_card', 'medical', 'moving',
            'other', 'renewable_energy', 'small_business', 'vacation', 'wedding']]

X_excluded = df_reg[['major_purchase', 'Length of Employment', 'home_improvement', 'house' ,'Own_RevLine_act', 'car']]

In [None]:
#Code Block 27

X = sm.add_constant(X) # adding a constant

reg6 = sm.OLS(y, X).fit()

predictions6 = reg6.predict(X) 
resid6 = reg6.resid
reg6.summary()

In [None]:
#Code Block 28

#Create Predictions dataframe
df_predictions6 = pd.DataFrame(predictions6)
df_predictions6=df_predictions6.rename(columns = {0:'Int_Pred_6'})

#Create Residuals dataframe
df_resid6 = pd.DataFrame(resid6)
df_resid6=df_resid6.rename(columns = {0:'Resid_6'})

In [None]:
#Code Block 29
from statsmodels.stats.outliers_influence import variance_inflation_factor

In [None]:
#Code Block 30

vif = pd.DataFrame()
vif["VIF Factor"] = [variance_inflation_factor(X.values, i) for i in range(X.shape[1])]
vif["features"] = X.columns #adds a column with the labels
round(vif, 1).sort_values(by = 'VIF Factor', ascending = False)

In [None]:
#Code Block 31

X_corr = X.drop(['const'], axis=1)
corrMatrix = X_corr.corr()
df_corrMatrix = pd.DataFrame(corrMatrix)
round(df_corrMatrix,3)

In [None]:
#Code Block 32

colormap = plt.cm.viridis
plt.figure(figsize=(14,10))
plt.title('Correlation Heat Map', y=1.05, size=15)
sns.heatmap(df_corrMatrix,linewidths=0.1,vmax=1.0, square=True, cmap=colormap, linecolor='white', annot=False)

### Create Final Model Results with all features, predictions6, and residuals6

In [None]:
#Code Block 33

df_reg_results = pd.concat([df_predictions6, df_resid6, df_reg], axis =1)
df_reg_results.head()

### Graphically looking at residuals using lowess

- LOWESS (Locally Weighted Scatterplot Smoothing), is a tool used in regression analysis that creates a smooth line through a scatter plot to help you to see relationship between variables and foresee trends.
- https://www.statisticshowto.com/lowess-smoothing/

In [None]:
#Code Block 34

df_move_col = df_reg_results[['Member ID', 'Loan ID', 'Origination Date', 'Interest Rate']]
df_reg_results = df_reg_results.drop(['Member ID', 'Loan ID', 'Origination Date', 'Interest Rate'], axis=1)
df_reg_results = pd.concat([df_move_col, df_reg_results], axis =1)
df_reg_results.head()

In [None]:
#Code Block 35

plt.figure(figsize=(20,10)) #changes area of scatterplot
sns.regplot(x='Int_Pred_6', y='Resid_6', lowess=True,
              data = df_reg_results, scatter_kws={"color":"blue","alpha":0.15, "s":100,"linewidth":2,"edgecolor":"white"}, 
              line_kws={'color': 'black'})

In [None]:
#Code Block 36

df_reg_results.sort_values(by='Resid_6').head(10)

In [None]:
#Code Block 37

df_reg_results.sort_values(by='Resid_6', ascending=False).head(10)

In [None]:
#Code Block 38

X = sm.add_constant(X) # adding a constant

reg6 = sm.OLS(y, X).fit()

predictions6 = reg6.predict(X) 
resid6 = reg6.resid
reg6.summary()

### View coefficents of final model

In [None]:
#Code Block 39

print(reg6.params)

### View t-values of final model

In [None]:
#Code Block 40 
reg6.tvalues

### View t-values in absolute value order

In [None]:
#Code Block 41
t_values_reg6 = pd.DataFrame(reg6.tvalues).reset_index()
t_values_reg6=t_values_reg6.rename(columns = {'index': "Coefficient", 0:'t_value'})
t_values_reg6['abs'] = abs(t_values_reg6['t_value'])
t_values_reg6 = t_values_reg6.sort_values(by='abs', ascending=False)
t_values_reg6 = t_values_reg6.drop(['abs'], axis=1)
t_values_reg6

### Graphically view a few select features
- Amount Funded
- Term
- RevBal_Line
- Revolving Balance
- Total Revolving Credit Line 
- renewable_energy

In [None]:
#Code Block 42

sns.set(style='dark')
plt.figure(figsize=(20,40))

#  Amount Funded - Predicted
ax1 = plt.subplot2grid((6, 2), (0, 0))
plt.title('Amount Funded - Predicted Interest Rate', fontweight='bold', color = 'blue', fontsize='17', horizontalalignment='center')
ax1 = sns.regplot(x='Amount Funded', y='Int_Pred_6', 
              data = df_reg_results, scatter_kws={"color":"blue","alpha":0.15, "s":100,"linewidth":2,"edgecolor":"white"}, 
              line_kws={'color': 'black'})


#  Amount Funded - Actual
ax2 = plt.subplot2grid((6, 2), (0, 1))
plt.title('Amount Funded - Actual Interest Rate', fontweight='bold', color = 'green', fontsize='17', horizontalalignment='center')
ax2 = sns.regplot(x='Amount Funded', y='Interest Rate', 
              data = df_reg_results, scatter_kws={"color":"green","alpha":0.15, "s":100,"linewidth":2,"edgecolor":"white"}, 
              line_kws={'color': 'black'})


#  Term - Predicted
ax3 = plt.subplot2grid((6, 2), (1, 0))
plt.title('Term - Predicted Interest Rate', fontweight='bold', color = 'blue', fontsize='17', horizontalalignment='center')
ax3 = sns.regplot(x='Term', y='Int_Pred_6', 
              data = df_reg_results, scatter_kws={"color":"blue","alpha":0.15, "s":100,"linewidth":2,"edgecolor":"white"}, 
              line_kws={'color': 'black'})


#  Term - Actual
ax4 = plt.subplot2grid((6, 2), (1, 1))
plt.title('Term - Actual Interest Rate', fontweight='bold', color = 'green', fontsize='17', horizontalalignment='center')
ax4 = sns.regplot(x='Term', y='Interest Rate', 
              data = df_reg_results, scatter_kws={"color":"green","alpha":0.15, "s":100,"linewidth":2,"edgecolor":"white"}, 
              line_kws={'color': 'black'})

#  RevBal_Line - Predicted
ax1 = plt.subplot2grid((6, 2), (2, 0))
plt.title('RevBal_Line - Predicted Interest Rate', fontweight='bold', color = 'blue', fontsize='17', horizontalalignment='center')
ax1 = sns.regplot(x='RevBal_Line', y='Int_Pred_6', 
              data = df_reg_results, scatter_kws={"color":"blue","alpha":0.15, "s":100,"linewidth":2,"edgecolor":"white"}, 
              line_kws={'color': 'black'})


#  RevBal_Line - Actual
ax2 = plt.subplot2grid((6, 2), (2, 1))
plt.title('RevBal_Line - Actual Interest Rate', fontweight='bold', color = 'green', fontsize='17', horizontalalignment='center')
ax2 = sns.regplot(x='RevBal_Line', y='Interest Rate', 
              data = df_reg_results, scatter_kws={"color":"green","alpha":0.15, "s":100,"linewidth":2,"edgecolor":"white"}, 
              line_kws={'color': 'black'})

#  Revolving Balance - Predicted
ax1 = plt.subplot2grid((6, 2), (3, 0))
plt.title('Revolving Balance - Predicted Interest Rate', fontweight='bold', color = 'blue', fontsize='17', horizontalalignment='center')
ax1 = sns.regplot(x='Revolving Balance', y='Int_Pred_6', 
              data = df_reg_results, scatter_kws={"color":"blue","alpha":0.15, "s":100,"linewidth":2,"edgecolor":"white"}, 
              line_kws={'color': 'black'})


#  Revolving Balance - Actual
ax2 = plt.subplot2grid((6, 2), (3, 1))
plt.title('Revolving Balance - Actual Interest Rate', fontweight='bold', color = 'green', fontsize='17', horizontalalignment='center')
ax2 = sns.regplot(x='Revolving Balance', y='Interest Rate', 
              data = df_reg_results, scatter_kws={"color":"green","alpha":0.15, "s":100,"linewidth":2,"edgecolor":"white"}, 
              line_kws={'color': 'black'})

#  Total Revolving Credit Line - Predicted
ax1 = plt.subplot2grid((6, 2), (4, 0))
plt.title('Total Revolving Credit Line - Predicted Interest Rate', fontweight='bold', color = 'blue', fontsize='17', horizontalalignment='center')
ax1 = sns.regplot(x='Total Revolving Credit Line', y='Int_Pred_6', 
              data = df_reg_results, scatter_kws={"color":"blue","alpha":0.15, "s":100,"linewidth":2,"edgecolor":"white"}, 
              line_kws={'color': 'black'})


#  Total Revolving Credit Line - Actual
ax2 = plt.subplot2grid((6, 2), (4, 1))
plt.title('Total Revolving Credit Line - Actual Interest Rate', fontweight='bold', color = 'green', fontsize='17', horizontalalignment='center')
ax2 = sns.regplot(x='Total Revolving Credit Line', y='Interest Rate', 
              data = df_reg_results, scatter_kws={"color":"green","alpha":0.15, "s":100,"linewidth":2,"edgecolor":"white"}, 
              line_kws={'color': 'black'})

#  renewable_energy - Predicted
ax1 = plt.subplot2grid((6, 2), (5, 0))
plt.title('renewable_energy - Predicted Interest Rate', fontweight='bold', color = 'blue', fontsize='17', horizontalalignment='center')
ax1 = sns.regplot(x='renewable_energy', y='Int_Pred_6', 
              data = df_reg_results, scatter_kws={"color":"blue","alpha":0.15, "s":100,"linewidth":2,"edgecolor":"white"}, 
              line_kws={'color': 'black'})


#  renewable_energy - Actual
ax2 = plt.subplot2grid((6, 2), (5, 1))
plt.title('renewable_energy - Actual Interest Rate', fontweight='bold', color = 'green', fontsize='17', horizontalalignment='center')
ax2 = sns.regplot(x='renewable_energy', y='Interest Rate', 
              data = df_reg_results, scatter_kws={"color":"green","alpha":0.15, "s":100,"linewidth":2,"edgecolor":"white"}, 
              line_kws={'color': 'black'})


### Export final data as a csv file

In [None]:
#Code Block 43
df_reg_results.to_csv('data/FinalDataResults.csv')