## Regression demo 1
This Jupyter notebook runs a the regression demo from Lecture 7 of the DS2000 course. 

Under the hood, it uses plotly and dash to create an interactive Jupyter notebook.
These are very cool tools that are worth learning about, but not required for the course. 

Plotly: 
Dash

You should however play around with the interactive features of this notebook to strengthen your intuitions about regression. 


In [2]:
# Make sure that you have the following libraries installed
! pip install dash plotly 

zsh:1: command not found: pip


In [3]:
import dash
from dash import dcc # Dash core components 
from dash import html # Dash html components
from dash.dependencies import Input, Output # Dash callback inputs and outputs
import plotly.express as px # Plotly express
import plotly.graph_objects as go # or plotly.express as px
import pandas as pd
import numpy as np

In [4]:
# Read the data frame as global variable
D = pd.read_csv('RunningSpeeds.csv')

# Generate the x-axis for the regression line
predx = np.linspace(18, 83, 4)

# Build the App layout:
app = dash.Dash()
app.layout = html.Div([
    dcc.Graph(id='strava_fig'),
    html.P('intercept: 0.05',id='intercept_val'),
    dcc.Slider(
        id='intercept_slide',
        min=0.0,
        max=8.0,
        value=5.0,
        step=0.01,
        updatemode='drag'),
    html.P('slope: 0.01',id='slope_val'),
    dcc.Slider(
        id='slope_slide',
        min=-0.1,
        max=0.2,
        value=0,
        step=0.001,
        updatemode='drag'),
    html.P(id='Loss_val',style = {'font-size': '20px'}),
    dcc.Checklist(
        id='check',
        options=[
            {'label': 'Subtract 40', 'value': 'Sub'},
            {'label': 'Show Loss', 'value': 'Loss'},
            {'label': 'Show Deriv', 'value': 'Deriv'}
        ],
        value=['Sub'],
        labelStyle={'display': 'inline-block'})
    ],style={'width':'500px',
            'background':'#ffffff',
            'font-family': 'sans-serif',
            'font-size': '16px'})

# Define the callback function
@app.callback(
    Output('strava_fig', 'figure'),
    Output('Loss_val',  'children'),
    Output('intercept_val', 'children'),
    Output('slope_val', 'children'),
    Input('intercept_slide', 'value'),
    Input('slope_slide', 'value'),
    Input('check','value'))

# This function updates the graph
def update_graph(b0, b1, checkitems):
    # Dependng on whether subtract 40 is toggled -
    # subtract 40 from the age.
    if 'Sub' in checkitems:
        predy = b0+b1*(predx-40)
        xx = D.age - 40
    else:
        predy = b0+b1*(predx)
        xx = D.age

    # Calculate residuals and  Loss
    res = D.pace - b0 - b1*xx
    Loss = np.sum((res)**2)

    # Generate the scatter plot
    fig = px.scatter(D,x='age',y='pace')
    fig.add_trace(go.Scatter(x=predx, y=predy,
                    mode='lines',
                    name='prediction'))
    fig.update_layout(title="Running Speeds from Strava",
        autosize=False,
        width=500, height=350,margin=dict(l=50,r=50,b=50,t=50,pad=4),
        paper_bgcolor="#fdfdff",
        xaxis_title="Age [years]",
        yaxis_title="Pace [min/km]")
    fig.update_xaxes(range=[17, 85],fixedrange=True)
    fig.update_yaxes(range=[2.5, 8.5],fixedrange=True)

    if 'Loss' in checkitems:
        Loss_str = f'Loss:{Loss:.1f}'
    else:
        Loss_str = ''

    # Plot the slope and intercept
    Intercept_str = f'Intercept:{b0:.2f}'
    #
    Slope_str = f'Slope: {b1:.2f}'
    if 'Deriv' in checkitems:
        dLb0 = -2*np.sum(res)
        dLb1 = -2*np.sum(res * xx)
        Intercept_str = Intercept_str + f'  [{dLb0:.2f}]'
        Slope_str = Slope_str + f'  [{dLb1:.2f}]'
    return fig, Loss_str, Intercept_str, Slope_str


# Dash example for the optimization of a regression line 
The following dash-app allows you to 
vary the value of the slope and intercept of a regression line and see the immediate effects on the fit. 

`subtract 40`: 
Selected : The regression line is: y = slope * (age-40) + intercept 
Not Selected: The regression line is :y = slope * age + intercept 

`show Loss`: 
Shows the squared loss: try to adjust the line minimizing this quantity

`show derivative`: 
Shows the derivative for slope and intercept. Note that you have to move the parameter in the direction *opposite* to the derivative, as you are trying to *minimize* the loss function. 

In [5]:
app.run(jupyter_mode='inline')