### Comedy Show Lab

Imagine that you are the producer for a comedy show at your school.  We need you to use knowledge of linear regression to make predictions as to the success of the show.

### Working through a linear regression 

The comedy show is trying to figure out how much money to spend on advertising in the student newspaper.  The newspaper tells the show that 
 * For every two dollars spent on advertising, three students attend the show.  
 * If no money is spent on advertising, no one will attend the show.  

Write a linear regression function called `attendance` that shows the relationship between advertising and attendance expressed by the newspaper.  

In [17]:
def attendance(advertising):
    return (3/2)*advertising

In [18]:
attendance(100) # 150

150.0

In [19]:
attendance(50) # 75

75.0

Despite what the student newspaper says, the comedy show knows from experience that they'll still have a crowd even without an advertising budget.  Some of the comedians in the show have friends (believe it or not), and twenty of those friends will show up.  Write a function called `attendance_with_friends` that models the following: 

 * ** When the advertising budget is zero, 20 friends still attend**
 * **For every two dollars spent on advertising, three additional people attend the show. ** 

In [20]:
def attendance_with_friends(advertising):
    return (3/2)*advertising + 20

In [21]:
attendance_with_friends(100) # 170

170.0

In [22]:
attendance_with_friends(50) # 70

95.0

Let's help plot this line so you can get a sense of what your $m$ and $b$ values look like in graph form.

First we import the necessary plotly library, and `graph_obs` function, and setup `plotly` to be used without uploading our plots to its website.

In [23]:
import plotly
from plotly import graph_objs
plotly.offline.init_notebook_mode(connected=True)

Then, we set a variable `initial_sample_budgets` equal to a list of our budgets.  

In [24]:
initial_sample_budgets = [0, 50, 100]

Finally, we plot out our regression line, using our `attendance_with_friends` function.  The `budgets` will be our x values.  For our y values, we need to use our `attendance_with_friends` function to create a list of y-value attendances for every input of x. 

In [25]:
trace_of_attendance_with_friends = graph_objs.Scatter(
    x=initial_sample_budgets,
    y=list(map(lambda budget: attendance_with_friends(budget), initial_sample_budgets)),
)

plotly.offline.iplot([trace_of_attendance_with_friends])

In [26]:
trace_of_attendance_with_friends

{'type': 'scatter', 'x': [0, 50, 100], 'y': [20.0, 95.0, 170.0]}

### Calculating slopes

The comedy show decides to use advertising with three different shows.  The attendance looks like the following.

| Budgets (dollars)        | Attendance           | 
| ------------- |:-------------:| 
| 200       |400 | 
| 400       |700 | 

In code, we represent the shows as the following:

In [27]:
first_show = {'budget': 200, 'attendance': 400}
second_show = {'budget': 400, 'attendance': 700}

Use the formula that calculates the slope of the regression line given these two points to write a function called `marginal_return_on_budget` provided these two shows.

In [28]:
def marginal_return_on_budget(first_show, second_show):
    return (second_show['attendance'] - first_show['attendance'])/(second_show['budget'] - first_show['budget'])

In [29]:
marginal_return_on_budget(first_show, second_show) # 1.5

1.5

Let's make sure that our function properly calculates the slope of the line with different data.

In [30]:
imaginary_third_show = {'budget': 300, 'attendance': 500}
imaginary_fourth_show = {'budget': 600, 'attendance': 900}
marginal_return_on_budget(imaginary_third_show, imaginary_fourth_show) # 1.33

1.3333333333333333

The comedy show spends zero dollars on advertising for the next show.  Now the attendance chart looks like the following:

| Budgets (dollars)        | Attendance           | 
| ------------- |:-------------:| 
| 0       |100 | 
| 200       |400 | 
| 400       |700 | 

In [31]:
first_show = {'budget': 200, 'attendance': 400}
second_show = {'budget': 400, 'attendance': 700}
third_show = {'budget': 0, 'attendance': 100}

shows = [first_show, second_show, third_show]

Write a function called `y_intercept`.  It should find the show with a budget of zero, and then return the corresponding attendance.

In [32]:
def y_intercept(shows):
    show_no_budget = list(filter(lambda show: show['budget'] == 0,shows))[0]
    return show_no_budget['attendance']

In [33]:
    y_intercept(shows) # 100

100

Now write a function called `comedy_show_regression_line` that provided a list of shows that have already occurred, and a proposed budget will return the expected attendance for the show.  The function should use the `marginal_return_on_budget` function, and the `y_intercept` function to calculate the attendance.

In [34]:
def comedy_show_regression_line(previous_shows, budget):
    first_show = previous_shows[0]
    second_show = previous_shows[1]
    return marginal_return_on_budget(first_show, second_show) + y_intercept(previous_shows) 

In [35]:
budget = 350
comedy_show_regression_line(shows, budget) # 101.5

101.5

Now, let's write a function that will will return the expected attendance given previous shows, and a budget, and do so even when the previous shows do not include a value for when x is zero.  The function will be called `regression_line_two_points` and take inputs of an array of two shows and a budget.

In [39]:
first_show = {'budget': 300, 'attendance': 700}
second_show = {'budget': 400, 'attendance': 900}

shows = [first_show, second_show]

In [40]:
def regression_line_two_points(shows, budget):
    first_show = shows[0]
    second_show = shows[1]
    m = marginal_return_on_budget(first_show, second_show)
    b = first_show['attendance'] - m*first_show['budget']   
    return m*budget + b

In [44]:
regression_line_two_points(shows, 350)

800.0