# Descending into ML

Linear regression is a method for finding the straight line or hyperplane that best fits a set of points. This module explores linear regression intuitively before laying the groundwork for a machine learning approach to linear regression.

$y = w ~1~ x ~1~ + b$
- where, *w = weights, x = labels, b = bias*

It has long been known that crickets (an insect species) chirp more frequently on hotter days than on cooler days. For decades, professional and amateur scientists have cataloged data on chirps-per-minute and temperature. As a birthday gift, your Aunt Ruth gives you her cricket database and asks you to learn a model to predict this relationship. Using this data, you want to explore this relationship.

First, examine your data by plotting it:

<img src="./images/CricketPoints.svg">

Figure 1. Chirps per Minute vs. Temperature in Celsius.

As expected, the plot shows the temperature rising with the number of chirps. Is this relationship between chirps and temperature linear? Yes, you could draw a single straight line like the following to approximate this relationship:

<img src="./images/CricketLine.svg">

Figure 2. A linear relationship.

True, the line doesn't pass through every dot, but the line does clearly show the relationship between chirps and temperature. Using the equation for a line, you could write down this relationship as follows:

$y = mx + b$

where:

 is the temperature in Celsius—the value we're trying to predict.
 is the slope of the line.
 is the number of chirps per minute—the value of our input feature.
 is the y-intercept.
By convention in machine learning, you'll write the equation for a model slightly differently:

$y = b = w ~1~ x ~1~$

where:

 is the predicted label (a desired output).
 is the bias (the y-intercept), sometimes referred to as 
.
 is the weight of feature 1. Weight is the same concept as the "slope" 
 in the traditional equation of a line.
 is a feature (a known input).
To infer (predict) the temperature 
 for a new chirps-per-minute value 
, just substitute the 
 value into this model.

Although this model uses only one feature, a more sophisticated model might rely on multiple features, each having a separate weight (etc.). For example, a model that relies on three features might look as follows:

$y' = b + w ~1~ x ~1~ + w ~2~ x ~2~ + w ~3~ x ~3~ ... + w ~n~ + x ~n~$ 