# Linear Regression
###### (https://samcgardner.github.io/2018/10/06/linear-regression-in-haskell.html)

The point of linear regression is the model the relationship between continuous values by assuming that one is a linear function of the other. We want to find a line that has the best fit for our data.

The equation for such a line is representable as `y = mx + b` so we need a way to determine the best matches for what `m` and `b` should be. For this we will use Gradient Descent.

Gradient descent can be applied to all differentiable functions. It says that a minima exists where all the partial derivatives are equal to zero and that by going in the direction of negative gradients we will eventually reach a minima if one exists.

What we need is a function whose value is at a minimum when the values for `m` and `b` are optimal. This is called the 'Cost Function.'

In [3]:
:m Data.List

-- represents our m and b coefficients
newtype Coefficients = Coefficients (Float, Float) deriving Show

-- a data point in our training set
newtype Example = Example (Float, Float)

newtype TrainingSet = TrainingSet [Example]

-- Finds the coefficients for linear regression using gradient descent
linearRegression :: Coefficients -> Float -> TrainingSet -> Int -> Coefficients
linearRegression coefficients alpha dataset iterations
  | iterations == 0 = coefficients
  | otherwise =
    let thetas = newThetas coefficients alpha dataset
    in linearRegression thetas alpha dataset (iterations - 1)
    
-- calculate new values for t0 and t1
newThetas :: Coefficients -> Float -> TrainingSet -> Coefficients
newThetas thetas@(Coefficients (t0, t1)) alpha (TrainingSet examples) =
  let deltas = map (calculateDelta thetas) examples
      adjustedDeltas = adjustDeltas deltas examples
      newt0 = t0 - alpha * avg deltas
      newt1 = t1 - alpha * avg adjustedDeltas
  in Coefficients (newt0, newt1)

-- Calculates the difference between h(x) and y
calculateDelta :: Coefficients -> Example -> Float
calculateDelta (Coefficients (t0, t1)) (Example (x, y)) = t0 + t1 * x - y

-- For the case where d/dx != 1 and we need to mulitply through, do so
adjustDeltas :: [Float] -> [Example] -> [Float]
adjustDeltas deltas examples =
  let xs = map (\(Example (x, _)) -> x) examples
      zipped = zip deltas xs
  in map (uncurry (*)) zipped
  
avg xs = realToFrac (sum xs) / genericLength xs

-- now test it
let thetas = Coefficients (0, 0)
let alpha = 0.3
let trainingset =
      TrainingSet [Example (0, 50), Example (1, 60), Example (2, 70)]
let iterations = 500

linearRegression thetas alpha trainingset iterations

Coefficients (49.99999,10.000008)

This is very close to our training data