# 4.7.3. Linear Discriminant Analysis

Now we will perform LDA on the `Smarket` data. In `R`, we fit an LDA model using the `lda()` function, which is part of the `MASS` library. Notice that the syntax for the `lda()` function is identical to that of `lm()`, and to that of `glm()` except for the absence of the `family` option. We fit the model using only the observations before 2005.

In [1]:
library(ISLR2)
library(MASS)
attach(Smarket)
train <- (Year < 2005)
Smarket.2005 <- Smarket[!train,]
Direction.2005 <- Direction[!train]


Attaching package: ‘MASS’


The following object is masked from ‘package:ISLR2’:

    Boston




In [2]:
lda.fit <- lda(Direction ~ Lag1 + Lag2, data = Smarket, subset = train)
lda.fit

Call:
lda(Direction ~ Lag1 + Lag2, data = Smarket, subset = train)

Prior probabilities of groups:
    Down       Up 
0.491984 0.508016 

Group means:
            Lag1        Lag2
Down  0.04279022  0.03389409
Up   -0.03954635 -0.03132544

Coefficients of linear discriminants:
            LD1
Lag1 -0.6420190
Lag2 -0.5135293

In [3]:
lda(Direction ~ Lag1 + Lag2, data = Smarket, subset = train)

Call:
lda(Direction ~ Lag1 + Lag2, data = Smarket, subset = train)

Prior probabilities of groups:
    Down       Up 
0.491984 0.508016 

Group means:
            Lag1        Lag2
Down  0.04279022  0.03389409
Up   -0.03954635 -0.03132544

Coefficients of linear discriminants:
            LD1
Lag1 -0.6420190
Lag2 -0.5135293

The LDA output indicates that $\hat{\pi}_1 = 0.492$ and $\hat{\pi}_2 = 0.508$; in other words, $49.2%$ of the training observations correspond to days during which the market went down. It also provides the group means; these are the average of each predictor within each class, and are used by LDA as estimates of $\mu_k$. These suggest that there is a tendency for the previous 2 days' returns to be negative on days when the market increases, and a tendency for the previous days' returns to be positive on days when the market declines. The _coefficients of linear discriminants_ output provides the linear combination words, these are the multipliers of the elements of $X=x$ in (4.24). If $-0.642 \times \textcolor{brown}{Lag1} - 0.514 \times \textcolor{brown}{Lag2}$ is large, then the LDA classifier will predict a market increase, and if it is small, then the LDA classifier will predict a market decline.  

The `plot()` function produces plots of the _linear discriminants_, obtained by computing $-0.642 \times \textcolor{brown}{Lag1} - 0.514 \times \textcolor{brown}{Lag2}$ for each of the training observations. The `Up` and `Down` observations are displayed separately.  

The `predict()` function returns a list with three elements. The first elements, `class`, contains LDA's predictions about the movement of the market. The second element, `posterior`, is a matrix whose _k_th column contains the posterior probability that the corresponding observation belongs to the _k_th class, computed from (4.15). Finally, `x` contains the linear discriminants, described earlier.

In [4]:
lda.pred <- predict(lda.fit, Smarket.2005)
names(lda.pred)

As we observed in Section 4.5, the LDA and logistic regression predictions are almost identical.

In [5]:
lda.class <- lda.pred$class
table(lda.class, Direction.2005)

         Direction.2005
lda.class Down  Up
     Down   35  35
     Up     76 106

In [6]:
mean(lda.class == Direction.2005)

Applying a $50%$ threshold to the posterior probabilities allows us to recreate the predictions contained in `lda.pred$class`.

In [7]:
sum(lda.pred$posterior[,1] >= .5)

In [8]:
sum(lda.pred$posterior[,1] < .5)

In [9]:
lda.pred$posterior[1:20,1]

In [10]:
lda.class[1:20]

If we wanted to use a posterior probability threshold other than $50%$ in order to make predictions, then we could easily do so. For instance, suppose that we wish to predict a market decrease only if we are very certain that the market will indeed decrease on that day&ndash;say, if the posterior probability is at least $90%$.

In [11]:
sum(lda.pred$posterior[,1] > .9)

No days in 2005 meet that threshold! In fact, the greatest posterior probability of decrease in all of 2005 was $52.02%$.