In [None]:
---
title: "Chapter 4 - Classificatoin"
author: "Dan"
date: "4 February 2018"
output: html_document
editor_options: 
  chunk_output_type: console
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
```{r}
data(Smarket)
names(Smarket)
dim(Smarket)
summary(Smarket)
```

```{r}
cor.prob <- function (X, dfr = nrow(X) - 2) {
  R <- cor(X, use="pairwise.complete.obs")
  above <- row(R) < col(R)
  r2 <- R[above]^2
  Fstat <- r2 * dfr/(1 - r2)
  R[above] <- 1 - pf(Fstat, 1, dfr)
  R[row(R) == col(R)] <- NA
  R
}

flattenSquareMatrix <- function(m) {
  if( (class(m) != "matrix") | (nrow(m) != ncol(m))) stop("Must be a square matrix.") 
  if(!identical(rownames(m), colnames(m))) stop("Row and column names must be equal.")
  ut <- upper.tri(m)
  data.frame(i = rownames(m)[row(m)[ut]],
             j = rownames(m)[col(m)[ut]],
             cor=t(m)[ut],
             p=m[ut])
}

mydata <- Smarket[-9]
corr <- flattenSquareMatrix(cor.prob(mydata))

library(dplyr)

corr %>%
  arrange(p)

## Only one with a p-value < 0.05 is year and volume, as we would expect there is no correlation ##

library(ggplot2)

ggplot(data = mydata, aes(x=Year, y=Volume, color = Year)) + geom_point() + geom_smooth(method="lm", se=0) + geom_jitter(height=0, width=0.05)


```
```{r}
## Lets see if we can predict the direction of movement given the change in the previous 5 days

glm_fit <- glm(Direction ~ Lag1+Lag2+Lag3+Lag4+Lag5+Volume, data=Smarket, family=binomial)

summary(glm_fit)

## Doesn't look like any of these predictors have any significance.

coef(glm.fit)
glm_probs <- predict(glm_fit, type="response")
attach(Smarket)
contrasts(Direction)

## In order to make a prediction, we need to turn these probabilities into the class labels "Up" and "Down"

library(broom)
glm_probs <- augment(glm_fit, type.predict = "response")

library(dplyr)
glm_probs %>%
  mutate(direction_hat = round(.fitted)) %>%
  select (Direction, direction_hat) %>%
  table() ## Creates a confusion matrix

145+507 # sum of the diagonals are the correct predictions
correct_predictions_pct <- 652/1250
correct_predictions_pct

## At first glance, is appears that the logistic regression model is working a little better than random guessing. However, this result is misleading because we trained and tested the model on the same set of 1250 observations. 

## 100-52.2 = 47.8% training error rate. As we have seen previously, the training error rate is often overly optimistic - it tends to under-estimate the the test error rate.

## To better assess the accuracy of the logistic regression model in this setting, we can fit the model using part of the data, then examine how well it predicts on the held out data. 

train <- (Year<2005) # 1250 element boolean vector
Smarket.2005 <- Smarket[!train,] # 252 x 9 matrix 
dim(Smarket.2005)
Direction.2005 <- Direction[!train] # 

glm.fits <- glm(Direction~Lag1+Lag2+Lag3+Lag4+Lag5+Volume, data = Smarket, family = binomial, subset = train) # fit the model with the training set

glm.probs <- predict(glm.fits, type = "response", newdata = Smarket.2005) # add new data to see how well we predict


glm.pred <- rep("Down",252) # 252 element vector of "Down"
glm.pred[glm.probs>0.5] = "Up" # Change the indexes where glm.probs >0.5 to "Up"
table(glm.pred,Direction.2005) # tabulate the two vectors of equal length

# (77+44)/252 = 0.48 # Correct predictions
# 1-0.48 = 0.52 = test error rate, worse than chance

# We recall that the logistic regression model had very underwhelming pvalues
# associated with all of the predictors, and that the smallest p-value,
# though not very small, corresponded to Lag1. Perhaps by removing the
# variables that appear not to be helpful in predicting Direction, we can
# obtain a more effective model. After all, using predictors that have no
# relationship with the response tends to cause a deterioration in the test
# error rate (since such predictors cause an increase in variance without a
# corresponding decrease in bias), and so removing such predictors may in
# turn yield an improvement. Below we have refit the logistic regression using
# just Lag1 and Lag2, which seemed to have the highest predictive power in
# the original logistic regression model.

glm.fits <- glm(Direction~Lag1+Lag2, data=Smarket, family = binomial, subset = train)

glm.probs <- predict(glm.fits, Smarket.2005, type = "response")

glm.pred <- rep("Down", 252)
glm.pred[glm.probs > 0.5] <- "Up"
table(glm.pred,Direction.2005)

mean(glm.pred!=Direction.2005) # Test error rate
  # (35+106)/252 = correct 56% of the time
 # Remember that if we just guessed increase every day we will also be correct 56% of the time! Hence, in terms of the overall error rate, the logistic regression model is no better than the naive approach. 

# 106/(106/76) = how often the models predictions that the market would increase were accurate = 58%

# This suggests a possible trading strategy of buying on days where the model predicts an increasing market, and avoiding trades on days when a decrease is predicted. Of course, one would need to investigate more carefully whether this small improvement was real or just due to random chance.

# Suppose that we want to predict the resurns associated with particular values of Lag1 and Lag2. In particular, we want to predict Direction on a day when Lag1 and Lag2 equal 1.1 and 1.2 respectively, and on a day when they equal 1.5 and -0.8. We can do this with the predict() function

predict(glm.fits, newdata=data.frame(Lag1=c(1.2, 1.5), Lag2=c(1.1, -0.8)), type="response") # The model predicts that for these numbers of Lag1 and Lag2, the market direction will be down in both cases.

## Linear Discriminant Analysis ##

# lda() function is part of the MASS package

library(MASS)
lda.fit <- lda(Direction~Lag1+Lag2, data=Smarket, subset=train)

# Prior probabilities = pi1 and pi2 values = 0.491 and 0.508
# Group means are the mu values, the mean value of each predictor within each class. Shows that on days where the market goes Down, the previous two days tended to be positive, and when the market goes Up, the previous two days tended to be negative

# The coefficients of linear discriminants output provides the linear combination of Lag1 and Lag2 that are used to form the LDA decision rule. In other words, these are the multipliers of the elements X=x in (4.19)

# If -0.642 x Lag1 - 0.514 x Lag2 is large, then the LDA classifier will predict a market increase, and if it is small, then the LDA classifier will predict a decline.

plot(lda.fit) # plots the linear discriminants, obtained by computing -0.642 x Lag1 - 0.514 x Lag2 for each of the training observations

lda.pred <-predict(lda.fit, newdata = Smarket.2005) # Returns a list with three elements. The first, class, is a factor vector of Ups and Downs. The second is a 252x2 matrix whose kth column corresponds to the posterior probability of the corresponding training observation going into the kth class, computed from (4.10). Finally, x contains the linear discriminants, described above 

lda.class <- lda.pred$class
table(lda.class, Direction.2005)

mean(lda.class==Direction.2005) # = 56% correct, LDA and logistic regression predictions are almost identical

# Applying a 50% threshold to the posterior probabilities allows us to re-create the predictions contained in lda.pred$class

sum(lda.pred$posterior[,1]>=.5) # number of elements of coulumn 1 of the posterior probability matrix (Down) that are favourite = 70

sum(lda.pred$posterior[,1] < 0.5) # 252 minus above

#If we wanted to use a posterior probability threshold other than 50% in order the make predictions, then we could easily do so. For instance, suppose that we wish to predict a market decrease only if we are very certain that the market will indded dectease on that day - say, if the posterior probability is at least 90%

sum(lda.pred$posterior[,1]>0.9) # =0. No days met the threshold
max(lda.pred$posterior[,1]) # max posterior probability was 52.02%

## Quadratic Discriminant Analysis ##

#Same syntax as LDA, also in the MASS package

library(MASS)
qda.fit <- qda(Direction~Lag1+Lag2, data=Smarket, subset=train)
qda.class <- predict(qda.fit, newdata = Smarket.2005)$class
table(qda.class, Direction.2005)
mean(qda.class == Direction.2005) # 59.92% test success rate!

# Suggests that the quadratic form assumed by QDA might capture the true relationship more accurately than the linear forms assumed by LDA and logistic regression. However, we recommend evaluating this methods performance on a larger set before betting that this approach will consistently beat the market!

## K-Nearest Neighbours ##

# knn () function from the 'class' library

# This function works differently from the other model fitting functions. Rather than a two step approach of fitting the model then using the model to make predictions, knn() forms predictions using a single command. The function requires 4 inputs

# 1. A matrix containing the predictors associated with the training data, laballed Train.X 

# 2. A matrix containing the predictors associated with the data for which we wish to make predictions

# 3. A vector containing the class labels for the training observations, labelled train.Direction below

# 4. A value for K, the number of nearest neighbours to be used by the classifier

library(class)

train.X <- cbind(Lag1,Lag2)[train,] # cbnid Lag1 and Lag2 for the row indexes where the boolean vector 'train' = TRUE

test.X <- cbind(Lag1,Lag2)[!train,] # cbnid Lag1 and Lag2 for the row indexes where the boolean vector 'train' = FALSE

train.Direction <- Direction[train] # vector containing class labels for the training observations

## KNN uses random numbers to break ties of two points are exactly the same distance away, so set.seed(1) to make results replicable

set.seed(1)
knn.pred <- knn(train.X, test.X, train.Direction, k=1) # returns a vector of predictions, hence uninterpretable for inference
table(knn.pred, Direction.2005)

mean(knn.pred != Direction.2005) # Test error rate of 50%

knn.pred2 <- knn(train.X, test.X, train.Direction, k=3)
table(knn.pred2, Direction.2005)
mean(knn.pred2 != Direction.2005) # Test error rate of 46.42%

# No improvement on further increasing K. It appears that for this data, QDA provides the best results of the methods that we have examined so far.

## Applying KNN to the Caravan data set from the ISLR library

library(ISLR)
data(Caravan)
attach(Caravan)
dim(Caravan)
?Caravan

summary(Purchase)

# Response variable is purches, only 348 of 5822 customers bought insurance for a rate of 6%

# Because the KNN classifier predicts the class of a given test observation by identifying the observations that are nearest to it, the scale of the variables matters. Any variables that are on a large scale will have a much larger effect on the distance between observations, and hence on the KNN classifier, than variables that are on a small scale.

# For instance, imagine a data set that contains two variables, Salary and Age (measured in dollars and years, respectively). As far as KNN is concerned, a differece of $1000 in salary is enormous compared to a difference of 50 years in age. Consequently, Salary will drive the KNN classification results, and age will have almost no effect.

# This is contraty to our intuition that a salary differnce of $1000 is quite small compared to an age difference of 50 years. Furthermore, the importance of scale to the KNN classifier leads to another issue: if we measured salary in Japanese Yen, or if we measured Age in minutes, then we'd get quite different classification results from what we get if these two variables are measured in dollars and years.

# A good way to handle this problem is to standardize the data so that all variables are given a mean of zero adn a standard deviation of one. Then all variables will be on a comparable scale. The scale() function does just this. In standardizing the data, we exclude column 86 because it is qualitative.

standardized.X <- scale(Caravan[,-86])
var(Caravan[,1]) # variance of column 1 before stanardizing
var(Caravan[,2]) # variance of column 2
var(standardized.X[,1]) # after standardizing
var(standardized.X[,2])

#Create a test set of the first 1,000 observations, the rest will be the training set

test <- 1:1000 # Numberic vector of 1 to 1,000
train.X <- standardized.X[-test,] 
test.X <- standardized.X[test,]
train.Y <- Purchase[-test]
test.Y <- Purchase[test]
set.seed(1)
knn.pred <- knn(train.X, test.X, train.Y, k=1)

mean(test.Y != knn.pred) # test error rate is 11.8% but we could get down to an error rate of just 6% by predicting "No" for every observation.

# Suppose that there is some non-trivial cost to trying to sell insurance to a given individual. For instance, perhaps a salesperson must visit each potential customer. If the company tries to sell insurance to a random selection of people, then the success rate will be only 6%, which may be far too low given the costs involved. Instead, the company would like to try to sell insurance only to customers who are likely to buy it. So the overall error rate is not of interest. Instead, the fraction of individuals who buy insurance GIVEN THAT THE MODEL PREDICTS THAT THEY WILL is the number of interest.

table(knn.pred, test.Y)

# 9 people bought insurance out of the predicted 77 for a succes rate of 11.7% which is a big improvement on the 6% just approaching random people

knn.pred2 <- knn(train.X, test.X, train.Y, k=3)
table(knn.pred2,test.Y) # at K=3 we're up to 19.2%

knn.pred3 <- knn(train.X, test.X, train.Y, k=5)
table(knn.pred3,test.Y) # at K=5 we're up to 26.6%. This is over four times the rate of random guessing. It appears that KNN is finding some real patterns in a difficult data set!

# As a comparison, lets fit a logistic regression model to the data set

glm.fit <- glm(Purchase ~. , data=Caravan, family=binomial, subset=-test) # didn't have to put subset = Caravan[-test] because data is already defined
glm.probs <- predict(glm.fit, newdata = Caravan[test,], type="response")
glm.pred <- rep("No", 1000)
glm.pred[glm.probs > 0.25] = "Yes" # All values over 0.25 probability are predicted yes to Purchase
table(glm.pred,test.Y) # 11/33 = 33.3% success rate, over 5 times better than guessing!


## Exercises Q5 ##

# a) If the Bayes decision boundary is linear, then LDA will perform better on the test set because QDA will have increased variance without any corresponding decrease in bias. For the training set, QDA will perform better if it overfits

# b) If the Bayes decision boundary is non-linear, then QDA will do better on the test set. LDA would have high bias. For the training set, QDA will do better also. 

# c) As sample size n increases, we expect the test prdiction of QDA relative to LDA to increase because QDA will suffer less from over-fitting when it has lots of observations. 

# d) False, QDA will over-fit the data so there will be higher variance with no improvement in bias when compared to LDA in this case. QDA might achieve a better error rate on the training set, but if the decision boundary is linear then QDA is not flexible in any predictive way.

# 6a) Logistic Regression

exp(-6+0.05*40+1*3.5)/(1+exp(-6+0.05*40+1*3.5)) # 0.38

# 6b) log(P(X)/1-P(X)) = B0 + B1X1 + B2X2
#     (log(0.5 / (1-0.5)) + 6 - 3.5)/0.05 = X2 = 50

# 7) 

# mu1 = dividend YES group = 10
# mu2 = dividend NO group = 0
# pi1 = dividend YES prior probability = 0.8
# pi2 = dividend NO group = 0.2
# var is shared and = 36

(0.8*exp(-1/(2*36)*(4-10)^2))/(0.8*exp(-1/(2*36)*(4-10)^2)+(1-0.8)*exp(-1/(2*36)*(4-0)^2)) # = 75.2%

# 8) We do not have enough information to tell, we would need the exact test set error rate for KNN to make a decision

# 9a) x/1-x = 0.37
#     x = 0.37 - 0.37x # add 0.37x to both sides
#     1.37x = 0.37
#     x = 0.37/1.37 = 0.27 = 27%

# b) odds = p(x)/1-p(x) = 0.16/1-0.16 = 0.19 

## 10a)

data(Weekly)
?Weekly

mydata <- Weekly[-9]
cor(mydata)
cormaster <- flattenSquareMatrix(cor.prob(mydata))
cormaster %>%
  arrange(p)

attach(Weekly)
plot(Year, Volume)
plot(Lag2, Volume)
pairs(Weekly)

# Volume and Year seem to be closely correlated

#b) 

glm.fit <- glm(Direction ~ Lag1+Lag2+Lag3+Lag4+Lag5+Volume, data = Weekly, family=binomial)

summary(glm.fit)

## Lag2 appears to be statistically significant

#c)

glm.probs <- predict(glm.fit, data=Weekly, type = "response")

glm.preds <- rep("Down", 1089)
glm.preds[glm.probs > 0.5] <- "Up"
table(glm.preds, Direction)

# Correct predictions = (557+54)/1089 = 56.1%

#d)

test <- (Year>2008)
dim(Weekly[!test,])

training.set <- Weekly[!test,]
test.set <- Weekly[test,]
glm.train <- glm(Direction ~ Lag2, data=training.set, family=binomial)
glm.probs <- predict(glm.train, newdata = test.set, type="response")

glm.preds <- rep("Down", 104)
glm.preds[glm.probs > 0.5] <- "Up"
table(glm.preds, Direction[test])

# Correct predictions = 65/104 = 62.5%

#e)

lda.train <- lda(Direction ~ Lag2, data=training.set)
lda.probs <- predict(lda.train, newdata=test.set)$class # predict outputs a list of 3 elements, by putting $class on the end we just get a vector of predictions to throw straight into a table
table(lda.probs, Direction[test])

# Correct predictions = 65/104 = 62.5%

#f)

qda.fit <- qda(Direction ~ Lag2, data=training.set)
qda.probs <- predict(qda.fit, newdata = test.set)$class
table(qda.probs, Direction[test])



# Correct predictions = 61/104 = 58.6%

#g)

set.seed(1)
train.X <- as.matrix(Lag2[!test]) # matrix of the X values in the training set, if only one predictor we have to use as.matrix
test.X <- as.matrix(Lag2[test])
train.Y <- Direction[!test]

knn.fit <- knn(train.X, test.X, train.Y, k=1)

table(knn.fit, Direction[test])

# Correct predictions = 52/104 =  50%

#h) Logistic Regression and LDA give the same good results. 

#i)

knn.fit2 <- knn(train.X, test.X, train.Y, k=3)
table(knn.fit2, Direction[test])

# Correct predictions = 58/104 = 55.8%

knn.fit10 <- knn(train.X, test.X, train.Y, k=10)
table(knn.fit10, Direction[test]) # Correct 59/104 = 56.7%

knn.fit15 <- knn(train.X, test.X, train.Y, k=15)
table(knn.fit15, Direction[test]) # Correct 61/104 = 58.7%

knn.fit20 <- knn(train.X, test.X, train.Y, k=20)
table(knn.fit20, Direction[test]) # Correct 61/104 = 58.7%

attach(Weekly)

test.set <- Year>2008

training.set <- Weekly[!test.set,]
test.set <- Weekly[test.set,]



lda.fit2 <- lda(Direction ~ Lag2 + I(Lag2^2), data = training.set)
lda.pred2 <- predict(lda.fit2, newdata = test.set)$class
table(lda.pred2, test.set$Direction)

# Adding exponential transformation = 64/104 = 61.5%

#11)

data(Auto)
myAuto <- Auto
dim(Auto)
myAuto
myAuto2 <-  mutate(myAuto, mpg01 = as.numeric(mpg > median(mpg)))
  
str(myAuto2)
cormaster <- flattenSquareMatrix(cor.prob(myAuto2[-9]))

cormaster %>%
  filter(j=="mpg01")

pairs(myAuto2[-9])

# mpg01 is highly correlated with every variable apart from acceleration

myAuto3 <- myAuto2[-9]

#Splitting into training set and test set

test <- seq(301:392)
mydf2 <- mydf[-9]

training.set <- mydf2[-test,]
test.set <- mydf2[test,]

lda.fit <- lda(mpg01 ~ displacement+horsepower+weight+acceleration, data = training.set)
lda.preds <- predict(lda.fit, newdata = test.set)
table(lda.preds$class, mydf2$mpg01[test])
mean(lda.preds$class != mydf2$mpg01[test])

# Test error rate = 14.1%

qda.fit <- qda(mpg01 ~ displacement+horsepower+weight+acceleration, data = training.set)
qda.preds <- predict(qda.fit, newdata = test.set) 
table(qda.preds$class, mydf2$mpg01[test])
mean(qda.preds$class != mydf2$mpg01[test])

# Test Error Rate = 8.7%

glm.fit <- glm(mpg01 ~ displacement+horsepower+weight+acceleration, data = training.set, family=binomial)
glm.probs <- predict(glm.fit, newdata = test.set, type = "response")
glm.preds <- rep("0", 92)
glm.preds[glm.probs > 0.5] = "1"
table(glm.preds, mydf2$mpg01[test])
mean(glm.preds != mydf2$mpg01[test])

# Test error rate = 13%

standardized.X <- scale(mydf2[-9])

train.X <- standardized.X[-test,]
test.X <- as.matrix(standardized.X[test,])
train.Y <- mydf2$mpg01[-test]
test.Y <- mydf2$mpg01[test]

knn.fit <- knn(train.X, test.X, train.Y, k=1)
knn.fit <- knn(train.X, test.X, train.Y, k=5)
knn.fit <- knn(train.X, test.X, train.Y, k=10)
knn.fit <- knn(train.X, test.X, train.Y, k=6)

mean(test.Y!=knn.fit)

# K=1 Test Error rate of 9.8%
# K=5 Test Error rate of 6.5%
# K=10 Test Error rate of 6.5%
# K=30 Test Error rate of 8.7%
# K=6 Test Error rate of 5.4%

####################################

#11 his way

mpg01 <- ifelse(Auto$mpg > median(Auto$mpg), 1, 0)
mydf <- data.frame(Auto, mpg01)

pairs (mydf)

#12)

Power <- function(x) {
  y <- 2^x
  print(y)
}

Power(3)

Power2 <- function(x, a) {
  y <- x^a
  print(y)
}

Power2(3,8)
Power2(10,3)
Power2(8,17)
Power2(131,3)

Power3 <- function(x, a) {
  result <- x^a
  return(plot(result))
}

Power3(1:10,2)

#13

data(Boston)
crimmed <- ifelse(Boston$crim > median(Boston$crim), 1, 0)

mydf <- data.frame(Boston, crimmed)

str(mydf)

pairs(mydf)

#correlated with zn, indus, nox, rm, dis, ptratio, lstat

train <- 1:406
test <- 407:nrow(mydf)

training.set <- mydf[train,]
test.set <- mydf[-train,]

glm.fit <- glm(crimmed ~ zn+indus+nox+rm+dis+ptratio+lstat, data=training.set, family = "binomial")
glm.pred <- predict(glm.fit, newdata = test.set, type="response")
glm.probs <- rep("0", 100)
glm.probs[glm.pred >= 0.5] <- "1"
table(glm.probs, mydf$crimmed[test])
mean(glm.probs != mydf$crimmed[test])

# Test error rate of 19%
# Removing rm increases test error rate to 20%

summary(glm.fit)

# Lets remove lstat and zn

glm.fit2 <- glm(crimmed ~ indus+nox+rm+dis+ptratio, data=training.set, family = "binomial")
glm.pred2 <- predict(glm.fit, newdata = test.set, type="response")
glm.probs2 <- rep("0", 100)
glm.probs2[glm.pred2 >= 0.5] <- "1"
table(glm.probs2, mydf$crimmed[test])
mean(glm.probs2 != mydf$crimmed[test])
summary(glm.fit2)

# Remove dis

glm.fit2 <- glm(crimmed ~ indus+nox+rm+ptratio, data=training.set, family = "binomial")
glm.pred2 <- predict(glm.fit, newdata = test.set, type="response")
glm.probs2 <- rep("0", 100)
glm.probs2[glm.pred2 >= 0.5] <- "1"
table(glm.probs2, mydf$crimmed[test])
mean(glm.probs2 != mydf$crimmed[test])
summary(glm.fit2)
plot(glm.fit2)

## LDA

lda.fit <- lda(crimmed ~ indus+nox+rm+ptratio, data=training.set)
lda.pred <- predict(lda.fit, newdata = test.set)
table(lda.pred$class, mydf$crimmed[test])
mean(lda.pred$class != mydf$crimmed[test])

# Test error rate of 16%

## QDA

qda.fit <- qda(crimmed ~ indus+nox+rm+ptratio, data=training.set)
qda.pred <- predict(qda.fit, newdata = test.set)
table(qda.pred$class, mydf$crimmed[test])
mean(qda.pred$class != mydf$crimmed[test])

# Test error rate of 18%

## KNN

standardized.X <- scale(mydf[-15])

train.X <- standardized.X[-test,]
test.X <- standardized.X[test,]
train.Y <- mydf$crimmed[-test]
test.Y <- mydf$crimmed[test]

knn.fit1 <- knn(train.X, test.X, train.Y, k=1)
table(knn.fit1, test.Y)
mean(knn.fit1 != test.Y)

# Test error rate = 12%

knn.fit5 <- knn(train.X, test.X, train.Y, k=5)
table(knn.fit5, test.Y)
mean(knn.fit5 != test.Y)

# Test error rate = 9% is as low as we can do

crim01 <- ifelse(Boston$crim > median(Boston$crim), 1, 0)
mydf <- data.frame(Boston, crim01)
pairs(mydf)
sort(cor(mydf)[1,])

cormaster <- flattenSquareMatrix(cor.prob(mydf))
cormaster %>%
  filter (i=="crim") %>%
  arrange (p)

```


