Seren Dances 5/24/2022
The working Problem:
Using the data of gas vapor, found on page 183 of the Linear Models in
Statistics 2nd Edition (Rencher), and with the assumption that where
. When
gasoline is pumped into the tank of a car, vapors are vented into the
atmosphere. An experiment was conducted to determine whether y, the
amount of vapor, can be predicted using the following four variables
based on initial conditions of the tank and the dispensed gasoline:
= tank temperature
(!^\circ F)
= gasoline temperature
(!^\circ F)
= vapor pressure in tank (psi)
= vapor pressure of gasoline (psi)
Loading Data…
X <- read.csv('443dancesdata.csv')
X <- unlist(X)
X <- array(X, dim = c(32,5))
y = X[,1]The data set where the first col is y and the rest are the x values:
y x1 x2 x3 x4
[1,] 29 33 53 3.32 3.42
[2,] 24 31 36 3.10 3.26
[3,] 26 33 51 3.18 3.18
[4,] 22 37 51 3.39 3.08
[5,] 27 36 54 3.20 3.41
[6,] 21 35 35 3.03 3.03
[7,] 33 59 56 4.78 4.57
[8,] 34 60 60 4.72 4.72
[9,] 32 59 60 4.60 4.41
[10,] 34 60 60 4.53 4.53
[11,] 20 34 35 2.90 2.95
[12,] 36 60 59 4.40 4.36
[13,] 34 60 62 4.31 4.42
[14,] 23 60 36 4.27 3.94
[15,] 24 62 38 4.41 3.49
[16,] 32 62 61 4.39 4.39
[17,] 40 90 64 7.32 6.70
[18,] 46 90 60 7.32 7.20
[19,] 55 92 92 7.45 7.45
[20,] 52 91 92 7.27 7.26
[21,] 29 61 62 3.91 4.08
[22,] 22 59 42 3.75 3.45
[23,] 31 88 65 6.48 5.80
[24,] 45 91 89 6.70 6.60
[25,] 37 63 62 4.30 4.30
[26,] 37 60 61 4.02 4.10
[27,] 33 60 62 4.02 3.89
[28,] 27 59 62 3.98 4.02
[29,] 34 59 62 4.39 4.53
[30,] 19 37 35 2.75 2.64
[31,] 16 35 35 2.59 2.59
[32,] 22 37 37 2.73 2.59
Creating the design matrix.
design = X[,c(2:5)]
j = rep(1,32)
#unrestricted design matrix
design <- cbind(j, design)
#restricted design matrix
design.R <- cbind(design[,c(1:2)],design[,5])To find the coefficient estimates for the unrestricted
. Find
.
#UR for unrestricted
coef.UR <- solve(t(design) %*% design) %*% t(design) %*% y
coef.UR [,1]
j 1.01501756
x1 -0.02860886
x2 0.21581693
x3 -4.32005167
x4 8.97488928
To find the coefficient estimates for the restricted model, Find
.
coef.R <- solve(t(design.R) %*% design.R) %*% t(design.R) %*% y
coef.R [,1]
j 4.4730173
x1 -0.1284081
7.8838074
To find the F statistic use the following formula:
#degrees of freedom
h = 2
df = (32-4-1)
#F statistic
numerator = ((t(coef.UR) %*% t(design) %*% y) - (t(coef.R) %*% t(design.R) %*% y))/h
denomenator = (((t(y) %*% y)-(t(coef.UR) %*% t(design) %*% y))/ df)
F = numerator/denomenator
#Critical value
critF = qf(.95,h,df)The F statistic is:
[1] 12.42426
And the critical value is:
[1] 3.354131
Based on the F statistic, reject
. Conclude that the selected values have a significant effect on
the dependent variable y.
To find the 95% CI for
, we must first find some values.
SSE = (t(y) %*% y - (t(coef.UR) %*% t(design) %*%y))
s = sqrt(SSE/(32-5-1))XXinv = solve((t(X) %*% X))
XXinv y x1 x2 x3 x4
y 0.0049153593 0.0001614946 -0.0011527478 0.019561241 -0.042570540
x1 0.0001614946 0.0010985764 -0.0002604604 -0.020579143 0.008512408
x2 -0.0011527478 -0.0002604604 0.0007278156 0.006481214 -0.004138046
x3 0.0195612413 -0.0205791434 0.0064812143 1.116142039 -1.087828015
x4 -0.0425705398 0.0085124083 -0.0041380464 -1.087828015 1.355635986
So .From the coefficients of the unrestricted model
we can find
.
print(coef.UR[3,]) x2
0.2158169
tval = pt(0.025,27)
UpperCI = (coef.UR[3,]) + tval*s*sqrt(0.0007278156)
LowerCI = (coef.UR[3,]) - tval*s*sqrt(0.0007278156)So the 95% CI for
is, (0.1775489, 0.254085).
- Construct a 90% CI for the mean amount of vapor when the tank
temperature is
F, the fasoline temperature is
F, the vapor pressure in tank is 5.5 psi, and the vapor pressure in gasoline is 5.2 psi.
xNaught = c(1,85,78,5.5,5.2)Then the expected value of y at
:
expectedValueY <- (t(xNaught) %*% coef.UR)Now to build the CI:
#t-value
tval = pt(0.05,27)
#preforming matrix multiplication
m <- t(xNaught) %*% XXinv %*% xNaught
#calculating bounds
nintyUpper = expectedValueY + tval*s*sqrt(m)
nintyLower = expectedValueY - tval*s*sqrt(m)So the 90% CI for the mean using the given parameter values of (1,85,78,5.5,5.2) is (34.44082, 42.21143). The interval found represents the theorized range of the mean amount of vapor at the given parameters.
- Under the same conditions as the previous step, find the 90% prediction interval for the amount of vapor.
Many of the same variables are used as in the previous step.
predUpper = expectedValueY + tval*s*sqrt((1+m))
predLower = expectedValueY - tval*s*sqrt((1+m))The result of the 90% PI for amount of vapor is (34.18048, 42.47177). This interval is very similar to the CI found in the last step however it is slightly wider.
- Interpretation of the prediction interval vs the confidence interval:
As stated in part d, the prediction interval is wider. This makes sense
as it is because the
value is smaller than 1, and therefore the addition of 1
under the square root sign increaded the magnitude of the interval.