TASKS:
Perform multiple linear regression analysis to identify which variables in the dataset predict the mpg of MechaCar prototypes
Collect summary statistics on the pounds per square inch (PSI) of the suspension coils from the manufacturing lots
Run t-tests to determine if the manufacturing lots are statistically different from the mean population
Design a statistical study to compare vehicle performance of the MechaCar vehicles against vehicles from other manufacturers. For each statistical analysis, you’ll write a summary interpretation of the findings.
- Which variables/coefficients provided a non-random amount of variance to the mpg values in the dataset?
The length of the vehicle and its ground clearance provide a non-random amount of variance to the mpg values based off their p-values.
- ground clearance: 0 < .05
- vehicle length: 0 < .05
- Is the slope of the linear model considered to be zero?
This model has a very low p-value compared to the typical significance value of .05%, therefore the null hypothesis can be rejected and it confirms a non-zero slope.
- Does this linear model predict mpg of MechaCar prototypes effectively?
Yes and no, the model has a .7149 or 71% prediction efficiency but there is still alot of breathe room there, I suppose its down to how 'effective' we're talking.
All lots:
Individual lots:
The design specifications for the MechaCar suspension coils dictate that the variance of the suspension coils must not exceed 100 pounds per square inch. Does the current manufacturing data meet this design specification for all manufacturing lots in total and each lot individually?
Total manufacturing variance rests well in the 100 PSI range at 62 PSI. However, when you take a look at the individual lots you see 'lot 3' sits at a 170 variance. As a whole the summary stats seemed to show a normal PSI range, individually examining the lots proves that not all of them meet the design specifications.
Based off the p-value, one can assume the all the lots as a whole fall within the normal range. (.60 > .05)
Similarly, lot 1 falls into the same category with a p-value of 1. (1 > .05)
Lot 2 being the same way with little difference in distribution, its p-value being .6 relative to the .5 we're comparing this to. (.60 > .05)
Lot 3 has a p-value lower than our .5 set point, one can conclude that this is abnormal but interestingly, the mean still rests in the 95 percent confidence interval. (.04 < .05)
Write a short description of a statistical study that can quantify how the MechaCar performs against the competition. In your study design, think critically about what metrics would be of interest to a consumer:
Metrics I'd consider would be horse power and safety ratings. A null hypothesis could be that the mean safety rating relative to horsepower is a star higher in the competitors lineup.
A multiple linear regression can be done to show the relation between the HP and safety ratings across the companies product lines to prove whether they may correlate. A minimum of 30 logs of sample data from each company containing info on cost, horse power, and safety ratings would be needed to run an analysis.