MechaCar_Statistical_Analysis

Overview

To review the production data for insights that may help the manufacturing team overcome production troubles that are blocking the manufacturing team’s progress.

Linear Regression to Predict MPG

Which variables/coefficients provided a non-random amount of variance to the mpg values in the dataset?

The vehicle length, and vehicle ground clearance provided a non-random amount of variance to the mpg values in the dataset.T values are relatively far away from zero which indicate a relationship exists and these two variables have significant impact on mpg.
Is the slope of the linear model considered to be zero? Why or why not?

The p-value: 5.35e-11 is much smaller than the assumed significance level of 0.05%.This indicates there is sufficient evidence to reject null hypothesis and the slope of the linear model is not zero.
Does this linear model predict mpg of MechaCar prototypes effectively? Why or why not?

This linear model has R-squared: 0.7149, which means that approximately 71% of all mpg predictions will be determined by this model. This model predict mpg of MechaCar prototypes effectively.

Summary Statistics on Suspension Coils

Total manufacturing lot

Individual manufacturing lot

The design specifications for the MechaCar suspension coils dictate that the variance of the suspension coils must not exceed 100 pounds per square inch. Does the current manufacturing data meet this design specification for all manufacturing lots in total and each lot individually? Why or why not?

The Variance for entire manufacturing lot is 62.293 which meets the current manufacturing design specification. The Variance for Lot1 is 0.9795 and Lot2 is 7.4693 and these two lots also meet the design specification. However Variance for Lot3 is 170.286 which exceeds the limit of 100 pounds per square inch and therefore Lot3 of suspension coils may not be accepted.

T-Tests on Suspension Coils

Individual lots

A T-test can be used to compare the mean of a sample to the population or the difference between population means.The null hypothesis assumes that there is no meaningful difference between the two means.Therefore the goal of the T-test is to reject the null hypothesis.

Here we can see mean of the sample is 1498.78 with a p-Value of 0.06, which is higher than the common significance level of 0.05, there is NOT enough evidence to reject the null hypothesis. That is to say, the mean of all three of these manufacturing lots is statistically similar to the presumed population mean of 1500.

Individual lots

Lot 1 has sample mean of 1500,with a p-Value of 1 and so we don't have enough evidence to reject the null hypothesis and so there is no statistical difference between the observed sample mean and the presumed population mean.
Lot 2 has sample mean of 1500.02, with a p-Value of 0.61 and so the null hypothesis cannot be rejected, and the sample mean and the population mean of 1500 are statistically similar.
Lot 3 however has the sample mean of 1496.14 and the p-Value is 0.04, which is lower than the common significance level of 0.05 indicating to reject the null hypothesis and so there is a statistical difference between the sample mean and the population mean.The suspension coils from this lot need to be inspected to remove those not meeting quality criteria.

Study Design: MechaCar vs Competition

A statistical study that can quantify how the MechaCar performs against the competition

What metric or metrics are you going to test?

Selling price - Dependent variable

Highway fuel efficiency - Independent variable

Maintenance cost - Independent variable

MPG - Independent variable

Safety rating - Independent variable
What is the null hypothesis or alternative hypothesis?

Null Hypothesis (Ho): MechaCar selling price is correct based on its performance of key factors.

Alternative Hypothesis (Ha): MechaCar selling price is NOT correct based on performance of key factors.
What statistical test would you use to test the hypothesis? And why?

Multiple linear regression, because it can process more than two independent variables for one dependent variable and helps find factors that have higher correlation with selling price
What data is needed to run the statistical test?

We can perform statistical tests on data that have been collected in a statistically valid manner – either through an experiment, or through observations made using probability sampling methods. For a statistical test to be valid, sample size needs to be large enough to approximate the true distribution of the population being studied.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
Resources		Resources
.RData		.RData
.Rhistory		.Rhistory
MechaCarChallenge.R		MechaCarChallenge.R
README.md		README.md
Stats_Cheat_Sheet.pdf		Stats_Cheat_Sheet.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MechaCar_Statistical_Analysis

Overview

Linear Regression to Predict MPG

Summary Statistics on Suspension Coils

Total manufacturing lot

Individual manufacturing lot

T-Tests on Suspension Coils

Individual lots

Individual lots

Study Design: MechaCar vs Competition

About

Releases

Packages

Languages

Ayesha-da/MechaCar_Statistical_Analysis

Folders and files

Latest commit

History

Repository files navigation

MechaCar_Statistical_Analysis

Overview

Linear Regression to Predict MPG

Summary Statistics on Suspension Coils

Total manufacturing lot

Individual manufacturing lot

T-Tests on Suspension Coils

Individual lots

Individual lots

Study Design: MechaCar vs Competition

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages