Skip to content

wengsengh/Exploratory_Data_Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Exploratory-Data-Analysis

This project was completed as part of the Udacity Data Analyst Nanodegree program requirements.

Project Overview

In this project, I will use R and apply exploratory data analysis techniques to explore relationships in one variable to multiple variables and to explore a selected data set for distributions, outliers, and anomalies.I chose the White Wine Quality Data Set for this project.

What do I need to install?

You will need to install R. After installing R, you will need to download and install R Studio. Finally, you will need to install a few packages. We recommend opening R Studio and installing the following packages using the command line.

  • install.packages("ggplot2", dependencies = T)
  • install.packages("knitr", dependencies = T)
  • install.packages("dplyr", dependencies = T)

Project Details

Document your Analysis

  1. A stream-of-consciousness analysis and exploration of the data.

a. Headings and text should organize your thoughts and reflect your analysis as you explored the data.

b. Plots in this analysis do not need to be polished with labels, units, and titles; these plots are exploratory (quick and dirty). They should, however, be of the appropriate type and effectively convey the information you glean from them.

c. You can iterate on a plot in the same R chunk, but you don’t need to show every plot iteration in your analysis.

  1. A section at the end called “Final Plots and Summary”

You will select three plots from your analysis to polish and share in this section. The three plots should show different trends and should be polished with appropriate labels, units, and titles (see the Project Rubric for more information).

  1. A final section called “Reflection”

This should contain a few sentences about your struggles, successes, and ideas for future exploration on the data set (see the Project Rubric for more information).

  • The RMD file containing the analysis (final plots and summary, and reflection)
  • the HTML file knitted from the RMD file using the knitr package