Skip to content

Udacity Data Analyst Nanodegree Project 6 - Exploring a Bicycle Company Sales Transactions using R

Notifications You must be signed in to change notification settings

latinacode/Explore-and-Summarize-Data

Repository files navigation

Explore and-Summarize Data with R

Udacity Data Analyst December 2017 - May 2018. Project 6: Explore and Summarize Data using R.

Profitability Analysis of a Bicycle Company.

Click here to see the preview

Project Overview

In this project, I will use R and apply exploratory data analysis techniques to explore relationships in one variable to multiple variables and to explore a selected data set for distributions, outliers, and anomalies.

What do I need to install?

R. R Studio. Finally install a few packages using command line.

install.packages("ggplot2", dependencies = T) install.packages("knitr", dependencies = T) install.packages("dplyr", dependencies = T)

Why this Project?

Exploratory Data Analysis (EDA) is the numerical and graphical examination of data characteristics and relationships before formal, rigorous statistical analyses are applied.

What will I learn?

After completing the project, I will:

  • Understand the distribution of a variable and to check for anomalies and outliers
  • Learn how to quantify and visualize individual variables within a data set by using appropriate plots such as scatter plots, histograms, bar charts, and box plots
  • Explore variables to identify the most important variables and relationships within a data set before building predictive models; calculate correlations, and investigate conditional means
  • Learn powerful methods and visualizations for examining relationships among multiple variables, such as reshaping data frames and using aesthetics like color and shape to uncover more information

Why is this Important to my Career?

In this project, you learn skills to frame and present data. Data, by itself, is "ubiquitous and cheap," says Google's Chief Economist and UC Berkeley professor Hal Varian. What you do as a data analyst is take that data and turn it into insights.

Project Details

Introduction

For the final project, you will conduct your own exploratory data analysis and create an RMD file that explores the variables, structure, patterns, oddities, and underlying relationships of a data set of your choice.

The analysis should be almost like a stream-of-consciousness as you ask questions, create visualizations, and explore your data.

Step One - Choose your Data Set

The data was obtained through the SAP University Alliance for study purposes. The data contains sales transactions of a bicycle company called GBI. The bycicle company is managed by SAP ERP.

Step Two - Get Organized

This repository contains:

  1. The RMD file that contains the analysis, final plots and summary, and reflection (in that order).
  2. The HTML file that will be knitted from your RMD file.
  3. The data set.

Step Three - Explore your Data

This means keeping track of the thoughts as I go (in an RMD file).

Step Four - Document your Analysis

The file should be formatted in markdown and should contain (in order):

  1. A stream-of-consciousness analysis and exploration of the data.

a. Headings and text the reflect the analysis as I explored the data.

b. Plots in this analysis do not need to be polished.

c. I can iterate on a plot in the same R chunk.

  1. A section at the end called “Final Plots and Summary”

I will select three plots from the analysis to polish and share in this section. The three plots should show different trends and should be polished with appropriate labels, units, and titles.

  1. A final section called “Reflection”

This contains a few sentences about the struggles, successes, and ideas for future exploration on the data set.

About

Udacity Data Analyst Nanodegree Project 6 - Exploring a Bicycle Company Sales Transactions using R

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages