# Preliminary Findings and Hypotheses Sprint Journal for [Author]

## Part 1 (Week 1): Pair EDA and Concept Demonstration

### 1. Exploratory Data Analysis Concepts to use or test on this project
concepts, definitions, why you are interested or think they will be useful. These are the concepts you will illustrate in the sections below. Recommend selecting 3-5, but choose as many as you think you can explore comprehensively.

### 2. Professionally Relevant Datasets
Find a dataset that would be useful for you to become more familiar with professionally. Can identify multiple (max 3 recommended):
1. Facebook campaign data
2. AdWords data
3. Email marketing data
4. Website traffic data

### 3. EDA Steps for your partner to explore your dataset(s)
The first part of this sprint project will be to trade datasets with a partner. The idea for the exercise is to think about what you think might be interesting about a dataset and create an EDA plan to explore it. Except the first pass will be for your partner to carry out your EDA plan and report back to you with preliminary findings.

Create a new Jupyter notebook (python or R) that loads your dataset and includes cells for the steps you want your partner to take and empty cells for the code to complete them as well as empty cells where preliminary findings should go.

### 4. Carrying out your Partner's EDA Steps
Complete the EDA process notebook your partner created

### 5. Applying your EDA Concepts (step 1) to your partner's dataset. 
Use your partner's dataset as a basis to explore the concepts for this week

## Part 2 (Week 2): EDA of your Dataset
Walk through a polished Exploratory Data Analysis exercise in your personal repo. This can include the steps that you laid out originally, any new steps and new ideas you have, or any new techniques you developed from the concepts in (5.) above. 

Your final deliverable repo should include a 
- 1.) readme and a script file OR
- 2.) a notebook 

that illustrates your analysis and leads to preliminary findings and hypotheses about the dataset for future investigation or more thorough analysis.

In [7]:
# Load up necessary packages
library(readr)
require(devtools)
install_github("Displayr/flipTime")
library(flipTime)

# Import data file
aalii_facebook <- read_csv("Aalii-Campaigns-Dec-15-2017-Feb-4-2018.csv")

# Clean up data set
aalii_facebook$reporting_start <- AsDate(aalii_facebook$reporting_start)
aalii_facebook$reporting_end <- AsDate(aalii_facebook$reporting_end)
aalii_facebook$end_date <- AsDate(aalii_facebook$end_date)
head(aalii_facebook)

Skipping install of 'flipTime' from a github remote, the SHA1 (2888936a) has not changed since last install.
  Use `force = TRUE` to force installation
Parsed with column specification:
cols(
  reporting_start = col_character(),
  reporting_end = col_character(),
  campaign_name = col_character(),
  delivery = col_character(),
  results = col_integer(),
  result_indicator = col_character(),
  reach = col_integer(),
  impressions = col_integer(),
  cost_per_result = col_double(),
  amount_spent = col_double(),
  end_date = col_character(),
  frequency = col_double(),
  link_clicks = col_integer(),
  button_clicks = col_integer(),
  comments = col_integer(),
  reactions = col_integer(),
  shares = col_integer(),
  engagement = col_integer()
)


reporting_start,reporting_end,campaign_name,delivery,results,result_indicator,reach,impressions,cost_per_result,amount_spent,end_date,frequency,link_clicks,button_clicks,comments,reactions,shares,engagement
2018-02-04,2018-02-04,20171215-20180103 Lead Gen Turnkey,completed,,,0,0,,0,2018-01-07,0,,0,,,,
2018-02-04,2018-02-04,20171221-20171231 'A'ali'i Blog Retargeting,completed,,,0,0,,0,2017-12-31,0,,0,,,,
2018-02-04,2018-02-04,20180109-20180111 Lead Gen Turnkey,completed,,,0,0,,0,2018-01-11,0,,0,,,,
2018-02-04,2018-02-04,"Post: ""Names have power. Learn more about the origins of...""",inactive,,,0,0,,0,2018-01-15,0,,0,,,,
2018-02-04,2018-02-04,"Post: ""Shaken, not stirred: Bar Leather Apron provides...""",completed,,,0,0,,0,2018-01-08,0,,0,,,,
2018-02-04,2018-02-04,"Post: ""Catch a movie, take a free yoga class or meet up...""",completed,,,0,0,,0,2017-12-31,0,,0,,,,


In [None]:
# The Basics
# How many variables/columns?
length(aalii_facebook)

# What are the unique Facebook campaigns in the data set?
unique(aalii_facebook$campaign_name)

# How much money was spent in total over the course of this data set?
sum(aalii_facebook$amount_spent)
# 16601.66

# What was the max amount of money spent?
max(aalii_facebook$amount_spent)
# 644.75

# What was the minimum amount spent?
 min(aalii_facebook$amount_spent)
# 0
 
# What was the maximum reach recorded?
max(aalii_facebook$reach)
# 21263

# What was the minimum reach recorded?
min(aalii_facebook$reach)
# 0

In [None]:
# Practice plots
ggplot(aalii_facebook, aes(reach, engagement)) + geom_point()

ggplot(aalii_facebook, aes(amount_spent, engagement)) + geom_point()

ggplot(aalii_facebook) + geom_bar(aes(x=reporting_start,y=engagement),stat="summary", fun.y = "sum",fill=I("grey50"))

# Trying to repurpose this graph: http://ggplot.yhathq.com/
ggplot(aalii_facebook, aes(x='amount_spent', y='clicks', color='campaign_name')) +
  geom_point() +
  scale_color_brewer(type='diverging', palette=3) +
  xlab("Reach") + ylab("Clicks") + ggtitle("Facebook_Campaigns")

# Key Concepts

## Covariance (Concept 1)
*Explanation of this concept relative to your data set here *

In [1]:
## Code to Illustrate the concept goes here

## Concept 2

## Concept 3

## Concept 4

## Concept 5