Baseball Dataset by Patrick Flynn
========================================================

> **Tip**: You will see quoted sections like this throughout the template to
help you construct your report. Make sure that you remove these notes before
you finish and submit your project!

> **Tip**: One of the requirements of this project is that your code follows
good formatting techniques, including limiting your lines to 80 characters or
less. If you're using RStudio, go into Preferences \> Code \> Display to set up
a margin line to help you keep track of this guideline!

```{r echo=FALSE, message=FALSE, warning=FALSE, packages}```

# Load all of the packages that you end up using in your analysis in this code chunk.

# Notice that the parameter "echo" was set to FALSE for this code chunk. This prevents the code from displaying in the knitted HTML output. You should set echo=FALSE for all code chunks in your file, unless it makes sense for your report to show the code that generated a particular plot. The other parameters for "message" and "warning" should also be set to FALSE for other code chunks once you have verified that each plot comes out as you want it to. This will clean up the flow of your report.

library(ggplot2)


```{r echo=FALSE, Load_the_Data}```

# Load the Data

> **Tip**: Before you create any plots, it is a good idea to provide a short
introduction into the dataset that you are planning to explore. Replace this
quoted text with that general information!

In [4]:
baseball <- read.csv('baseball_data.csv')

In [21]:
#baseball
#subset(baseball, HR > 5)
baseball[baseball$HR > 350, ]
head(baseball, 3)
dim(baseball)

Unnamed: 0,name,handedness,height,weight,avg,HR
17,Darrell Evans,L,74,200,0.248,414
46,Dave Kingman,R,78,210,0.236,442
204,Tony Perez,R,74,175,0.279,379
296,Graig Nettles,L,72,180,0.248,390
423,Carl Yastrzemski,L,71,175,0.285,452
657,Reggie Jackson,L,72,195,0.262,563
762,Johnny Bench,R,73,197,0.267,389
814,Lee May,R,75,195,0.267,354
969,Dick Allen,R,71,187,0.292,351
973,Willie Stargell,L,74,188,0.282,475


name,handedness,height,weight,avg,HR
Tom Brown,R,73,170,0.0,0
Denny Lemaster,R,73,182,0.13,4
Joe Nolan,L,71,175,0.263,27


In [23]:
table(baseball$handedness)
summary(baseball)


  B   L   R 
104 316 737 

              name      handedness     height          weight     
 Bobby Mitchell :   2   B:104      Min.   :65.00   Min.   :140.0  
 Dave Roberts   :   2   L:316      1st Qu.:71.00   1st Qu.:175.0  
 Dave Stapleton :   2   R:737      Median :73.00   Median :185.0  
 Jim Wright     :   2              Mean   :72.76   Mean   :184.5  
 Mel Stottlemyre:   2              3rd Qu.:74.00   3rd Qu.:195.0  
 Mike Brown     :   2              Max.   :80.00   Max.   :245.0  
 (Other)        :1145                                             
      avg               HR        
 Min.   :0.0000   Min.   :  0.00  
 1st Qu.:0.1380   1st Qu.:  1.00  
 Median :0.2380   Median : 15.00  
 Mean   :0.1868   Mean   : 45.36  
 3rd Qu.:0.2580   3rd Qu.: 55.00  
 Max.   :0.3280   Max.   :563.00  
                                  

# Univariate Plots Section

> **Tip**: In this section, you should perform some preliminary exploration of
your dataset. Run some summaries of the data and create univariate plots to
understand the structure of the individual variables in your dataset. Don't
forget to add a comment after each plot or closely-related group of plots!
There should be multiple code chunks and text sections; the first one below is
just to help you get started.

```{r echo=FALSE, Univariate_Plots}```

> **Tip**: Make sure that you leave a blank line between the start / end of
each code block and the end / start of your Markdown text so that it is
formatted nicely in the knitted text. Note as well that text on consecutive
lines is treated as a single space. Make sure you have a blank line between
your paragraphs so that they too are formatted for easy readability.

# Univariate Analysis

> **Tip**: Now that you've completed your univariate explorations, it's time to
reflect on and summarize what you've found. Use the questions below to help you
gather your observations and add your own if you have other thoughts!

### What is the structure of your dataset?

### What is/are the main feature(s) of interest in your dataset?

### What other features in the dataset do you think will help support your \
investigation into your feature(s) of interest?

### Did you create any new variables from existing variables in the dataset?

### Of the features you investigated, were there any unusual distributions? \
Did you perform any operations on the data to tidy, adjust, or change the form \
of the data? If so, why did you do this?


# Bivariate Plots Section

> **Tip**: Based on what you saw in the univariate plots, what relationships
between variables might be interesting to look at in this section? Don't limit
yourself to relationships between a main output feature and one of the
supporting variables. Try to look at relationships between supporting variables
as well.

```{r echo=FALSE, Bivariate_Plots}```

# Bivariate Analysis

> **Tip**: As before, summarize what you found in your bivariate explorations
here. Use the questions below to guide your discussion.

### Talk about some of the relationships you observed in this part of the \
investigation. How did the feature(s) of interest vary with other features in \
the dataset?

### Did you observe any interesting relationships between the other features \
(not the main feature(s) of interest)?

### What was the strongest relationship you found?


# Multivariate Plots Section

> **Tip**: Now it's time to put everything together. Based on what you found in
the bivariate plots section, create a few multivariate plots to investigate
more complex interactions between variables. Make sure that the plots that you
create here are justified by the plots you explored in the previous section. If
you plan on creating any mathematical models, this is the section where you
will do that.

```{r echo=FALSE, Multivariate_Plots}```

# Multivariate Analysis

### Talk about some of the relationships you observed in this part of the \
investigation. Were there features that strengthened each other in terms of \
looking at your feature(s) of interest?

### Were there any interesting or surprising interactions between features?

### OPTIONAL: Did you create any models with your dataset? Discuss the \
strengths and limitations of your model.

------

# Final Plots and Summary

> **Tip**: You've done a lot of exploration and have built up an understanding
of the structure of and relationships between the variables in your dataset.
Here, you will select three plots from all of your previous exploration to
present here as a summary of some of your most interesting findings. Make sure
that you have refined your selected plots for good titling, axis labels (with
units), and good aesthetic choices (e.g. color, transparency). After each plot,
make sure you justify why you chose each plot by describing what it shows.

### Plot One
```{r echo=FALSE, Plot_One}```

### Description One


### Plot Two
```{r echo=FALSE, Plot_Two}```

### Description Two


### Plot Three
```{r echo=FALSE, Plot_Three}```

### Description Three

------

# Reflection

> **Tip**: Here's the final step! Reflect on the exploration you performed and
the insights you found. What were some of the struggles that you went through?
What went well? What was surprising? Make sure you include an insight into
future work that could be done with the dataset.

> **Tip**: Don't forget to remove this, and the other **Tip** sections before
saving your final work and knitting the final report!