# DS102 Statistical Programming in R  : Lesson Four - Vectors and Sample Statistics

### Table of Contents <a class="anchor" id="DS102L4_toc"></a>

* [Table of Contents](#DS102L4_toc)
    * [Page 1 - Introduction](#DS102L4_page_1)
    * [Page 2 - Creating and Accessing Vectors](#DS102L4_page_2)
    * [Page 3 - Vector Arithmetic and Functions](#DS102L4_page_3)
    * [Page 4 - Logical Variables](#DS102L4_page_4)
    * [Page 5 - Logical Operations on Vectors](#DS102L4_page_5)
    * [Page 6 - Sample Statistics](#DS102L4_page_6)
    * [Page 7 - Key Terms](#DS102L4_page_7)
    * [Page 8 - Hands-On](#DS102L4_page_8)
    * [Page 9 - Solution](#DS102L4_page_9)
    

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 1 - Introduction<a class="anchor" id="DS102L4_page_1"></a>

[Back to Top](#DS102L4_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

In [2]:
from IPython.display import VimeoVideo
# Tutorial Video Name: Introduction to R
VimeoVideo('247057914', width=720, height=480)

You recently learned how to do simple computations with individual numbers. This is a useful skill. But data science and statistics deal with samples; a sample is a collection of numbers or some other type of data. Samples consisting of only one number are generally not that useful. So R has several powerful ways to allow you to work with samples. The simplest of these ways is a vector.

Previously, you were briefly introduced to vectors. However, vectors are useful in many ways, and R can compute arithmetic on vectors. Much of what you did with for loops can be done directly with vectors.

You will also learn how to compute sample statistics for a sample stored in a vector. R has functions to compute most common sample statistics.

In this lesson, you will: 

* Create vectors
* Call functions on vectors
* Learn about logical variables
* Apply logical variables to vectors
* Compute sample statistics on vectors

This lesson will culminate in a hands on where you analyze vector data from Old Faithful, the geyser in Yellowstone.

<div class="panel panel-success">
    <div class="panel-heading">
        <h3 class="panel-title">Additional Info!</h3>
    </div>
    <div class="panel-body">
        <p>Before you start working in R too heavily, it's a good idea to watch <a href="https://vimeo.com/410435344"> this recorded live workshop on how to address common errors in R. It will save you lots of headaches later! </a></p>
    </div>
</div>


<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 2 - Creating and Accessing Vectors<a class="anchor" id="DS102L4_page_2"></a>

[Back to Top](#DS102L4_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">


In [3]:
from IPython.display import VimeoVideo
# Tutorial Video Name: Introduction to R
VimeoVideo('327137068', width=720, height=480)

The transcript for the above topic tutorial video **[is located here](https://repo.exeterlms.com/documents/V2/DataScience/Video-Transcripts/DSO102-L04-pg2tutorial.zip)**.

As you learned previously, a numerical vector is an ordered collection of numbers. Ordered means that there is a first number, a second number, and so on to the last number. Each number in the vector is called an element. You can create a vector of the following measured heights in centimeters of six people: 171, 192, 183, 177, 154, and 176.

```{r}
heights <- c(171, 192, 183, 177, 154, 176)
```

You can access specific elements of the vector ```heights``` as follows:

```{r}
heights[1]
```

[1] 171

This gives you the first element in the vector. You can access the second element of ```heights``` with

```{r}
heights[2]
```

[1] 192

In general, to get a particular element of a vector, you follow the variable name with square brackets containing the index of the variable element you want; the index of an element is just the number describing its location in the vector. R is "1 indexed" meaning that the indexing numbers start with the number 1, as opposed to the number 0.  You will encounter other programming languages, like Python, later than are "0 indexed" meaning that the indexing numbers start at 0.

---

## Sequences

As you previously learned, you can make sequences of numbers using the ```:``` operator:

```{r}
1:10
```

[1] 1 2 3 4 5 6 7 8 9 10

You are now at a point where you can explain the mysterious [1] that appears before almost everything R outputs. You will make a long sequence of numbers going backward:

```{r}
100:1
```

[1] 100 99 98 97 96 95 94 93 92 91 90 89 88

[14] 87 86 85 84 83 82 81 80 79 78 77 76 75

[27] 74 73 72 71 70 69 68 67 66 65 64 63 62

[40] 61 60 59 58 57 56 55 54 53 52 51 50 49

[53] 48 47 46 45 44 43 42 41 40 39 38 37 36

[66] 35 34 33 32 31 30 29 28 27 26 25 24 23

[79] 22 21 20 19 18 17 16 15 14 13 12 11 10

[92] 9 8 7 6 5 4 3 2 1

Your output in the Console pane will probably be different than the output above unless your Console pane is the same width as the one in which this output was generated. The command ```100:1``` generates a vector with 100 elements. Element 1 is 100, element 2 is 99, and so on.

The number in the square bracket at the beginning of each line is the index of the first element displayed on that line. So element 14 is 87; you know this because the line that begins with [14] has the first value of 87. Element 15 is 86.

When R displays a single number, it places a [1] in front of it because R actually treats all single numbers as vectors with only one element. So every single number that is displayed is the first element in a vector with only one number.

The ```:``` operator creates sequences of integers. You can also create sequences of arbitrarily spaced values using the ```seq()``` function.

For example, you can create a sequence of numbers starting at 0 and ending at 1 and going up by 0.1 between numbers:

```{r}
seq(0, 1, by = 0.1)
```

[1] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

The first argument to ```seq()``` is the starting number of the sequence; the second argument to ```seq()``` is the ending number; ```by = 0.1``` means that each element of the sequence should be 0.1 larger than the previous element.

A sequence with a specified length starting and ending at given values is created with the ```length.out``` argument to ```seq()```. So if you want a sequence starting at 3 and ending at 5 with 7 elements, you would use the following form of ```seq()```:

```{r}
seq(3, 5, length.out = 7)
```

[1] 3.000000 3.333333 3.666667 4.000000 4.333333 4.666667 5.000000

---

## Review
Below is a quiz to review the recently covered material. Quizzes are _not_ graded.

In [1]:
try:
    from DS_Students import MultipleChoice
    from ipynb.fs.full.DS102Questions import *
except:
    !pip install DS_Students
    from DS_Students import MultipleChoice
    from ipynb.fs.full.DS102Questions import *

In [2]:
try:
    display(L4P2Q1, L4P2Q2, L4P2Q3)
except:
    pass

VBox(children=(Output(outputs=({'name': 'stdout', 'text': '1. Which of the following commands would create thi…

VBox(children=(Output(outputs=({'name': 'stdout', 'text': '2. What does the \x1b[31;1mseq()\x1b[0m function do…

VBox(children=(Output(outputs=({'name': 'stdout', 'text': '3. Which of the following is NOT an argument for th…

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 3 - Vector Arithmetic and Functions<a class="anchor" id="DS102L4_page_3"></a>

[Back to Top](#DS102L4_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">


In [4]:
from IPython.display import VimeoVideo
# Tutorial Video Name: Introduction to R
VimeoVideo('327137118', width=720, height=480)

The transcript for the above topic tutorial video **[is located here](https://repo.exeterlms.com/documents/V2/DataScience/Video-Transcripts/DSO102-L04-pg3tutorial.zip)**.

# Vector Arithmetic

R can do arithmetic on vectors of numbers. For an example, you will use ```heights```, the vector of heights defined previously. To convert the heights in the vector ```h``` from centimeters to inches, you could divide each height by 2.54 (which is the number of centimeters in an inch). Because the heights are stored in the vector ```heights```, R can do this in a single statement:

```{r}
heights / 2.54
```

[1] 67.32283 75.59055 72.04724 69.68504 60.62992 69.29134

You can verify that the first element in this vector is the quotient of 171 divided by 2.54, the second is the quotient of 192 divided by 2.54, and so on. The expression ```h / 2.54``` divides each element in R by 2.54. You could store the heights in inches in a variable ```h.inch```:

```{r}
h.inch <- heights / 2.54
```

Suppose that each person whose height you measured was wearing shoes at the time of the measurement, and the soles of their shoes had the following different thicknesses (in centimeters): 4, 1, 2, 6, 3, and 5. You can make a vector of shoe sole thicknesses:

```{r}
soles <- c(4, 1, 2, 6, 3, 5)
```

Now you can find the height of each person without their shoes by subtracting the thickness of their shoe soles from their measured height. You do this by subtracting the vector ```soles``` from ```heights```. R computes the difference of each element of ```heights``` and the corresponding element of ```soles```:

```{r}
heights - soles
```

[1] 167 191 181 171 151 171

Arithmetic operations on vectors are done element by element. An operation on two vectors of different sizes is done by recycling the elements of the shorter vector. In the following code that subtracts the vector ```x``` from the vector ```y```, ```x``` has only two elements, while ```y``` has six. The elements of the shorter vector ```x``` will be used three times each while the elements of ```y``` will be used only once:

```{r}
x <- c(1, 2)
y <- c(4, 5, 6, 7, 8, 9)
y - x
```

[1] 3 3 5 5 7 7

<div class="panel panel-danger">
    <div class="panel-heading">
        <h3 class="panel-title">Caution!</h3>
    </div>
    <div class="panel-body">
        <p>The above can be problematic if you want the operation to only happen once per element, but have missing data! Make sure both vectors are the same size if you want to go line by line. </p>
    </div>
</div>

---

# Functions on Vectors

Most of the R functions that do computations on numbers can also do computations on vectors, as you saw previously.  As a reminder, if you wanted to compute the square roots of the integers between 1 and 10, you could create a vector of these integers and call ```sqrt()``` with this vector as its argument:

```{r}
n = 1:10
sqrt(n)
```

[1] 1.000000 1.414214 1.732051 2.000000 2.236068 2.449490 2.645751

[8] 2.828427 3.000000 3.162278

The first number output is the square root of 1, the second is the square root of 2, and so on.

---

## Review
Below is a quiz to review the recently covered material. Quizzes are _not_ graded.

In [3]:
try:
    display(L4P3Q1, L4P3Q2)
except:
    pass

VBox(children=(Output(outputs=({'name': 'stdout', 'text': '1. Suppose you create the following two vectors: \n…

VBox(children=(Output(outputs=({'name': 'stdout', 'text': '2. Can functions be performed on vectors?\n', 'outp…

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 4 - Logical Variables<a class="anchor" id="DS102L4_page_4"></a>

[Back to Top](#DS102L4_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">


In [5]:
from IPython.display import VimeoVideo
# Tutorial Video Name: Introduction to R
VimeoVideo('327137105', width=720, height=480)

The transcript for the above topic tutorial video **[is located here](https://repo.exeterlms.com/documents/V2/DataScience/Video-Transcripts/DSO102-L04-pg4tutorial.zip)**.

# Logical Variables

In R, logical variables and expressions are often used to select certain components of a vector or data table. This is very helpful when doing exploratory data analysis. In this section, you will learn the basics of logical values and logical operators and apply them to select elements in a vector. 

A logical value is ```TRUE``` or ```FALSE```. You can assign a logical value to a variable:

```{r}
Logic <- TRUE
Logic
```

[1] TRUE

However, it is usually more useful to create logical values using a condition; an example of a condition is ```3 < 5```. Since 3 is less than 5, this condition has a value of TRUE:

```{r}
3 < 5
```

[1] TRUE

If you change the condition to be 3 > 5, you get

```{r}
3 > 5
```

[1] FALSE

A condition uses a logical operator to compare two values; in the examples above, you used the logical operators ```>``` and ```<``` to do the comparison.

The logical operators in R are the following:

* **```<``` :** Less than
* **```<=``` :** Less than or equal to
* **```>``` :** Greater than
* **```>=``` :** Greater than or equal to
* **```==``` :** Equal to
* **```!=``` :** Not equal to

<div class="panel panel-info">
    <div class="panel-heading">
        <h3 class="panel-title">Tip!</h3>
    </div>
    <div class="panel-body">
        <p>Since the equals sign ( = ) is used for variable assignment, it cannot be used as a logical operator.  Thus the double equals ( == ) is used instead.  This is common in programming.</p>
    </div>
</div>

---

## Review
Below is a quiz to review the recently covered material. Quizzes are _not_ graded.

In [4]:
try:
    display(L4P4Q1, L4P4Q2, L4P4Q3)
except:
    pass

VBox(children=(Output(outputs=({'name': 'stdout', 'text': '1. What is the value of the condition 4 < 6?\n', 'o…

VBox(children=(Output(outputs=({'name': 'stdout', 'text': '2. What is the value of the condition 4 == 6?\n', '…

VBox(children=(Output(outputs=({'name': 'stdout', 'text': '3. What is the value of the condition 4 != 6?\n', '…

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 5 - Logical Operations on Vectors<a class="anchor" id="DS102L4_page_5"></a>

[Back to Top](#DS102L4_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">


In [6]:
from IPython.display import VimeoVideo
# Tutorial Video Name: Introduction to R
VimeoVideo('327137089', width=720, height=480)

The transcript for the above topic tutorial video **[is located here](https://repo.exeterlms.com/documents/V2/DataScience/Video-Transcripts/DSO102-L04-pg5tutorial.zip)**.

---

## Using Logical Operators with Vectors

You can use logical operators with vectors as well. For example, using the vector ```heights```, you can find which heights are less than 180 as follows:

```{r}
heights < 180
```

[1] TRUE FALSE FALSE TRUE TRUE TRUE

This condition returns a vector whose elements are ```TRUE``` when the corresponding element of ```heights``` is less than 180 and ```FALSE``` when the corresponding element of ```heights``` is greater than or equal to 180. This kind of vector is called a *logical vector*.

You can use a logical vector to select elements of ```heights``` that meet a given condition. This is a very powerful capability when doing data analysis, where it can be advantageous to focus on values that meet a given condition.

As an example, suppose you want to separate ```heights``` into two vectors: one with the heights of the "tall" people, and one with the heights of the "short" people, with the dividing line between short and tall arbitrarily set at 180. To create a vector short that contains the values of ```heights``` which are less than 180, do the following:

```{r}
short <- heights[heights < 180]
short
```

[1] 171 177 154 176

This statement works as follows. From above, you know that the condition ```heights < 180``` creates a logical vector of ```TRUE``` and ```FALSE``` values.

The first element of this vector is ```TRUE```, since the first element of ```heights``` has a value of 171.

This means that the first element of `heights` is in the ```short``` vector.

The second element of the logical vector created by the condition is ```FALSE```, since the second element of ```heights``` is 192.

This means that ```short``` does not include the second element of ```heights```.

This process is repeated for every element in ```heights```; only the elements less than 180 are included in ```short```.

You can create a ```tall``` vector as follows:

```{r}
tall <- heights[heights >= 180]
tall
```

[1] 192 183

You can remove a specific value from ```heights``` as follows. You can take out the value 177 with this command:

```{r}
new.heights <- heights[heights != 177]
new.heights
```

[1] 171 192 183 154 176

---

## Review
Below is a quiz to review the recently covered material. Quizzes are _not_ graded.

In [5]:
try:
    display(L4P5Q1, L4P5Q2)
except:
    pass

VBox(children=(Output(outputs=({'name': 'stdout', 'text': '1. Suppose you have two vectors x and y that are de…

VBox(children=(Output(outputs=({'name': 'stdout', 'text': "2. Suppose that you have a data vector that has bee…

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 6 - Sample Statistics<a class="anchor" id="DS102L4_page_6"></a>

[Back to Top](#DS102L4_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">


In [7]:
from IPython.display import VimeoVideo
# Tutorial Video Name: Introduction to R
VimeoVideo('327137138', width=720, height=480)

The transcript for the above topic tutorial video **[is located here](https://repo.exeterlms.com/documents/V2/DataScience/Video-Transcripts/DSO102-L04-pg6tutorial.zip)**.

# Sample Statistics

R has many functions to compute every commonly used and many less commonly used sample statistics. In this section, you will illustrate computing the statistics of a sample stored in a vector.

R has many built in data sets that can be used to learn about features of R. Most of these data sets are stored as data frames. You will go through a more extensive introduction to data frames later, but for now will learn just enough about data frames to get numerical data from them into a vector,  from which you can compute sample statistics.

![The old faithful geyser erupting in Yellowstone National Park in the U S.](Media/L04-OldFaithful.jpg)

One set of data is the ```faithful``` dataset, which has the length in minutes of 272 eruptions of the Old Faithful geyser as well as the waiting time in minutes to the next eruption. You can get access to this data frame using the following command:

```{r}
library(datasets)
```

This command makes several data sets available to you, including the faithful data set. You can see this set as follows:

```{r}
faithful
```

```text
eruptions waiting

1 3.600 79

2 1.800 54

3 3.333 74

4 2.283 62
```

Notice that this data set has two columns; one column is labeled ```eruptions``` and the other is labeled ```waiting```. There are 272 rows of data.

You can create a vector of eruption times and store it in the variable ```eruption.times``` with the command

```{r}
eruption.times <- faithful$eruptions
```

R has many useful functions that can be used to understand a numerical sample stored as a vector. They include functions to find maximum and minimum values, as well as commonly used sample statistics. Here is a list of some commonly used functions:

* **max() :** Returns the largest element of the vector.
* **min() :** Returns the smallest element of the vector.
* **length() :** Returns the number of elements in the vector.
* **sort() :** Returns a vector with the elements arranged from smallest to largest.
* **mean() :** Computes the average of the values in the vector.
* **median() :** Computes the median of the values in the vector.
* **var() :** Computes the variance of the values in the vector.
* **sd() :** Computes the sample standard deviation of the values in the vector.
* **summary() :** Shows the following values for a vector:
    * Minimum.
    * 1st quartile (the value that has 25% of the sample values below it).
    * Median.
    * Mean.
    * 3rd quartile (the value with 75% of the sample values below it).
    * Maximum.

Using these functions, you can learn about the sample in ```eruption.times```. The number of values in the sample is given by:

```{r}
length(eruption.times)
```

[1] 272

There are 272 samples. The largest and smallest values in the sample are given by:

```{r}
max(eruption.times)
min(eruption.times)
```

[1] 5.1
[1] 1.6

So the longest eruption time is 5.1 minutes, and the shortest eruption time is 1.6 minutes. The mean and median are given by: 

```{r}
mean(eruption.times)
median(eruption.times)
```

[1] 3.487783
[1] 4

The variance and standard deviation are given by:

```{r}
var(eruption.times)
sd(eruption.times)
```

[1] 1.302728
[1] 1.141371

You can get a summary of the sample statistics using the ```summary()``` function:

```{r}
summary(eruption.times)
```

Min. 1st Qu. Median Mean 3rd Qu. Max.

1.600 2.163 4.000 3.488 4.454 5.100

Note that these values agree with the values that you computed using individual functions above. Huzzah!

---

## Review
Below is a quiz to review the recently covered material. Quizzes are _not_ graded.

In [6]:
try:
    display(L4P6Q1, L4P6Q2, L4P6Q3, L4P6Q4, L4P6Q5)
except:
    pass

VBox(children=(Output(outputs=({'name': 'stdout', 'text': '1. Create a vector of wait times (times between eru…

VBox(children=(Output(outputs=({'name': 'stdout', 'text': '2. Create a vector of wait times (times between eru…

VBox(children=(Output(outputs=({'name': 'stdout', 'text': '3. Create a vector of wait times (times between eru…

VBox(children=(Output(outputs=({'name': 'stdout', 'text': '4. Create a vector of wait times (times between eru…

VBox(children=(Output(outputs=({'name': 'stdout', 'text': '5. Which statement would product quartiles?\n', 'ou…

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 7 - Key Terms<a class="anchor" id="DS102L4_page_7"></a>

[Back to Top](#DS102L4_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">


Below is a list and short description of the important keywords learned in this lesson. Please read through and go back and review any concepts you do not fully understand. Great Work!

<table class="table table-striped">
    <tr>
        <th>Keyword</th>
        <th>Description</th>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>Logical Variable</td>
        <td>A variable that has a value of either TRUE or FALSE.</td>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>Logical Value</td>
        <td>TRUE or FALSE.</td>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>Logical Operators</td>
        <td>Something you use to compare data. Can be less than <, greater than >, less than or equal to <=, greater than or equal to >=, equal to ==, or not equal to != .</td>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>Subset</td>
        <td>To isolate certain information from a dataset.</td>
    </tr>
</table>


---

# Key R Functions

<table class="table table-striped">
    <tr>
        <th>Keyword</th>
        <th>Description</th>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>seq()</td>
        <td>Creates a vector using the arguments of a start number and end number.  Can include the arguments by= to increment or length.out to specify how long the vector should be. </td>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>max()</td>
        <td>Finds the maximum value in a vector.</td>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>min() </td>
        <td>Returns the minimum value in a vector. </td>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>length()</td>
        <td>Provides the number of elements in a vector.</td>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>sort()</td>
        <td>Sorts the vector from smallest to largest.</td>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>mean()</td>
        <td>Finds the average of values in a vector.</td>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>median()</td>
        <td>Finds the median of values in a vector.</td>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>var()</td>
        <td>Finds the variance for values in a vector.</td>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>sd()</td>
        <td>Finds the standard deviation for values in a vector.</td>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>summary()</td>
        <td>Provides minimum, maximum, first quartile, median, mean, third quartile, and maximum for values in a vector.</td>
    </tr>
</table>



<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 8 - Hands-On<a class="anchor" id="DS102L4_page_8"></a>

[Back to Top](#DS102L4_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">


For your Lesson 4 Practice Hands-On, you will be calculating the answer to several questions. This Hands-On will not be graded, but you are encouraged to complete it. The best way to become a great data scientist is to practice! Once you have submitted your project, you will be able to access the solution on the next page. Please write your answers within one Word Document (or an equivalent) and submit this file, along with your R file, in area below. 

<div class="panel panel-success">
    <div class="panel-heading">
        <h3 class="panel-title">Additional Info!</h3>
    </div>
    <div class="panel-body">
        <p>You may want to watch <a href="https://vimeo.com/438392224">this recorded live workshop </a> before beginning the hands-on, which goes over a similar example.</p>
    </div>
</div>

---

## Requirements

The eruption times for Old Faithful are clustered into two different groups. One group is short eruptions, and the other group is long eruptions. Short eruptions last three minutes or less, while long eruptions last more than three minutes. As below, define the vector ```eruptions.times``` as:

```eruptions.times <- faithful$eruptions```

Split the vector ```eruptions.times``` into two vectors: a vector ```short``` that contains the times less than or equal to three minutes, and a vector ```long``` that contains the times greater than three minutes. Answer the following questions:

1. How many elements are in the vector ```short```?

2. How many elements are in the vector ```long```?

3. What is the mean erruption time of the short eruptions?

4. What is the mean erruption time of the long eruptions?

5. What is the standard deviation of the short eruption times?

6. What is the standard deviation of the long eruption times?

<div class="panel panel-danger">
    <div class="panel-heading">
        <h3 class="panel-title">Caution!</h3>
    </div>
    <div class="panel-body">
        <p>Be sure to zip and submit your entire document when finished!</p>
    </div>
</div>

<div class="panel panel-info">
    <div class="panel-heading">
        <h3 class="panel-title">Tip!</h3>
    </div>
    <div class="panel-body">
        <p>To zip your file on <b>Windows</b>, right click on the file and select "Send to", then select "Compressed (zipped) folder". For <b>Mac</b> users, right click on the file and select "Compress", then select your file from the options.</p>
    </div>
</div>

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 9 - Solution<a class="anchor" id="DS102L4_page_9"></a>

[Back to Top](#DS102L4_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">



Below is the solution to your Lesson 4 Practice Hands-On

1.  How many elements are in the vector short?

    * 97

2)  How many elements are in the vector long?

    * 175
    
3)  Which of the following is the mean eruption time of the short eruptions?

    * 2.038134

4.  Which of the following is the mean eruption time of the long eruptions?

    * 4.291303

5.  Which of the following is the standard deviation of the short eruption times?

    * 0.2668655

6.  Which of the following is the standard deviation of the long eruption times?

    * 0.4108516

---

## Code to Achieve Solution


# This splits it into a short and long vector

short <- eruptions.times[eruptions.times <= 3]
long <- eruptions.times[eruptions.times > 3]

### Question 1 - How many elements are in the short vector?

length(short)

### Question 2 - How many elements are in the long vector?

length(long)

### Question 3 - Mean wait time of short? 

mean(short)

### Question 4 - Mean wait time of long?

mean(long)

### Question 5 - Standard deviation of short?

sd(short)

### Question 6 - Standard deviation of long?

sd(long)