# DS102 Statistical Programming in R : Lesson Three - Variables, Functions, and For Loops

### Table of Contents <a class="anchor" id="DS102L3_toc"></a>

* [Table of Contents](#DS102L3_toc)
    * [Page 1 - Introduction](#DS102L3_page_1)
    * [Page 2 - Variables](#DS102L3_page_2)
    * [Page 3 - Strings](#DS102L3_page_3)
    * [Page 4 - Arithmetic Operations](#DS102L3_page_4)
    * [Page 5 - Functions](#DS102L3_page_5)
    * [Page 6 - Creating Functions](#DS102L3_page_6)
    * [Page 7 - Vectors](#DS102L3_page_7)
    * [Page 8 - The For Loop](#DS102L3_page_8)
    * [Page 9 - Key Terms](#DS102L3_page_9)
    * [Page 10 - Hands-On](#DS102L3_page_10)
    

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 1 - Introduction<a class="anchor" id="DS102L3_page_1"></a>

[Back to Top](#DS102L3_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

In [1]:
from IPython.display import VimeoVideo
# Tutorial Video Name: Introduction to R
VimeoVideo('247057893', width=720, height=480)

In this lesson, you will begin to learn the fundamentals of using R. You will learn about how the following concepts and operations work in R:

* Variables and assigning values to variables
* Character strings
* Numerical operations
* Vectors
* Functions 
* For loops

By the end of this lesson, you will be able to create your very own function and for loop to compute sphere diameter for a vector in R.

As you go through, type the examples into RStudio and verify that you get the results shown. This will help you understand how R works.  


<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 2 - Variables<a class="anchor" id="DS102L3_page_2"></a>

[Back to Top](#DS102L3_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">


In [2]:
from IPython.display import VimeoVideo
# Tutorial Video Name: Introduction to R
VimeoVideo('326671179', width=720, height=480)

The transcript for the above topic tutorial video **[is located here](https://repo.exeterlms.com/documents/V2/DataScience/Video-Transcripts/DSO102-L03-pg2tutorial.zip)**.

# A Note About Errors

Everybody who codes makes errors. Don't be afraid of them. The goal, when you are learning to program, should not be to never make an error, but to make lots of them so you can learn what they look like so you can fix them.

One of the most difficult parts of learning to program in any language is to connect the error messages produced by running your code to the error in your code, understand what the error is, and fix it.

So, as you go through this and subsequent lessons, when something works, congratulate yourself, but don't stop there. Take some time to break your code and see what happens. Remove a quote or a semicolon. Change a variable name. Change a number. As you do this, you will learn what errors look like, and will be better able to fix them.

---

## Variables and Assignments

The R language operates on named objects. Objects can be as simple as a single number, or as complex as the information generated by a complicated statistical analysis. You will start with very simple objects and work with more complex ones as you advance. The name of an object is referred to as a *variable*. Conceptually, a variable is a named place to store something. [Examples covered in this lesson](./Examples/DS102L3-Examples-variables-functions-forLoops.ipynb)

---

## Naming Conventions for Variables

R is frighteningly permissive about variable names. It is not a good idea, however, to use some of the variable names that R will allow you to use, since this will teach you bad habits in other languages you'll use later on.  You will be assured that your variable names will work on any implementation of R if you stick to the following rules:

* Use only the letters a-z (as well as capital letters A-Z), the digits 0-9, periods, and underscores. 

    <div class="panel panel-info">
        <div class="panel-heading">
            <h3 class="panel-title">Tip!</h3>
        </div>
        <div class="panel-body">
            <p>R does distinguish between lower-case letters and upper-case letters.</p>
        </div>
    </div>

* Variable names must start with a letter or a period.

* If the variable name starts with a period, the second character can not be a digit.

Variable names can be up to 10,000 characters in length, but it is generally considered to be good practice to use much shorter variable names. The shorter, the better, as long as you can deduce meaning from it.  This is so that you are less likely to make mistakes as you type variable names in repetitively.

Unlike many other programming languages, which do not allow periods to be part of a variable name (Python, for example), in R a period can be part of a variable name. Traditionally in R, variable names made up of multiple words are connected by periods: 

```this.is.a.long.variable.name.made.of.words.connected.by.periods```

The above is a valid variable name. In many other programming languages, variable names are created by joining words with underscores. This can also be done in R: 

```this_is_a_long_variable_name_made_of_words_connected_by_underscores```

So the above is a valid variable name, too.  Even though R will allow periods, you are encouraged to use underscores or just capitalization to denote new words, so that you're not having to change how you operate once you learn Python.

Here's what capitalization might look like: 

```ThisIsALongVariableNameMadeOfWords```

<div class="panel panel-success">
    <div class="panel-heading">
        <h3 class="panel-title">Fun Fact!</h3>
    </div>
    <div class="panel-body">
        <p>Capitalization like what is done above is referred to as "Camel Case" because it's like looking at the humps of a camel.</p>
    </div>
</div>

One way to think about variables is to imagine a row of cubbies at the back of a grade school classroom, where each cubby is labeled by a name; each cubby represents a variable. Each cubby (variable) can hold an object. As a program executes, it gets objects from cubbies, does computations with these objects, and perhaps creates new objects from these computations, which are then stored in a new cubby.

![Cubbies in a classroom holding various items.](Media/calculating1.jpg)

An assignment statement assigns an object to a variable; in other words, it gives the object a name. Assignment statements can be built using the following symbols: 

* <-
* =
* -> 

For example: 

* The assignment statement x <- 3 creates a variable x and stores 3 in it.
* The assignment statement y = 7 creates a variably y and stores 7 in it.
* The assignment statement 11 -> z creates a variable z and stores 11 in it.

R is quite unusual in that it has three different ways to indicate assignment; most programming languages only have one. In most programming languages, you think of the variable on the left being assigned the value on the right, as in the top two assignment statements. But R also allows us to assign the value on the left to a variable on the right, as in the last assignment statement.

To implement your first assignment statement, start RStudio if it is not already started. Then enter the following assignment into the Console window:

```{r}
x <- 3
```

Your RStudio window should look something like this:

![An R studio window showing three panes, one for console, one for environment and history, and one for files, plots, packages, help, and viewer.](Media/L02-SimpleAssignment.png)

The list of values in the ```Environment``` tab shows the variable x and the value that is assigned to it, namely 3. Try the other two assignment statements above and verify that the correct number is assigned to the correct variable. Now try to generate an error message by changing the direction of the -> in one of your assignment statements.

From here on, in the examples, you will see the statement you should type into the Console window of RStudio and the response you will get, but you typically won't see images of RStudio. 

---

## Review
Below is a quiz to review the recently covered material. Quizzes are _not_ graded.

In [1]:
try:
    from DS_Students import *
    from ipynb.fs.full.DS102Questions import *
except:
    !pip install DS_Students
    from DS_Students import *
    from ipynb.fs.full.DS102Questions import *

In [2]:
try:
    display(L3P2Q1, L3P2Q2, L3P2Q3, L3P2Q4, L3P2Q5, L3P2Q6)
except:
    pass

VBox(children=(Output(outputs=({'name': 'stdout', 'text': '1. What is a variable?\n', 'output_type': 'stream'}…

VBox(children=(Output(outputs=({'name': 'stdout', 'text': '2. Which of the following is NOT a valid naming con…

VBox(children=(Output(outputs=({'name': 'stdout', 'text': '3. Which of the following is NOT a way to assign va…

VBox(children=(Output(outputs=({'name': 'stdout', 'text': '4. Consider the assignment:\n \n \x1b[31;1mage.of.a…

VBox(children=(Output(outputs=({'name': 'stdout', 'text': '5. Consider the assignment:\n \n \x1b[31;1m21 -> ba…

VBox(children=(Output(outputs=({'name': 'stdout', 'text': '6. True or False? \n     Once Created, your variabl…

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 3 - Strings<a class="anchor" id="DS102L3_page_3"></a>

[Back to Top](#DS102L3_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">


In [3]:
from IPython.display import VimeoVideo
# Tutorial Video Name: Introduction to R
VimeoVideo('326671296', width=720, height=480)


The transcript for the above topic tutorial video **[is located here](https://repo.exeterlms.com/documents/V2/DataScience/Video-Transcripts/DSO102-L03-pg3tutorial.zip)**.

# Strings

So far, you have only assigned numbers to variables. But R has the capability to represent many more types of data in addition to numbers. In this section, strings, or character data, are introduced. 

In computer programming, a sequence of characters is often called a *string*. Strings are used for many reasons, including to represent categorical data values, to represent column names in data tables, and to represent labels in charts and graphs.

Strings can be created in R by enclosing the string in single quotes ```'``` or double quotes ```"```. For example, this is a valid string: "Hello World!" If you enter this string into the R Console, R will print it (with the quotes):

```{r}
"Hello World!"
```

[1] "Hello World!"

Strings can be assigned to variables just like numbers. This statement assigns the string to the variable ```h```:

```{r}
h <- "Hello World!"
```
### [Errata4](Errata/DS102-Change-Log.ipynb)
<a class="anchor" id="Run_button"></a>

As with assignments of numbers, R does not print the result of an assignment. R will print the value of a variable when you type it into the Console. So entering ```h``` into the console and pressing the Run button produces the following:

```{r}
h
```

[1] "Hello World!"

You can also use the ```print()``` function to print the value of a variable. If you enter ```print(h)``` into the console, you get the same output as you got when just entering ```h```:

```{r}
print(h)
```

[1] "Hello World!"

You can combine two strings into a single one using the ```paste()``` function:

```{r}
f <- "Hello"
g <- "World!"
paste(f, g)
```

[1] "Hello World!"

This means that you get the output in only one line, instead of two.

---

## Review
Below is a quiz to review the recently covered material. Quizzes are _not_ graded.

In [3]:
try:
    display(L3P3Q1, L3P3Q2)
except:
    pass

VBox(children=(Output(outputs=({'name': 'stdout', 'text': '1. Which of the following R statements will create …

VBox(children=(Output(outputs=({'name': 'stdout', 'text': '2. Suppose you have two variables: \x1b[31;1mfirst.…

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 4 - Arithmetic Operations<a class="anchor" id="DS102L3_page_4"></a>

[Back to Top](#DS102L3_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">


In [4]:
from IPython.display import VimeoVideo
# Tutorial Video Name: Introduction to R
VimeoVideo('326671154', width=720, height=480)

The transcript for the above topic tutorial video **[is located here](https://repo.exeterlms.com/documents/V2/DataScience/Video-Transcripts/DSO102-L03-pg4tutorial.zip)**.

# Arithmetic Operations

Like most other computational environments, R can do computations on data values, including those stored in variables. You can compute the sum of two numbers as follows; R will provide the following output:

```{r}
2+3
```

[1] 5

You can also compute the sum of two numbers that are stored as variables:

```{r}
x <- 4
y <- 7
x + y
```

[1] 11

R implements all of the usual arithmetic operators:

* **Addition:** x + y
* **Subtraction:** x - y
* **Multiplication:** x * y
* **Division:** x / y
* **Raising to a Power:** x ^ y

<div class="panel panel-info">
    <div class="panel-heading">
        <h3 class="panel-title">Tip!</h3>
    </div>
    <div class="panel-body">
        <p>The spaces on both sides of the operator are optional; R generally ignores spaces that are not inside quotation marks. x+y works exactly like x + y. Most programmers put spaces around operators to improve readability.</p>
    </div>
</div>

---

## Order of Operations

These arithmetic operators have the following precedence. Raising to a power is done first, followed by multiplication and division, and finally addition and subtraction.

You may recall the acronym "PEMDAS" from math classes, where PEMDAS indicates the order of operation. PEMDAS stands for:

* Parentheses
* Exponents
* Multiplication
* Division
* Addition
* Subtraction

You can easily remember this with a pneumonic often taught still: Please Excuse My Dear Aunt Sally, where each letter in the pneumonic helps remind you of which operation will come next.  

PEMDAS is the same order that R (and most programming languages) uses to determine which operation to do when.

So, when evaluating this:

```text
3^4 * 2 + 1
```

R first computes 3 ^ 4, which is 81. It then multiplies 81 by 2 to get 162, then adds 1 to get 163.

You can change the order in which operations are performed by using parentheses: ```(``` and ```)```. For example, in the below: 

```text
3 ^ (4 * 2 + 1)
```

R first computes 4 * 2, which is 8, then adds 1 to get 9, and finally computes 3 ^ 9, which is 19683.

A sequence of variables, numbers, arithmetic operations, and functions (more on functions in a bit) is called an expression. If an expression is used without an assignment, the result is printed, but not saved (sort of):

```{r}
4 ^ 2 + 3 ^ 2
```

[1] 25

The result of the computation, 25, is printed, but it is not easily accessible going forward. You can save the result of an expression by assigning it to a variable:

```{r}
s <- 4 ^ 2 + 3 ^ 2
```

This statement computes the value of the expression and stores it in the variable ```s```, but it does not print it. As you learned earlier, you can see the value that was computed by typing the variable ```s```:

```{r}
s
```

[1] 25

---

## Example: Computing a *z*-Score

Next, you will use R to do a statistical calculation from the Basic Statistics module, namely computing a *z*-score.

In the Basic Statistics module you were informed that in Ghana, the height of a young adult woman is normally distributed with a mean of 159.0 cm and a standard deviation of about 4.9 cm.

Gabianu is a college student originally from Ghana, and she stands 169.0 cm tall. You can compute the *z*-score for Gabianu's height using this formula:

![The formula to find the z score. Z equals x minus mu divided by sigma.](Media/L02-zscore.png)

You first define R variables for each of the quantities on the right side of the formula:

```{r}
x <- 169.0
mu <- 159.0
sigma <- 4.9
```

You can now compute the *z*-score with this expression:

```{r}
(x - mu) / sigma
```

[1] 2.040816

Note that the parentheses are necessary in the computation of the *z*-score. The reason for this is that you must first subtract mu from x, then divide the difference by sigma.

If you leave the parentheses out of the expression, it would look like this: x - mu / sigma. Because division has a higher precedence than subtraction, R would divide mu by sigma first, then subtract this quotient from x. If you type this expression into R without the parentheses, you get the following:

```{r}
x - mu / sigma
```

[1] 136.551

You can clearly see that this is different than the correct answer above.

---

## Review
Below is a quiz to review the recently covered material. Quizzes are _not_ graded.
<p style="text-align: center">
  <img src="Media/L02-circleArea.png" alt="Drawing" style="width: 100px;"/>
</p> 

In [5]:
try:
    display(L3P4Q1, L3P4Q2, L3P4Q3)
except:
    pass

VBox(children=(Output(outputs=({'name': 'stdout', 'text': '1. The area of a circle is given by the following f…

VBox(children=(Output(outputs=({'name': 'stdout', 'text': '2. The sample variance can be computed from the sam…

VBox(children=(Output(outputs=({'name': 'stdout', 'text': "3. Gabianu's friend from Ghana is Rashida. Rashida …

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 5 - Functions<a class="anchor" id="DS102L3_page_5"></a>

[Back to Top](#DS102L3_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">


In [5]:
from IPython.display import VimeoVideo
# Tutorial Video Name: Introduction to R
VimeoVideo('326671254', width=720, height=480)

The transcript for the above topic tutorial video **[is located here](https://repo.exeterlms.com/documents/V2/DataScience/Video-Transcripts/DSO102-L03-pg5tutorial.zip)**.

# Functions

In most programming languages, a function relates inputs (also called arguments) to an output; in other words, a function produces a value for each possible value(s) of their argument(s).  You can think of a function as the main "do-er" - it is a command that will be done to the data you feed the function. A function performs a specific programming task.

You have already worked with functions in the Basic Statistics module, even though they weren't thought of as such. For example, you had to compute the square root of 16. R has a function built in that will compute a square root; this function is ```sqrt()```. Notice that a function has a name followed by parentheses; the function's argument goes in the parentheses.

You can compute the square root of 16 in R as follows:

```{r}
sqrt(16)
```

[1] 4

In this computation, 16 is the argument of the ```sqrt()``` function. The ```sqrt()``` function returns the value 4, which is printed. You could save the returned value in a variable ```t```:

```{r}
t <- sqrt(16)
t
```

[1] 4

The argument of a function can also be a variable; in the following, you will set the value of ```w``` to be 2, then compute its square root:

```{r}
w <- 2
sqrt(w)
```

[1] 1.414214

R provides almost every mathematical function you are ever likely to want to compute. 

In addition, R provides many functions specifically for statistical analysis. You'll be introduced to these in the example below.

Many of the R commands you have learned are actually implemented in R as functions. For example, ```print()``` is actually a function that prints its argument in the Console window. ```help()``` is a function that brings up help information about its argument. ```paste()```, used earlier in this lesson, is a function that takes two or more strings as arguments and combines them into a single string.

---

## Example: Computing a Percentile from a *z*-Score

In the Basic Statistics module, you used an applet to compute a percentile from a *z*-score. You had the following problem:

Suppose you took a standardized exam, and the mean score is 440 with a standard deviation of 23. Your score is 472...what percentile is that?

The first step in solving this problem is to compute the *z*-score. From the previous example, you know how to do this:

```{r}
x <- 472
mu <- 440
sigma <- 23
z.score <- (x - mu) / sigma
z.score
```

[1] 1.391304

So the *z*-score for this example is 1.391304.

Here is a random question: why could you not give the variable that stores the *z*-score the name *z*-score? Try doing this in R and see what happens.  You get an error because variable names in R cannot contain dashes.

```pnorm()``` is an R function that computes percentiles from *z*-scores for the normal distribution. You want to know what percentage of scores will be below 472. This is the same as the area under the standard normal density to the left of 1.391304, and is the blue part in this graph:

![A normal distribution with z scores at negative one point three nine one and one point three nine one. The z scores are indicated with vertical lines on the distribution. The area to the left of the one point three nine one vertical line is shaded in a different color.](Media/L02-PercentileBelow.png)

You can compute this area using ```pnorm()``` as follows. The argument to ```pnorm()``` is the ```z.score``` variable that was computed above:

```{r}
pnorm(z.score)
```

[1] 0.9179334

You can convert this number to a percent by multiplying by 100:

```{r}
pnorm(z.score) * 100
```

[1] 91.79334

This tells you that the score of 472 is greater than 91.79334% of the scores.

```pnorm()``` can also compute the area under the standard normal density to the right of 1.391304, or the blue part in the graph below:

![A normal distribution with z scores at negative one point three nine one and one point three nine one. The z scores are indicated with vertical lines on the distribution. The area to the right of the one point three nine one vertical line is shaded in a different color.](Media/L02-PercentileAbove.png)

You compute this as follows:

```{r}
pnorm(z.score, lower.tail = FALSE)
```

[1] 0.08206658

The argument ```lower.tail = FALSE``` indicates to ```pnorm()``` that you want the upper tail (the area to the right), rather than the lower tail (the area to the left). You can convert this result to a percent by multiplying it by 100:

```{r}
pnorm(z.score, lower.tail = FALSE) * 100
```

[1] 8.206658

This tells you that the score of 472 is less than 8.206658% of the scores.

```pnorm()``` can even save you the effort of computing the *z*-score. You can supply the mean and standard deviation of the normal distribution as arguments to ```pnorm()```, and it will compute the lower tail area directly from the score:

```{r}
pnorm(472, mean=440, sd=23)
```

[1] 0.9179334

mean is the name ```pnorm()``` uses for the mean, and sd is the name ```pnorm()``` uses for the standard deviation. You can see that the value returned by ```pnorm()``` is the same as the one you obtained using the *z*-score. You can compute the upper tail area as well:

```{r}
pnorm(472, mean=440, sd=23, lower.tail = FALSE)
```

[1] 0.08206658

---

## Review
Below is a quiz to review the recently covered material. Quizzes are _not_ graded.

In [6]:
try:
    display(L3P5Q1, L3P5Q2, L3P5Q3)
except:
    pass

VBox(children=(Output(outputs=({'name': 'stdout', 'text': '1. What is a function?\n', 'output_type': 'stream'}…

VBox(children=(Output(outputs=({'name': 'stdout', 'text': '2. True or False? \nAn argument is also known as th…

VBox(children=(Output(outputs=({'name': 'stdout', 'text': '3. The sample standard deviation can be computed fr…

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 6 - Creating Functions<a class="anchor" id="DS102L3_page_6"></a>

[Back to Top](#DS102L3_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">


In [6]:
from IPython.display import VimeoVideo
# Tutorial Video Name: Introduction to R
VimeoVideo('326671206', width=720, height=480)

The transcript for the above topic tutorial video **[is located here](https://repo.exeterlms.com/documents/V2/DataScience/Video-Transcripts/DSO102-L03-pg6tutorial.zip)**.

# Creating Functions

You have used a few functions, such as ```sqrt()```.  However, it can be helpful to define your own functions that take one or more arguments and compute a value. R can do this.

---

## A Function to Convert Temperatures

As an example, suppose you have a data set that includes temperatures measured in degrees Fahrenheit. And suppose you need to convert these temperatures to degrees Celsius. Do a bit of searching online, and you'll discover that the formula to convert a temperature from Fahrenheit to Celsius is the following:

![The formula to convert Fahrenheit to Celsius. T C equals open parentheses T F minus thirty two close parentheses times five ninths.](Media/L03-FahrenheitToCelsius.png)

In this formula, TC is the temperature in degrees Celsius, and TF is the temperature in degrees Fahrenheit. With this formula and mastery of Lesson 2, you can convert a given temperature from degrees Fahrenheit to degrees Celsius.

But suppose you want to convert many temperatures? You can make a function that will do this conversion for us. You will call the function ```f.to.c()```. It will have one argument (the temperature in degrees Fahrenheit) and return the corresponding temperature in degrees Celsius.

Start by creating a new, empty script file by using the ```File -> New File``` menu and selecting ```R Script.``` You can then enter your function into the editor window as follows:

```{r}
f.to.c <- function(TF){
    (TF - 32) * 5 / 9
}
```

Now add a line that calls ```f.to.c()``` with an argument of 73 and assigns the result to the variable ```temp.in.c.```.

```{r}
temp.in.c <- f.to.c(73)
```

Then execute the whole script file by clicking on the Source button. R runs the script, and you see in the Environment pane that a new variable, ```temp.in.c``` has been created. You can find the value of ```temp.in.c``` by typing it in the Console pane:

```{r}
temp.in.c
```

[1] 22.77778

Now that the overall function is laid out, you will examine the function in detail. The first line is:

```{r}
f.to.c <- function(TF) {
```

```f.to.c``` is the name of the function. ```f.to.c``` is actually a variable, and rather than assigning a number or a string to it, you are assigning a function to it. That is indicated by the assignment operator ```<-```. 

<div class="panel panel-info">
    <div class="panel-heading">
        <h3 class="panel-title">Tip!</h3>
    </div>
    <div class="panel-body">
        <p>In addition to holding numbers and strings, variables can hold many different objects, including functions.</p>
    </div>
</div>

Then, you have ```function(TF)```. ```function()``` indicates that you are creating a function. ```(TF)``` shows that the function will have one argument, named ```TF```. This line finishes with an open brace ```{```, which indicates that you are beginning the definition of the function.

The second line of the function is:

```{r}
(TF - 32) * 5 / 9
```

This is an expression that actually converts from Fahrenheit to Celsius; if you look at the formula, you will see that this expression is the right hand side of the formula. This line of the function is the last line of code in the function. (It is also the first line of code, and actually the only line of code.) The value computed by the last line of code is the value that is returned by the function.

The last line of the function is just a closing brace ```}```, which signifies the end of the function.

When you call the function with ```f.to.c(73)```, the function sets ```TF``` to a value of 73. It then executes the next line of code, which subtracts 32 from TF and multiplies the result by 5/9, giving the value 22.77778. This is the value that is returned by the function.

---

## Review
Below is a quiz to review the recently covered material. Quizzes are _not_ graded.

In [7]:
try:
    display(L3P6Q1, L3P6Q2)
except:
    pass

VBox(children=(Output(outputs=({'name': 'stdout', 'text': '1. What command is required when creating your own …

VBox(children=(Output(outputs=({'name': 'stdout', 'text': '2. What do the curly braces \x1b[31;1m{}\x1b[0m ind…

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 7 - Vectors<a class="anchor" id="DS102L3_page_7"></a>

[Back to Top](#DS102L3_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">


In [7]:
from IPython.display import VimeoVideo
# Tutorial Video Name: Introduction to R
VimeoVideo('326671276', width=720, height=480)

The transcript for the above topic tutorial video **[is located here](https://repo.exeterlms.com/documents/V2/DataScience/Video-Transcripts/DSO102-L03-pg7tutorial.zip)**.

# Introduction to Vectors

A numerical *vector* is an ordered collection of numbers. Ordered means that there is a first number, a second number, and so on to the last number. Each number in the vector is called an *element*.

You can create a vector of numbers using the ```c()``` function; its default operation is to combine its arguments into a vector. Suppose you measured the height in centimeters of six people and got the following six measurements: 171, 192, 183, 177, 154, and 176. You can create a vector of these heights as follows:

```{r}
c(171, 192, 183, 177, 154, 176)
```

[1] 171 192 183 177 154 176

You can assign this vector to the variable ```heights```:

```{r}
heights <- c(171, 192, 183, 177, 154, 176)
```

R allows you to easily create vectors that are sequences of numbers. These types of vectors are especially useful in for loops. You can use the ```:``` operator to create a sequence of numbers going from 1 to 10:

```{r}
1:10
```

[1] 1 2 3 4 5 6 7 8 9 10

You can make descending sequences of numbers as well:

```{r}
7:2
```

[1] 7 6 5 4 3 2

You can assign the vectors created by the ```:``` operator to variables, just like any other vector:

```{r}
s <- 2:5
s
```

[1] 2 3 4 5

---

## Review
Below is a quiz to review the recently covered material. Quizzes are _not_ graded.

In [8]:
try:
    display(L3P7Q1)
except:
    pass

VBox(children=(Output(outputs=({'name': 'stdout', 'text': '1. What command creates a vector of non-sequential …

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 8 - The For Loop<a class="anchor" id="DS102L3_page_8"></a>

[Back to Top](#DS102L3_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">


In [8]:
from IPython.display import VimeoVideo
# Tutorial Video Name: Introduction to R
VimeoVideo('326671216', width=720, height=480)


The transcript for the above topic tutorial video **[is located here](https://repo.exeterlms.com/documents/V2/DataScience/Video-Transcripts/DSO102-L03-pg8tutorial.zip)**.


# The For Loop

R, like most programming languages, has commands that allow you to repeat computations or actions. The most widely used command to do this is call a *for loop*. It's format is *for* this, in *that*, do *something*. 

---

## An Everyday Example

Sometimes the need exists to perform a certain task multiple times until some condition is met. For example, when baking cookies, a recipe may call for 4 cups of flour. The bowl starts with 0 cups of flour, then a baker will scoop a cup and pour it into the bowl. At this point, the bowl now contains 1 cup of flour. The baker needs to repeat the steps of scooping and pouring flour until the number of cups of flour in the bowl is equal to the number of cups needed for the recipe.

The scooping of flour is an example of a loop. The baker performs a series of operations until they meet a certain condition. The operations are the scooping and pouring of flour, and the condition is that the cups of flour in the bowl need to match the quantity the recipe needs.

If you were to write this out, then you would have something like:

```text
for flour_cups in bowl
scoop until flour_cups=4
```

---

## Your First For Loop

Suppose that you want to count from 1 to 6, printing each number on its own line. You open a new script file in the editor, and create a for loop that looks like this:

```{r}
for (n in 1:6){
    print(n)
}
```

When you click on the Source button, the for loop executes and prints the following:

```text
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
[1] 6
```

When the for loop executes, it does the following: 

1. ```1:6``` creates a vector of six values starting with one and going to six.

2. *n* is assigned to be the first value in the vector, which is 1. 

3. The statement ```print(n)``` is executed; since *n* is 1, it prints 1.

4. Then *n* is assigned to be the second value in the vector, which is 2. The print statement is executed again; since *n* is 2, it prints 2.

This process repeats four more times as *n* is assigned 3, 4, 5, and 6 in turn. Finally, there are no more values in the vector, so the for loop is done.

The statements in the for loop between the open brace ```{``` and close brace ```}``` are called the body of the loop. There may be more than one statement in the body. Suppose you wish to extend your for loop to print both the number and the number squared. You could do that with the following for loop:

```{r}
for (n in 1:6){
    print(n)
    print(n^2)
}
```

![A R studio editor pane showing a for loop. For open parentheses n in one colon six close parentheses open bracket print open parentheses n close parentheses print open parentheses n caret 2 close parentheses close bracket.](Media/L03-ForLoop2.png)

When you click on the the Source button, the for loop prints the following:

[1] 1
[1] 1
[1] 2
[1] 4
[1] 3
[1] 9
[1] 4
[1] 16
[1] 5
[1] 25
[1] 6
[1] 36

At this point, you may have become weary of R printing a [1] in front of every number you print. You can use the ```cat()``` function to print numbers without the [1] in front of them.

Unfortunately, the ```cat()``` function prints only its arguments; unless you include spaces or new lines specifically as arguments, it will not print them. You can change your for loop to print each number and its square on the same line, without the [1] values, as follows:

```{r}
for (n in 1:6){
    cat(n, " ")
    cat(n^2, "\n")
}
```

When you execute this for loop, it prints the following:

1 1
2 4
3 9
4 16
5 25
6 36

The strange string ```"\n"``` inserts a new line into the output; without it, all of the numbers would be printed on the same line.

---

## Example: Converting Many Temperatures

The previous example, while instructive, is not very useful (unless you have a need to know the first six integers and their squares and cannot remember them).

In this example, you will use a for loop to convert many temperatures from Fahrenheit to Celsius using the function you created earlier. You will first create a vector of temperatures, then use a for loop to iterate through each temperature and convert it.

Later, you'll learn an easier and more efficient way to do this. However, this is an excellent example of how a for loop can be used to compute many things.

Create the following script file:

```{r}
f.to.c <- function(TF) {
    (TF - 32) * 5 / 9
}

f.temps <- c(-40, 0, 32, 100, 212, 400)
for (f in f.temps) {
    c <-f.to.c(f)
    cat(f, " ", c, "\n")
}
```

Lines 1-3 define the function ```f.to.c()```; this definition is exactly the same as when you introduced it earlier in this lesson.

Line 5 creates a vector of temperatures in degrees Fahrenheit and assigns this vector to the variable ```f.temps```. These temperatures are the values that will be converted from Fahrenheit to Celsius.

Line 6 starts the for loop.

Lines 7 and 8 are the body of the for loop.

The for loop operates as follows. When Line 6 is executed, the variable ```f``` is assigned the first temperature in the vector ```f.temps```, which is -40.

Then, in Line 7, the ```f.to.c()``` function is called with this value of ```f```; it converts this value to Celsius and assigns it to the variable ```c```.

In Line 8, the Fahrenheit temperature in ```f``` and the Celsius temperature in ```c``` are printed on the same line.

After this is done, the for loop goes back to Line 6 and assigns the next value in ```f.temps to f```; this value is 0. This value is converted to Celsius and printed in Lines 6 and 7. This process is repeated for all the values in ```f.temps```.

When you click on the Source button, you get the following output:

-40 -40
0 -17.77778
32 0
100 37.77778
212 100
400 204.4444

It is interesting to note that -40 degrees Fahrenheit is the same temperature as -40 degrees Celsius.

---

## Summary

In this lesson, you learned how to create variables, perform arithmetic using R like a calculator, made your first vector, used and created functions, and created for loops. All of these basics will serve you well as you move forward to start doing more work with data in R.

* Functions allow us to perform complex computations with a simple function call.
* For loops allow us to repeat computations with different numbers.
* A variable is a named place to store something.
* Variable names should be composed of letters, digits, periods, and underscores.
* An assignment statement creates a variable and stores a number or string in it.
* Strings are sequences of characters enclosed in single quotes (') or double quotes (").
* R can do computations on numbers using arithmetic operators.
* Functions produce a value from their argument(s).
* R implements most mathematical functions that you would need to use.
* R has many functions that are useful in computing statistical values.

---

## Review
Below is a quiz to review the recently covered material. Quizzes are _not_ graded.

In [9]:
try:
    display(L3P8Q1, L3P8Q2)
except:
    pass

VBox(children=(Output(outputs=({'name': 'stdout', 'text': '1. What is the purpose of a for loop?\n', 'output_t…

VBox(children=(Output(outputs=({'name': 'stdout', 'text': '2. What does the command \x1b[31;1m"\\n"\x1b[0m do?…

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 9 - Key Terms<a class="anchor" id="DS102L3_page_9"></a>

[Back to Top](#DS102L3_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">


# Key Terms

Below is a list and short description of the important keywords learned in this lesson. Please read through and go back and review any concepts you do not fully understand. Great Work!

<table class="table table-striped">
    <tr>
        <th>Keyword</th>
        <th>Description</th>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>Variable</td>
        <td>A place to store a piece of information. Must be named with A-Z, a-z, 0-9, underscore, or period.</td>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>Assignment</td>
        <td>Putting a value in a variable. Done with <- , =, or -> . </td>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>Environment Pane</td>
        <td>Area of R Studio where you can see variables and data.</td>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>String</td>
        <td>Character data; denoted with single or double quotes.</td>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>Order of Operations</td>
        <td>The order in which R computes arithmetic: Parentheses, Exponents, Multiplication, Division, Addition, Subtraction. </td>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>Vector</td>
        <td>Ordered collection of numbers. </td>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>Element</td>
        <td>One number in a vector. </td>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>For Loop</td>
        <td>An easy way to repeat a task over a series of data.  Takes the form "for this in that, do something." </td>
    </tr>
</table>

---

# Key Functions in R

<table class="table table-striped">
    <tr>
        <th>Keyword</th>
        <th>Description</th>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>paste()</td>
        <td>Combines two strings together.</td>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>sqrt()</td>
        <td>Takes the square root of a number.</td>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>pnorm()</td>
        <td>Finds the percentile for a z score.</td>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>function()</td>
        <td>Makes a function.</td>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>c()</td>
        <td>Creates a vector of numbers listed in the parentheses. </td>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>1:10</td>
        <td>Makes a sequential vector, starting at the first number and incrementing by one to the last number. </td>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>cat()</td>
        <td>Concatenates (attaches) strings together. </td>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>"\n"</td>
        <td>Adds a line break. </td>
    </tr>
</table>


<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 10 - Hands-On<a class="anchor" id="DS102L3_page_10"></a>

[Back to Top](#DS102L3_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">



For your Lesson 3 Hands-On, you will be solving the following word problem. This Hands-On **will** be graded, so be sure you complete all requirements. Please complete this Hands-On within an R script file and submit it below when completed. 

<div class="panel panel-success">
    <div class="panel-heading">
        <h3 class="panel-title">Additional Info!</h3>
    </div>
    <div class="panel-body">
        <p>You may want to watch <a href="https://vimeo.com/408696478">this recorded live workshop </a> before beginning the hands-on, which goes over a similar example.</p>
    </div>
</div>

---

## Requirements

Premier restaurants and bars are now serving drinks with ice spheres rather than ice cubes to chill the drink. Suppose you work for an ice vendor, and are tasked with measuring the size of manufactured ice spheres. Rather than measure the diameter directly, you weigh each sphere. You consulted with your company's ice scientist, and they gave you the following formula to convert the ice sphere's weight in grams to its diameter in inches (assuming it is weighed at 0 degrees Fahrenheit and is perfectly spherical); d is the sphere's diameter, and w is the sphere's weight.

---
![d equals two divided by two point five four open parentheses w divided by zero point nine two times four thirds times pi close parentheses to one third power.](Media/L03-Exam.png)

---
Create a function called ```diam()``` that computes the diameter of the sphere from its argument.
Create a for loop that will use your ```diam()``` function to compute the diameters of spheres with the following weights in grams: 0.96, 1.51, 2.17, 3.85, 4.45, and 6.02.

<div class="panel panel-danger">
    <div class="panel-heading">
        <h3 class="panel-title">Caution!</h3>
    </div>
    <div class="panel-body">
        <p>Be sure to zip and submit your entire document when finished!</p>
    </div>
</div>

<div class="panel panel-info">
    <div class="panel-heading">
        <h3 class="panel-title">Tip!</h3>
    </div>
    <div class="panel-body">
        <p>To zip your file on <b>Windows</b>, right click on the file and select "Send to", then select "Compressed (zipped) folder". For <b>Mac</b> users, right click on the file and select "Compress", then select your file from the options.</p>
    </div>
</div>