
# Using R
## Introduction

In this lab we will learn how to:

 * Create variables
 * Use basic arithmatic
 * Import 'csv' data files
 * Create basic data visualizations

## Creating Variables in R

A variable is values that can be saved within the workspace. At the simplest level, a variable consists of a variable name and variable values. Once a variable is created it will appear in the environment window (the third orange box).

In R you can create variables using either '<-' or '=' symbol.

Below is an example, which creates a variable called 'x' equaling the sum of 4 and 5 and a variable called 'y' equaling the product of 4 and 5.

To run the following code block, click on it and then press ctrl+enter.

In [None]:
x <- 4 + 5 # Creating the x value
y = 4 * 5  # Creating the y value
# Printing the value
cat("The value of x is: ",x, "\n") # '\n' is the same as pressing enter to go to the next line
cat("The value of y is: ",y, "\n")

Above you should see the printed values of x and y, where x is the addition of 4 and 5 and y is the multiplication of 4 and 5.

In R you can also create vectors, below are two common ways of manually creating vectors. The first command (setting the variable x) allows you to "store" a vector of numbers on the variable x. The second command creates a sequence of numbers that starts from 0 and ends at 10 at increments of 1.

Try editing the code to change the y sequence to increment by 0.1 from 0 to 5, to check you can see the print statements within the code.

In [None]:
# Creates a list of numbers
x <- c(1,5,10,20,40)
# Creates a list of numbers from 0 to 10 by increasing the value of 1
y <- seq(0,10,1)
# Printing the value
cat("The value of x is: ",x, "\n")
cat("The value of y is: ",y, "\n")

To select certain values in a list you can use square brackets and the index of the value inside the square brackets. For example, x[1] would select the first value in x and x[-1] would select everything but the first value.

In [None]:
### Try printing the third value of x

### Try printing the fifth value of y


## Basic Arithmetic in R

The basic arithmatic operations are also included below, take careful note of how the multiplication of the vectors are calculated.

In [None]:
3 + 5.5 ## Addition
3 - 5.5 ## Subtraction
3 * 5.5 ## Multiplication
3 / 5.5 ## Division
x*5
y+10
x+y

The last line has the warning message 'longer object length is not a multiple of shorter object length', this means that the x is not a factor of the y variable.

Can you work out how R has produced the sum of x and y if they are not of the same length?

### Exercise

***

Below create 3 variables:

  * a variable called 'g' that is equal to a sequence of numbers from 10 to 20.
  * a variable called 'h' that is equal to the product of 3 and 25.
  * a variable called 'i' that is set to the string "Potato".

To check use the 'cat' function to print the variable name and the values that the variables hold.

In [None]:
## Enter the answer


## Basic Math Functions in R

Above we looked at basic aritmetic operations in R, now we will look at the basic math functions you can also use in R. Below is a list of basic math functions you can use in R:

* *log(x)*: Natural log of x
* *exp(x)*: Exponential of x
* *max(x)*: Largest element
* *min(x)*: Smallest element
* *sum(x)*: Sum all values in x
* *mean(x)*: Find the mean of x
* *median(x)*: Find the median of x
* *round(x,n)*: Round x to n decimal places
* *length(x)*: Number of values in x

Using the list of values in variable 'z' find:

* The log of the third element
* Maximum of z
* Mean of z
* Median of z
* Exponent of all elements

In [None]:
z <- c(6.0780998 ,8.8926461 , 2.3485093 , 2.8511061 , 5.7849268 , 0.3145299,  1.2683729, 8.4636561 , 8.9633093 , 8.4354148,  9.0497666 , 5.9046839 , 4.1272612 ,-4.7378608 , -1.3372997 , 2.9143154 , 9.0774442 , 7.1303078,  9.9144950 , 10.1626416)

## Calculate the log of the third value

## Maximum of z

## Mean of z

## Median of z

## Exponent of all elements


## Reading in csv (data) files

It is very common to want to read in data from a different file on the computer for example a .csv (comma-separated values) file. To do this you can use the 'read.csv' command. Make sure that this notebook is saved in the same location as the data file 'perth_mean_max_temp.csv'.

In [None]:
temp <- read.csv("perth_mean_max_temp.csv")

Now to see what is in the data file we will use the head function to print the few observations of temp.

In [None]:
head(temp)

The '$' symbol can be used to print one variable from the temp data frame in R:

In [None]:
temp$temp[1:10] #Only printing the first 10 values

Can you change the above statement to print the first 20 values of the month variable.

Below we will explore some basic plots of the data type, what is the data type of each of the variables?

Firstly, we will look at a histogram of the temperature values. To create this histogram we will be using the package 'ggplot2', which will have to be loaded into the workspace. 

In [None]:
library(ggplot2)

Using the ggplot function and geom_histogram together you can create a histogram of the temp values. Try playing around with the bins variable to change the apperance of the histogram. What happens when you increase the bins value? What happens when it decreases?

In [None]:
ggplot(data=temp, mapping= aes(x=temp))+ geom_histogram(bins=2)

The ggplot function also allows you to change the appearance of the histogram, for example changing the labels, adding a title and changing the colours.

In [None]:
ggplot(data= temp, mapping = aes(x=temp,stat(density))) +
  geom_histogram(binwidth = 2, col="black",fill="red") +
  geom_vline(aes(xintercept = mean(temp)), color="blue" , size = 1, linetype = "dashed") +
  labs(title = "Histogram Example", x= "Temperature")


What if we wanted to compare the temperature values for each month:

In [None]:
ggplot(data= temp, mapping = aes(x=temp)) +
    geom_histogram(binwidth = 2, col="black",fill="grey") +
  labs( x= "Temperature") + facet_wrap(~month)

## Exercise

Using the iris data set (the first 6 rows are shown below), try creating different visualizations of the data and numerical summaries.

In [None]:
head(iris)