<div class="alert alert-warning"></i><strong>CFT Data Science With R Tutorials</strong><br>By: Seiya David 
 </div>

 <div class="alert alert-success"></i><strong>1.0 R dplyr package</strong>
 </div>
 Now that we have seen how to slice datasets using base R, we would now look at a very nice package called dplyr.<br>
dplyr is a package that can help query data sets much faster to derive insight. It has five data manipulation functions (also known as dplyr grammar of data manipulation). These are:
<ul>
<li>select()     - extracts columns from a dataset</li>
<li>mutate()     - builds new columns in the dataset</li>
<li> filter()    - extracts rows from a dataset</li>
<li> arrange()    - re-orders rows in a dataset</li>
<li>summarize()  - calculates summary statistics</li>

<li> %>%: the “pipe” operator is used to connect multiple verb actions together into a pipeline</li>

<li> group_by : for grouping summaries</li>
</ul>

 <div class="alert alert-success"></i><strong>1.1 logical operators</strong>
 </div>
 <ul>
<li>x < y, TRUE if x is less than y</li>
<li>x <= y, TRUE if x is less than or equal to y</li>
<li>x == y, TRUE if x equals y</li>
<li>x != y, TRUE if x does not equal y</li>
<li>x >= y, TRUE if x is greater than or equal to y</li>
<li>x > y, TRUE if x is greater than y</li>
<li>x %in% c(a, b, c), TRUE if x is in the vector c(a, b, c)</li>
<li> x %in% c(a, b, c), TRUE if x is in the vector c(a, b, c)</li>
 </ul>

<div class="alert alert-success"></i><strong>1.1 logical operators</strong>
 </div>
 <ul> 
<strong>boolean operators</strong>
<li> & (and), | (or), and ! (not)</li>
 </ul>

 <div class="alert alert-success"></i><strong>1.2 select()</strong>
 </div>
The select() function can be used to select fields/columns of a data frame.

<strong>dplyr</strong> also comes with a set of helper functions that can assist in selecting groups of variables inside the select() function.
<ul>
<li>starts_with("X"): every name that starts with "X"</li>
<li>ends_with("X"): every name that ends with "X"</li>
<li>contains("X"): every name that contains "X"</li>
<li>matches("X"): every name that matches "X", where "X" can be a regular expression</li>
<li>num_range("x", 1:5): the variables named x01, x02, x03, x04 and x05</li>
<li>one_of(x): every name that appears in x, which should be a character vector.</li>
</ul>

In [5]:
# Load the dplyr package
library(dplyr)

# import dataset required for this session
txtData <- read.table("D:/CFT DataScienceHub/makeOver50K.txt",sep = ",", 
                     header = F)

# insert column names to the imported table
colnames(txtData) <- c("Age","workclass","fnlwgt","education",
                       "education-num","marital-status",
                       "occupation","relationship",
                       "race","sex","capital-gain",
                       "capital-loss","hours-per-week",
                       "native-country","eaning")
head(txtData, 3)

Age,workclass,fnlwgt,education,education-num,marital-status,occupation,relationship,race,sex,capital-gain,capital-loss,hours-per-week,native-country,eaning
39,State-gov,77516,Bachelors,13,Never-married,Adm-clerical,Not-in-family,White,Male,2174,0,40,United-States,<=50K
50,Self-emp-not-inc,83311,Bachelors,13,Married-civ-spouse,Exec-managerial,Husband,White,Male,0,0,13,United-States,<=50K
38,Private,215646,HS-grad,9,Divorced,Handlers-cleaners,Not-in-family,White,Male,0,0,40,United-States,<=50K


In [8]:
# Select sub column
age_race_sex <- select(txtData, c(Age, race, sex))

In [7]:
# To omit a field, use the code snippet below.
# To code indicates that we should include every column except Age, race and sex
head(select(txtData, -c(Age, race, sex)),4)

workclass,fnlwgt,education,education-num,marital-status,occupation,relationship,capital-gain,capital-loss,hours-per-week,native-country,eaning
State-gov,77516,Bachelors,13,Never-married,Adm-clerical,Not-in-family,2174,0,40,United-States,<=50K
Self-emp-not-inc,83311,Bachelors,13,Married-civ-spouse,Exec-managerial,Husband,0,0,13,United-States,<=50K
Private,215646,HS-grad,9,Divorced,Handlers-cleaners,Not-in-family,0,0,40,United-States,<=50K
Private,234721,11th,7,Married-civ-spouse,Handlers-cleaners,Husband,0,0,40,United-States,<=50K


In [11]:
# The select() function also allows a special syntax that allows you to specify variable names based
# on patterns
columns_ending_pattern <- select(txtData, ends_with("p"))
head(columns_ending_pattern)

relationship
Not-in-family
Husband
Not-in-family
Husband
Wife
Wife


In [13]:
subset <- select(txtData, starts_with("c"))
head(subset)

capital-gain,capital-loss
2174,0
0,0
0,0
0,0
0,0
0,0


<div class="alert alert-success"></i><strong>1.3 mutate()</strong>
 </div>
Mutate function can be used to create a new field or column name/variable.
Mutate take two arguments, firstly the dataframe and the operation.

In [14]:
# Let's use this toy dataset for to make this concept clear
sample_data = data.frame(school = c("Goodwind","High_grade","High_grade","Windrush"),
                        senior_students_age = c(20, 23, 21, 22), 
                        junior_students_age = c(15,14, 12,NA))
sample_data

school,senior_students_age,junior_students_age
Goodwind,20,15.0
High_grade,23,14.0
High_grade,21,12.0
Windrush,22,


In [15]:
new_sample_data <- mutate(sample_data, age_difference = senior_students_age - junior_students_age)
new_sample_data

"package 'bindrcpp' was built under R version 3.5.1"

school,senior_students_age,junior_students_age,age_difference
Goodwind,20,15.0,5.0
High_grade,23,14.0,9.0
High_grade,21,12.0,9.0
Windrush,22,,


<div class="alert alert-success"></i><strong>1.4 filter()</strong>
 </div>
The filter() function is used to extract subsets of rows from a data frame. This function is similar
to the existing subset() function in R but is quite a bit faster in my experience.

In [17]:
filter(sample_data, school == "Windrush")

school,senior_students_age,junior_students_age
Windrush,22,


<div class="alert alert-danger"></i><i class="icon-attention-alt"></i>**Try it out!**<br>Return only records of people with Bachelors degree using filter.<br>

In [20]:
# import dataset required for this session
txtData <- read.table("D:/CFT DataScienceHub/makeOver50K.txt",sep = ",", 
                     header = F,strip.white = TRUE)

# insert column names to the imported table
colnames(txtData) <- c("Age","workclass","fnlwgt","education",
                       "education-num","marital-status",
                       "occupation","relationship",
                       "race","sex","capital-gain",
                       "capital-loss","hours-per-week",
                       "native-country","eaning")
head(txtData, 3)

Age,workclass,fnlwgt,education,education-num,marital-status,occupation,relationship,race,sex,capital-gain,capital-loss,hours-per-week,native-country,eaning
39,State-gov,77516,Bachelors,13,Never-married,Adm-clerical,Not-in-family,White,Male,2174,0,40,United-States,<=50K
50,Self-emp-not-inc,83311,Bachelors,13,Married-civ-spouse,Exec-managerial,Husband,White,Male,0,0,13,United-States,<=50K
38,Private,215646,HS-grad,9,Divorced,Handlers-cleaners,Not-in-family,White,Male,0,0,40,United-States,<=50K


In [21]:
filter(txtData, education == "Bachelors")

Age,workclass,fnlwgt,education,education-num,marital-status,occupation,relationship,race,sex,capital-gain,capital-loss,hours-per-week,native-country,eaning
39,State-gov,77516,Bachelors,13,Never-married,Adm-clerical,Not-in-family,White,Male,2174,0,40,United-States,<=50K
50,Self-emp-not-inc,83311,Bachelors,13,Married-civ-spouse,Exec-managerial,Husband,White,Male,0,0,13,United-States,<=50K
28,Private,338409,Bachelors,13,Married-civ-spouse,Prof-specialty,Wife,Black,Female,0,0,40,Cuba,<=50K
42,Private,159449,Bachelors,13,Married-civ-spouse,Exec-managerial,Husband,White,Male,5178,0,40,United-States,>50K
30,State-gov,141297,Bachelors,13,Married-civ-spouse,Prof-specialty,Husband,Asian-Pac-Islander,Male,0,0,40,India,>50K
23,Private,122272,Bachelors,13,Never-married,Adm-clerical,Own-child,White,Female,0,0,30,United-States,<=50K
56,Local-gov,216851,Bachelors,13,Married-civ-spouse,Tech-support,Husband,White,Male,0,0,40,United-States,>50K
45,Private,386940,Bachelors,13,Divorced,Exec-managerial,Own-child,White,Male,0,1408,40,United-States,<=50K
53,Self-emp-not-inc,88506,Bachelors,13,Married-civ-spouse,Prof-specialty,Husband,White,Male,0,0,40,United-States,<=50K
24,Private,172987,Bachelors,13,Married-civ-spouse,Tech-support,Husband,White,Male,0,0,50,United-States,<=50K


<div class="alert alert-danger"></i><i class="icon-attention-alt"></i>**Try it out!**<br>Return only records of people with education-num greater than 13 using filter.<br>

In [22]:
filter(txtData, `education-num` > 13)

Age,workclass,fnlwgt,education,education-num,marital-status,occupation,relationship,race,sex,capital-gain,capital-loss,hours-per-week,native-country,eaning
37,Private,284582,Masters,14,Married-civ-spouse,Exec-managerial,Wife,White,Female,0,0,40,United-States,<=50K
31,Private,45781,Masters,14,Never-married,Prof-specialty,Not-in-family,White,Female,14084,0,50,United-States,>50K
43,Self-emp-not-inc,292175,Masters,14,Divorced,Exec-managerial,Unmarried,White,Female,0,0,45,United-States,>50K
40,Private,193524,Doctorate,16,Married-civ-spouse,Prof-specialty,Husband,White,Male,0,0,60,United-States,>50K
44,Private,128354,Masters,14,Divorced,Exec-managerial,Unmarried,White,Female,0,0,40,United-States,<=50K
47,Private,51835,Prof-school,15,Married-civ-spouse,Prof-specialty,Wife,White,Female,0,1902,60,Honduras,>50K
42,Private,116632,Doctorate,16,Married-civ-spouse,Prof-specialty,Husband,White,Male,0,0,45,United-States,>50K
33,Private,202051,Masters,14,Married-civ-spouse,Prof-specialty,Husband,White,Male,0,0,50,United-States,<=50K
43,Federal-gov,410867,Doctorate,16,Never-married,Prof-specialty,Not-in-family,White,Female,0,0,50,United-States,>50K
48,Self-emp-not-inc,191277,Doctorate,16,Married-civ-spouse,Prof-specialty,Husband,White,Male,0,1902,60,United-States,>50K


<div class="alert alert-danger"></i><i class="icon-attention-alt"></i>**Try it out!**<br>Return only records of people with education-num greater than or equal to 13 using filter.<br>

<div class="alert alert-danger"></i><i class="icon-attention-alt"></i>**Try it out!**<br>Return only records of people with education-num equal 9 or 13 using filter.<br>

In [24]:
filter(txtData, `education-num` == c(9,13))
filter(txtData, `education-num` == 9 | `education-num` == 13)

"longer object length is not a multiple of shorter object length"

Age,workclass,fnlwgt,education,education-num,marital-status,occupation,relationship,race,sex,capital-gain,capital-loss,hours-per-week,native-country,eaning
50,Self-emp-not-inc,83311,Bachelors,13,Married-civ-spouse,Exec-managerial,Husband,White,Male,0,0,13,United-States,<=50K
38,Private,215646,HS-grad,9,Divorced,Handlers-cleaners,Not-in-family,White,Male,0,0,40,United-States,<=50K
42,Private,159449,Bachelors,13,Married-civ-spouse,Exec-managerial,Husband,White,Male,5178,0,40,United-States,>50K
30,State-gov,141297,Bachelors,13,Married-civ-spouse,Prof-specialty,Husband,Asian-Pac-Islander,Male,0,0,40,India,>50K
25,Self-emp-not-inc,176756,HS-grad,9,Never-married,Farming-fishing,Own-child,White,Male,0,0,35,United-States,<=50K
59,Private,109015,HS-grad,9,Divorced,Tech-support,Unmarried,White,Female,0,0,40,United-States,<=50K
56,Local-gov,216851,Bachelors,13,Married-civ-spouse,Tech-support,Husband,White,Male,0,0,40,United-States,>50K
19,Private,168294,HS-grad,9,Never-married,Craft-repair,Own-child,White,Male,0,0,40,United-States,<=50K
39,Private,367260,HS-grad,9,Divorced,Exec-managerial,Not-in-family,White,Male,0,0,80,United-States,<=50K
53,Self-emp-not-inc,88506,Bachelors,13,Married-civ-spouse,Prof-specialty,Husband,White,Male,0,0,40,United-States,<=50K


<div class="alert alert-danger"></i><i class="icon-attention-alt"></i>**Try it out!**<br>Return only records of people with education-num equal 9 and sex as female using filter.<br>

In [26]:
filter(txtData, `education-num` == 9 & sex == "Female" & race == "Black")

Age,workclass,fnlwgt,education,education-num,marital-status,occupation,relationship,race,sex,capital-gain,capital-loss,hours-per-week,native-country,eaning
54,Private,302146,HS-grad,9,Separated,Other-service,Unmarried,Black,Female,0,0,20,United-States,<=50K
31,Private,185814,HS-grad,9,Never-married,Transport-moving,Unmarried,Black,Female,0,0,30,United-States,<=50K
41,Private,130408,HS-grad,9,Divorced,Sales,Unmarried,Black,Female,0,0,38,United-States,<=50K
24,Private,241951,HS-grad,9,Never-married,Handlers-cleaners,Unmarried,Black,Female,0,0,40,United-States,<=50K
44,Private,290521,HS-grad,9,Widowed,Exec-managerial,Unmarried,Black,Female,0,0,40,United-States,<=50K
19,Private,206399,HS-grad,9,Never-married,Machine-op-inspct,Own-child,Black,Female,0,0,40,United-States,<=50K
47,Local-gov,543162,HS-grad,9,Separated,Adm-clerical,Unmarried,Black,Female,0,0,40,United-States,<=50K
34,?,190027,HS-grad,9,Never-married,?,Unmarried,Black,Female,0,0,40,United-States,<=50K
35,Private,229328,HS-grad,9,Married-civ-spouse,Machine-op-inspct,Wife,Black,Female,0,0,40,United-States,<=50K
24,Private,82804,HS-grad,9,Never-married,Handlers-cleaners,Unmarried,Black,Female,0,0,40,United-States,<=50K


<div class="alert alert-success"></i><strong>1.5 arrange()</strong>
 </div>
The arrange() function takes two main arguments: the name of the dataframe, and the column(s) one wishes to re-order.
Note: Results returned are in acesending order.

In [27]:
sample_data

school,senior_students_age,junior_students_age
Goodwind,20,15.0
High_grade,23,14.0
High_grade,21,12.0
Windrush,22,


In [28]:
arrange(sample_data, senior_students_age)

school,senior_students_age,junior_students_age
Goodwind,20,15.0
High_grade,21,12.0
Windrush,22,
High_grade,23,14.0


In [29]:
arrange(sample_data, desc(senior_students_age))

school,senior_students_age,junior_students_age
High_grade,23,14.0
Windrush,22,
High_grade,21,12.0
Goodwind,20,15.0


dplyr provides several helpful aggregate functions
first(x) - The first element of vector x.
last(x) - The last element of vector x.
nth(x, n) - The nth element of vector x.
n() - The number of rows in the data.frame or group of observations that summarize() describes.
n_distinct(x) - The number of unique values in vector x.

In [30]:
# Generate summarizing statistics
summarize(sample_data, 
          number_records = n(), 
          distinct_school_count = n_distinct(school))

number_records,distinct_school_count
4,3


In [31]:
High_grade_Analysis <- filter(sample_data, school == "High_grade")

# Generate summarizing statistics
summarize(High_grade_Analysis, 
          number_records = n(), 
          senior_students_age_sum = sum(senior_students_age),
          senior_students_mean_age = mean(senior_students_age))

number_records,senior_students_age_sum,senior_students_mean_age
2,44,22


<div class="alert alert-danger"></i><i class="icon-attention-alt"></i>**Try it out!**<br>Run the code above but this time replace senior_students_age with junior_students_age.<br>

In [34]:
#Group_by()

txtData %>%
   group_by(workclass) %>%
   summarize(
     Age = mean(Age, na.rm = TRUE), 
     hrs_workedper_week = mean(`hours-per-week`, na.rm = TRUE)
   ) %>%
   arrange(Age, hrs_workedper_week)

workclass,Age,hrs_workedper_week
Never-worked,20.57143,28.42857
Private,36.79759,40.2671
State-gov,39.43606,39.03159
?,40.96024,31.91939
Local-gov,41.75108,40.9828
Federal-gov,42.59063,41.37917
Self-emp-not-inc,44.9697,44.42188
Self-emp-inc,46.01703,48.8181
Without-pay,47.78571,32.71429


In [None]:
# This code is optional as database configuration is required to run it.

# Connecting Database
library(RODBC) # this packages is required to for connecting to the database
library(dplyr)

# The first argument is the name given to your odbc driver,
# followed by your database connection id and password
con <- odbcConnect("CFTdemo", uid = "seiya", pwd="")

# Query the database using native sql
sqlQuery(con, "select * from Employees") %>%
group_by(Gender) %>%
summarize(n_Gender = n(), avg_salary = mean(Salary, na.rm = TRUE)) %>%
arrange(avg_salary)

<div class="alert alert-success"></i><strong>1.0 Control structures</strong><br><br>Control structures in R gives R the flexibiliy to control the flow of execution of a series of R expressions.<br>
Commonly used control structures in R are:
<ul>
    <li>if and else: Used for testing a condition and acting on it</li>
    <li>for: Required to execute a loop a fixed number of times</li>
    <li>while: Required to execute some block of code continuously until a condition is met</li>
    <li>break: Used to break the execution of a loop</li>
    <li>next: Used to skip an iteration of a loop</li>
    </ul>
 </div>

<div class="alert alert-success"></i><strong>1.1 if-else</strong><br>
if-else control structure is one of the most commonly used in programming.This structure allows one to test a condition if it returns true or false and act on it accordinly.The general format of an if and else as well as an if-else statement is given below.

In [None]:
# if and else statement
if(<condition>) {
    ## do something
    } else {
    ## do something else
}

In [3]:
# Example of if and else statement
x <- 201
if (x %% 2 == 0){
    print(paste("Number", x, "is even")) ## do something
    } else{
    print(paste("Number", x, "is Not even")) ## do something else
    
}

[1] "Number 201 is Not even"


<div class="alert alert-info"><i class="icon-lightbulb"></i><strong>Discussion</strong><br>
We just used a function called paste above, what would our output be if we replace paste with paste0?

In [None]:
# General format of an if-else conditinal statement
if(<condition1>) {
   ## do something
  } else if(<condition2>) {
  ## do something different
    } else {
  ## do something very different
}

In [6]:
# Example of if-else statement
Age <- as.integer(readline(prompt="Enter your age: ")) # prompt for text input
if (Age < 13){
    print(paste("A", Age, "years old is a child")) ## do something
    
    } else if(Age < 20){
    print(paste("A", Age, "years old is a teenager,")) ## do something else
    
   } else {
    print(paste("A", Age, "years old is an adult,")) ## do something very different
}

Enter your age: 40
[1] "A 40 years old is an adult,"


<div class="alert alert-success"></i><strong>1.2 for loops</strong><br>
for loops are commonly used for iterating over elements of an object and it made up of three parts
the sequence,body and the output.

In [7]:
# Example of for loop
x <- c("a", "b", "c", "d")
for(i in x) {
    ## Print out each element of 'x'
    print(i)
}

[1] "a"
[1] "b"
[1] "c"
[1] "d"


In [9]:
list_of_numbers <- list(2,4,7,8,9,10)
list_of_numbers[3]

In [17]:
#try it
#list_of_numbers <- list(2,4,7,8,9,10)
listx <- list(c('CFT', 'Data', 'Science'), FALSE, TRUE, FALSE, c(2,8,5))
for (i in 1:length(listx)){
    print(listx[[i]])
}

[1] "CFT"     "Data"    "Science"
[1] FALSE
[1] TRUE
[1] FALSE
[1] 2 8 5


In [4]:
length(list_of_numbers)

In [7]:
# Example of a for loop with if-else statements
numbers = 1:5
for(k in 1:length(numbers)){
    
        if (numbers[k] %% 2 == 0){
            print(paste("Number", numbers[k], "is even")) ## do something
        }else {
            print(paste("Number", numbers[k], "is Not even")) ## do something else
    }
}

[1] "Number 1 is Not even"
[1] "Number 2 is even"
[1] "Number 3 is Not even"
[1] "Number 4 is even"
[1] "Number 5 is Not even"


In [19]:
# Example of a for loop: compute the mean of columns in a dataframe
dataframe <- data.frame(Age = c(90,13,18,15), test_score = c(70,30,18, 17), name =LETTERS[1:4])
dataframe

for(i in 1 :ncol(dataframe)){
    print(mean(dataframe[[i]]))
}

Age,test_score,name
90,70,A
13,30,B
18,18,C
15,17,D


[1] 34
[1] 33.75


"argument is not numeric or logical: returning NA"

[1] NA


In [None]:
# From the above code we see that an error warning was raised
# this is because of the data frane fields is character instead of numeric
# as the mean was calculated. To solve this issue, we will check if a field 
# is does hold numeric value before we evaluate the mean.

output <- vector("double", ncol(dataframe))

for (i in 1 :ncol(dataframe)) {
    if (class(dataframe[[i]]) == "numeric"){
  # Change code to store result in output
  output[[i]] <- mean(dataframe[[i]])
        }
}

# Print output
output

<div class="alert alert-success"></i><strong>1.3 while loop</strong><br>
while loop is used to execute some block of code continuously until a condition is met. A major concern when working with a while loop is to ensure that at some point the condition should become true, otherwise the while loop will go forever. If you find yourself in an endless loop, press <strong>Crtl-C</strong> to kill the process in RStudio or base R.

In [None]:
# The general syntax of a while loop is given as:
while (condition){
    # Code executed here 
    # while condition is true
}

In [17]:
# This while loop checks if the condition (x < 10) is true
# if it's false, it prints 15 randomly generated alphabets and moves to a new line
# then adds one to x so as to avoid endless loop.This will only stop when the
# while condition is true, i.e x > 10.
x <- 0

while(x < 10){
    
    cat(sample(LETTERS,15),"\n")
    
    # add one to x
    x <- x+1
}

X Z P L N T S I M K W A E G O 
H M L X C G S W F U V N E B Q 
F L S H Z T Y X Q P A I W M C 
X O Q W M H B V R Y J N U C G 
H Y L A P S Z E V T Q B W K F 
X F S P T K D L N J A O Q I W 
G K U O X D E J B R H C T P Q 
K J A N M Y Z W L H X I V D U 
G X Y U N R O M E H Q A B P V 
S Q U R E T C O W I J X N V A 


<div class="alert alert-success"></i><strong>1.4 next</strong><br>
next is used to skip an iteration of a loop given a condition.

In [None]:
for(i in LETTERS) {
    if(i < "V") {
        ## Skip the all letter before "V" iterations
        next
    }
        print(i)
}

<div class="alert alert-success"></i><strong>1.5 break</strong><br>
break is used to exit a loop immediately, given a condition regardless of what iteration the loop may be on.

In [None]:
for(i in LETTERS) {
    print(i)
    if(i == "D") {
    ## Stop loop after 20 iterations
        break
    }
}

In [None]:
f <- function(letters, letter){
for(i in letters) {
    print(i)
if(i == letter) {
## Stop loop after 20 iterations
break
}
}}


<div class="alert alert-info"><i class="icon-lightbulb"></i><strong>Discussion</strong><br>
Let's work through an integer number guessing game. 

<div class="alert alert-success"></i><strong>2.0 Functions</strong><br>
A function is a piece of code written to carry out a specified task which may or may not accept arguments.Functions are designed by assigning a name like you would when creating an object followed by the assignment operator and the key word function.There are also functions that are <font color = "red"><strong>NOT</strong></font> assigned names when developed, these type of functions are called "anonymous functions".<br>Functions are made up of three parts:
<ul>
    <li>Arguments</li>
    <li>Body</li>
    <li>Enviroment</li>
</ul>
One rule to keep in mind when creating your own function is to avoid using R key words for your user defined functions names.
</div>

In [None]:
# The basic structure of a function is given as:
func.one <- function(arg1, arg2){ # Arguments
                body              # Body
        }
# and the Enviroment which is where R look to for variable and is visually invisible.
# More on this when we create our first function.

<div class="alert alert-success"></i><strong>2.0.1 return value</strong><br>
when return is called within a function it forces the function to stop execution and return a value.An example will make this point clearer.
</div>

In [None]:
rm(list=ls())
# We will now create our first function which does not have the return key word.
# This function simply adds 3 and 5 and also prints a text as you can see this
# function does not take in any argument
addnumbers <- function(){
            print(3 + 2)
            print("You can read me because no return is called above")
   }
addnumbers()

In [None]:
# This function take two arguments and return a ratio
ratio <- function(x, y) {
  return (x / y)
}

# Call ratio() with arguments 3 and 4
# I could have as well run the function ratio(y = 2, x = 1) to get 
# same output as the code snippet below
ratio(1, 2)

In [None]:
rm(list=ls())
# We will now create our second function which has the return key word.Did you notice
# anything in the output?
# This function simply adds 3 and 5 and ignors the prints part since
# return was called.
addnumbersv1 <- function(){
            return(3 + 2)
            print("You can read me because no return is called above")
   }
addnumbersv1()

In [None]:
# An example of an anonymus function that does not have a name assigned to it.
# Anonymus function are always written on a single line as below
(function(x) {x + 5})(2)

In [None]:
# A function with one default argument. This means that when
# this function is called you may or may not pass/change the value of y
addxy <- function(x, y = 3){
    x + y
}
addxy(2)

In [None]:
# change the default value of y
addxy(x = 7, y = 19)

<div class="alert alert-success"></i><strong>2.0.2 Scoping</strong><br>
Scoping in R basically describes how and where (Note: NOT WHEN) R looks for values by name. There are basically two types of scope in R:
<ul>
<li>local scope</li> 
<li>global scope</li>
<ul>
</div>

In [3]:
# If a name is not defined inside a function, R will look one level up.
# The example below would make this point clearer.

# The function below returns the sum of x and y.
# There are two variables obviously, x and y but y is not defined within the function
# so when the function is called, it will make use of the value defined within and 
# look one level up to find the name/varible y
# x is said to be a local variable while in this case y is a global variable.
# Any global variable may be called/used in a function but local variables CAN'T be 
# called outside a function.
rm(list=ls())
#y <- 3
y <- 5

fc <- function(x) {
  x + y
}

# call the function fc
fc(7)

<div class="alert alert-info"><i class="icon-lightbulb"></i><strong>Discussion</strong><br>
Given our earlier explanation what do you think the output would be?.Let's talk about the code below:<br>
y <- 10<br>
fz <- function(x) {<br>
  y <- 5<br>
  x + y<br>
}

In [4]:
y <- 10
fz <- function(x) {
y <- 5
x + y
}
fz(2)

<div class="alert alert-info"><i class="icon-lightbulb"></i><strong>Discussion</strong><br>
How do we write a function that checks if a given number or list of number is an even or odd number.

<div class="alert alert-danger"></i><i class="icon-attention-alt"></i>**Try it out!**<br>Given the list below,<br>
circle_rdaii = list(2,24,5,6,2,8,9,1,5,9,3,2,12,67,89,45,23,78,12)<br>write a function that computes the list of radii of different cirles.Your solution should be of the form shown below.<br>

In [10]:
circle_rdaii = list(2,24,5,6,2,8,9,1,5,9,3,2,12,67,89,45,23,78,12)
area_of_circle <- function(radius){
    for(i in 1:length(circle_rdaii)){
        area <- pi * (circle_rdaii[[i]])**2
        print(paste("The area of a circle with radius", circle_rdaii[[i]], "is", area))
    }
}

In [11]:
area_of_circle(circle_rdaii)

[1] "The area of a circle with radius 2 is 12.5663706143592"
[1] "The area of a circle with radius 24 is 1809.55736846772"
[1] "The area of a circle with radius 5 is 78.5398163397448"
[1] "The area of a circle with radius 6 is 113.097335529233"
[1] "The area of a circle with radius 2 is 12.5663706143592"
[1] "The area of a circle with radius 8 is 201.061929829747"
[1] "The area of a circle with radius 9 is 254.469004940773"
[1] "The area of a circle with radius 1 is 3.14159265358979"
[1] "The area of a circle with radius 5 is 78.5398163397448"
[1] "The area of a circle with radius 9 is 254.469004940773"
[1] "The area of a circle with radius 3 is 28.2743338823081"
[1] "The area of a circle with radius 2 is 12.5663706143592"
[1] "The area of a circle with radius 12 is 452.38934211693"
[1] "The area of a circle with radius 67 is 14102.6094219646"
[1] "The area of a circle with radius 89 is 24884.5554090848"
[1] "The area of a circle with radius 45 is 6361.72512351933"
[1] "The area of a c