# **INTRODUCTION**

- Welcome to the second part of this course. If you have no knowledge whatsoever in R programming, then you should check out the [first lesson](https://github.com/Tess-hacker/INTRODUCTION-TO-PROGRAMMING-IN-R/blob/main/INTRODUCTION%20TO%20PROGRAMMING%20IN%20R.ipynb) of the course as I will be subsequently referring to it as we move on in the course.


- In this course, the following are the topics to be treated:

    - Dealing with Advanced Arithmetic Operations
    
    - Using Complex Arithmetic Expression
    
    - Dealing with Operator Priority Rules
    
    - Identifying Expression's Data Type
    
    - Dealing Variables with Naming Rules
    
    - Updating variables
    
    - Identifying variables data type
    
    - R Syntax: Best Practice
    
    
- Looks like a lot, right? Don't worry, you'll have crazy fun while learning. Without further ado, let us get this thing going!

## **DEALING WITH ADVANCED ARITHMETIC OPERATORS**

- In the previous lesson, we learnt four basic arithmetic operations we can use in R programming. Here, I would introduce you to more advanced arithmetic operations that R can perform. They are:

    - Exponentiation represented with the `^` sign
    
    - Integer division represented by `%/%`
    
    - Modulo represented by `%%`
    
    
- **Exponentiation** - This involves raising a number to a power `n`. In other words, it is the multiplication of a number by itself raised to a certain number of times.


- **Integer Division** - It refers to the maximum number of times that `n` can be found in `x` given that we have the integer division of `x %/% n`. Remember that `x` and `n` can be represented by integers.


- **Modulo** - This returns the rest of the integer division of `x` by `n`. Therefore, in the case where we have a division with a remainder, the modulo operation returns this value.

In [1]:
128 %/% 5
128 ^ 5
128 %% 5

## **USING COMPLEX ARITHMETIC OPERATIONS**

- Now that we have learnt new advanced numeric operations in R programming, we should apply to a dataset. Below is a table of data which we can use:


| **S/NO** | **Day** | **Supply Day** | **Item** | **Purchase Price** | **Selling Price** | **Item Available** | **Item Sold** |
| --- | --- | --- | --- | --- | --- | --- | --- |
| 1	| 1L | TRUE | "Apple" | 10.50 | 12.10 | 200L | 0L |
| 2 | 1L | TRUE | "Mangoes" | 21.15 | 24.90 | 50L | 0L |
| 3 | 1L | TRUE | "Lemon" | 3.20 | 4.99 | 500L | 0L |
| 4 | 2L | FALSE | "Apple"| 10.50 | 12.10 | 188L | 12L |
| 5 | 2L | FALSE | "Mangoes" | 21.15 | 24.90 | 47L | 3L |
| 6 | 2L | FALSE | "Lemon" | 3.20 |  4.99 | 476L | 24L |


- Let us perform complex expressions with the dataset above:

In [2]:
"The total cost of purchasing apples on day 1:"
10.50 * 200L
"The total cost of purchasing mangoes on day 1:"
21.15 * 50
"The total cost of purchasing lemons on day 1:"
3.20 * 500
"The sum total of all figures:"
10.50*200L+21.15 * 50+3.20 * 500

## **DEALING WITH MARKDOWN OPERATOR PRIORITY RULES**

- R evaluates complex expressions with the same order of operations used in mathematics. We refer to this order as the **operator priority rules**. They are as follows:

    - Parentheses are calculated first, then exponentiation, then division and multiplication, and finally, addition and subtraction.


- We can use parentheses `()` to override the priority rule. For example, the expression `(2 + 3) * 5` yields `25` rather than `17`. In this case, the expression in parentheses is first evaluated to `5`, and then the expression `5 * 5` yields `25`.

In [3]:
"The predicted selling cost of apples on day 1:"
12.10 * 200L
"The predicted selling cost of mangoes on day 1:"
24.90 * 50L
"The predicted selling cost of lemons on day 1:"
4.99 * 500L
"The overall predicted selling cost of all fruits on day 1:"
(12.10 * 200L) + (24.90 * 50L) + (4.99 * 500L)

## **IDENTIFYING EXPRESSION'S DATA TYPE**

- Did you see how the operator priority rule worked up there? Now, you are probably wondering what this sub-topic is all about. Or what an expression data type is? Let me help you out.


- The values used in an expression will determine its data type. The output from that expression will also have the same data type. We can identify it using the data type transformation rules. There are two rules:

    - Operations between values of the same data type yield that **same data type**. For example, `2L * 2L * 2L` yields an Integer data type because `2L` is an integer.

    - Operations between values of different data types will result in the **highest data type**. From highest to lowest, the data types are ranked: 
    
        - Numeric, 
        
        - Integer, and 
        
        - Logical.
    
    - For example, the expression `12.10 * 12L` between a numeric and an integer yields a Numeric data type because it is the higher data type.

In [4]:
"What is the type of the following expression: 2 * 2 * 2?"
"ANSWER: Numeric"

"What is the type of the following expression: 2L * 2L * 2?"
"ANSWER: Numeric"

"What is the type of the following expression: 2L * 2L * 2L?"
"ANSWER: Integer"

## **DEALING VARIABLES WITH NAMING RULES**

- We did an excellent job so far! However, we would have liked to do things more automatically, and not have to copy expression values and repeat the same operations. To do that, we need to create variables.


- To create a variable, we need to undergo two major steps:

    - Create the variable name.
    
    - Assign values (or expressions) to the variable name using the **symbol `<-` (less-than, minus)**. We create a variable in order to be able to reuse it. Behind the scene, a variable declaration allocates memory space in your computer to store the value it contains. We want to do similar things with our previous lessons in order to use them afterward.
    
    
- To name a variable, there are five naming rules to follow.

    - Variable names consists of letters (upper or lower case), numbers, a dot (.), or an underscore (_).
    
    - We can begin a variable name with a letter or a dot, but dots cannot be followed by a number.
    
    - We cannot begin a variable name with a number.
    
    - No special characters are allowed, e.g., spaces, operator's symbols, ", parentheses.
        
    - We cannot use a keyword already used and interpreted by R as a variable name.


- Below is a list of variable names, numbered from 1 to 5. Using naming rules, **answer TRUE if the variable name is valid and FALSE otherwise**. Store each answer (TRUE/FALSE) in variables named q_1 to q_5, respectively (one answer per line). 


- Ready? LET'S GO!!!

In [5]:
".2var"
q_1 <- FALSE
"numberOfFruit"
q_2 <- TRUE    
"apple variable"
q_3 <- FALSE    
"var"
q_4 <- FALSE    
"age20+"
q_5 <- FALSE

## **UPDATING VARIABLES**

- In the previous sub-topic, we learned how to name a variable. Here, we learn how to use/update a variable. A variable assignment workflow consists of putting the value at the right of the assignment symbol in the memory space allocated to the variable at the left of the assignment symbol represented by its name. If there is a previous value in this memory it is overwritten by the new one. 


- Also, an expression can be assigned to a variable. For instance, `var_2  <- 10.50 * 2`. When an expression is assigned to a variable, it is not the expression itself that is stored in the computer memory but rather **the output value after the evaluation of that expression.**


- Let's practice the use of variables by repeating the exercise of calculating the overall predicted profit in a cleaner way using variables.

In [6]:
total_day1_purchase_cost <- 10.50 * 200L + 21.15 * 50L + 3.20 * 500L
total_day1_selling_cost <- 12.10 * 200 + 24.90 * 50 + 4.99 * 500
total_day1_profit <- total_day1_selling_cost - total_day1_purchase_cost
total_day1_profit

## **IDENTIFYING VARIABLES DATA TYPE**

- The power of R comes essentially from the availability of a bunch of built-in programs that allow you to perform (elementary) several kinds of data science tasks. These small programs are called **functions.** The functions are grouped by topic/utility called **package.** R is installed by default with some packages containing useful functions such as **base and stats packages.** In this lesson, we only learn how to use some of them. Moving forward, we will be learning other functions.


- The first function we will see is the `class()` function. It allows us to **identify literal/variable data types.** We use it by indicating the function name following by the variable whose data type we want in brackets. Let us experiment using this function on the variables we defined in the precedent exercise below:


In [7]:
class1 = class(total_day1_purchase_cost)
class2 = class(total_day1_selling_cost)
class3 = class(total_day1_profit)
class1
class2
class3

## **R SYNTAX : BEST PRACTICE**

- So far we've used space characters between numbers and operators (+, -, *, /, ^ are operators). For instance, we've used `2 * 2 * 2*` instead of `2*2*2.` But R's syntax rules do not enforce this, so both `2 * 2 * 2` and `2*2*2` will run correctly.


- You would also have noticed that we use `_` to separate the words in our variable name. For instance, we've used `total_purchase_cost` instead of something else. Once again R's syntax rules do not impose this style on you. As long as you respect the naming rules, we have learned, your code will run correctly.


- The computer executes code from the first line downwards and ignores blank lines. Besides blank lines, the computer also ignores any sequence of characters that comes to the right of the # symbol. This sequence of characters that follows the # symbol is called a code comment. We can also use code comments to add information about our code.