# **Data Types and Operations**

In R, data types and operations are fundamental concepts that define how data is represented and manipulated. Data types determine the nature of variables and dictate the operations that can be performed on them. R supports various data types, including:

1.  **Numeric:** Represents real or decimal numbers, both integers and floating-point values. Numeric data allows for arithmetic operations like addition, subtraction, multiplication, and division. 

2.  **Character:** Consists of text data enclosed in single or double quotes. Character data is used to store textual information and perform operations like concatenation (combining strings).
    
3.  **Logical:** Holds Boolean values (`TRUE` or `FALSE`) and is utilized for logical operations such as AND, OR, and NOT.
    
4.  **Integer:** Represents whole numbers (no decimal point). It is a sub-type of the numeric data type and provides additional optimization for handling large datasets.
    
5.  **Complex:** Stores complex numbers in the form of `real + imaginary` parts. Complex data types enable mathematical operations involving complex arithmetic.
    
6.  **Factor:** Used for categorical data with predefined levels. Factors facilitate data analysis with qualitative variables.
    
7.  **Date and Time:** R supports specific data types to handle dates and times, allowing for date arithmetic and manipulation.
    

**In R, operators are used to perform various operations on data. Common types of operators in R include:**

1.  **Arithmetic Operators:** Perform mathematical computations like addition (`+`), subtraction (`-`), multiplication (`*`), division (`/`), and exponentiation (`^`).
    
2.  **Relational Operators:** Compare values and return logical results (`TRUE` or `FALSE`). Examples include equality (`==`), inequality (`!=`), greater than (`>`), less than (`<`), etc.
    
3.  **Logical Operators:** Combine logical expressions. Common logical operators are AND (`&`), OR (`|`), and NOT (`!`).
    
4.  **Assignment Operator:** Used to assign values to variables. The assignment operator in R is `<-` or `=`.
    
5.  **Membership Operator:** Checks if an element belongs to a set. It is represented by `%in%`.
    
6.  **Special Operators:** R also has other specialized operators for specific tasks, such as indexing (`[]`) to access elements from vectors and matrices.
    

Understanding data types and operations is crucial in R programming as it forms the foundation for data manipulation, analysis, and visualization tasks.


## Topics to be covered :
- Assignment Operation
- Data Types
- Vectors
- Lists
- Matrices
- Factors
- Data Frames
- Missing Values

# Assignment Operation

In R there are 2 assignment operators "<-" and "=", you can use either of these

- Assigning Integer values to Variables x and y
- Variables are sort of Containers to hold value


In R, the assignment operation is a fundamental concept that allows you to assign values to variables. It is denoted by the `<-` symbol or the `=` symbol. The assignment operation is used to store data or the results of computations in named objects (variables), making it easy to reference and manipulate the data throughout the R script.

**Syntax:**

```
`variable_name  <-  value`
```
or
```
`variable_name  =  value`
```

**Description:**

-   **Variable Name:** Choose a valid name for the variable, adhering to R's naming rules. Variable names can consist of letters, numbers, periods, and underscores but should start with a letter or a period followed by a letter.
    
-   **Assignment Operator:** Use either `<-` or `=` to perform the assignment operation. Both symbols have the same effect, but `<-` is more commonly used in R to assign values to variables.
    
-   **Value:** The value to be assigned to the variable. It can be a numeric value, character, logical value (`TRUE` or `FALSE`), vector, matrix, or any other data type supported by R.

In [1]:
x <- 5     # After executing it, x will contain 5
y = 8      # After executing it, y will contain 5
print(x)   # print function Extracts the value from Variable(container) and displays it on Console
print(y)   # print function Extracts the value from Variable(container) and displays it on Console

[1] 5
[1] 8


# R – Data Types

Each variable in R has an associated data type. Each data type requires different amounts of memory and has some specific operations which can be performed over it.

| BASIC DATA TYPES |                        VALUES                       |
|:----------------:|:---------------------------------------------------:|
| Numeric          | Set of all real numbers                             |
| Integer          | Set of all integers, Z                              |
| Logical          | TRUE and FALSE                                      |
| Complex          | Set of complex numbers                              |
| Character        | “a”, “b”, “c”, …, “@”, “#”, “$”, …., “1”, “2”, …etc |



## 1. Numeric Datatype
Decimal values are called numerics in R. It is the default data type for numbers in R. If you assign a decimal value to a variable x as follows, x will be of numeric type.

In [2]:
# Assign a decimal value to x 
x = 5.6 
  
# print the class name of variable 
print(class(x))       # class funtion returns the Class type it belongs to
  
# print the type of variable 
print(typeof(x))      # class funtion returns the Type of the Variable

[1] "numeric"
[1] "double"


When R stores a number in a variable, it converts the number into a “double” value or a decimal type with at least two decimal places. This means that a value such as “5” here, is stored as 5.00 with a type of double and a class of numeric.

In [3]:
z = as.integer(11)    # now z will hold 5 as integer
h = 9L                # by specifying "L" without space indicates that take it as Integer

print(typeof(z))      # class funtion returns the Type of the Variable
print(typeof(h))      # class funtion returns the Type of the Variable

[1] "integer"
[1] "integer"


In [4]:
# There is also a special number Inf which represents infinity
I = Inf

# print the class name of variable 
print(class(I))       # class funtion returns the Class type it belongs to
  
# print the type of variable 
print(typeof(I))      # class funtion returns the Type of the Variable

[1] "numeric"
[1] "double"


## 2.Logical Datatype
R has logical data types that take either a value of true or false. A logical value is often created via a comparison between variables.

In [5]:
# Comparing two values 
s = x > y 
  
# print the logical value 
print(s) 
  
# print the class name of s
print(class(s))     # class funtion returns the Class type it belongs to
  
# print the type of s
print(typeof(s))    # class funtion returns the Type of the Variable

[1] FALSE
[1] "logical"
[1] "logical"


## 3. Complex Datatype
R supports complex data types which are set of all the complex numbers. The complex data type is to store numbers with an imaginary component.

In [6]:
# Comparing two values 
c = 4 + 3i 
  

# print the class name of s
print(class(c))     # class funtion returns the Class type it belongs to
  
# print the type of s
print(typeof(c))    # class funtion returns the Type of the Variable

[1] "complex"
[1] "complex"


## 4. Character Datatype
it takes all the AlphaNumeric and Special Characters

In [7]:
# Assigning character to variable a and b
a = "Data"
b = "userdata@gmail.com"

# print the class name of a, b
print(class(a))     # class funtion returns the Class type it belongs to
print(class(b))     # class funtion returns the Class type it belongs to
  
# print the type of a, b
print(typeof(a))    # class funtion returns the Type of the Variable
print(typeof(b))    # class funtion returns the Type of the Variable

[1] "character"
[1] "character"
[1] "character"
[1] "character"


# Vectors
Vectors are a sequence of elements belonging to the same data type. 
A list in R, however, comprises of elements, vectors, variables or lists which may belong to different data types.
![image.png](attachment:image.png)

## 1. Creating Vector

In R, a vector is a fundamental data structure that represents a collection of elements of the same data type. It allows you to store and manipulate data efficiently. Creating vectors in R is a straightforward process, and there are multiple methods to do so.

**Using c() function:** The most common way to create a vector in R is by using the `c()` function, which stands for "combine" or "concatenate." This function takes individual elements separated by commas and combines them into a vector. For example:

    # Creating a numeric vector  
    numeric_vector  <-  c(10,  20,  30,  40,  50)  
    # Creating a character vector  
    character_vector  <-  c("apple",  "banana",  "orange")`

In [8]:
# Creating a Vector

k <- c(8L, 9L)         ## Integer
m <- c(0.5, 0.6)       ## numeric
n <- c(TRUE, FALSE)    ## logical
o <- c(T, F)           ## logical
p <- c("a", "b", "c")  ## character
q <- 9:29              ## integer
r <- c(1+0i, 2+4i)     ## complex

print(class(k))
print(class(m))
print(class(n))
print(class(o))
print(class(p))
print(class(q))
print(class(r))

[1] "integer"
[1] "numeric"
[1] "logical"
[1] "logical"
[1] "character"
[1] "integer"
[1] "complex"


There are different ways of creating vectors. Generally, we use ‘c’ to combine different elements together.

In [9]:
# R program to create Vectors 
  
# we can use the c function 
# to combine the values as a vector. 
# By default the type will be double 
X <- c(61, 4, 21, 67, 89, 2) 
cat('using c function', X, '\n') 
  
# seq() function for creating 
# a sequence of continuous values. 
# length.out defines the length of vector. 
Y <- seq(1, 10, length.out = 5)  
cat('using seq() function', Y, '\n')  
  
# use':' to create a vector  
# of continuous values. 
Z <- 2:7
cat('using colon', Z) 

using c function 61 4 21 67 89 2 
using seq() function 1 3.25 5.5 7.75 10 
using colon 2 3 4 5 6 7

## 2. Accessing vector elements
Accessing elements in a vector is the process of performing operation on an individual element of a vector. There are many ways through which we can access the elements of the vector. The most common is using the ‘[]’, symbol.

**Note:** Vectors in R are 1 based indexing unlike the normal C, python, etc format.

In [10]:
# R program to access elements of a Vector 
  
# accessing elements with an index number. 
X <- c(2, 5, 18, 1, 12) 
cat('Using Subscript operator', X[2], '\n') 
  
# by passing a range of values 
# inside the vector index. 
Y <- c(4, 8, 2, 1, 17) 
cat('Using combine() function', Y[c(4, 1)], '\n') 
  
# using logical expressions 
Z <- c(5, 2, 1, 4, 4, 3) 
cat('Using Logical indexing', Z[Z>4]) 

Using Subscript operator 5 
Using combine() function 1 4 
Using Logical indexing 5

## 3. Modifying a vector
Modification of a Vector is the process of applying some operation on an individual element of a vector to change its value in the vector. There are different ways through which we can modify a vector:

In [11]:
# R program to modify elements of a Vector 
  
# Creating a vector 
X <- c(2, 7, 9, 7, 8, 2) 
  
# modify a specific element 
X[3] <- 1
X[2] <-9
cat('subscript operator', X, '\n') 
  
# Modify using different logics. 
X[X>5] <- 0
cat('Logical indexing', X, '\n') 
  
# Modify by specifying  
# the position or elements. 
X <- X[c(3, 2, 1)] 
cat('combine() function', X) 

subscript operator 2 9 1 7 8 2 
Logical indexing 2 0 1 0 0 2 
combine() function 1 0 2

## 4. Deleting a vector
Deletion of a Vector is the process of deleting all of the elements of the vector. This can be done by assigning it to a NULL value.

In [12]:

# R program to delete a Vector 
  
# Creating a Vector 
M <- c(8, 10, 2, 5) 
  
# set NULL to the vector 
M <- NULL  
cat('Output vector', M) 

Output vector

## 5. Mixing Objects
What about the following?

In [13]:
d <- c(1.7, "a")    ## character --> "1.7" is going to converted to Character
e <- c(TRUE, 2)     ## numeric   --> "TRUE" will be converted to Numeric, as we know True=1 and False=0
f <- c("a", TRUE)   ## character --> "TRUE" will be converted to Charater as String "TRUE"

print(d)
print(e)
print(f)

[1] "1.7" "a"  
[1] 1 2
[1] "a"    "TRUE"


When different objects are mixed in a vector, coercion occurs so that every element in the vector is
of the same class.

## 6. Explicit Coercion
Objects can be explicitly coerced from one class to another using the as.* functions, if available.

In [14]:
x <- 0:6
print(class(x))          # Will print -> "integer"

print( as.numeric(x) )   # will print -> 0 1 2 3 4 5 6

print( as.logical(x) )   # will print -> FALSE TRUE TRUE TRUE TRUE TRUE TRUE

print( as.character(x) ) # will print -> "0" "1" "2" "3" "4" "5" "6"

[1] "integer"
[1] 0 1 2 3 4 5 6
[1] FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
[1] "0" "1" "2" "3" "4" "5" "6"


Nonsensical coercion results in NAs.

In [15]:
x <- c("a", "b", "c")

print( as.numeric(x) )    # NA NA NA
                          # Warning message:
                          # NAs introduced by coercion

print( as.logical(x) )    # NA NA NA

print( as.complex(x) )    # NA NA NA
                          # Warning message:
                          # NAs introduced by coercion

“NAs introduced by coercion”


[1] NA NA NA
[1] NA NA NA


“NAs introduced by coercion”


[1] NA NA NA


# Lists
Lists are a special type of vector that can contain elements of different classes. Lists are a very
important data type in R and you should get to know them well.  
A list in R is a generic object consisting of an ordered collection of objects. Lists are one-dimensional, heterogeneous data structures. The list can be a list of vectors, a list of matrices, a list of characters and a list of functions, and so on.

In [16]:
g <- list(1, "a", TRUE, 1 + 4i)
print(g)    # when auto printed

[[1]]
[1] 1

[[2]]
[1] "a"

[[3]]
[1] TRUE

[[4]]
[1] 1+4i



## 1. Access components by names
All the components of a list can be named and we can use those names to access the components of the list using the dollar command.

In [17]:
# R program to access 
# components of a list 
  
# Creating a list by naming all its components 
empId = c(1, 2, 3, 4) 
empName = c("Debi", "Sandeep", "Subham", "Shiba") 
numberOfEmp = 4

empList = list( 
  "ID" = empId, 
  "Names" = empName, 
  "Total Staff" = numberOfEmp 
  )

print(empList) 
  
# Accessin components by names 
cat("Acessing name components using $ command\n") 
print(empList$Names)

cat("Acessing ID components using $ command\n") 
print(empList$ID)

$ID
[1] 1 2 3 4

$Names
[1] "Debi"    "Sandeep" "Subham"  "Shiba"  

$`Total Staff`
[1] 4

Acessing name components using $ command
[1] "Debi"    "Sandeep" "Subham"  "Shiba"  
Acessing ID components using $ command
[1] 1 2 3 4


## 2. Access components by indices
We can also access the components of the list using indices.  
To access the top-level components of a list we have to use a double slicing operator “[[ ]]” which is two square brackets and if we want to access the lower or inner level components of a list we have to use another square bracket “[ ]” along with the double slicing operator “[[ ]]“.

In [18]:
# Accessing a top level components by indices 
cat("Accessing name components using indices\n") 
print(empList[[2]]) 
  
# Accessing a inner level components by indices 
cat("Accessing Sandeep from name using indices\n") 
print(empList[[2]][2]) 
  
# Accessing another inner level components by indices 
cat("Accessing 4 from ID using indices\n") 
print(empList[[1]][4]) 

Accessing name components using indices
[1] "Debi"    "Sandeep" "Subham"  "Shiba"  
Accessing Sandeep from name using indices
[1] "Sandeep"
Accessing 4 from ID using indices
[1] 4


## 3. Modifying components of a list
A list can also be modified by accessing the components and replacing them with the ones which you want.

In [19]:
# Modifying the top-level component 
empList$`Total Staff` = 5
  
# Modifying inner level componet 
empList[[1]][5] = 5
empList[[2]][5] = "Kamala"
  
cat("After modified the list\n") 
print(empList) 

After modified the list
$ID
[1] 1 2 3 4 5

$Names
[1] "Debi"    "Sandeep" "Subham"  "Shiba"   "Kamala" 

$`Total Staff`
[1] 5



## 4. Deleting components of a list
To delete components of a list, first of all, we need to access those components and then insert a negative sign before those components. It indicates that we had to delete that component.

Use **"-"** (Negetive Sign) to indicate to delete it

In [20]:
# Deleting a top level components 
cat("After Deleting Total staff components\n") 
print(empList[-3])           # 3rd component will get deleted which is "Total Staff"
  
# Deleting a inner level components 
cat("After Deleting sandeep from name\n") 
print(empList[[2]][-2])      

After Deleting Total staff components
$ID
[1] 1 2 3 4 5

$Names
[1] "Debi"    "Sandeep" "Subham"  "Shiba"   "Kamala" 

After Deleting sandeep from name
[1] "Debi"   "Subham" "Shiba"  "Kamala"


# Matrices
Matrices are vectors with a dimension attribute. The dimension attribute is itself an integer vector of
length 2 (nrow, ncol)

In [21]:
m <- matrix(nrow = 2, ncol = 3)
print(m)
cat("\n")

print(dim(m))
cat("\n")

print(attributes(m))

     [,1] [,2] [,3]
[1,]   NA   NA   NA
[2,]   NA   NA   NA

[1] 2 3

$dim
[1] 2 3



Matrices are constructed column-wise, so entries can be thought of starting in the “upper left” corner
and running down the columns.
- The argument byrow indicates that the matrix is filled by the rows. If we want the matrix to be filled by the columns, we just place byrow = FALSE

In [22]:
m <- matrix(1:6, nrow = 2, ncol = 3, byrow=FALSE)
print(m)
cat("\n")

# if want to fill by row
n <- matrix(1:6, nrow = 2, ncol = 3, byrow=TRUE)
print(n)

     [,1] [,2] [,3]
[1,]    1    3    5
[2,]    2    4    6

     [,1] [,2] [,3]
[1,]    1    2    3
[2,]    4    5    6


## 1. cbind-ing and rbind-ing
Matrices can be created by column-binding or row-binding with cbind() and rbind().

In [23]:
x <- 1:3
y <- 10:12

print(cbind(x, y))
cat("\n")
print(rbind(x, y))

     x  y
[1,] 1 10
[2,] 2 11
[3,] 3 12

  [,1] [,2] [,3]
x    1    2    3
y   10   11   12


## 2. Accessing columns:

In [24]:
# R program to illustrate 
# access columns in metrics 
  
# Create a 3x3 matrix 
A = matrix( 
  c(1, 2, 3, 4, 5, 6, 7, 8, 9),  
  nrow = 3,              
  ncol = 3,              
  byrow = TRUE           
) 
cat("The 3x3 matrix:\n") 
print(A) 
  
# Accessing first and second column 
cat("\nAccessing first and second column\n") 
print(A[, 1:2]) 

The 3x3 matrix:
     [,1] [,2] [,3]
[1,]    1    2    3
[2,]    4    5    6
[3,]    7    8    9

Accessing first and second column
     [,1] [,2]
[1,]    1    2
[2,]    4    5
[3,]    7    8


## 3. Accessing elements of a matrix:

In [25]:
# Accessing 2 
print(A[1, 2]) 
  
# Accessing 6 
print(A[2, 3]) 

[1] 2
[1] 6


## 4. Accessing Submatrices:

We can access submatrix in a matrix using the colon(:) operator.

In [26]:
cat("Accessing the first three rows and the first two columns\n") 
print(A[1:3, 1:2]) 

Accessing the first three rows and the first two columns
     [,1] [,2]
[1,]    1    2
[2,]    4    5
[3,]    7    8


## 5. Deleting rows and columns of a Matrix
To delete a row or a column, first of all, you need to access that row or column and then insert a negative sign before that row or column. It indicates that you had to delete that row or column.

In [27]:
# 2nd-row deletion 
A = A[-2, ] 
  
cat("After deleted the 2nd row\n") 
print(A) 

After deleted the 2nd row
     [,1] [,2] [,3]
[1,]    1    2    3
[2,]    7    8    9


In [28]:
# 2nd-row deletion 
A = A[, -2] 
  
cat("After deleted the 2nd column\n") 
print(A) 

After deleted the 2nd column
     [,1] [,2]
[1,]    1    3
[2,]    7    9


# Factors
Factors are used to represent categorical data. Factors can be unordered or ordered. One can think
of a factor as an integer vector where each integer has a label. 

The factors are the variable in R, which takes the categorical variable and stores data in levels. The primary use of this function can be seen in data analysis and specifically in statistical analysis. Also, it helps to reduce data redundancy and to save a lot of space in the memory.

**Note:** 
- A categorical variable is those variables that take values based on the labels or names. For example, the blood type of a human can be A, B, AB, or O.
- A data field such as marital status may contain only values from single, married, separated, divorced, or widowed.

In [29]:
gender <- c("Male","Female","Female","Male","Female")
gender<- factor(gender)

print(gender)
cat("\n")

print(unclass(gender))

[1] Male   Female Female Male   Female
Levels: Female Male

[1] 2 1 1 2 1
attr(,"levels")
[1] "Female" "Male"  


as we saw "Female" is stored as '1' and "Male" is stored as'2'  

## 1. The order of the levels can be set using the levels argument to factor()   
This can be important in linear modelling because the first level is used as the baseline level.

In [30]:
gender <- c("Male","Female","Female","Male","Female")
gender<- factor(gender,
                levels = c("Male", "Female"))

print(unclass(gender))

[1] 1 2 2 1 2
attr(,"levels")
[1] "Male"   "Female"


## 2. Renaming a Factor levels
Let's change the name of the vector values in the input by specifying the regular use of 'levels' as the first argument with values "Male" and "Female" and the expected changed vector values using 'labels' as the second argument with "Gen_Male" and "Gen_Female" respectively.

In [31]:
renamed_gender = factor(gender,levels = c("Male","Female"),labels = c("Gen_Male","Gen_Female"))
print(renamed_gender)

[1] Gen_Male   Gen_Female Gen_Female Gen_Male   Gen_Female
Levels: Gen_Male Gen_Female


The above code gives the output where the name is changed for "Male" to "Gen_Male" and "Female" to "Gen_Female".

## 3. Ordering a Categorical Variable  
Let's look at a different example when dealing with ordinal categorical values where ordered matters. For instance, for the size of a pant, there might be a size which is considered as Large as "L", Extra Large as "XL" and Extra extra Large as "XXL" is arranged in ascending order.  

The code below contains the collection of vector input of characters "L", "XL" and "XXL" and stored to 'pant'.  
'pant.factor' is the variable which has parameter containing levels arranged in ascending order as 'levels = c("L", "XL", "XXL")' and finally 'ordered = TRUE', which makes the sorting possible according to your need.

In [32]:
pant <- c("XL","L","XL","XXL","L","XL")
pant.factor <- factor(pant, ordered = TRUE, levels = c("L","XL","XXL"))

print(pant.factor)
cat("\n")

print("is XL > L")
print( pant.factor[1] > pant.factor[2] )

[1] XL  L   XL  XXL L   XL 
Levels: L < XL < XXL

[1] "is XL > L"
[1] TRUE


# 7. Data Frames
Data frames can also be taught as mattresses where each column of a matrix can be of the different data types. DataFrame are made up of three principal components, the data, rows, and columns.  
<img src="https://media.geeksforgeeks.org/wp-content/uploads/20200414224825/f115.png" width="500" height="500">

Data frames are used to store tabular data:
- They are represented as a special type of list where every element of the list has to have the same length
- Each element of the list can be thought of as a column and the length of each element of the list is the number of rows
- Unlike matrices, data frames can store different classes of objects in each column (just like lists); matrices must have every element be the same class
- Data frames also have a special attribute called **row.names**
- Data frames are usually created by calling **read.table()** or **read.csv()**
- Can be converted to a matrix by calling **data.matrix()**

# 1. Creating a data frame using Vectors
To create a data frame we use the data.frame() function in R. To create a data frame use data.frame() command and then pass each of the vectors you have created as arguments to the function.

In [33]:
# A vector which is a character vector 
Name = c("Amiya", "Raj", "Asish") 
  
# A vector which is a character vector 
Language = c("R", "Python", "Java") 
  
# A vector which is a numeric vector 
Age = c(22, 25, 45) 

df = data.frame(Name, Language, Age) 
  
print(df) 

   Name Language Age
1 Amiya        R  22
2   Raj   Python  25
3 Asish     Java  45


## 2. Accessing rows and columns
Syntax:
```R
df[val1, val2]

df = dataframe object
val1 = rows of a data frame
val2 = columns of a data frame
```

In [34]:
# Accessing first and second row 
cat("Accessing first and second row\n") 
print(df[1:2, ]) 

# Accessing second and third column 
cat("\nAccessing first and second row\n") 
print(df[,2:3]) 

Accessing first and second row
   Name Language Age
1 Amiya        R  22
2   Raj   Python  25

Accessing first and second row
  Language Age
1        R  22
2   Python  25
3     Java  45


## 3. Selecting the subset of the DataFrame
A subset of a DataFrame can also be created based on certain conditions with the help of following syntax.
```R
newDF = subset(df, conditions)

df = Original dataframe
conditions = Certain conditions
```

In [35]:
newDf = subset(df, Name =="Amiya"|Age>30) 
  
cat("After Selecting the subset of the data frame\n") 
print(newDf) 


newDf2 = subset(df, Age<30) 
  
cat("\nAfter Selecting the subset of the data frame\n") 
print(newDf2) 

After Selecting the subset of the data frame
   Name Language Age
1 Amiya        R  22
3 Asish     Java  45

After Selecting the subset of the data frame
   Name Language Age
1 Amiya        R  22
2   Raj   Python  25


## 4. Editing DataFrames
In R, DataFrames can be edited in two ways:
Editing data frames by direct assignments: Much like the list in R you can edit the data frames by a direct assignment.

In [36]:
# Editing dataframes by direct assignments 
# [[3]] accesing the top level components  
# Here Age in this case 
# [[3]][3] accessing inner level componets  
# Here Age of Asish in this case 
df[[3]][3] = 30
  
cat("After edited the dataframe\n") 
print(df) 

After edited the dataframe
   Name Language Age
1 Amiya        R  22
2   Raj   Python  25
3 Asish     Java  30


## 5. Adding extra columns
We can add extra column using the command cbind(). The syntax for this is given below,
```
newDF = cbind(df, the entries for the new column you have to add )
df = Original data frame
```

In [37]:
# Add a new column using cbind() 
newDf = cbind(df, Rank=c(3, 5, 1)) 
  
cat("After Added a column\n") 
print(newDf) 

After Added a column
   Name Language Age Rank
1 Amiya        R  22    3
2   Raj   Python  25    5
3 Asish     Java  30    1


## 6. Deleting rows and columns from a data frame
To delete a row or a column, first of all, you need to access that row or column and then insert a negative sign before that row or column. It indicates that you had to delete that row or column.

Syntax:
```
newDF = df[-rowNo, -colNo]

df = original data frame
```

In [38]:
# delete the third row and the second column 
newDF = df[-3, -2] 
  
cat("After Deleted the 3rd row and 2nd column\n") 
print(newDF) 

After Deleted the 3rd row and 2nd column
   Name Age
1 Amiya  22
2   Raj  25


# Missing Values
Missing values are denoted by NA or NaN for undefined mathematical operations.
- **is.na()** is used to test objects if they are **NA**
- **is.nan()** is used to test for **NaN**
- **NA** values have a class also, so there are integer **NA**, character **NA**, etc.
- A **NaN** value is also **NA** but the converse is not true

In [39]:
m = c(2L, NA)
print(typeof(m))  # integer

n = c(4, NA)
print(typeof(n))  # double

o = c("h", NA)
print(typeof(o))  # character

p = c(TRUE, NA)
print(typeof(p))  # logical

[1] "integer"
[1] "double"
[1] "character"
[1] "logical"


In [40]:
i <- c(1, 2, NA, 10, 3)
print(is.na(i))              # Returns a Logical vector: FALSE FALSE TRUE FALSE FALSE
print(is.nan(i))             # Returns a Logical vector: FALSE FALSE FALSE FALSE FALSE


j <- c(1, 2, NaN, NA, 4)
print(is.na(j))              # Returns a Logical vector: FALSE FALSE TRUE FALSE FALSE
print(is.nan(j))             # Returns a Logical vector: FALSE FALSE FALSE FALSE FALSE

[1] FALSE FALSE  TRUE FALSE FALSE
[1] FALSE FALSE FALSE FALSE FALSE
[1] FALSE FALSE  TRUE  TRUE FALSE
[1] FALSE FALSE  TRUE FALSE FALSE
