### Introduction: 


* Overview
- R is a powerful tool for statistical computing and data analysis.
- Widely used by statisticians, data scientists, and researchers.
- Offers extensive packages and libraries for data manipulation, statistical modeling, and visualization.
- Originated as an implementation of the S programming language, conceived in 1992, with a stable version released in 2000.

* Key Features
1. **Comprehensive Statistical Analysis**: Supports a wide range of statistical techniques.
2. **Advanced Data Visualization**: Tools like ggplot2 and plotly for creating detailed visualizations.
3. **Extensive Packages and Libraries**: Thousands of packages available via CRAN for various applications.
4. **Open Source and Free**: Accessible to everyone without licensing costs.
5. **Platform Independence**: Runs on Windows, macOS, and Linux.
6. **Integration with Other Languages**: Compatible with C, C++, Python, Java, and SQL.
7. **Powerful Data Handling**: Supports various data types and structures.
8. **Robust Community Support**: Active user community providing resources and assistance.
9. **Interactive Development Environment (IDE)**: RStudio offers a user-friendly interface.
10. **Reproducible Research**: Tools like R Markdown for creating dynamic reports.

* Advantages
- Comprehensive statistical analysis capabilities.
- Open-source nature allows for widespread use and contributions.
- Cross-platform compatibility.
- Encourages community contributions for package development.

* Disadvantages
- Some packages may have inconsistent quality.
- High memory consumption can occur.
- Limited support for troubleshooting.
- Slower performance compared to languages like Python and MATLAB.

* Applications
- Used extensively in Data Science for statistical computing and design.
- Popular among quantitative analysts for data importing and cleaning.
- Fundamental tool in finance and analytics.
- Employed by major tech companies like Google, Facebook, and Twitter.



1. **Origin and Naming**: 
   - R is an implementation of the S programming language, with influences from Scheme.
   - Named after the first names of its authors, Ross Ihaka and Robert Gentleman, and as a play on the name "S."

2. **Programming Paradigms**: 
   - Supports both procedural and object-oriented programming.
   - Procedural programming includes procedures, records, modules, and calls; object-oriented programming includes classes, objects, and generic functions.

3. **Interpreted Language**: 
   - R is an interpreted language, eliminating the need for a compiler, which speeds up script execution.

4. **Extensive Packages**: 
   - Over 100,000 R packages available via CRAN and GitHub, enabling complex tasks with minimal code.

5. **Rapid Growth**: 
   - R is growing faster than other data science languages and is the second most-used language after SQL, utilized by 70% of data miners.

6. **Reproducible Documents**: 
   - The rmarkdown package allows users to create reproducible Word documents and PowerPoint presentations from R markdown code with a simple YAML change.

7. **Database Connectivity**: 
   - Easy connection to various databases using the dbplyr package, allowing independent data pulling from common database types.

8. **Interactive Web Apps**: 
   - Create and host interactive web apps with minimal code using the flexdashboard package; deploy on personal or cloud servers with rsconnect.

9. **Game Development**: 
   - The nessy package enables the creation of NES-style Shiny apps, allowing for game development in R.

10. **API Development**: 
    - The plumber package converts R functions into web APIs for integration with other applications.

11. **Popularity**: 
    - R ranks #7 in the PYPL Popularity of Programming Languages and is the top Google search for advanced analytics software, with over 3 million users globally.

12. **Historical Background**: 
    - Introduced in 1993 by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand.

13. **Open Source**: 
    - R is free and open-source, available for statistical and graphical purposes.

14. **Supportive Community**: 
    - R has a large, enthusiastic user community providing resources and assistance.

15. **Widespread Usage**: 
    - Commonly used in data science, machine learning, and statistical modeling, making it a sought-after programming language.

16. **Diverse Applications**: 
    - Utilized in industries like finance, healthcare, pharmaceuticals, and marketing for data analysis and modeling.

17. **Academic Research**: 
    - A crucial tool in various academic disciplines, including biology, psychology, and economics.

18. **Cross-Platform Compatibility**: 
    - Operates seamlessly on Windows, macOS, and Linux, ensuring accessibility for all users.


    - integration with C, C++, .Net, Python, FORTRAN
    - freely available under GNU-GPL and distributed under GNU- style copy left and is a part of GNU project called GNU S; precompiled binary versions for different os

### Basic Syntax

In [85]:
# start R command prompt
# $ R

# first program
print("Helloworld")

[1] "Helloworld"


In [86]:
# BASIC SYNTAX

# ASSIGNMENTS (SIMPLE, LEFTWARD, RIGHTWARD)

v1 = "yo"
v2 <- "yo" # or u can use <<-
"yo" -> v3 # or u can use ->>
print(v1)
print(v2)
print(v3)

# DOUBLE LINE COMMENTS: use following strategy
if (FALSE)
{
'hello
how are you'
  
"hello
how are you"
}

[1] "yo"
[1] "yo"
[1] "yo"


![image.png](attachment:image.png)

### Operators

In [87]:
# ARITHMATIC OPERATORS
a <- 2
b <- 4

print(a+b)
print(a-b)
print(a*b)
print(a/b)
print(b^a) #POWER
print(a%%b) #MODULUS


[1] 6
[1] -2
[1] 8
[1] 0.5
[1] 16
[1] 2


In [88]:
# LOGICAL OPERATORS: TRUE FALSE NA
vec1 <- c(0,2)
vec2 <- c(TRUE,FALSE)

cat ("Element wise AND :", vec1 & vec2, "\n") #Any non zero integer value is considered as a TRUE value, be it complex or real number.
cat ("Element wise OR :", vec1 | vec2, "\n")
cat ("Logical AND :", vec1[1] && vec2[1], "\n")
cat ("Logical OR :", vec1[1] || vec2[1], "\n")
cat ("Negation :", !vec1)

Element wise AND : FALSE FALSE 
Element wise OR : TRUE TRUE 
Logical AND : FALSE 
Logical OR : TRUE 
Negation : TRUE FALSE

In [89]:
# RELATIONAL OPERATORS
vec1 <- c(0, 2)
vec2 <- c(2, 3)

# element wise operations
cat ("Vector1 less than Vector2 :", vec1 < vec2, "\n")
cat ("Vector1 less than equal to Vector2 :", vec1 <= vec2, "\n")
cat ("Vector1 greater than Vector2 :", vec1 > vec2, "\n")
cat ("Vector1 greater than equal to Vector2 :", vec1 >= vec2, "\n")
cat ("Vector1 not equal to Vector2 :", vec1 != vec2, "\n")

Vector1 less than Vector2 : TRUE TRUE 
Vector1 less than equal to Vector2 : TRUE TRUE 
Vector1 greater than Vector2 : FALSE FALSE 
Vector1 greater than equal to Vector2 : FALSE FALSE 
Vector1 not equal to Vector2 : TRUE TRUE 


In [90]:
# MISCELLANEOUS OPERATORS

mat <- matrix (1:4, nrow = 1, ncol = 4) #create a matrix
print("Matrix elements using : ")
print(mat)

product = mat %*% t(mat) #multiplying matrix with its transpose
print("Product of matrices")
print(product,)
cat ("does 1 exist in prod matrix :", "1" %in% product) #check if matrix has 1 in it

[1] "Matrix elements using : "
     [,1] [,2] [,3] [,4]
[1,]    1    2    3    4
[1] "Product of matrices"
     [,1]
[1,]   30
does 1 exist in prod matrix : FALSE

### Datatypes

* dynamically typed so no explicit declaration of dt
* vars are assigned a R-obj and dt of R-obj becomes dt of var
* vector is simplest obj. 6 dts/classes of atomic vectors:
    ```logical,numeric,integer,complex, character,raw```
* imp functions:
    * class(): kind of object (highlevel)
    * typeof(): kind of object (low level)
    * attributes(): have any metadata?
![image.png](attachment:image.png)

In [91]:
# NUMERIC DATATYPE: basically decimals; default dt for numbers; representation of number by "DOUBLE PRECISION FPN"
x = 5 #numeric
y = 5.6 #numeric
cat(class(x), typeof(x), "\n") # >>>> numeric double {here double = decimal type wiht minimum 2 decimal places}

# INTEGER DATATYPE: subset of numeric
z = 4L
cat(class(z), typeof(z), "\n")

# LOGICAL DATATYPE
l = y>x
cat(l, class(l), typeof(l), "\n")

# COMPLEX DATATYPE: real and imaginary part; cant find sqrt of -1 but can find of -1+0i and is 0+1i if u cerce -1 in complext it will be 0+1i
c = 4+3i
cat(class(c), typeof(c), "\n")

#CHARACTER DATATYPE: strings, paste(char1,char2) function to concatenate these
char = "hello"
cat(class(char), typeof(char), "\n")

# RAW DATATYPE: save and work at raw byte level to do low level operations; done by displaying series of unprocessed bytes
raw <- as.raw(c(0x1, 0x2, 0x3, 0x4, 0x5)) # raw vector-each byte as a pai of hex digits
# as.raw() will be 0 is fails
# is.raw is TRUE <=> typeof(raw_var) is "raw"
cat(class(raw), typeof(raw), "\n")

# KNOWING THE DATATYPE: class(object) 
# Logical
print(class(TRUE))
# Integer
print(class(3L))
# Numeric
print(class(10.5))
# Complex
print(class(1+2i))
# Character
print(class("12-04-2020"))

# TYPE VERIFICATION: is.datatype(object)
# Logical
print(is.logical(TRUE))
# Integer
print(is.integer(3L))
# Numeric
print(is.numeric(10.5))
# Complex
print(is.complex(1+2i))
# Character
print(is.character("12-04-2020"))

# COERCEING DTS: as.datatype(object)
# Logical
print(as.numeric(TRUE))
# Integer
print(as.complex(3L))
# Numeric
print(as.logical(10.5))
# Complex
print(as.character(1+2i))
# Can't possible: print(as.numeric("12-04-2020")); error will be thrown


numeric double 
integer integer 
TRUE logical logical 
complex complex 
character character 
raw raw 
[1] "logical"
[1] "integer"
[1] "numeric"
[1] "complex"
[1] "character"
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] 1
[1] 3+0i
[1] TRUE
[1] "1+2i"


In [5]:
# CHARACTER DATATYPE: strings, 
char = "hello"
cat(class(char), typeof(char), "\n")
# paste(char1,char2) function to concatenate these
char2 <- "sam"
paste(char, char2)
# readable string with the sprintf() function, which has a C language syntax
sprintf("%s has %d dollars","raj",100)
# To extract a substring, we apply the substr(str,start =, stop =) function
substr("ABCDEFGHIJKLMNOPQRSTUVWXYZ", start = 5, stop =10) #from substring within 5 and 10
# to replace first occurance of a substring in a string by other substring, use sub()
sub("big", "small", "raj has big car") # replace big with small

character character 


### Variables

* named storage which ou programs can manipulate.
* can store atomic vector(s)/combination of many objects.
* dynammically typed language; no expicit declaration of dt; we can change dt again and again
* creation: leftward, ightward or assignment operator
* case sensitive
* naming conventions: 
    * alphanum, special char (. and _)
    * startwith alphabet or . ; not num or _
    * if startswith . then next must NOT be a num
    * no reserved keyword
* imp methods: 
    * class(): knowing datatype
    * ls(pattern = "pattern_name", all.name = TRUE): know all present variables in workspace; pattern for finding specific pattern within var names; all.name for finding hidden vars which start with dot (.) 
    * rm(): delete unwanted variables in workspace
    * local and global variables 

* note: ```The vector c(TRUE,1) has a mix of logical and numeric class. So logical
class is coerced to numeric class making TRUE as 1.```

In [92]:
# class()
var1 = "hello"
print(class(var1))

# ls()
var2 <- "hello"
"hello" -> var3
print(ls())

# rm()
# rm(list = ls()) removes all variables in workspace
rm(var3)

[1] "character"
 [1] "a"             "A"             "accdf"         "b"            
 [5] "B"             "c"             "char"          "D"            
 [9] "empAge"        "empId"         "empList"       "empName"      
[13] "l"             "lst"           "lst1"          "lst2"         
[17] "M"             "mat"           "My.nums"       "my_named_list"
[21] "new_list"      "newdf"         "newfactor"     "newmatrix"    
[25] "newmatrix2"    "numberOfEmp"   "product"       "raw"          
[29] "S"             "thisarray"     "thisdf"        "thisdf2"      
[33] "thisfactor"    "thismatrix"    "v1"            "v2"           
[37] "v3"            "var1"          "var2"          "var3"         
[41] "vec"           "vec1"          "vec2"          "x"            
[45] "X"             "y"             "Y"             "z"            
[49] "Z"             "z1"            "z2"           


### Data Structures

* way of oranising data for effective use; reduce space-time complexities
* organised as: dimensionality(1D,2D,...,nD) & type of elements in a DS (Homogenous or heterogenous)
* ds types:
    * vectors: ordered, homogenos ds, 1D; c()
    * lists: ordered, heterogenous, 1D; list()
    * dataframes: tabular, most popular, heterogenous, 2D; data.frame()
    * matrices: rows and cols, homogenous, 2D, matrix()
    * arrays: more than two dimensions, homogenous, nD, array()
    * factors: categorise and store data as levels, heterogenous (string/ints), 1D, factor()
    * tibbles: enhanced dataframe and part of tidyverse

#### Vectors

* same as arrays; seq of data elements
* indexing starts from 1 not 0
* ordered, homogenous and 1D
* c(components/members)

##### BASIC OPERATIONS


* creating: c(comma separated elements) or seq() or uing colon as 1:10 
* length: length(vector object)
* accessing: 
    * use subscript operator: vector_obj[2]
    * use combine() function: vector_obj[c(4,1)]
* modifying:
    * using subscript operator: vector_obj[3] <- 2; 3rd element updated as 2
    * using logical indexing: vector_obj[1:5] <- 0; 1 to 5 element in vector as 0
    * using combine operator:vector_obj <- vector_obj[c(3,2,1)]; 3rd,2nd,1st element of vector are made as new vector
* deleting:
    * using NULL keyword: vector_obj <- NULL
* arithmatics: using arithmatic operators
* sorting:
    * using sort(vector_obj, decreasing = TRUE)

In [2]:
# CREATING 
# by default the type will be double
X <- c(1, 4, 5, 2, 6, 7) 
print('using c function')
print(X)
  
# using the seq() function to generate
# a sequence of continuous values 
# with different step-size and length.
# length.out defines the length of vector.
Y <- seq(1, 10, by = 0.5) 
print('using seq() function') 
print(Y)
  
# using ':' operator/range operator to create
# a vector of continuous values. 
# only applicable for numeric data
Z <- 5:10
print('using colon')
print(Y)

[1] "using c function"
[1] 1 4 5 2 6 7
[1] "using seq() function"
 [1]  1.0  1.5  2.0  2.5  3.0  3.5  4.0  4.5  5.0  5.5  6.0  6.5  7.0  7.5  8.0
[16]  8.5  9.0  9.5 10.0
[1] "using colon"
 [1]  1.0  1.5  2.0  2.5  3.0  3.5  4.0  4.5  5.0  5.5  6.0  6.5  7.0  7.5  8.0
[16]  8.5  9.0  9.5 10.0


In [7]:
# ACCESSING

X <- c(1, 4, 5, 2, 6, 7) 
print('using c function')
print(X)
cat("accessing element in second position\n")
print(X[2])
cat("accessing all elements except element in second position\n")
print(X[-2])
# if out of index NA is outputted
cat("accessing elements from 2nd to 4thposition\n")
print(X[2:4]) # here 4 is included unlike python

# indexing multiple elements
days <- c("mon","tues","thurs","wedn")
print(days[c(1,3,4)])
print(days[c(TRUE,FALSE,TRUE,TRUE)]) # same as above
print(days[c(-2,-4)]) # without 2nd and 4th element

[1] "using c function"
[1] 1 4 5 2 6 7
accessing element in second position
[1] 4
accessing all elements except element in second position
[1] 1 5 2 6 7
accessing elements from 2nd to 4thposition
[1] 4 5 2
[1] "mon"   "thurs" "wedn" 
[1] "mon"   "thurs" "wedn" 
[1] "mon"   "thurs"


In [95]:
# MODIFYING 

# Creating a vector
X <- c(2, 5, 1, 7, 8, 2)
  
# modify a specific element
X[3] <- 11
print('Using subscript operator')
print(X)
  
# Modify using different logics.
X[X>9] <- 0
print('Logical indexing')
print(X)
  
# Modify by specifying the position or elements.
X <- X[c(5, 2, 1)]
print('using c function')
print(X)

[1] "Using subscript operator"
[1]  2  5 11  7  8  2
[1] "Logical indexing"
[1] 2 5 0 7 8 2
[1] "using c function"
[1] 8 5 2


In [96]:
# DELETING

# Creating a vector
X <- c(5, 2, 1, 6)
  
# Deleting a vector
X <- NULL
print('Deleted vector')
print(X)

[1] "Deleted vector"
NULL


In [97]:
# ARITHMATIC OPERATIONS

# Creating Vectors
X <- c(5, 2, 5, 1, 51, 2)
Y <- c(7, 9, 1, 5, 2, 1)
# Addition
Z <- X + Y
print('Addition')
print(Z) 
# Subtraction
S <- X - Y
print('Subtraction')
print(S)
# Multiplication
M <- X * Y
print('Multiplication')
print(M)
# Division
D <- X / Y
print('Division')
print(D)
# recycling rule: 
if(FALSE){
    "If two vectors are of unequal length, the shorter one will be recycled in order to
match the longer vector.  For example, the following vectors u and v have different lengths, and their sum
is computed by recycling values of the shorter vector u"
}

[1] "Addition"
[1] 12 11  6  6 53  3
[1] "Subtraction"
[1] -2 -7  4 -4 49  1
[1] "Multiplication"
[1]  35  18   5   5 102   2
[1] "Division"
[1]  0.7142857  0.2222222  5.0000000  0.2000000 25.5000000  2.0000000


In [98]:
# SORTING VECTORS

# Creating a Vector
X <- c(5, 2, 5, 1, 51, 2)
# Sort in ascending order
A <- sort(X)
print('sorting done in ascending order')
print(A)
# sort in descending order.
B <- sort(X, decreasing = TRUE)
print('sorting done in descending order')
print(B)

[1] "sorting done in ascending order"
[1]  1  2  2  5  5 51
[1] "sorting done in descending order"
[1] 51  5  5  2  2  1


In [None]:
# NAMED VECTOR
v1 <- c("may","june")
names(v1) <- c("first", "last") # naming each member
print(v1)
print(v1["first"]) # using names for indexing

 first   last 
 "may" "june" 
first 
"may" 


##### Append

* using c()
* using append()
* using indexing

In [99]:
# USING c()
x <- 1:4
y <- 5:10
a <- letters[1:5]
z <- c(x,y)
z1 <- c(x,a)
print(z)
print(z1)

# USING append()
z2 <- append(x, y)
print(z2)

# USING INDEXING
x[5] <- 5
x[6] <- 6
print(x)

 [1]  1  2  3  4  5  6  7  8  9 10
[1] "1" "2" "3" "4" "a" "b" "c" "d" "e"
 [1]  1  2  3  4  5  6  7  8  9 10
[1] 1 2 3 4 5 6


#### Lists

* ordered collection of objects, heterogenous and 1D
* create 
* naming a list
* accessing a list:
    * by names
    * by indices
* modifying
* concatenation
* appending
* deleting component
* merging 
* coverting to a vector: unlist()

In [100]:
# CREATING 
empId = c(1, 2, 3, 4)
empName = c("Debi", "Sandeep", "Subham", "Shiba")
numberOfEmp = 4

empList = list(empId, empName, numberOfEmp)
  
print(empList)
#[[]] index [] subindex

[[1]]
[1] 1 2 3 4

[[2]]
[1] "Debi"    "Sandeep" "Subham"  "Shiba"  

[[3]]
[1] 4



In [73]:
# NAMING LIST COMPONENTS
my_named_list <- list(name = "Sudheer", age = 25, city = "Delhi")
print(my_named_list)
# $name instead of [[]]

$name
[1] "Sudheer"

$age
[1] 25

$city
[1] "Delhi"



In [74]:
# ACCESSING BY NAME
empId = c(1, 2, 3, 4)
empName = c("Debi", "Sandeep", "Subham", "Shiba")
numberOfEmp = 4
empList = list(
  "ID" = empId,
  "Names" = empName,
  "Total Staff" = numberOfEmp
  )
print(empList)
cat("Accessing name components using $ command\n")
print(empList$Names)

# ACCESSING BY INDICES
empId = c(1, 2, 3, 4)
empName = c("Debi", "Sandeep", "Subham", "Shiba")
numberOfEmp = 4
empList = list(
  "ID" = empId,
  "Names" = empName,
  "Total Staff" = numberOfEmp
  )
print(empList)
cat("Accessing name components using indices\n")
print(empList[[2]])
cat("Accessing Sandeep from name using indices\n")
print(empList[[2]][2])
cat("Accessing 4 from ID using indices\n")
print(empList[[1]][4])


$ID
[1] 1 2 3 4

$Names
[1] "Debi"    "Sandeep" "Subham"  "Shiba"  

$`Total Staff`
[1] 4

Accessing name components using $ command
[1] "Debi"    "Sandeep" "Subham"  "Shiba"  
$ID
[1] 1 2 3 4

$Names
[1] "Debi"    "Sandeep" "Subham"  "Shiba"  

$`Total Staff`
[1] 4

Accessing name components using indices
[1] "Debi"    "Sandeep" "Subham"  "Shiba"  
Accessing Sandeep from name using indices
[1] "Sandeep"
Accessing 4 from ID using indices
[1] 4


In [75]:
# MODIFYING 
empId = c(1, 2, 3, 4)
empName = c("Debi", "Sandeep", "Subham", "Shiba")
numberOfEmp = 4
empList = list(
  "ID" = empId,
  "Names" = empName,
  "Total Staff" = numberOfEmp
)
cat("Before modifying the list\n")
print(empList)
# Modifying the top-level component
empList$`Total Staff` = 5
# Modifying inner level component
empList[[1]][5] = 5
empList[[2]][5] = "Kamala"
cat("After modified the list\n")
print(empList)

Before modifying the list
$ID
[1] 1 2 3 4

$Names
[1] "Debi"    "Sandeep" "Subham"  "Shiba"  

$`Total Staff`
[1] 4

After modified the list
$ID
[1] 1 2 3 4 5

$Names
[1] "Debi"    "Sandeep" "Subham"  "Shiba"   "Kamala" 

$`Total Staff`
[1] 5



In [76]:
# CONCATENATION
empId = c(1, 2, 3, 4)
empName = c("Debi", "Sandeep", "Subham", "Shiba")
numberOfEmp = 4
empList = list(
  "ID" = empId,
  "Names" = empName,
  "Total Staff" = numberOfEmp
)
cat("Before concatenation of the new list\n")
print(empList)
empAge = c(34, 23, 18, 45)
empList = c(empName, empAge)
cat("After concatenation of the new list\n")
print(empList)

Before concatenation of the new list
$ID
[1] 1 2 3 4

$Names
[1] "Debi"    "Sandeep" "Subham"  "Shiba"  

$`Total Staff`
[1] 4

After concatenation of the new list
[1] "Debi"    "Sandeep" "Subham"  "Shiba"   "34"      "23"      "18"     
[8] "45"     


In [77]:
# APPENDING 
My.nums = list(1,2,3,4)
print(My.nums)
print("new list appended")
z = append(My.nums,6)
print(z)


[[1]]
[1] 1

[[2]]
[1] 2

[[3]]
[1] 3

[[4]]
[1] 4

[1] "new list appended"
[[1]]
[1] 1

[[2]]
[1] 2

[[3]]
[1] 3

[[4]]
[1] 4

[[5]]
[1] 6



In [78]:
# DELETING A COMPONENT
empId = c(1, 2, 3, 4)
empName = c("Debi", "Sandeep", "Subham", "Shiba")
numberOfEmp = 4
empList = list(
"ID" = empId,
"Names" = empName,
"Total Staff" = numberOfEmp
)
cat("Before deletion the list is\n")
print(empList)
# Deleting a top level components
cat("After Deleting Total staff components\n")
print(empList[-3])
# Deleting a inner level components
cat("After Deleting sandeep from name\n")
print(empList[[2]][-2])


Before deletion the list is
$ID
[1] 1 2 3 4

$Names
[1] "Debi"    "Sandeep" "Subham"  "Shiba"  

$`Total Staff`
[1] 4

After Deleting Total staff components
$ID
[1] 1 2 3 4

$Names
[1] "Debi"    "Sandeep" "Subham"  "Shiba"  

After Deleting sandeep from name
[1] "Debi"   "Subham" "Shiba" 


In [79]:
# MERGING LISTS
lst1 <- list(1,2,3)
lst2 <- list("Sun","Mon","Tue")
 
# Merge the two lists.
new_list <- c(lst1,lst2)
 
# Print the merged list.
print(new_list)

[[1]]
[1] 1

[[2]]
[1] 2

[[3]]
[1] 3

[[4]]
[1] "Sun"

[[5]]
[1] "Mon"

[[6]]
[1] "Tue"



In [80]:
# CONVERTING LIST TO VECTOR
lst <- list(1:5)
print(lst)
vec <- unlist(lst)
 
print(vec)

[[1]]
[1] 1 2 3 4 5

[1] 1 2 3 4 5


#### Matrices

* 2D, homogenous with rows and cols
* rows as horizontal and cols as vertical representation
* function: matrix(iterable, nrow=,ncol=)

In [15]:
# creating 
thismatrix <- matrix(c(1,2,3,4), nrow = 2, ncol = 2) # can use vector of strings too
print(thismatrix)

#accessing
thismatrix[1,2] # syntax = matrixobj[rowpos, colpos]
thismatrix[1,] # whole row
thismatrix[,2] # whole col
thismatrix[c(1,2), ] # more than one row
thismatrix[, c(1,2)] # more than one column
# appending/combining
newmatrix <- cbind(thismatrix, c(5,6)) # add columns using cbind()
newmatrix2 <- rbind(thismatrix, c(5,6)) # add rows using rbind()
# removing
newmatrix <- newmatrix[-c(1)]
# num of row cols
dim(thismatrix)
# length
length(newmatrix)
# naming
A <- matrix(1:10, nrow = 2, ncol = 5, byrow = FALSE)
print(A) # dimnames(A) <- list(c(for rows), c(for cols)) or u can use as arg in matrix()
# transpose
print(t(A))
# deconstruction
print(c(A))
# simple arithmatics; dimensions must be same

     [,1] [,2]
[1,]    1    3
[2,]    2    4


0,1
1,3
2,4


0,1
1,3
2,4


     [,1] [,2] [,3] [,4] [,5]
[1,]    1    3    5    7    9
[2,]    2    4    6    8   10
     [,1] [,2]
[1,]    1    2
[2,]    3    4
[3,]    5    6
[4,]    7    8
[5,]    9   10
 [1]  1  2  3  4  5  6  7  8  9 10


#### Arrays

* nD, homogenous(can have only one datatype)
* function: array(iterable, dim=c(rows,cols,dimensions/pages))

In [82]:
# creating multidim array; here c(1:10) is 1D array
thisarray <- array(c(1:10),dim=c(4,3,2))
print(thisarray)
# access iems
thisarray[2,3,2] # arrayobj[rowpos,colpos,dim]
thisarray[c(1),,1] # all items from first row
thisarray[,c(1),1] # all items from first col
# num of rows and cols
dim(thisarray)
# length
length(thisarray)


, , 1

     [,1] [,2] [,3]
[1,]    1    5    9
[2,]    2    6   10
[3,]    3    7    1
[4,]    4    8    2

, , 2

     [,1] [,2] [,3]
[1,]    3    7    1
[2,]    4    8    2
[3,]    5    9    3
[4,]    6   10    4



#### factors

* to categorise data in levels, 1D, heterogenous(only strings and nums)
* function: factor(iterable)

In [83]:
# creating
thisfactor <- factor(c(1,2,3,4,5))
print(thisfactor)
print(levels(thisfactor))
# setting levels 
newfactor <- factor(c(1,2,3,4,5,6,7), levels = c(1,2,3,4,5,6,7,'others'))
print(levels(newfactor))
# length
length(newfactor)
# access
print(newfactor[3])
# modify item
newfactor[3] <- 5
print(newfactor[3])
# IF U CHANGE THE VALUE OF AN ITEM WHICH ISNT PREDEFINED IN LEVELS THEN EROOR WILL BE THROWN
# set the new level item which is going to be added at a certain position


[1] 1 2 3 4 5
Levels: 1 2 3 4 5
[1] "1" "2" "3" "4" "5"
[1] "1"      "2"      "3"      "4"      "5"      "6"      "7"      "others"


[1] 3
Levels: 1 2 3 4 5 6 7 others
[1] 5
Levels: 1 2 3 4 5 6 7 others


#### Dataframes

* 2D, heterogenous, one variable is row and one is column i.e. table wise representation
* each col should have same type of data
* function: data.frame()


In [84]:
# create
thisdf <- data.frame(empid= c(1,2,3), empname = c("A","B","C"), salary=c(100,200,300))
print(thisdf)
# summary
print(summary(thisdf))
# add cols
thisdf$dpet <- c("IT","Civ","Com")
print(thisdf)
# add rows: create a new df
thisdf2 <- data.frame (
   empid= c(4,5),
  empname = c("E", "F"),
  salary = c(600, 3000), dpet = c("IT","Com")
)
newdf <- rbind(thisdf,thisdf2)
print(newdf)
# access
# specific col
accdf <- data.frame(newdf$empid) #[1]: first entry given [1:5] first 5 entries given
print(accdf)
newdf[,3] # all the rows and third col
# specific row
newdf[1:2,] # first two row and all the cols

  empid empname salary
1     1       A    100
2     2       B    200
3     3       C    300
     empid       empname              salary   
 Min.   :1.0   Length:3           Min.   :100  
 1st Qu.:1.5   Class :character   1st Qu.:150  
 Median :2.0   Mode  :character   Median :200  
 Mean   :2.0                      Mean   :200  
 3rd Qu.:2.5                      3rd Qu.:250  
 Max.   :3.0                      Max.   :300  
  empid empname salary dpet
1     1       A    100   IT
2     2       B    200  Civ
3     3       C    300  Com
  empid empname salary dpet
1     1       A    100   IT
2     2       B    200  Civ
3     3       C    300  Com
4     4       E    600   IT
5     5       F   3000  Com
  newdf.empid
1           1
2           2
3           3
4           4
5           5


Unnamed: 0_level_0,empid,empname,salary,dpet
Unnamed: 0_level_1,<dbl>,<chr>,<dbl>,<chr>
1,1,A,100,IT
2,2,B,200,Civ


### Decison Making Statements

#### if statement


* syntax: 
```R
if (boolean expression){
    statement
}
```

* Flow:

In [103]:
# if descion making statement
a <- 1
b <- 2
if (a<b){
    print("nah man")
}

[1] "nah man"


#### if_else statement

* syntax:
```R
if (boolean expression){
    statement
}else {
    statement
}
```

* flow:

In [104]:
# if else code block
a <- 1
b <- 2
if (a>b){
    print("nah man")
}else{
    print("yeah man")
}

[1] "yeah man"


#### else if

* syntax:
```R
if (boolean expression){
    statement
} else if (boolean expression 2){
    statement
} else{
    statement
}
```
* flow:

In [105]:
# use of else if 

a <- 1
b <- 2
if (a<b){
    print("nah man")
} else if (a>b){
    print("yeah man")
} else{
    print("whatever")
}

[1] "nah man"


### Iterative Loops

#### repeat

* execution of the code block until stop condition is applied
* syntax:
```R
statement
repeat{
    executing statement
    update expression
    stop condition
}
```
* flow:


In [106]:
# use of repeat

i <- 3

repeat{
    print(i)
    i <- i+3

    if (i > 30){
        break
    }
}

[1] 3
[1] 6
[1] 9
[1] 12
[1] 15
[1] 18
[1] 21
[1] 24
[1] 27
[1] 30


#### while

* statement is executed till it is held true; no stop condition required; no number of iterations to be used are known, checks the bool value of expression n+1 times
* syntax:
```R
statement
while (boolean expression){
    executing statement
    updating statement
}
```
* flow:

In [115]:
# while loop

i <- 3
while (i<30){
    print(i)
    i <- i + 3
}

[1] 3
[1] 6
[1] 9
[1] 12
[1] 15
[1] 18
[1] 21
[1] 24
[1] 27


#### for

* to iterate over iterables like list, strings, vectors, matrix, dataframe etc.; execution depends on numbes of elements
* syntax:
```R
iterable object
for (iterator/counter/iterating variable in iterable object){
    statement
}
```
* flow:

In [116]:
# for loop

v1 <- c(1,2,3,4,5,6,7,8,9)

for (i in v1){
    print(i)
}

[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
[1] 6
[1] 7
[1] 8
[1] 9


###  Decision/Loop control statements

* Control Structures in R: Loops are fundamental in R for executing a block of code multiple times.
* Definition of Looping: The term "looping" refers to cycling or iterating over code.
* Purpose of Jump Statements: Jump statements like break and next are used to alter the flow of loops in R.
* Jump Statements in R:
    * Break Statement: Terminates the loop when a specific condition is met, allowing the program to continue executing the rest of the code.
    * Next Statement: Skips the current iteration of the loop and continues with the next one. In R, the next statement serves the role of a "continue" statement in other languages.
* use case:
    * Break Statement Usage: Used to exit a loop early when a certain condition is met.
    * Next Statement Usage: Used to bypass specific conditions within a loop without exiting it.
* Condition-Based Execution: Loops and jump statements are vital for conditionally controlling the flow of execution in R programs.

* syntax:
```R
if (boolean expression){
    control/jump statements
}
```
* flow:


In [117]:
# for and break

no <- 1:10

for (val in no) 
{ 
	if (val == 5) 
	{ 
		print(paste("Coming out from for loop Where i = ", val)) 
		break
	} 
	print(paste("Values are: ", val)) 
} 

# while and break
a<-1    
while (a < 10) 
{     
    print(a)     
    if(a==5)     
        break   
    a = a + 1    
}     

[1] "Values are:  1"
[1] "Values are:  2"
[1] "Values are:  3"
[1] "Values are:  4"
[1] "Coming out from for loop Where i =  5"
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5


In [118]:
# for and next
no <- 1:10
 
for (val in no) 
{ 
    if (val == 6) 
    { 
        print(paste("Skipping for loop Where i = ", val)) 
        next
    } 
    print(paste("Values are: ", val)) 
} 

# while and next
x <- 1
while(x < 5) 
{ 
    x <- x + 1; 
    if (x == 3) 
        next; 
    print(x); 
} 

[1] "Values are:  1"
[1] "Values are:  2"
[1] "Values are:  3"
[1] "Values are:  4"
[1] "Values are:  5"
[1] "Skipping for loop Where i =  6"
[1] "Values are:  7"
[1] "Values are:  8"
[1] "Values are:  9"
[1] "Values are:  10"
[1] 2
[1] 4
[1] 5


### Data Reshaping

- intro:
* changing how data is organised into rows and columns
* data processing in r is done by taking input as df (cuz much easier to extract data from df)
* issue: sometimes we need a df in a specific format (different from what we have recieved in) hence USE DATA RESHAPING


![image.png](attachment:image.png)

In [1]:
# Transpose of a matrix or a df using t()

m1 <- matrix(c(4:12), nrow=3, byrow=TRUE)
print(m1)
cat("\nMatrix after transpose\n" )
m2 <- t(m1)
print(m2)

     [,1] [,2] [,3]
[1,]    4    5    6
[2,]    7    8    9
[3,]   10   11   12

Matrix after transpose
     [,1] [,2] [,3]
[1,]    4    7   10
[2,]    5    8   11
[3,]    6    9   12


In [4]:
# Joining rows and cols in df using rbind() and cbind()
# you can use these to make df from multiple vectors

# create vectors
v1 <- c(1,2,3,4)
v2 <- c("A","B","C", "D")
v3 <- c("yes", "no","yes", "no")
# use cbind() to create a df out of these vectors
df1 <- cbind(v1,v2,v3)
cat("\nThe first df by joining rows and cols\n\n")
print(df1)

# another df using data.frame()
df2 <- data.frame(v1=c(5,6), 
v2=c("E","F"), v3=c("yes", "no"), 
stringsAsFactors = FALSE)
cat("\nThe second df by using data.frame() function\n\n")
print(df2)

# combine df1 and df2 rowise using rbind()
cat("\nThe third df by combining prev two\n\n")
df3 <- rbind(df1, df2)
print(df3)



The first df by joining rows and cols

     v1  v2  v3   
[1,] "1" "A" "yes"
[2,] "2" "B" "no" 
[3,] "3" "C" "yes"
[4,] "4" "D" "no" 

The second df by using data.frame() function

  v1 v2  v3
1  5  E yes
2  6  F  no

The third df by combining prev two

  v1 v2  v3
1  1  A yes
2  2  B  no
3  3  C yes
4  4  D  no
5  5  E yes
6  6  F  no
